actions-runner-controller: [Github Webhook HRA] Not able to get it working...
Hello everyone,
Since my scaling problems exposed on this issue https://github.com/summerwind/actions-runner-controller/issues/206, I’ve found an efficient workaround…
I’m using multiple kubernetes clusters (5 actually) with one github actions controller deployed on each one.
- Each controller is managing a pool of 20 workers, autoscaled using the
- type: PercentageRunnersBusy
method. - Each controller is using a unique Github APP for Github API auth. wich gives me approx. 6700 API calls per hour on each clusters.
- Each controller have a sync-period configured on 1m
It’s working well, and it was the only solution i found to be able to run 100 runners concurently with the action runner controller.
Btw, that’s not why i’m here today. Since i’ve seen the new Github Webhook HRA feature, i absolutely need it to stop doing this kind of workaround and to be able to use the controller “at scale”.
Unfortunately, i’m not able to get it working using the last Helm chart version 0.7.0.
I tried with : latest/v0.17.0/canary
versions of the controller-image, and i’m using the ‘master’ branch CRDs.
When i declare the HRA like this :
apiVersion: actions.summerwind.dev/v1alpha1
kind: HorizontalRunnerAutoscaler
metadata:
name: actions-runner-aos-autoscaler
namespace: default
spec:
scaleTargetRef:
name: actions-runner-aos
minReplicas: 1
maxReplicas: 10
scaleUpTriggers:
- githubEvent:
checkRun:
types: ["created"]
status: "queued"
amount: 1
duration: "5m"
The github-actions-controller is crashing with this log :
2021-03-08T14:40:39.333Z INFO controller-runtime.metrics metrics server is starting to listen {"addr": "127.0.0.1:8080"}
2021-03-08T14:40:39.333Z INFO controller-runtime.builder Registering a mutating webhook {"GVK": "actions.summerwind.dev/v1alpha1, Kind=Runner", "path": "/mutate-actions-summerwind-dev-v1alpha1-runner"}
2021-03-08T14:40:39.333Z INFO controller-runtime.webhook registering webhook {"path": "/mutate-actions-summerwind-dev-v1alpha1-runner"}
2021-03-08T14:40:39.333Z INFO controller-runtime.builder Registering a validating webhook {"GVK": "actions.summerwind.dev/v1alpha1, Kind=Runner", "path": "/validate-actions-summerwind-dev-v1alpha1-runner"}
2021-03-08T14:40:39.333Z INFO controller-runtime.webhook registering webhook {"path": "/validate-actions-summerwind-dev-v1alpha1-runner"}
2021-03-08T14:40:39.333Z INFO controller-runtime.builder Registering a mutating webhook {"GVK": "actions.summerwind.dev/v1alpha1, Kind=RunnerDeployment", "path": "/mutate-actions-summerwind-dev-v1alpha1-runnerdeployment"}
2021-03-08T14:40:39.333Z INFO controller-runtime.webhook registering webhook {"path": "/mutate-actions-summerwind-dev-v1alpha1-runnerdeployment"}
2021-03-08T14:40:39.333Z INFO controller-runtime.builder Registering a validating webhook {"GVK": "actions.summerwind.dev/v1alpha1, Kind=RunnerDeployment", "path": "/validate-actions-summerwind-dev-v1alpha1-runnerdeployment"}
2021-03-08T14:40:39.333Z INFO controller-runtime.webhook registering webhook {"path": "/validate-actions-summerwind-dev-v1alpha1-runnerdeployment"}
2021-03-08T14:40:39.333Z INFO controller-runtime.builder Registering a mutating webhook {"GVK": "actions.summerwind.dev/v1alpha1, Kind=RunnerReplicaSet", "path": "/mutate-actions-summerwind-dev-v1alpha1-runnerreplicaset"}
2021-03-08T14:40:39.333Z INFO controller-runtime.webhook registering webhook {"path": "/mutate-actions-summerwind-dev-v1alpha1-runnerreplicaset"}
2021-03-08T14:40:39.333Z INFO controller-runtime.builder Registering a validating webhook {"GVK": "actions.summerwind.dev/v1alpha1, Kind=RunnerReplicaSet", "path": "/validate-actions-summerwind-dev-v1alpha1-runnerreplicaset"}
2021-03-08T14:40:39.334Z INFO controller-runtime.webhook registering webhook {"path": "/validate-actions-summerwind-dev-v1alpha1-runnerreplicaset"}
2021-03-08T14:40:39.334Z INFO setup starting manager
2021-03-08T14:40:39.334Z INFO controller-runtime.manager starting metrics server {"path": "/metrics"}
2021-03-08T14:40:39.435Z INFO controller-runtime.webhook.webhooks starting webhook server
2021-03-08T14:40:39.435Z INFO controller-runtime.certwatcher Updated current TLS certificate
2021-03-08T14:40:39.435Z INFO controller-runtime.webhook serving webhook server {"host": "", "port": 9443}
2021-03-08T14:40:39.436Z INFO controller-runtime.certwatcher Starting certificate watcher
2021-03-08T14:40:56.134Z DEBUG controller-runtime.manager.events Normal {"object": {"kind":"ConfigMap","namespace":"default","name":"controller-leader-election-helper","uid":"900760ed-cad7-435b-964f-e3694c664fbe","apiVersion":"v1","resourceVersion":"5323021"}, "reason": "LeaderElection", "message": "actions-controller-actions-runner-controller-554966bb8b-lbwvt_6caf86f4-a576-4e77-b0c5-51d19c018b26 became leader"}
2021-03-08T14:40:56.134Z INFO controller-runtime.controller Starting EventSource {"controller": "horizontalrunnerautoscaler", "source": "kind source: /, Kind="}
2021-03-08T14:40:56.134Z INFO controller-runtime.controller Starting EventSource {"controller": "runner", "source": "kind source: /, Kind="}
2021-03-08T14:40:56.134Z INFO controller-runtime.controller Starting EventSource {"controller": "runnerreplicaset", "source": "kind source: /, Kind="}
2021-03-08T14:40:56.134Z INFO controller-runtime.controller Starting EventSource {"controller": "runnerreplicaset", "source": "kind source: /, Kind="}
2021-03-08T14:40:56.135Z INFO controller-runtime.controller Starting EventSource {"controller": "runnerdeployment", "source": "kind source: /, Kind="}
2021-03-08T14:40:56.234Z INFO controller-runtime.controller Starting Controller {"controller": "horizontalrunnerautoscaler"}
2021-03-08T14:40:56.234Z INFO controller-runtime.controller Starting EventSource {"controller": "runner", "source": "kind source: /, Kind="}
2021-03-08T14:40:56.235Z INFO controller-runtime.controller Starting Controller {"controller": "runnerreplicaset"}
2021-03-08T14:40:56.235Z INFO controller-runtime.controller Starting EventSource {"controller": "runnerdeployment", "source": "kind source: /, Kind="}
2021-03-08T14:40:56.235Z INFO controller-runtime.controller Starting Controller {"controller": "runnerdeployment"}
2021-03-08T14:40:56.335Z INFO controller-runtime.controller Starting workers {"controller": "runnerreplicaset", "worker count": 1}
2021-03-08T14:40:56.335Z INFO controllers.RunnerReplicaSet debug {"runnerreplicaset": "default/actions-runner-aos-h9ppg", "desired": 1, "available": 1}
2021-03-08T14:40:56.335Z DEBUG controller-runtime.controller Successfully Reconciled {"controller": "runnerreplicaset", "request": "default/actions-runner-aos-h9ppg"}
2021-03-08T14:40:56.336Z INFO controller-runtime.controller Starting Controller {"controller": "runner"}
2021-03-08T14:40:56.335Z INFO controller-runtime.controller Starting workers {"controller": "horizontalrunnerautoscaler", "worker count": 1}
E0308 14:40:56.336609 1 runtime.go:78] Observed a panic: runtime.boundsError{x:0, y:0, signed:true, code:0x0} (runtime error: index out of range [0] with length 0)
goroutine 343 [running]:
k8s.io/apimachinery/pkg/util/runtime.logPanic(0x15aabe0, 0xc00027ed80)
/go/pkg/mod/k8s.io/apimachinery@v0.0.0-20190913080033-27d36303b655/pkg/util/runtime/runtime.go:74 +0xa6
k8s.io/apimachinery/pkg/util/runtime.HandleCrash(0x0, 0x0, 0x0)
/go/pkg/mod/k8s.io/apimachinery@v0.0.0-20190913080033-27d36303b655/pkg/util/runtime/runtime.go:48 +0x89
panic(0x15aabe0, 0xc00027ed80)
/usr/local/go/src/runtime/panic.go:969 +0x1b9
github.com/summerwind/actions-runner-controller/controllers.(*HorizontalRunnerAutoscalerReconciler).calculateReplicasByQueuedAndInProgressWorkflowRuns(0xc0002e2ac0, 0x13d38ba, 0x10, 0xc00027ed60, 0x1f, 0xc000704ca0, 0x12, 0x0, 0x0, 0xc00071cb40, ...)
/workspace/controllers/autoscaling.go:50 +0xe7e
github.com/summerwind/actions-runner-controller/controllers.(*HorizontalRunnerAutoscalerReconciler).determineDesiredReplicas(0xc0002e2ac0, 0x13d38ba, 0x10, 0xc00027ed60, 0x1f, 0xc000704ca0, 0x12, 0x0, 0x0, 0xc00071cb40, ...)
/workspace/controllers/autoscaling.go:31 +0xb8
github.com/summerwind/actions-runner-controller/controllers.(*HorizontalRunnerAutoscalerReconciler).computeReplicas(0xc0002e2ac0, 0x13d38ba, 0x10, 0xc00027ed60, 0x1f, 0xc000704ca0, 0x12, 0x0, 0x0, 0xc00071cb40, ...)
/workspace/controllers/horizontalrunnerautoscaler_controller.go:142 +0x7b
github.com/summerwind/actions-runner-controller/controllers.(*HorizontalRunnerAutoscalerReconciler).Reconcile(0xc0002e2ac0, 0xc00017a7e0, 0x7, 0xc00027f9e0, 0x1d, 0x428f095d4, 0xc000558cf0, 0xc0002d27e8, 0xc0002d27e0)
I tried to delete :
minReplicas: 1
maxReplicas: 10
to follow the README.md exemple, but the controller is not happy either and keeps saying to add minReplicas
and maxReplicas
to work.
I know that this feature is in early stage, so it won’t be suprised if this is not working yet, just wanted to be sure that you are aware of this 😄
👍
About this issue
- Original URL
- State: closed
- Created 3 years ago
- Comments: 39 (6 by maintainers)
Commits related to this issue
- Use --watch-namespace flag to restrict the namespace to watch Ref https://github.com/summerwind/actions-runner-controller/issues/377#issuecomment-793172995 — committed to actions/actions-runner-controller by mumoshu 3 years ago
- Fix panic on scaling organizational runners Ref https://github.com/summerwind/actions-runner-controller/issues/377#issuecomment-793287133 — committed to actions/actions-runner-controller by mumoshu 3 years ago
- Fix panic on scaling organizational runners (#381) Ref https://github.com/summerwind/actions-runner-controller/issues/377#issuecomment-793287133 — committed to actions/actions-runner-controller by mumoshu 3 years ago
- Make webhook-based autoscaler github event logs more operator-friendly Adds fields like `pullRequest.base.ref` and `checkRun.status` that are useful for verifying the autoscaling behaviour without br... — committed to actions/actions-runner-controller by mumoshu 3 years ago
- Make webhook-based autoscaler github event logs more operator-friendly (#384) Adds fields like `pullRequest.base.ref` and `checkRun.status` that are useful for verifying the autoscaling behaviour wit... — committed to actions/actions-runner-controller by mumoshu 3 years ago
- Fix PercentageRunnersBusy scaling not working PercentageRunnerBusy seems to have regressed since #355 due to that RunnerDeployment.Spec.Selector is empty by default and the HRA controller was using t... — committed to actions/actions-runner-controller by mumoshu 3 years ago
- Fix PercentageRunnersBusy scaling not working (#386) PercentageRunnerBusy seems to have regressed since #355 due to that RunnerDeployment.Spec.Selector is empty by default and the HRA controller was ... — committed to actions/actions-runner-controller by mumoshu 3 years ago
- Disable metrics-based autoscaling by default when scaleUpTriggers are enabled Relates to https://github.com/summerwind/actions-runner-controller/pull/379#discussion_r592813661 Relates to https://gith... — committed to actions/actions-runner-controller by mumoshu 3 years ago
- Disable metrics-based autoscaling by default when scaleUpTriggers are enabled (#391) Relates to https://github.com/summerwind/actions-runner-controller/pull/379#discussion_r592813661 Relates to http... — committed to actions/actions-runner-controller by mumoshu 3 years ago
@mumoshu Hello Yusuke, i’ve tested heavily the new
watch-namespace
feature you’ve implemented.I can say that’s working very well 😉 I’ve launched 1 cluster with 5 namespaced controllers, each one in charge of 20 runners, with a sync-period of 1m. And … it’s amazing, that’s it 😄 no more to say except thank you a lot again.
By the way, i’m gonna test i little bit more the Github Webhook HRA on my side to find the best Autoscaling mecanism for my case. I think you’r right, the
PercentageRunnersBusy
is fitting well for me, with multiplewatch-namespace
controller.Now i’m able to scale up to 100 runners constantly without any Github API limitation, with 1 cluster ! 🥇
I had the same issue, the HorizontalRunnerAutoscaler still requires one of the “normal” scaling types under
metrics
as well asscaleUpTriggers
.this works for me:
@mumoshu It’s working again 😄 ! Thanks a lot, let’s test now 🚀
I got to think this is confusing and there’s no actual benefit making it the default behavior. I’ve change the controller code, and since #391, omitting
Metrics[]
just result inScaleUpTriggers[]
being used alone. Doing so, the controller would completely skip GitHub API calls for autoscaling, which alleviates the rate-limit issue!@theobolo Thanks! FYI, I’ve just merged #386 and the
canary
tag will be updated soon.@theobolo I now believe it was due to a regression introduced in #355. #386 should fix it.
@avdhoot The fix should be available in the current
canary
image. Would you mind giving it a shot?@avdhoot No. But, omitting
metrics
result in the use of TotalNumberOfQueuedAndInProgressWorkflowRuns metrichttps://github.com/summerwind/actions-runner-controller/blob/4fa53153111489691c57cee9cd11fdafb9e3d5bd/controllers/autoscaling.go#L75
Also,
minReplicas
andmaxReplicas
are required regardless of you configuremetrics
or not (https://github.com/summerwind/actions-runner-controller/issues/377#issuecomment-792855412).