actions-runner-controller: Autoscaler do not scale down
runner config:
apiVersion: actions.summerwind.dev/v1alpha1
kind: RunnerDeployment
metadata:
name: starcoin-runner-deployment
spec:
template:
spec:
nodeSelector:
doks.digitalocean.com/node-pool: ci-pool2
image: starcoin/starcoin-runner:v2.275.1.20210104
repository: starcoinorg/starcoin
resources:
requests:
cpu: "24.0"
memory: "48Gi"
# If set to false, there are no privileged container and you cannot use docker.
dockerEnabled: true
# If set to true, runner pod container only 1 container that's expected to be able to run docker, too.
# image summerwind/actions-runner-dind or custom one should be used with true -value
dockerdWithinRunnerContainer: false
# Valid if dockerdWithinRunnerContainer is not true
dockerdContainerResources:
requests:
cpu: "24.0"
memory: "48Gi"
---
apiVersion: actions.summerwind.dev/v1alpha1
kind: HorizontalRunnerAutoscaler
metadata:
name: starcoin-runner-deployment-autoscaler
spec:
scaleTargetRef:
name: starcoin-runner-deployment
minReplicas: 1
maxReplicas: 6
scaleDownDelaySecondsAfterScaleOut: 120
metrics:
- type: TotalNumberOfQueuedAndInProgressWorkflowRuns
repositoryNames:
- starcoinorg/starcoin
- –sync-period=2m
kubectl get HorizontalRunnerAutoscaler
NAME MIN MAX DESIRED
starcoin-runner-deployment-autoscaler 1 6 1
But if the runer autoscale to 6, it does not scale down, even I delete a runner pod manual, it will auto-create a new runner pod.
About this issue
- Original URL
- State: closed
- Created 3 years ago
- Comments: 18
@jolestar Hey! Are you still using actions-runner-controller?
FYI, I’ve recently summarized how our controller can get stuck due to runners being unable to be registered for various reasons.
We’re far from “fixing” all the root causes because they vary a lot but the universal fix can be #297 which is I’m working on currently.
@jolestar Thanks for reporting! Glad to hear it worked.
@jolestar Thanks. I think we’re close. Would you mind sharing me the result of
kubectl get po -o yaml starcoin-runner-deployment-vt5bk-2tvk8
(or those of any runner pods that failed like it), with the output ofdate
command on your machine or controller pod? (