actions-runner-controller: Autoscaler do not scale down

runner config:

apiVersion: actions.summerwind.dev/v1alpha1
kind: RunnerDeployment
metadata:
  name: starcoin-runner-deployment
spec:
  template:
    spec:
      nodeSelector:
        doks.digitalocean.com/node-pool: ci-pool2
      image: starcoin/starcoin-runner:v2.275.1.20210104
      repository: starcoinorg/starcoin

      resources:
        requests:
          cpu: "24.0"
          memory: "48Gi"
        # If set to false, there are no privileged container and you cannot use docker.
        dockerEnabled: true
        # If set to true, runner pod container only 1 container that's expected to be able to run docker, too.
        # image summerwind/actions-runner-dind or custom one should be used with true -value
        dockerdWithinRunnerContainer: false
        # Valid if dockerdWithinRunnerContainer is not true
        dockerdContainerResources:
          requests:
            cpu: "24.0"
            memory: "48Gi"

---
apiVersion: actions.summerwind.dev/v1alpha1
kind: HorizontalRunnerAutoscaler
metadata:
  name: starcoin-runner-deployment-autoscaler
spec:
  scaleTargetRef:
    name: starcoin-runner-deployment
  minReplicas: 1
  maxReplicas: 6
  scaleDownDelaySecondsAfterScaleOut: 120
  metrics:
    - type: TotalNumberOfQueuedAndInProgressWorkflowRuns
      repositoryNames:
        - starcoinorg/starcoin

  • –sync-period=2m
kubectl  get HorizontalRunnerAutoscaler                                                                                                                                    
NAME                                    MIN   MAX   DESIRED
starcoin-runner-deployment-autoscaler   1     6     1

But if the runer autoscale to 6, it does not scale down, even I delete a runner pod manual, it will auto-create a new runner pod.

About this issue

  • Original URL
  • State: closed
  • Created 3 years ago
  • Comments: 18

Most upvoted comments

@jolestar Hey! Are you still using actions-runner-controller?

FYI, I’ve recently summarized how our controller can get stuck due to runners being unable to be registered for various reasons.

We’re far from “fixing” all the root causes because they vary a lot but the universal fix can be #297 which is I’m working on currently.

@jolestar Thanks for reporting! Glad to hear it worked.

@jolestar Thanks. I think we’re close. Would you mind sharing me the result of kubectl get po -o yaml starcoin-runner-deployment-vt5bk-2tvk8(or those of any runner pods that failed like it), with the output of date command on your machine or controller pod? (