actions-runner-controller: webhook autoscaler doesn't recognize organization target

I keep getting this in logs:

github-webhook-server 2021-05-09T11:48:28.986Z	INFO	controllers.Runner	Scale target not found. If this is unexpected, ensure that there is exactly one repository-wide or organizational runner deployment that matches this webhook event	{"event": "check_run", "hookID": "296414796", "delivery": "787a18e0-b0bc-11eb-8d24-0ff6126380d2", "checkRun.status": "completed", "action": "completed"}

Looks like it can’t pull the org name or type from the event payload (But I might be misunderstanding the code)

This is my config:

---
apiVersion: actions.summerwind.dev/v1alpha1
kind: HorizontalRunnerAutoscaler
metadata:
  name: kubernetes-1cores-2048mi-summerwind-autoscaler
spec:
  scaleTargetRef:
    name: kubernetes-1cores-2048mi-summerwind
  scaleUpTriggers:
  - githubEvent:
      checkRun:
        types: ["created"]
        status: "queued"
    amount: 3
    duration: "5m"
---
apiVersion: actions.summerwind.dev/v1alpha1
kind: RunnerDeployment
metadata:
  name: kubernetes-1cores-2048mi-summerwind
spec:
  template:
    spec:
      organization: Myorg-Private
      labels:
        - kubernetes-1cores-2048mi-summerwind
      ephemeral: true
      env:
        - name: RUNNER_DEBUG
          value: "true"
        - name: ACTIONS_RUNNER_INPUT_LABELS
          value: kubernetes-1cores-2048mi-summerwind
      image: my-custom-image:v1
      resources:
        requests:
          cpu: "1"
          memory: "2048Mi"
        limits:
          cpu: "1"
          memory: "2048Mi"
      dockerEnabled: false

I’m deploying the controller using helm chart from my own branch which shouldn’t matter, but just in case…

Here’s my values file:

authSecret:
  create: false
  name: controller-manager
createDummySecret: false
githubWebhookServer:
  enabled: true
  imagePullSecrets: []
  ingress:
    enabled: true
    hosts:
      - host: summerwind-webhook-listener.myorg.com
        paths:
                    - backend:
                        serviceName: summerwind-actions-runner-controller-webhook
                        servicePort: http 
    tls:
      - hosts:
          - summerwind-webhook-listener.myorg.com
        secretName: myorg.com
  nameOverride: summerwind-webhook-listener
  replicaCount: 1
  secret:
    create: true
    name: github-webhook-server
metrics:
  proxy:
    enabled: false
  serviceMonitor: true

About this issue

Original URL
State: closed
Created 3 years ago
Comments: 33 (3 by maintainers)

Commits related to this issue

chore: Enhance acceptance test to cover webhook-based autoscaling for repo and org runners Adds what I used while verifying #534 — committed to actions/actions-runner-controller by mumoshu 3 years ago
chore: Enhance acceptance test to cover webhook-based autoscaling for repo and org runners Adds what I used while verifying #534 — committed to actions/actions-runner-controller by mumoshu 3 years ago
Improve debug log in webhook-based autoscaling Adds some helpful debug log messages I have used while verifying #534 — committed to actions/actions-runner-controller by mumoshu 3 years ago
Improve debug log in webhook-based autoscaling Adds some helpful debug log messages I have used while verifying #534 — committed to actions/actions-runner-controller by mumoshu 3 years ago

Most upvoted comments

@fabiano-amaral Thanks a lot for the info! I have not reproduced your issue yet, but I’d definitely keep your information in my mind and report back if there’s any update 👍

mumoshu on Nov 4, 2021

Yeah, yeah, i moved to workflow_job event and still I have 5 runners that are in offline mode… I will wait some more to see, but they got removed as soon as the jobs completed and are in offline mode… I will run again the same test workflow to see if It will create new 5 runners and then leave them offline as well after its finished. If it does i will go with new issue with all the details…

UrosCvijan on Jul 26, 2022

@fabiano-amaral Thanks for the info! Could you clarify a bit more? I had no clue how minReplicas and maxReplicas of a HRA affects it, because actions-runner-controller doesn’t use those fields for filtering the right HRA for a webhook event.

@mumoshu Yes, i agree totally with u, but the only way that worked for us is adding this configs to HRA.

Min and Max MAYBE can be used to set the max os jobs that can be started at same time, so, you can limit how many pods spawn, and the minReplica can set idle jobs waiting for a webhook event.

but, its only my theory about this, I don’t debugged the source code. This information could be in the documentation

fabiano-amaral on Nov 2, 2021

BTW, to be clear, although this issue says webhook autoscaler doesn't recognize organization target, I’ve successfully tested it to work on my environment.

So I’m still believing this would have been some user error, even though there might be some documentation or operational enhancements we need to make.

mumoshu on May 26, 2021

// So depending on your requirement, you’d need to raise feature requests to GitHub, not us.

mumoshu on May 26, 2021

@awoimbee How can you differentiate runners? Basically, webhook event payloads do not contain information about which runner(with certain labels, groups, orgs, repositories, etc) the webhook event is going to trigger a workflow job run on.

If you have unique enough job names per org/repo/labels/groups/etc for your workflows, for scaling based on check run events, you can set chckRun.Names on HRA so that the HRA only reacts to check run with those names.

mumoshu on May 26, 2021