actions-runner-controller: RunnerSet Runners fail to start with "RUNNER_NAME must be set" after upgrade to 0.22.1

Describe the bug After the upgrade to the ARC to 0.22.1 (reproduced with 0.22.2) (ChartVersion 0.17.1/0.17.2), we noticed that newly created pods from our RunnerSet fail to start and shows the error message in the logs:

RUNNER_NAME must be set

I did rollback the controller to 0.22.0 (0.17.0) and the RunnerSets are starting like normal.

RunnerDeployments are working in all versions.

Checks

  • My actions-runner-controller version (v0.x.y) does support the feature
  • [] I’m using an unreleased version of the controller I built from HEAD of the default branch

To Reproduce

  1. Install actions-runner-controller chart 0.17.1 with the following values. The pre-created secret holds the credentials from the GitHub App that is registered in the organisation.
authSecret:
  name: controller-manager
scope:
  singleNamespace: true
image:
  pullPolicy: Always
resources:
  requests:
    cpu: 10m
    memory: 128Mi
  limits:
    cpu: 1
    memory: 256M
  1. Create a RunnerSet with this manifest:
apiVersion: actions.summerwind.dev/v1alpha1
kind: RunnerSet
metadata:
  name: vpc-devservices-runnerset
  namespace: asys-vpc-github-runner
spec:
  ephemeral: false
  organization: our-organisation
  labels:
  - vpc-devservices-runnerset
  replicas: 1
  selector:
    matchLabels:
      app: vpc-devservices-runnerset
  serviceName: vpc-devservices-runnerset
  template:
    metadata:
      labels:
        app: vpc-devservices-runnerset
    spec:
      containers:
        - name: runner
          resources:
            requests:
              memory: 500Mi
              cpu: 10m
            limits:
              memory: 2Gi
              cpu: 1
  1. runner container exits with RC=1 and the following log:
Waiting until Docker is avaliable or the timeout is reached
CONTAINER ID   IMAGE     COMMAND   CREATED   STATUS    PORTS     NAMES
Github endpoint URL https://github.com/
RUNNER_NAME must be set

Log of the manager container in ARC:

I0405 15:05:33.209132       1 request.go:665] Waited for 1.033433053s due to client-side throttling, not priority and fairness, request: GET:https://10.240.16.1:443/apis/flowcontrol.apiserver.k8s.io/v1beta1?timeout=32s
2022-04-05T15:05:33Z	INFO	controller-runtime.metrics	Metrics server is starting to listen	{"addr": "127.0.0.1:8080"}
2022-04-05T15:05:33Z	INFO	actions-runner-controller	Initializing actions-runner-controller	{"github-api-cache-duration": "9m50s", "sync-period": "10m0s", "runner-image": "summerwind/actions-runner:latest", "docker-image": "docker:dind", "common-runnner-labels": null, "watch-namespace": "asys-vpc-github-runner"}
2022-04-05T15:05:33Z	INFO	controller-runtime.builder	Registering a mutating webhook	{"GVK": "actions.summerwind.dev/v1alpha1, Kind=Runner", "path": "/mutate-actions-summerwind-dev-v1alpha1-runner"}
2022-04-05T15:05:33Z	INFO	controller-runtime.webhook	Registering webhook	{"path": "/mutate-actions-summerwind-dev-v1alpha1-runner"}
2022-04-05T15:05:33Z	INFO	controller-runtime.builder	Registering a validating webhook	{"GVK": "actions.summerwind.dev/v1alpha1, Kind=Runner", "path": "/validate-actions-summerwind-dev-v1alpha1-runner"}
2022-04-05T15:05:33Z	INFO	controller-runtime.webhook	Registering webhook	{"path": "/validate-actions-summerwind-dev-v1alpha1-runner"}
2022-04-05T15:05:33Z	INFO	controller-runtime.builder	Registering a mutating webhook	{"GVK": "actions.summerwind.dev/v1alpha1, Kind=RunnerDeployment", "path": "/mutate-actions-summerwind-dev-v1alpha1-runnerdeployment"}
2022-04-05T15:05:33Z	INFO	controller-runtime.webhook	Registering webhook	{"path": "/mutate-actions-summerwind-dev-v1alpha1-runnerdeployment"}
2022-04-05T15:05:33Z	INFO	controller-runtime.builder	Registering a validating webhook	{"GVK": "actions.summerwind.dev/v1alpha1, Kind=RunnerDeployment", "path": "/validate-actions-summerwind-dev-v1alpha1-runnerdeployment"}
2022-04-05T15:05:33Z	INFO	controller-runtime.webhook	Registering webhook	{"path": "/validate-actions-summerwind-dev-v1alpha1-runnerdeployment"}
2022-04-05T15:05:33Z	INFO	controller-runtime.builder	Registering a mutating webhook	{"GVK": "actions.summerwind.dev/v1alpha1, Kind=RunnerReplicaSet", "path": "/mutate-actions-summerwind-dev-v1alpha1-runnerreplicaset"}
2022-04-05T15:05:33Z	INFO	controller-runtime.webhook	Registering webhook	{"path": "/mutate-actions-summerwind-dev-v1alpha1-runnerreplicaset"}
2022-04-05T15:05:33Z	INFO	controller-runtime.builder	Registering a validating webhook	{"GVK": "actions.summerwind.dev/v1alpha1, Kind=RunnerReplicaSet", "path": "/validate-actions-summerwind-dev-v1alpha1-runnerreplicaset"}
2022-04-05T15:05:33Z	INFO	controller-runtime.webhook	Registering webhook	{"path": "/validate-actions-summerwind-dev-v1alpha1-runnerreplicaset"}
2022-04-05T15:05:33Z	INFO	controller-runtime.webhook	Registering webhook	{"path": "/mutate-runner-set-pod"}
2022-04-05T15:05:33Z	INFO	actions-runner-controller	starting manager
2022-04-05T15:05:33Z	INFO	controller-runtime.webhook.webhooks	Starting webhook server
2022-04-05T15:05:33Z	INFO	controller-runtime.certwatcher	Updated current TLS certificate
2022-04-05T15:05:33Z	INFO	controller-runtime.webhook	Serving webhook server	{"host": "", "port": 9443}
2022-04-05T15:05:33Z	INFO	Starting server	{"path": "/metrics", "kind": "metrics", "addr": "127.0.0.1:8080"}
2022-04-05T15:05:33Z	INFO	controller-runtime.certwatcher	Starting certificate watcher
I0405 15:05:34.069226       1 leaderelection.go:248] attempting to acquire leader lease asys-vpc-github-runner/actions-runner-controller...
I0405 15:05:54.657952       1 leaderelection.go:258] successfully acquired lease asys-vpc-github-runner/actions-runner-controller
2022-04-05T15:05:54Z	DEBUG	events	Normal	{"object": {"kind":"ConfigMap","namespace":"asys-vpc-github-runner","name":"actions-runner-controller","uid":"01ec3990-6bd8-4257-a590-8693dfbe60ad","apiVersion":"v1","resourceVersion":"91592721"}, "reason": "LeaderElection", "message": "asys-vpc-actions-runner-controller-7bbb657bb4-jztvd_0b453fca-bf0e-447b-b38f-50fb76d9218b became leader"}
2022-04-05T15:05:54Z	DEBUG	events	Normal	{"object": {"kind":"Lease","namespace":"asys-vpc-github-runner","name":"actions-runner-controller","uid":"540b0336-c7c6-44bb-95bd-6c5c68e6d739","apiVersion":"coordination.k8s.io/v1","resourceVersion":"91592722"}, "reason": "LeaderElection", "message": "asys-vpc-actions-runner-controller-7bbb657bb4-jztvd_0b453fca-bf0e-447b-b38f-50fb76d9218b became leader"}
2022-04-05T15:05:54Z	INFO	controller.runner-controller	Starting EventSource	{"reconciler group": "actions.summerwind.dev", "reconciler kind": "Runner", "source": "kind source: *v1alpha1.Runner"}
2022-04-05T15:05:54Z	INFO	controller.runner-controller	Starting EventSource	{"reconciler group": "actions.summerwind.dev", "reconciler kind": "Runner", "source": "kind source: *v1.Pod"}
2022-04-05T15:05:54Z	INFO	controller.runner-controller	Starting Controller	{"reconciler group": "actions.summerwind.dev", "reconciler kind": "Runner"}
2022-04-05T15:05:54Z	INFO	controller.runnerdeployment-controller	Starting EventSource	{"reconciler group": "actions.summerwind.dev", "reconciler kind": "RunnerDeployment", "source": "kind source: *v1alpha1.RunnerDeployment"}
2022-04-05T15:05:54Z	INFO	controller.runnerdeployment-controller	Starting EventSource	{"reconciler group": "actions.summerwind.dev", "reconciler kind": "RunnerDeployment", "source": "kind source: *v1alpha1.RunnerReplicaSet"}
2022-04-05T15:05:54Z	INFO	controller.runnerdeployment-controller	Starting Controller	{"reconciler group": "actions.summerwind.dev", "reconciler kind": "RunnerDeployment"}
2022-04-05T15:05:54Z	INFO	controller.runnerreplicaset-controller	Starting EventSource	{"reconciler group": "actions.summerwind.dev", "reconciler kind": "RunnerReplicaSet", "source": "kind source: *v1alpha1.RunnerReplicaSet"}
2022-04-05T15:05:54Z	INFO	controller.runnerreplicaset-controller	Starting EventSource	{"reconciler group": "actions.summerwind.dev", "reconciler kind": "RunnerReplicaSet", "source": "kind source: *v1alpha1.Runner"}
2022-04-05T15:05:54Z	INFO	controller.runnerreplicaset-controller	Starting Controller	{"reconciler group": "actions.summerwind.dev", "reconciler kind": "RunnerReplicaSet"}
2022-04-05T15:05:54Z	INFO	controller.runnerpod-controller	Starting EventSource	{"reconciler group": "", "reconciler kind": "Pod", "source": "kind source: *v1.Pod"}
2022-04-05T15:05:54Z	INFO	controller.runnerpod-controller	Starting Controller	{"reconciler group": "", "reconciler kind": "Pod"}
2022-04-05T15:05:54Z	INFO	controller.runnerset-controller	Starting EventSource	{"reconciler group": "actions.summerwind.dev", "reconciler kind": "RunnerSet", "source": "kind source: *v1alpha1.RunnerSet"}
2022-04-05T15:05:54Z	INFO	controller.runnerset-controller	Starting EventSource	{"reconciler group": "actions.summerwind.dev", "reconciler kind": "RunnerSet", "source": "kind source: *v1.StatefulSet"}
2022-04-05T15:05:54Z	INFO	controller.runnerset-controller	Starting Controller	{"reconciler group": "actions.summerwind.dev", "reconciler kind": "RunnerSet"}
2022-04-05T15:05:54Z	INFO	controller.horizontalrunnerautoscaler-controller	Starting EventSource	{"reconciler group": "actions.summerwind.dev", "reconciler kind": "HorizontalRunnerAutoscaler", "source": "kind source: *v1alpha1.HorizontalRunnerAutoscaler"}
2022-04-05T15:05:54Z	INFO	controller.horizontalrunnerautoscaler-controller	Starting Controller	{"reconciler group": "actions.summerwind.dev", "reconciler kind": "HorizontalRunnerAutoscaler"}
2022-04-05T15:05:54Z	INFO	controller.runner-controller	Starting workers	{"reconciler group": "actions.summerwind.dev", "reconciler kind": "Runner", "worker count": 1}
2022-04-05T15:05:54Z	INFO	controller.runnerdeployment-controller	Starting workers	{"reconciler group": "actions.summerwind.dev", "reconciler kind": "RunnerDeployment", "worker count": 1}
2022-04-05T15:05:54Z	INFO	controller.runnerpod-controller	Starting workers	{"reconciler group": "", "reconciler kind": "Pod", "worker count": 1}
2022-04-05T15:05:54Z	INFO	controller.runnerreplicaset-controller	Starting workers	{"reconciler group": "actions.summerwind.dev", "reconciler kind": "RunnerReplicaSet", "worker count": 1}
2022-04-05T15:05:54Z	INFO	controller.runnerset-controller	Starting workers	{"reconciler group": "actions.summerwind.dev", "reconciler kind": "RunnerSet", "worker count": 1}
2022-04-05T15:05:54Z	INFO	controller.horizontalrunnerautoscaler-controller	Starting workers	{"reconciler group": "actions.summerwind.dev", "reconciler kind": "HorizontalRunnerAutoscaler", "worker count": 1}
2022-04-05T15:06:24Z	DEBUG	actions-runner-controller.runnerset	Created replica(s)	{"runnerset": "asys-vpc-github-runner/vpc-devservices-runnerset", "lastSyncTime": null, "effectiveTime": "<nil>", "templateHashDesired": "7bfb6646c9", "replicasDesired": 1, "replicasPending": 0, "replicasRunning": 0, "replicasMaybeRunning": 0, "templateHashObserved": [], "created": 1}
2022-04-05T15:06:24Z	DEBUG	actions-runner-controller.runnerset	Skipped reconcilation because owner is not synced yet	{"runnerset": "asys-vpc-github-runner/vpc-devservices-runnerset", "owner": "asys-vpc-github-runner/vpc-devservices-runnerset-24628", "pods": null}
  1. Delete the RunnerSet
  2. Rollback the ARC to Chart 0.17.0
  3. Recreate the RunnerSet => Runner Pods are starting, normally and registered at the GitHub organization

Expected behavior Runner Pods from the RunnerSet StatefulSets start up and register successfully in the GitHub organization

Environment (please complete the following information):

  • Controller Version [0.22.1, 0.22.2]
  • Deployment Method [Helm]
  • Helm Chart Version [0.17.1, 0.17.2]
  • K8S 1.20.7

About this issue

  • Original URL
  • State: closed
  • Created 2 years ago
  • Comments: 19

Most upvoted comments

@tbomberg Ah sorry I missed that you changed the namespace of the example RunnerSet to actions-runner-system. It should align with scope.watchNamespace and the namespaceSelector of mutatingwebhookconfig so almost everything looks good now.

The last missing piece, which I just noticed, might be that we might have missed labeling the actions-runner-system namespace with name=actions-runner-system.

Rereading your mutatingwebhookconfig:

  namespaceSelector:
    matchLabels:
      name: actions-runner-system

This says that the namespace must be labeled with the name key, and a helm chart is unable to modify existing namespace to have that label so it must be done by you. Try kubectl label ns actions-runner-system name=actions-runner-system.

I wish there was any way to let a mutating webhook match the target namespace by name, but apparently there’s no way 😢

@rxa313 Thanks for reporting! Yes, neither ARC nor the chart labels the namespace automatically so I believe this is where we need to update the documentation (of perhaps our chart, next to the description for the watchNamespace and singleNamespace

@mumoshu I just wanted to inform that this was not fixed in 0.24.1 or 0.25.2 – in both cases I tried to upgrade from 0.21.1 to 0.25.2 after updating all crds and even wiping out my entire cluster of crds and uninstalling everything and installing from scratch to make sure.

I finally was able to stop my stateful set from constantly restarting after running: kubectl label ns actions-runner-system name=actions-runner-system