kueue: Non-leading replica fails due to not started cert-controller
What happened:
If I run two replicas the manager crashes after a while, it looks like that the healthiness probe fails and restarts the pod
What you expected to happen:
Both pods as running fine
How to reproduce it (as minimally and precisely as possible):
Anything else we need to know?:
I1212 18:02:44.725166 1 leaderelection.go:250] attempting to acquire leader lease kueue-system/c1f6bfd2.kueue.x-k8s.io...
{"level":"info","ts":"2023-12-12T18:02:44.725142061Z","caller":"controller/controller.go:178","msg":"Starting EventSource","controller":"cert-rotator","source":"kind source: *v1.Secret"}
{"level":"info","ts":"2023-12-12T18:02:44.726132561Z","caller":"controller/controller.go:178","msg":"Starting EventSource","controller":"cert-rotator","source":"kind source: *unstructured.Unstructured"}
{"level":"info","ts":"2023-12-12T18:02:44.726421287Z","caller":"controller/controller.go:178","msg":"Starting EventSource","controller":"cert-rotator","source":"kind source: *unstructured.Unstructured"}
{"level":"info","ts":"2023-12-12T18:02:44.726453798Z","caller":"controller/controller.go:186","msg":"Starting Controller","controller":"cert-rotator"}
{"level":"error","ts":"2023-12-12T18:04:44.726776367Z","caller":"controller/controller.go:203","msg":"Could not wait for Cache to sync","controller":"cert-rotator","error":"failed to wait for cert-rotator caches to sync: timed out waiting for cache to be synced for Kind *v1.Secret","stacktrace":"sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.1\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.16.2/pkg/internal/controller/controller.go:203\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.16.2/pkg/internal/controller/controller.go:208\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.16.2/pkg/internal/controller/controller.go:234\nsigs.k8s.io/controller-runtime/pkg/manager.(*runnableGroup).reconcile.func1\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.16.2/pkg/manager/runnable_group.go:223"}
{"level":"info","ts":"2023-12-12T18:04:44.72690754Z","caller":"manager/internal.go:516","msg":"Stopping and waiting for non leader election runnables"}
{"level":"info","ts":"2023-12-12T18:04:44.72693391Z","caller":"manager/internal.go:520","msg":"Stopping and waiting for leader election runnables"}
{"level":"error","ts":"2023-12-12T18:04:44.726934751Z","caller":"manager/internal.go:490","msg":"error received after stop sequence was engaged","error":"failed waiting for reader to sync","errorVerbose":"failed waiting for reader to sync\ngithub.com/open-policy-agent/cert-controller/pkg/rotator.(*CertRotator).Start\n\t/go/pkg/mod/github.com/open-policy-agent/cert-controller@v0.10.0/pkg/rotator/rotator.go:258\nsigs.k8s.io/controller-runtime/pkg/manager.(*runnableGroup).reconcile.func1\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.16.2/pkg/manager/runnable_group.go:223\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1650","stacktrace":"sigs.k8s.io/controller-runtime/pkg/manager.(*controllerManager).engageStopProcedure.func1\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.16.2/pkg/manager/internal.go:490"}
{"level":"info","ts":"2023-12-12T18:04:44.72773714Z","caller":"manager/internal.go:526","msg":"Stopping and waiting for caches"}
{"level":"info","ts":"2023-12-12T18:04:44.727923434Z","caller":"manager/internal.go:530","msg":"Stopping and waiting for webhooks"}
{"level":"info","ts":"2023-12-12T18:04:44.727976656Z","caller":"manager/internal.go:533","msg":"Stopping and waiting for HTTP servers"}
{"level":"info","ts":"2023-12-12T18:04:44.727990216Z","logger":"controller-runtime.metrics","caller":"server/server.go:231","msg":"Shutting down metrics server with timeout of 1 minute"}
{"level":"info","ts":"2023-12-12T18:04:44.727998496Z","caller":"manager/server.go:43","msg":"shutting down server","kind":"health probe","addr":"[::]:8081"}
{"level":"info","ts":"2023-12-12T18:04:44.728063958Z","caller":"manager/internal.go:537","msg":"Wait completed, proceeding to shutdown the manager"}
{"level":"error","ts":"2023-12-12T18:04:44.728084368Z","logger":"setup","caller":"kueue/main.go:182","msg":"Could not run manager","error":"failed to wait for cert-rotator caches to sync: timed out waiting for cache to be synced for Kind *v1.Secret","stacktrace":"main.main\n\t/workspace/cmd/kueue/main.go:182\nruntime.main\n\t/usr/local/go/src/runtime/proc.go:267"}
Environment:
- Kubernetes version (use
kubectl version): Server Version: v1.28.3-eks-4f4795d - Kueue version (use
git describe --tags --dirty --always): v0.5.1 - Cloud provider or hardware configuration: EKS
- OS (e.g:
cat /etc/os-release): bottlerocket os - Kernel (e.g.
uname -a): - Install tools: helm
- Others:
About this issue
- Original URL
- State: closed
- Created 7 months ago
- Comments: 22 (19 by maintainers)
A new version of cert-controller has been released with the fix, and I’ve open #1509 that upgrades our dependency. That fixes the non-leading replicas starting issue.
I’ve also open #1510 to track the work for making the visibility extension API server highly available.
@alculquicondor right, I don’t think it has something to do with the probes.
So what happens in the non-leader elected mode is the following:
Could not run managermessagecontroller-runtime exposes the
LeaderElectionRunnableinterface, so controllers can implement itsNeedLeaderElectionmethod to control whether the manager should start them in non-leader instances. Also managed webhooks are always started, irrespective of leader election.In the case of the OPA cert-controller, there is a
RequireLeaderElectionoption, that’s correctly set by Kueue, but I suspect there is an issue in cert-controller that makes it not taken into account, which is the root cause of that issue. I’ll fix it upstream.For the visibility extension API server, we would need to make sure it’s safe to run multiple instances of ClusterQueueReconciler concurrently, or find a way to only run the read-only part?
Yeah, ideally all replicas should reply to webhooks.
However, we recently introduced this feature https://github.com/kubernetes-sigs/kueue/tree/main/keps/168-2-pending-workloads-visibility. In this case, only the leader can respond. Another alternative would be for non-leaders to also maintain the queues (we do this in kube-scheduler), so that they can also respond to api-extensions requests.
I’m not actually sure about what is the behavior that controller-runtime applies.