rancher: [BUG] tigera-operator fails due to rancher-webhook denying access
Rancher Server Setup
- Rancher version: 2.7.2
- Installation option (Docker install/Helm Chart): docker
Information about the Cluster
- Kubernetes version: v1.24.11 +k3s1
- Cluster Type (Local/Downstream): Downstream custom
User Information
- What is the role of the user logged in? Admin
Describe the bug
tigera-operator fails due to rancher-webhook denying access:
{"level":"info","ts":1681811352.3538423,"logger":"controller_installation","msg":"Failed to update object.","Name":"calico-system","Namespace":"","Kind":"Namespace","key":"/calico-system"}
Tue, Apr 18 2023 11:49:12 am
{"level":"error","ts":1681811352.3538892,"logger":"controller_installation","msg":"Error creating / updating resource","Request.Namespace":"tigera-operator","Request.Name":"tigera-ca-private","reason":"ResourceUpdateError","error":"admission webhook \"rancher.cattle.io.namespaces\" denied the request: Unauthorized","stacktrace":"github.com/tigera/operator/pkg/controller/status.(*statusManager).SetDegraded\n\t/go/src/github.com/tigera/operator/pkg/controller/status/status.go:406\ngithub.com/tigera/operator/pkg/controller/installation.(*ReconcileInstallation).Reconcile\n\t/go/src/github.com/tigera/operator/pkg/controller/installation/core_controller.go:1358\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.13.0/pkg/internal/controller/controller.go:121\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.13.0/pkg/internal/controller/controller.go:320\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.13.0/pkg/internal/controller/controller.go:273\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.13.0/pkg/internal/controller/controller.go:234"}
2023-04-18T11:49:12.354158625+02:00 {"level":"error","ts":1681811352.3540413,"msg":"Reconciler error","controller":"tigera-installation-controller","object":{"name":"tigera-ca-private","namespace":"tigera-operator"},"namespace":"tigera-operator","name":"tigera-ca-private","reconcileID":"593d8d9e-272b-4009-a6d1-823e0a39883d","error":"admission webhook \"rancher.cattle.io.namespaces\" denied the request: Unauthorized","stacktrace":"sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.13.0/pkg/internal/controller/controller.go:326\nsigs
.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.13.0/pkg/internal/controller/controller.go:273\nsigs.k8s.io/controller-runtime/pkg/internal/controller.
To Reproduce
Run cluster with tigera-operator for calico-installation and update: rancher from 2.7.1 -> 2.7.2 tigera-operator from 3.16 to 3.25
observe that the calico installation does not update
Result
Expected Result tigera-operator can update without problems.
Additional context related to #41172?
About this issue
- Original URL
- State: closed
- Created a year ago
- Reactions: 4
- Comments: 15 (2 by maintainers)
Commits related to this issue
- Add allow PSA update step in rancher manager doc Add a step in rancher manager doc to allow Tigera Operator update PSA. This step is reported necessary for Calico Enterprise installation from rancher... — committed to hjiawei/docs by hjiawei a year ago
- Add allow PSA update step in rancher manager doc Add a step in rancher manager doc to allow Tigera Operator update PSA. This step is reported necessary for Calico Enterprise installation from rancher... — committed to hjiawei/docs by hjiawei a year ago
- Add allow PSA update step in rancher manager doc Add a step in rancher manager doc to allow Tigera Operator update PSA. This step is reported necessary for Calico Enterprise installation from rancher... — committed to tigera/docs by hjiawei a year ago
- Allow operator to set PSA in Rancher In Rancher, it is not enough to have `patch` permissions for a namespace in order to set PSA labels. It is also required to have the `updatepsa` permission on the... — committed to lindhe/trident by lindhe a year ago
Hi, same issue here withe trident-operator and Rancher 2.7.3.
level=error msg="error syncing 'trident': reconcile failed; error re-installing Trident 'trident' ; err: reconcile failed; failed to patch Trident installation namespace trident; admission webhook \"rancher.cattle.io.namespaces\" denied the request: Unauthorized, requeuing"
Same helm installation with Rancher 2.7.0 worked fine, …
I’ve managed to resolve this by adding some new ClusterRoles and ClusterRoleBindings. For instance for Tigera:
This resolves the issue we were having with Tigera Operator, and a similar set of resources resolves a similar issue with the OPA Gatekeeper Update Namespace Label deployment.
I too had the issue where Trident reported
'Failed to install Trident; err: failed to patch Trident installation namespace trident; admission webhook "rancher.cattle.io.namespaces" denied the request: Unauthorized'
.I can confirm that upgrading to Rancher v2.7.5 have fixed the issue for me.EDIT 1: No, upgrading to v2.7.5 did not fix it. 😞 I looked at the wrong cluster. Damn it.
EDIT 2: Alright, all is fine now! Big thanks to @justdan96 for posting the solution. In hindsight I should have tried that first, before all the other trouble I went through trying to debug this. 😌 Here’s my YAML for fixing Trident:
This leaves me wondering…
patch
privileges on thenamespace
resource is not enough for adding a label?trident
namespace is not part of any Rancher project. To me, it feels strange then that I need any privileges on theprojects
resource for things to work. Should that be changed in Rancher?So, is this a bug in Rancher or a bug in the operators? I have to admit that it is a bit murky. On one hand, I generally expect Kubernetes manifests to work the same no matter if I deploy on Rancher, Minikube, AKS or vanilla Kubernetes. On the other hand, projects is a neat piece of management abstraction that Rancher provides, so that’s obviously something that needs to be handled when using Rancher.
If this should be added in the operator’s manifests in order to handle the “edge case of Rancher”, I assume it should be as simple as adding the relevant rule to the list of rules for the existing ClusterRole, right? Because it’s using the API group
management.cattle.io
, that should be unique and have no effect on other Kubernetes distributions. Am I correct in this assumption, or do we need some logic to determine which Kubernetes distribution we deploy to and have Helm render different manifests depending on that?As far as I’m concerned, there is at least one thing that is 100% on Rancher’s responsibility and that they must fix: this API must be documented! And not just how to add PSA labels for namespaces, the whole
management.cattle.io
API group must be documented!Getting the same issue with a fresh installation of the trident operator (NetApp). Recently updated to rancher
2.7.2
and also bumped the version of trident so not entirely sure what introduced the issue. Older installations of trident aren’t showing the issue but they may have already updated the ns as desired, not sure.