cert-manager: Cannot install on GKE autopilot cluster due to mutatingwebhookconfigurations access denied

Describe the bug: Hi,

I am trying to install cert-manager on the “new” GKE cluster autopilot via Helm.

Unfortunately I ran into this error :

Error: rendered manifests contain a resource that already exists. Unable to continue with install: could not get information about the resource: mutatingwebhookconfigurations.admissionregistration.k8s.io "cert-manager-webhook" is forbidden: User "XXXX" cannot get resource "mutatingwebhookconfigurations" in API group "admissionregistration.k8s.io" at the cluster scope: GKEAutopilot authz: cluster scoped resource "mutatingwebhookconfigurations/" is managed and access is denied

Expected behaviour: The helm script should install cert-manager.

Steps to reproduce the bug:

  • Create a cluster on GKE with autopilot
  • Run
helm install \
  cert-manager jetstack/cert-manager \
  --namespace cert-manager \
  --version v1.2.0 \
  --create-namespace

Anything else we need to know?:

Environment details::

  • Kubernetes version: 1.18.12-gke.1210
  • Cloud-provider/provisioner: GKE
  • cert-manager version: 1.2.0
  • Install method: helm

/kind bug

About this issue

  • Original URL
  • State: closed
  • Created 3 years ago
  • Reactions: 6
  • Comments: 28 (3 by maintainers)

Commits related to this issue

Most upvoted comments

Is there any workaround to install cert-manager on GKE Autopilot cluster? 🤔

Hey everyone

We’ve now added support for mutating webhooks in Autopilot. More details at https://cloud.google.com/kubernetes-engine/docs/concepts/autopilot-overview#webhooks_limitations

If you’re using the standard Helm chart, you can simply pass helm the appropriate parameter, as in helm install my-release --namespace cert-manager --version v1.5.4 jetstack/cert-manager --set global.leaderElection.namespace=cert-manager

Apart from the helm install command “failing” with the error message Error: failed post-install: timed out waiting for the condition and the startupapicheck job never completing successfully, this worked like a charm for me on a newly spun-up auto-pilot cluster (“Rapid” branch of course). In spite of these errors, cert-manager seems to work just as it should, creating certificates as expected. (Full disclosure: I only did a quick test with self-signed CA/issuer and a certificate based on that, no fancier use cases yet)

Great work and thanks @vivekbagade !

I’ve been looking forward to this for some time, thanks @vivekbagade and everyone else who helped with this update to Autopilot.

Out of the box, there’s still an incompatibility with cert-manager on Autopilot. Cert-manager manages its leader election in configmaps in the kube-system namespace, but this is disallowed by Autopilot (seems like a reasonable choice). I used this Kustomization file to move those configmaps into the cert-manager namespace.

I thought I’d share in case it helps others. I’d also like to learn if anyone else would suggest a better approach.

---
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization

resources:
  - https://github.com/jetstack/cert-manager/releases/download/v1.5.3/cert-manager.yaml

patches:
  - target:
      kind: Role
      name: cert-manager:leaderelection
    patch: |-
      - op: replace
        path: /metadata/namespace
        value: cert-manager
  - target:
      kind: Role
      name: cert-manager-cainjector:leaderelection
    patch: |-
      - op: replace
        path: /metadata/namespace
        value: cert-manager
  - target:
      kind: RoleBinding
      name: cert-manager:leaderelection
    patch: |-
      - op: replace
        path: /metadata/namespace
        value: cert-manager
  - target:
      kind: RoleBinding
      name: cert-manager-cainjector:leaderelection
    patch: |-
      - op: replace
        path: /metadata/namespace
        value: cert-manager
  - target:
      kind: Deployment
      name: cert-manager
    patch: |-
      - op: replace
        path: /spec/template/spec/containers/0/args/2
        value: --leader-election-namespace=cert-manager
  - target:
      kind: Deployment
      name: cert-manager-cainjector
    patch: |-
      - op: replace
        path: /spec/template/spec/containers/0/args/1
        value: --leader-election-namespace=cert-manager

For anyone else who’s a little slow like me 🙃

Instead of kubectl apply -f https://github.com/cert-manager/cert-manager/releases/download/vX.Y.Z/cert-manager.yaml, do:

helm repo add jetstack https://charts.jetstack.io
helm repo update
helm install --create-namespace --namespace cert-manager --set installCRDs=true --set global.leaderElection.namespace=cert-manager cert-manager jetstack/cert-manager

If you already did kubectl apply -f https://github.com/cert-manager/cert-manager/releases/download/vX.Y.Z/cert-manager.yaml, uninstall it first: https://cert-manager.io/docs/installation/kubectl/#uninstalling

Now you can continue on with your kubectl apply -f issuer-lets-encrypt-staging.yaml

Status update on this, It seems the rapid release channel is no longer required in GKE Autopilot. Installed successfully with:

helm install --create-namespace --namespace cert-manager --set installCRDs=true --set global.leaderElection.namespace=cert-manager cert-manager jetstack/cert-manager

@olaf-2 I have tried Google’s managed certificates in the past, they were the reason I ended up switching to cert-manager in the first place.

In my 3 years of developing on the cloud (primarily AWS and GCP), it often seems to be the case that everything is advertised to “Just work”, but that is rarely the case.

GKE Autopilot has no support for 3rd party webhooks in it’s current state, this limits many Kubernetes plugins such as ourselves not to be able to operate correctly. This is reportedly (on Twitter: https://twitter.com/BagadeVivek/status/1365701217469534220 ) to be fixed in a next release.

For anyone else who’s a little slow like me 🙃

Instead of kubectl apply -f https://github.com/cert-manager/cert-manager/releases/download/vX.Y.Z/cert-manager.yaml, do:

helm repo add jetstack https://charts.jetstack.io
helm repo update
helm install --create-namespace --namespace cert-manager --set installCRDs=true --set global.leaderElection.namespace=cert-manager cert-manager jetstack/cert-manager

If you already did kubectl apply -f https://github.com/cert-manager/cert-manager/releases/download/vX.Y.Z/cert-manager.yaml, uninstall it first: https://cert-manager.io/docs/installation/kubectl/#uninstalling

Now you can continue on with your kubectl apply -f issuer-lets-encrypt-staging.yaml

This works for me on Gke Autopilot with k8s v1.24.100-gke.2300

@pahuja21 You need to make sure you’re using the “rapid” channel for GKE Autopilot.

I’ve successfully installed cert-manager v1.6.1 using Helm to a cluster. As already noted in the thread, it’s necessary to override global.leaderElection.namespace from the default kube-system as it’s a “managed namespace” on Autopilot (see https://cloud.google.com/kubernetes-engine/docs/concepts/autopilot-overview#managed_namespaces).

I was just able to do a clean install with Helm 3 on an Autopilot “rapid” cluster, and did not run into a failure of the API readiness check job.

$ helm install cert-manager --namespace cert-manager jetstack/cert-manager --set global.leaderElection.namespace=cert-manager --set installCRDs=true --set prometheus.enabled=false
W1012 15:01:24.371788   35116 warnings.go:70] Autopilot set default resource requests for Deployment cert-manager/cert-manager-webhook, as resource requests were not specified. See http://g.co/gke/autopilot-defaults.
W1012 15:01:24.414269   35116 warnings.go:70] Autopilot set default resource requests for Deployment cert-manager/cert-manager-cainjector, as resource requests were not specified. See http://g.co/gke/autopilot-defaults.
W1012 15:01:24.456685   35116 warnings.go:70] Autopilot set default resource requests for Deployment cert-manager/cert-manager, as resource requests were not specified. See http://g.co/gke/autopilot-defaults.
W1012 15:01:24.959784   35116 warnings.go:70] AdmissionWebhookController: mutated namespaceselector of the webhooks to enforce GKE Autopilot policies.
W1012 15:01:37.079421   35116 warnings.go:70] Autopilot set default resource requests for Job cert-manager/cert-manager-startupapicheck, as resource requests were not specified. See http://g.co/gke/autopilot-defaults.
NAME: cert-manager
LAST DEPLOYED: Tue Oct 12 15:01:13 2021
NAMESPACE: cert-manager
STATUS: deployed
REVISION: 1
TEST SUITE: None
NOTES:
cert-manager v1.5.4 has been deployed successfully!

That said, while the chart installs and the pods start, the log output of cert-manager-startupapicheck is:

Not ready: the cert-manager webhook CA bundle is not injected yet

repeated until it eventually times out, so not currently able to get it to work.

Thanks @mattbates - the --set global.leaderElection.namespace=cert-manager -- is indeed the key.

As at today (11/22/21), this works on the regular channel 1.21.5-gke.1302 as well. Make sure to be installing through Helm. I haven’t found luck with kubectl apply

Piggybacking off @bradjones1 solution with a bit of a tweak, I used to set up cert-manager on GKE Autopilot “rapid” cluster and did not run into any failure.

$ helm install cert-manager jetstack/cert-manager --namespace cert-manager --create-namespace --version v1.5.4 --set global.leaderElection.namespace=cert-manager --set installCRDs=true --set prometheus.enabled=false

NAME: cert-manager LAST DEPLOYED: Thu Oct 14 15:45:12 2021 NAMESPACE: cert-manager STATUS: deployed REVISION: 1 TEST SUITE: None NOTES: cert-manager v1.5.4 has been deployed successfully!

We cannot action this so we will close it. However we can leave it pinned for informational purposes /close

For anyone getting here via Google, the chart now installs correctly on 1.21+. As of today (22nd Sep) to get onto 1.21+ you may need to be on the “Rapid” release branch, this can be configured under “Advanced” cluster settings on cluster creation and cannot be changed for existing clusters. Imagine it won’t be that long until the Regular release branch gets 1.21 but not sure of the exact date.