cert-manager: kubernetes 1.16: secret "cert-manager-webhook-webhook-tls" not found

Describe the bug:

Installing cert-manager ends with

webhook fails to start MountVolume.SetUp failed for volume "certs" : secret "cert-manager-webhook-webhook-tls" not found

Expected behaviour:

No errors, pods start without errors

Steps to reproduce the bug:

Simply install cert-manager from helm or static manifests

Anything else we need to know?:

Installation result with helm

helm ls --namespace cert-manager 
NAME        	NAMESPACE   	REVISION	UPDATED                                	STATUS  	CHART               	APP VERSION
cert-manager	cert-manager	2       	2019-12-16 18:40:14.296856384 +0100 CET	deployed	cert-manager-v0.12.0	v0.12.0    

and the pods

kubectl get pods --all-namespaces 
NAMESPACE       NAME                                       READY   STATUS              RESTARTS   AGE
cert-manager    cert-manager-784bc9c58b-xq25x              1/1     Running             0          20m
cert-manager    cert-manager-cainjector-85fbdf788-d8s5l    0/1     CrashLoopBackOff    9          28m
cert-manager    cert-manager-webhook-76f9b64b45-brpp5      0/1     ContainerCreating   0          28m
default         multitool                                  1/1     Running             0          88m
ingress-nginx   default-http-backend-67cf578fc4-lr5jw      1/1     Running             0          32h
ingress-nginx   nginx-ingress-controller-7gczj             1/1     Running             0          32h
ingress-nginx   nginx-ingress-controller-x5j2x             1/1     Running             0          32h
kube-system     calico-kube-controllers-5fd6f588f8-jhtl5   1/1     Running             1          107m
kube-system     calico-node-82s74                          1/1     Running             0          92m
kube-system     calico-node-qv7fg                          1/1     Running             0          92m
kube-system     coredns-5c59fd465f-nlwcw                   1/1     Running             0          32h
kube-system     coredns-5c59fd465f-z8jvg                   1/1     Running             0          32h
kube-system     coredns-autoscaler-d765c8497-hrkzk         1/1     Running             0          32h
kube-system     metrics-server-64f6dffb84-5mwrk            1/1     Running             0          32h
kube-system     rke-coredns-addon-deploy-job-mldcf         0/1     Completed           0          32h
kube-system     rke-ingress-controller-deploy-job-wxvt7    0/1     Completed           0          32h
kube-system     rke-metrics-addon-deploy-job-szd4v         0/1     Completed           0          32h
kube-system     rke-network-plugin-deploy-job-d9cbg        0/1     Completed           0          32h

and there is definitively no such secret cert-manager-webhook-webhook-tls

kubectl  get secret  -n cert-manager
NAME                                  TYPE                                  DATA   AGE
cert-manager-cainjector-token-m65nj   kubernetes.io/service-account-token   3      18m
cert-manager-token-rzmdx              kubernetes.io/service-account-token   3      18m
cert-manager-webhook-token-59qnz      kubernetes.io/service-account-token   3      18m

Pod details cert-manager-cainjector

 kubectl describe pod  cert-manager-cainjector-6659d6844d-mpxc7 -n cert-manager
Name:         cert-manager-cainjector-6659d6844d-mpxc7
Namespace:    cert-manager
Priority:     0
Node:         x.x.x.x/192.168.100.2
Start Time:   Tue, 17 Dec 2019 17:55:34 +0100
Labels:       app=cainjector
              app.kubernetes.io/instance=cert-manager
              app.kubernetes.io/managed-by=Tiller
              app.kubernetes.io/name=cainjector
              helm.sh/chart=cert-manager-v0.12.0
              pod-template-hash=6659d6844d
Annotations:  cni.projectcalico.org/podIP: 10.42.111.203/32
Status:       Running
IP:           10.42.111.203
IPs:
  IP:           10.42.111.203
Controlled By:  ReplicaSet/cert-manager-cainjector-6659d6844d
Containers:
  cert-manager:
    Container ID:  docker://674aeca3b8baed3c230c349e9bfea0f50b3cc287adddb6733e282e306712ed49
    Image:         quay.io/jetstack/cert-manager-cainjector:v0.12.0
    Image ID:      docker-pullable://quay.io/jetstack/cert-manager-cainjector@sha256:9ff6923f6c567573103816796df283d03256bc7a9edb7450542e106b349cf34a
    Port:          <none>
    Host Port:     <none>
    Args:
      --v=2
      --leader-election-namespace=kube-system
    State:          Terminated
      Reason:       Error
      Exit Code:    255
      Started:      Tue, 17 Dec 2019 17:56:11 +0100
      Finished:     Tue, 17 Dec 2019 17:56:41 +0100
    Last State:     Terminated
      Reason:       Error
      Exit Code:    255
      Started:      Tue, 17 Dec 2019 17:55:38 +0100
      Finished:     Tue, 17 Dec 2019 17:56:08 +0100
    Ready:          False
    Restart Count:  1
    Environment:
      POD_NAMESPACE:  cert-manager (v1:metadata.namespace)
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from cert-manager-cainjector-token-lhz85 (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             False 
  ContainersReady   False 
  PodScheduled      True 
Volumes:
  cert-manager-cainjector-token-lhz85:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  cert-manager-cainjector-token-lhz85
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:
  Type     Reason     Age               From                Message
  ----     ------     ----              ----                -------
  Normal   Scheduled  <unknown>         default-scheduler   Successfully assigned cert-manager/cert-manager-cainjector-6659d6844d-mpxc7 to x.x.x.x
  Normal   Pulled     9s (x2 over 42s)  kubelet, x.x.x.x  Container image "quay.io/jetstack/cert-manager-cainjector:v0.12.0" already present on machine
  Normal   Created    8s (x2 over 41s)  kubelet, x.x.x.x  Created container cert-manager
  Normal   Started    8s (x2 over 41s)  kubelet, x.x.x.x  Started container cert-manager
  Warning  BackOff    <invalid>         kubelet, x.x.x.x  Back-off restarting failed container

Pod details cert-manager-webhook

kubectl describe pod cert-manager-webhook-547567b88f-b7fzk    -n cert-manager
Name:           cert-manager-webhook-547567b88f-b7fzk
Namespace:      cert-manager
Priority:       0
Node:           x.x.x.x/192.168.100.1
Start Time:     Tue, 17 Dec 2019 17:55:36 +0100
Labels:         app=webhook
                app.kubernetes.io/instance=cert-manager
                app.kubernetes.io/managed-by=Tiller
                app.kubernetes.io/name=webhook
                helm.sh/chart=cert-manager-v0.12.0
                pod-template-hash=547567b88f
Annotations:    <none>
Status:         Pending
IP:             
IPs:            <none>
Controlled By:  ReplicaSet/cert-manager-webhook-547567b88f
Containers:
  cert-manager:
    Container ID:  
    Image:         quay.io/jetstack/cert-manager-webhook:v0.12.0
    Image ID:      
    Port:          <none>
    Host Port:     <none>
    Args:
      --v=2
      --secure-port=10250
      --tls-cert-file=/certs/tls.crt
      --tls-private-key-file=/certs/tls.key
    State:          Waiting
      Reason:       ContainerCreating
    Ready:          False
    Restart Count:  0
    Liveness:       http-get http://:6080/livez delay=0s timeout=1s period=10s #success=1 #failure=3
    Readiness:      http-get http://:6080/healthz delay=0s timeout=1s period=10s #success=1 #failure=3
    Environment:
      POD_NAMESPACE:  cert-manager (v1:metadata.namespace)
    Mounts:
      /certs from certs (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from cert-manager-webhook-token-lf56p (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             False 
  ContainersReady   False 
  PodScheduled      True 
Volumes:
  certs:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  cert-manager-webhook-tls
    Optional:    false
  cert-manager-webhook-token-lf56p:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  cert-manager-webhook-token-lf56p
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:
  Type     Reason       Age                       From                   Message
  ----     ------       ----                      ----                   -------
  Normal   Scheduled    <unknown>                 default-scheduler      Successfully assigned cert-manager/cert-manager-webhook-547567b88f-b7fzk to y.y.y.y
  Warning  FailedMount  <invalid>                 kubelet, y.y.y.y  Unable to attach or mount volumes: unmounted volumes=[certs], unattached volumes=[cert-manager-webhook-token-lf56p certs]: timed out waiting for the condition
  Warning  FailedMount  <invalid> (x9 over 118s)  kubelet, y.y.y.y  MountVolume.SetUp failed for volume "certs" : secret "cert-manager-webhook-tls" not found

possible related issues (mostly closed)

Environment details::

  • Kubernetes version (e.g. v1.10.2): v1.16.2
  • Cloud-provider/provisioner (e.g. GKE, kops AWS, etc): baremetal
  • cert-manager version (e.g. v0.4.0): 0.10.0, 0.11.0 and 0.12.0
  • Install method (e.g. helm or static manifests): helm and static manifests

/kind bug

About this issue

  • Original URL
  • State: closed
  • Created 5 years ago
  • Reactions: 12
  • Comments: 43 (1 by maintainers)

Commits related to this issue

Most upvoted comments

same problem here on gke with kubernetes 1.14

I’m not sure if this is the issue anyone else in this thread is running into, but I was able to solve this error by deploying everything into the cert-manager namespace and adding the following to the Helm chart’s values.yaml:

---
global:
  rbac:
    create: true

  leaderElection:
    namespace: cert-manager

thanks @filipweidemann for your input this saved my day 😉 However I figured that tainting may not been necessary, I’ve did the following

  1. deleted namespace cert-manager

    kubectl delete ns cert-manager --force --grace-period=0
    
  2. created/modified the manifest according to your suggestion

    helm template cert-manager jetstack/cert-manager --namespace cert-manager > cert-manager.yml
    

    Then add nodeSelector to deployments in cert-manager.yml

  3. labeled the master node

    kubectl label node <master node name> schedule-certmanager=true
    
  4. created ns cert-manager (no additional lables added)

    kubectl create ns cert-manager
    
  5. applied manifest

    kubectl apply -f cert-manager.yml 
    

Result

kubectl -n cert-manager get pods
NAME                                       READY   STATUS    RESTARTS   AGE
cert-manager-55798cbfdf-mtbz6              1/1     Running   0          3m38s
cert-manager-cainjector-5b5d88b76b-drgbm   1/1     Running   0          3m38s
cert-manager-webhook-656f59b5d5-zn6sb      1/1     Running   0          3m38s

Thanks @ioben , your solution works good to me with helmv2. I have no idea why setting global.leaderElection.namespace=“cert-manager” resolves the issue of no secret of cert-manager-webhook-tls previously. helm install --name my-release --namespace cert-manager jetstack/cert-manager --version “v0.12.0” --set global.leaderElection.namespace=“cert-manager” --set global.podSecurityPolicy.enabled=true

Hi,

First of all, thanks to the maintainers for the time and effort put into this OSS project.

I have been dealing with this issue for the past few days, banging my head against a wall as to why things didn’t work as they should. Some context:

I have 2 clusters, both on GCP, one being production, and another one being a scaled-down version, for staging/testing. I had successfully deployed v0.12 to staging with no issues, but were facing this particular issue on the production cluster. I had tried copying the secret from the staging to production, which seemed to solve this issue, but where facing other problems further down the pipeline, where CertificateRequests and Orders were not being created automatically by Certificates and Issuers/ClusterIssuers.

Stuff I tried:

  • Every version between 0.8 and 0.12
  • Multiple install/uninstall processes
  • “manual” install and helm install
  • disabling webhooks
  • sacrificing several animals to the gods of kubernetes

In the end, here’s what I learned, and how it fixed the problem for me: At the time of my experiments above, I am using Helm v3, without having explicitly migrated from Helm 2 to 3. As Helm 3 does not detect Helm 2 stuff, I was not aware that there was a Helm 2 installed version of cert-manager on my production cluster. Even with all the installs/uninstalls above, something must have survived, and was most likely causing issues.

So, the solution for me was:

  • downgrade my helm client to v2
  • uninstall the old cert-manager version (in my case, it was v0.8)
  • Install cert-manager v0.12 (still using helm 2)
  • claim back the lives of the sacrificed animals, as the gods didn’t help at all

Hope this helps someone else