cert-manager: kubernetes 1.16: secret "cert-manager-webhook-webhook-tls" not found
Describe the bug:
Installing cert-manager ends with
webhook fails to start MountVolume.SetUp failed for volume "certs" : secret "cert-manager-webhook-webhook-tls" not found
Expected behaviour:
No errors, pods start without errors
Steps to reproduce the bug:
Simply install cert-manager from helm or static manifests
Anything else we need to know?:
Installation result with helm
helm ls --namespace cert-manager
NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION
cert-manager cert-manager 2 2019-12-16 18:40:14.296856384 +0100 CET deployed cert-manager-v0.12.0 v0.12.0
and the pods
kubectl get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
cert-manager cert-manager-784bc9c58b-xq25x 1/1 Running 0 20m
cert-manager cert-manager-cainjector-85fbdf788-d8s5l 0/1 CrashLoopBackOff 9 28m
cert-manager cert-manager-webhook-76f9b64b45-brpp5 0/1 ContainerCreating 0 28m
default multitool 1/1 Running 0 88m
ingress-nginx default-http-backend-67cf578fc4-lr5jw 1/1 Running 0 32h
ingress-nginx nginx-ingress-controller-7gczj 1/1 Running 0 32h
ingress-nginx nginx-ingress-controller-x5j2x 1/1 Running 0 32h
kube-system calico-kube-controllers-5fd6f588f8-jhtl5 1/1 Running 1 107m
kube-system calico-node-82s74 1/1 Running 0 92m
kube-system calico-node-qv7fg 1/1 Running 0 92m
kube-system coredns-5c59fd465f-nlwcw 1/1 Running 0 32h
kube-system coredns-5c59fd465f-z8jvg 1/1 Running 0 32h
kube-system coredns-autoscaler-d765c8497-hrkzk 1/1 Running 0 32h
kube-system metrics-server-64f6dffb84-5mwrk 1/1 Running 0 32h
kube-system rke-coredns-addon-deploy-job-mldcf 0/1 Completed 0 32h
kube-system rke-ingress-controller-deploy-job-wxvt7 0/1 Completed 0 32h
kube-system rke-metrics-addon-deploy-job-szd4v 0/1 Completed 0 32h
kube-system rke-network-plugin-deploy-job-d9cbg 0/1 Completed 0 32h
and there is definitively no such secret cert-manager-webhook-webhook-tls
kubectl get secret -n cert-manager
NAME TYPE DATA AGE
cert-manager-cainjector-token-m65nj kubernetes.io/service-account-token 3 18m
cert-manager-token-rzmdx kubernetes.io/service-account-token 3 18m
cert-manager-webhook-token-59qnz kubernetes.io/service-account-token 3 18m
Pod details cert-manager-cainjector
kubectl describe pod cert-manager-cainjector-6659d6844d-mpxc7 -n cert-manager
Name: cert-manager-cainjector-6659d6844d-mpxc7
Namespace: cert-manager
Priority: 0
Node: x.x.x.x/192.168.100.2
Start Time: Tue, 17 Dec 2019 17:55:34 +0100
Labels: app=cainjector
app.kubernetes.io/instance=cert-manager
app.kubernetes.io/managed-by=Tiller
app.kubernetes.io/name=cainjector
helm.sh/chart=cert-manager-v0.12.0
pod-template-hash=6659d6844d
Annotations: cni.projectcalico.org/podIP: 10.42.111.203/32
Status: Running
IP: 10.42.111.203
IPs:
IP: 10.42.111.203
Controlled By: ReplicaSet/cert-manager-cainjector-6659d6844d
Containers:
cert-manager:
Container ID: docker://674aeca3b8baed3c230c349e9bfea0f50b3cc287adddb6733e282e306712ed49
Image: quay.io/jetstack/cert-manager-cainjector:v0.12.0
Image ID: docker-pullable://quay.io/jetstack/cert-manager-cainjector@sha256:9ff6923f6c567573103816796df283d03256bc7a9edb7450542e106b349cf34a
Port: <none>
Host Port: <none>
Args:
--v=2
--leader-election-namespace=kube-system
State: Terminated
Reason: Error
Exit Code: 255
Started: Tue, 17 Dec 2019 17:56:11 +0100
Finished: Tue, 17 Dec 2019 17:56:41 +0100
Last State: Terminated
Reason: Error
Exit Code: 255
Started: Tue, 17 Dec 2019 17:55:38 +0100
Finished: Tue, 17 Dec 2019 17:56:08 +0100
Ready: False
Restart Count: 1
Environment:
POD_NAMESPACE: cert-manager (v1:metadata.namespace)
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from cert-manager-cainjector-token-lhz85 (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
cert-manager-cainjector-token-lhz85:
Type: Secret (a volume populated by a Secret)
SecretName: cert-manager-cainjector-token-lhz85
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled <unknown> default-scheduler Successfully assigned cert-manager/cert-manager-cainjector-6659d6844d-mpxc7 to x.x.x.x
Normal Pulled 9s (x2 over 42s) kubelet, x.x.x.x Container image "quay.io/jetstack/cert-manager-cainjector:v0.12.0" already present on machine
Normal Created 8s (x2 over 41s) kubelet, x.x.x.x Created container cert-manager
Normal Started 8s (x2 over 41s) kubelet, x.x.x.x Started container cert-manager
Warning BackOff <invalid> kubelet, x.x.x.x Back-off restarting failed container
Pod details cert-manager-webhook
kubectl describe pod cert-manager-webhook-547567b88f-b7fzk -n cert-manager
Name: cert-manager-webhook-547567b88f-b7fzk
Namespace: cert-manager
Priority: 0
Node: x.x.x.x/192.168.100.1
Start Time: Tue, 17 Dec 2019 17:55:36 +0100
Labels: app=webhook
app.kubernetes.io/instance=cert-manager
app.kubernetes.io/managed-by=Tiller
app.kubernetes.io/name=webhook
helm.sh/chart=cert-manager-v0.12.0
pod-template-hash=547567b88f
Annotations: <none>
Status: Pending
IP:
IPs: <none>
Controlled By: ReplicaSet/cert-manager-webhook-547567b88f
Containers:
cert-manager:
Container ID:
Image: quay.io/jetstack/cert-manager-webhook:v0.12.0
Image ID:
Port: <none>
Host Port: <none>
Args:
--v=2
--secure-port=10250
--tls-cert-file=/certs/tls.crt
--tls-private-key-file=/certs/tls.key
State: Waiting
Reason: ContainerCreating
Ready: False
Restart Count: 0
Liveness: http-get http://:6080/livez delay=0s timeout=1s period=10s #success=1 #failure=3
Readiness: http-get http://:6080/healthz delay=0s timeout=1s period=10s #success=1 #failure=3
Environment:
POD_NAMESPACE: cert-manager (v1:metadata.namespace)
Mounts:
/certs from certs (rw)
/var/run/secrets/kubernetes.io/serviceaccount from cert-manager-webhook-token-lf56p (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
certs:
Type: Secret (a volume populated by a Secret)
SecretName: cert-manager-webhook-tls
Optional: false
cert-manager-webhook-token-lf56p:
Type: Secret (a volume populated by a Secret)
SecretName: cert-manager-webhook-token-lf56p
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled <unknown> default-scheduler Successfully assigned cert-manager/cert-manager-webhook-547567b88f-b7fzk to y.y.y.y
Warning FailedMount <invalid> kubelet, y.y.y.y Unable to attach or mount volumes: unmounted volumes=[certs], unattached volumes=[cert-manager-webhook-token-lf56p certs]: timed out waiting for the condition
Warning FailedMount <invalid> (x9 over 118s) kubelet, y.y.y.y MountVolume.SetUp failed for volume "certs" : secret "cert-manager-webhook-tls" not found
possible related issues (mostly closed)
- https://github.com/jetstack/cert-manager/issues/2052
- https://github.com/jetstack/cert-manager/issues/2038
Environment details::
- Kubernetes version (e.g. v1.10.2):
v1.16.2 - Cloud-provider/provisioner (e.g. GKE, kops AWS, etc):
baremetal - cert-manager version (e.g. v0.4.0):
0.10.0,0.11.0and0.12.0 - Install method (e.g. helm or static manifests):
helmandstatic manifests
/kind bug
About this issue
- Original URL
- State: closed
- Created 5 years ago
- Reactions: 12
- Comments: 43 (1 by maintainers)
Commits related to this issue
- set namespace for leader election https://github.com/jetstack/cert-manager/issues/2484 — committed to greenpeace/global-redirects-cert-manager by deleted user 4 years ago
- deploy to prod (#1) * Initial commit * update perms * set validation rule for namespace https://github.com/helm/charts/issues/10856 * because lint * set namespace for leader election ... — committed to greenpeace/global-redirects-cert-manager by jencub 4 years ago
same problem here on gke with kubernetes 1.14
I’m not sure if this is the issue anyone else in this thread is running into, but I was able to solve this error by deploying everything into the
cert-managernamespace and adding the following to the Helm chart’svalues.yaml:thanks @filipweidemann for your input this saved my day 😉 However I figured that tainting may not been necessary, I’ve did the following
deleted namespace
cert-managercreated/modified the manifest according to your suggestion
Then add
nodeSelectorto deployments incert-manager.ymllabeled the master node
created ns
cert-manager(no additional lables added)applied manifest
Result
Thanks @ioben , your solution works good to me with helmv2. I have no idea why setting global.leaderElection.namespace=“cert-manager” resolves the issue of no secret of cert-manager-webhook-tls previously. helm install --name my-release --namespace cert-manager jetstack/cert-manager --version “v0.12.0” --set global.leaderElection.namespace=“cert-manager” --set global.podSecurityPolicy.enabled=true
Hi,
First of all, thanks to the maintainers for the time and effort put into this OSS project.
I have been dealing with this issue for the past few days, banging my head against a wall as to why things didn’t work as they should. Some context:
I have 2 clusters, both on GCP, one being production, and another one being a scaled-down version, for staging/testing. I had successfully deployed v0.12 to staging with no issues, but were facing this particular issue on the production cluster. I had tried copying the secret from the staging to production, which seemed to solve this issue, but where facing other problems further down the pipeline, where
CertificateRequestsandOrderswere not being created automatically byCertificatesandIssuers/ClusterIssuers.Stuff I tried:
In the end, here’s what I learned, and how it fixed the problem for me: At the time of my experiments above, I am using Helm v3, without having explicitly migrated from Helm 2 to 3. As Helm 3 does not detect Helm 2 stuff, I was not aware that there was a Helm 2 installed version of cert-manager on my production cluster. Even with all the installs/uninstalls above, something must have survived, and was most likely causing issues.
So, the solution for me was:
Hope this helps someone else