cert-manager: Internal error occurred: failed calling webhook "webhook.cert-manager.io": Post https://cert-manager-webhook.cert-manager.svc:443/mutate?timeout=30s: net/http: TLS handshake timeout

Bugs should be filed for issues encountered whilst operating cert-manager. You should first attempt to resolve your issues through the community support channels, e.g. Slack, in order to rule out individual configuration errors. Please provide as much detail as possible.

Describe the bug:

Cluster Issuer installation fails with TLS handshake timeout

kubectl apply -f cert-issuer-letsencrypt-prd.yml -n cert-manager
Error from server (InternalError): error when creating "cert-issuer-letsencrypt-prd.yml": Internal error occurred: failed calling webhook "webhook.cert-manager.io": Post https://cert-manager-webhook.cert-manager.svc:443/mutate?timeout=30s: net/http: TLS handshake timeout

Expected behaviour:

kubectl apply -f cert-issuer-letsencrypt-prd.yml -n cert-manager

works successfully and does not generate an error

Steps to reproduce the bug:

Create cert-manager ns
```
kubectl create ns cert-manager
```

Install cert-manager using helm 3

helm install cert-manager jetstack/cert-manager --namespace cert-manager
NAME: cert-manager
LAST DEPLOYED: Sat Feb 15 11:40:28 2020
NAMESPACE: cert-manager
STATUS: deployed
REVISION: 1
TEST SUITE: None
NOTES:
cert-manager has been deployed successfully!

Add secret letsencrypt-prd

kubectl -n cert-manager apply -f cert-cloudflare-api-key-secret.yml

Create cluster-issuer

kubectl apply -f cert-issuer-letsencrypt-prd.yml -n cert-manager

cert-issuer-letsencrypt-prd.yml

apiVersion: cert-manager.io/v1alpha2
kind: ClusterIssuer
metadata:
  name: letsencrypt-prd
  namespace: cert-manager
spec:
  acme:
    server: https://acme-v02.api.letsencrypt.org/directory
    email: xxxx
    privateKeySecretRef: 
      name: letsencrypt-prd
    solvers:
    - dns01:
        cloudflare:
          email: xxxx
          apiKeySecretRef:
            name: cloudflare-api-key-secret
            key: api-key

Anything else we need to know?:

Environment details::

Kubernetes version (e.g. v1.10.2): v.1.17.2
Cloud-provider/provisioner (e.g. GKE, kops AWS, etc): bare-metal
cert-manager version (e.g. v0.4.0): 0.13.0
Install method (e.g. helm or static manifests): helm 3

/kind bug

Pods are running fine, no restarts

kubectl -n cert-manager get pods
NAME                                      READY   STATUS    RESTARTS   AGE
cert-manager-c6cb4cbdf-djqt4              1/1     Running   0          37m
cert-manager-cainjector-76f7596c4-wsb4z   1/1     Running   0          37m
cert-manager-webhook-8575f88c85-xf7w2     1/1     Running   0          31m

crd are there

kubectl get crd | grep cert-manager
certificaterequests.cert-manager.io                     2020-02-15T10:39:37Z
certificates.cert-manager.io                            2020-02-15T10:39:38Z
challenges.acme.cert-manager.io                         2020-02-15T10:39:38Z
clusterissuers.cert-manager.io                          2020-02-15T10:39:39Z
issuers.cert-manager.io                                 2020-02-15T10:39:40Z
orders.acme.cert-manager.io                             2020-02-15T10:39:40Z

logs of cert-manager-webhook pod repreatetly show http: TLS handshake error from 10.42.152.128:5067: EOF

kubectl -n cert-manager logs cert-manager-webhook-8575f88c85-xf7w2
I0215 10:47:05.409158       1 main.go:64]  "msg"="enabling TLS as certificate file flags specified"  
I0215 10:47:05.409423       1 server.go:126]  "msg"="listening for insecure healthz connections"  "address"=":6080"
I0215 10:47:05.409471       1 server.go:138]  "msg"="listening for secure connections"  "address"=":10250"
I0215 10:47:05.409495       1 server.go:155]  "msg"="registered pprof handlers"  
I0215 10:47:05.409672       1 tls_file_source.go:144]  "msg"="detected private key or certificate data on disk has changed. reloading certificate"  
2020/02/15 10:48:46 http: TLS handshake error from 10.42.152.128:25427: EOF
2020/02/15 10:53:56 http: TLS handshake error from 10.42.152.128:48126: EOF
2020/02/15 10:59:06 http: TLS handshake error from 10.42.152.128:21683: EOF
2020/02/15 11:04:16 http: TLS handshake error from 10.42.152.128:9457: EOF
2020/02/15 11:09:26 http: TLS handshake error from 10.42.152.128:41640: EOF
2020/02/15 11:14:36 http: TLS handshake error from 10.42.152.128:56638: EOF

here the logs from cert-manager-pod

kubectl -n cert-manager logs cert-manager-c6cb4cbdf-fzdmj
I0215 12:23:31.410690       1 start.go:76] cert-manager "msg"="starting controller"  "git-commit"="6d9200f9d" "version"="v0.13.0"
W0215 12:23:31.410750       1 client_config.go:543] Neither --kubeconfig nor --master was specified.  Using the inClusterConfig.  This might not work.
I0215 12:23:31.411825       1 controller.go:167] cert-manager/controller/build-context "msg"="configured acme dns01 nameservers" "nameservers"=["10.43.0.10:53"] 
I0215 12:23:31.412082       1 controller.go:130] cert-manager/controller "msg"="starting leader election"  
I0215 12:23:31.412188       1 metrics.go:202] cert-manager/metrics "msg"="listening for connections on" "address"="0.0.0.0:9402" 
I0215 12:23:31.412836       1 leaderelection.go:242] attempting to acquire leader lease  kube-system/cert-manager-controller...
I0215 12:24:50.921660       1 leaderelection.go:252] successfully acquired lease kube-system/cert-manager-controller
I0215 12:24:50.922007       1 controller.go:172] cert-manager/controller/certificaterequests "msg"="new certificate request controller registered"  "type"="selfsigned"
I0215 12:24:50.922139       1 controller.go:172] cert-manager/controller/certificaterequests "msg"="new certificate request controller registered"  "type"="venafi"
I0215 12:24:50.922149       1 controller.go:101] cert-manager/controller "msg"="starting controller" "controller"="certificaterequests-issuer-selfsigned" 
I0215 12:24:50.922214       1 controller.go:74] cert-manager/controller/certificaterequests-issuer-selfsigned "msg"="starting control loop"  
I0215 12:24:50.922245       1 controller.go:101] cert-manager/controller "msg"="starting controller" "controller"="certificaterequests-issuer-venafi" 
I0215 12:24:50.922291       1 controller.go:74] cert-manager/controller/certificaterequests-issuer-venafi "msg"="starting control loop"  
I0215 12:24:50.922307       1 controller.go:101] cert-manager/controller "msg"="starting controller" "controller"="clusterissuers" 
I0215 12:24:50.922338       1 controller.go:74] cert-manager/controller/clusterissuers "msg"="starting control loop"  
I0215 12:24:50.922376       1 controller.go:101] cert-manager/controller "msg"="starting controller" "controller"="webhook-bootstrap" 
I0215 12:24:50.922407       1 controller.go:74] cert-manager/controller/webhook-bootstrap "msg"="starting control loop"  
I0215 12:24:50.922412       1 controller.go:101] cert-manager/controller "msg"="starting controller" "controller"="issuers" 
I0215 12:24:50.922474       1 controller.go:74] cert-manager/controller/issuers "msg"="starting control loop"  
I0215 12:24:50.922537       1 controller.go:101] cert-manager/controller "msg"="starting controller" "controller"="orders" 
I0215 12:24:50.922578       1 controller.go:74] cert-manager/controller/orders "msg"="starting control loop"  
I0215 12:24:50.922602       1 controller.go:172] cert-manager/controller/certificaterequests "msg"="new certificate request controller registered"  "type"="acme"
I0215 12:24:50.922711       1 controller.go:101] cert-manager/controller "msg"="starting controller" "controller"="certificaterequests-issuer-acme" 
I0215 12:24:50.922736       1 controller.go:172] cert-manager/controller/certificaterequests "msg"="new certificate request controller registered"  "type"="vault"
I0215 12:24:50.922740       1 controller.go:74] cert-manager/controller/certificaterequests-issuer-acme "msg"="starting control loop"  
I0215 12:24:50.922855       1 controller.go:101] cert-manager/controller "msg"="starting controller" "controller"="certificaterequests-issuer-vault" 
I0215 12:24:50.922904       1 controller.go:74] cert-manager/controller/certificaterequests-issuer-vault "msg"="starting control loop"  
I0215 12:24:50.922982       1 controller.go:101] cert-manager/controller "msg"="starting controller" "controller"="certificates" 
I0215 12:24:50.923031       1 controller.go:74] cert-manager/controller/certificates "msg"="starting control loop"  
I0215 12:24:50.923042       1 controller.go:101] cert-manager/controller "msg"="starting controller" "controller"="ingress-shim" 
I0215 12:24:50.923073       1 controller.go:74] cert-manager/controller/ingress-shim "msg"="starting control loop"  
I0215 12:24:51.025320       1 controller.go:172] cert-manager/controller/certificaterequests "msg"="new certificate request controller registered"  "type"="ca"
I0215 12:24:51.025331       1 controller.go:101] cert-manager/controller "msg"="starting controller" "controller"="challenges" 
I0215 12:24:51.025385       1 controller.go:74] cert-manager/controller/challenges "msg"="starting control loop"  
I0215 12:24:51.025430       1 controller.go:101] cert-manager/controller "msg"="starting controller" "controller"="certificaterequests-issuer-ca" 
I0215 12:24:51.025473       1 controller.go:74] cert-manager/controller/certificaterequests-issuer-ca "msg"="starting control loop"  
I0215 12:24:51.122618       1 controller.go:129] cert-manager/controller/webhook-bootstrap "msg"="syncing item" "key"="cert-manager/cert-manager-webhook-ca" 
I0215 12:24:51.122638       1 controller.go:129] cert-manager/controller/webhook-bootstrap "msg"="syncing item" "key"="cert-manager/cloudflare-api-key-secret" 
I0215 12:24:51.122650       1 controller.go:129] cert-manager/controller/webhook-bootstrap "msg"="syncing item" "key"="cert-manager/cert-manager-webhook-tls" 
I0215 12:24:51.122669       1 controller.go:129] cert-manager/controller/webhook-bootstrap "msg"="syncing item" "key"="cert-manager/sh.helm.release.v1.cert-manager.v1" 
I0215 12:24:51.122674       1 controller.go:135] cert-manager/controller/webhook-bootstrap "msg"="finished processing work item" "key"="cert-manager/cloudflare-api-key-secret" 
I0215 12:24:51.122705       1 controller.go:129] cert-manager/controller/webhook-bootstrap "msg"="syncing item" "key"="cert-manager/cert-manager-token-5tdm7" 
I0215 12:24:51.122729       1 controller.go:135] cert-manager/controller/webhook-bootstrap "msg"="finished processing work item" "key"="cert-manager/cert-manager-token-5tdm7" 
I0215 12:24:51.122780       1 controller.go:129] cert-manager/controller/webhook-bootstrap "msg"="syncing item" "key"="cert-manager/cert-manager-webhook-token-6hpwz" 
I0215 12:24:51.122729       1 controller.go:135] cert-manager/controller/webhook-bootstrap "msg"="finished processing work item" "key"="cert-manager/sh.helm.release.v1.cert-manager.v1" 
I0215 12:24:51.122805       1 controller.go:135] cert-manager/controller/webhook-bootstrap "msg"="finished processing work item" "key"="cert-manager/cert-manager-webhook-token-6hpwz" 
I0215 12:24:51.122840       1 controller.go:129] cert-manager/controller/webhook-bootstrap "msg"="syncing item" "key"="cert-manager/default-token-vlftn" 
I0215 12:24:51.122867       1 controller.go:135] cert-manager/controller/webhook-bootstrap "msg"="finished processing work item" "key"="cert-manager/default-token-vlftn" 
I0215 12:24:51.122618       1 controller.go:129] cert-manager/controller/webhook-bootstrap "msg"="syncing item" "key"="cert-manager/cert-manager-cainjector-token-lzwbt" 
I0215 12:24:51.122903       1 controller.go:135] cert-manager/controller/webhook-bootstrap "msg"="finished processing work item" "key"="cert-manager/cert-manager-cainjector-token-lzwbt" 
I0215 12:24:51.123241       1 controller.go:129] cert-manager/controller/ingress-shim "msg"="syncing item" "key"="kube-system/dashboard-kubernetes-dashboard" 
I0215 12:24:51.123256       1 controller.go:197] cert-manager/controller/webhook-bootstrap/webhook-bootstrap/ca-secret "msg"="ca certificate already up to date" "resource_kind"="Secret" "resource_name"="cert-manager-webhook-ca" "resource_namespace"="cert-manager" 
I0215 12:24:51.123281       1 controller.go:135] cert-manager/controller/webhook-bootstrap "msg"="finished processing work item" "key"="cert-manager/cert-manager-webhook-ca" 
I0215 12:24:51.123420       1 controller.go:255] cert-manager/controller/webhook-bootstrap/webhook-bootstrap/ca-secret "msg"="serving certificate already up to date" "resource_kind"="Secret" "resource_name"="cert-manager-webhook-tls" "resource_namespace"="cert-manager" 
I0215 12:24:51.123450       1 controller.go:135] cert-manager/controller/webhook-bootstrap "msg"="finished processing work item" "key"="cert-manager/cert-manager-webhook-tls"

About this issue

Original URL
State: closed
Created 4 years ago
Comments: 70 (3 by maintainers)

Commits related to this issue

remove validator maybe this is not work in OKE. Error from server (InternalError): error when creating "clusterissuer.yml": Internal error occurred: failed calling webhook "webhook.cert-manager.... — committed to pikatenor/infra by pikatenor a year ago

Most upvoted comments

I had issue with deploy ClusterIssuer, error was:

Internal error occurred: failed calling webhook \"webhook.cert-manager.io\": Post https://cert-manager-webhook.cert-manager.svc:443/validate?timeout=30s: context deadline exceeded

Solved as:

$ helm install \
  cert-manager jetstack/cert-manager \
  --namespace cert-manager \
  --version v0.16.0 \
  --set installCRDs=true

$ kubectl delete mutatingwebhookconfiguration.admissionregistration.k8s.io cert-manager-webhook
$ kubectl delete validatingwebhookconfigurations.admissionregistration.k8s.io cert-manager-webhook

+52

ivanovUA on Aug 5, 2020

@Antiarchitect Only your solution worked for me!

Steps taken:

kubectl delete mutatingwebhookconfiguration.admissionregistration.k8s.io cert-manager-webhook
kubectl delete validatingwebhookconfigurations.admissionregistration.k8s.io cert-manager-webhook
namespace="not-cert-manager"
curl -s -L "https://github.com/jetstack/cert-manager/releases/download/v0.14.1/cert-manager.crds.yaml" 2>&1 | sed -e "s/namespace: cert-manager/namespace: ${namespace}/" -e "s/cert-manager.io\/inject-ca-from-secret: cert-manager\/cert-manager-webhook-tls/cert-manager.io\/inject-ca-from-secret: ${namespace}\/${namespace}-cert-manager-webhook-tls/" |  kubectl apply --validate=false -f -

+45

TylerIlunga on Apr 2, 2020

@turkenh I am seeing the same issue but no errors in my events. I am following the same approach as you. i.e., deploy the cert-manager first and then the issuer with a separate helm chart. Just as you had observed, I do not see the error if I deploy my Issuer after a few seconds (~60). Back to back installations of cert-manager and the Issuer certainly throws the following error: Internal error occurred: failed calling webhook “webhook.cert-manager.io”: Post https://cert-manager-webhook.cert-manager.svc:443/mutate?timeout=30s: x509: certificate signed by unknown authority @munnerz , I am certainly seeing the same error as the original issue with v.015.0. Any thoughts on why I am seeing the error when I try to immediately deploy the Issuer after the cert-manager deployment and NOT when I deploy the Issuer after a bit of a wait? This still appears to be a bug. Do you want me to open another issue to track this?

+11

vijaygos on May 20, 2020

Using cert manager v0.15.0 which is released yesterday. With installCRDs set true, I am still getting the same error as above:

failed calling webhook "webhook.cert-manager.io": Post https://cert-manager-webhook.cert-manager.svc:443/mutate?timeout=30s: x509: certificate signed by unknown authority

Our scripts deploy another helm chart which contains cert manager resources just after cert manager helm release reports ready and helm fails with above error. However, if I try to create resources after some time, I don’t get any errors. So, it looks like a timing issue but I was not getting it with v0.15.0-alpha.0.

+11

turkenh on May 7, 2020

Waiting a while as mentioned earlier seems to do the trick, so there is probably a timing issue somewhere. Tested on 0.15.2.

The following works for me (you might wanna tinker with the timer):

kubectl apply --validate=false -f https://github.com/jetstack/cert-manager/releases/download/v0.15.2/cert-manager.yaml
kubectl -n cert-manager rollout status deploy/cert-manager-webhook
sleep 20

Fresa on Jul 22, 2020

We’ve made significant improvements to the way TLS is managed in the upcoming v0.15 release, as well as adding an installCRDs option to the Helm chart which will handle correctly updating service names and conversion webhook configuration when deploying into namespaces other than cert-manager or using a Helm release name other than cert-manager.

I think this issue can now be closed after this, and if anyone is still running into issues I’d advise you to try the new v0.15.0-alpha.1 release and reporting back! (to be safe, it may be best to start ‘fresh’ in case you have a currently broken configuration!)

munnerz on Apr 23, 2020

Still not sure why it does not work with the webhook. Also not sure whether I really sure if this is the best approach as

Doing this may expose your cluster to miss-configuration problems that in some cases could cause cert-manager to stop working altogether (i.e. if invalid types are set for fields on cert-manager resources).

Also interestingly, webhook was working on the initial setup of my cluster back in January. I did add an additional node and updated the underlying OS. Not sure yet why it stopped working…

papanito on Feb 17, 2020

@munnerz Thank you for your help. I deployed the cert manager using the kubectl command like below :

kubectl apply --validate=false -f https://github.com/jetstack/cert-manager/releases/download/v0.15.0-alpha.1/Cert-manager.yaml

every thing working fine as you can see:

kubectl get pod,service,endpoints -n cert-manager NAME READY STATUS RESTARTS AGE pod/cert-manager-5bb5b9dcf8-sb52s 1/1 Running 0 28m pod/cert-manager-cainjector-869f7868b7-rrrw2 1/1 Running 0 28m pod/cert-manager-webhook-79d78c45cd-7fxfs 1/1 Running 0 28m

NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/cert-manager ClusterIP 10.99.239.39 <none> 9402/TCP 28m service/cert-manager-webhook ClusterIP 10.99.49.145 <none> 443/TCP 28m

NAME ENDPOINTS AGE endpoints/cert-manager 10.244.4.60:9402 28m endpoints/cert-manager-webhook 10.244.5.75:10250 28m

but when i try to create the issuer and certificate i got the timeout and context deadline exceeded

zzaareer on Apr 23, 2020

I’ve solved my problems with sed 😃)

namespace="not-cert-manager"
curl -s -L "https://github.com/jetstack/cert-manager/releases/download/v0.14.1/cert-manager.crds.yaml" 2>&1 | sed -e "s/namespace: cert-manager/namespace: ${namespace}/" -e "s/cert-manager.io\/inject-ca-from-secret: cert-manager\/cert-manager-webhook-tls/cert-manager.io\/inject-ca-from-secret: ${namespace}\/${namespace}-cert-manager-webhook-tls/" |  kubectl apply --validate=false -f -

But you should remove not only CRDs but

mutatingwebhookconfigurations.admissionregistration.k8s.io
validatingwebhookconfigurations.admissionregistration.k8s.io

if they were improperly configured before

Antiarchitect on Apr 2, 2020

We ran into this, and the specific resource that was conflicting was the cert-manager-webhook-ca secret, which had been left over from a previous installation that was removed manually. When I looked at the details, that secret had been created 2 years previously to the new version of cert-manager being installed. I was able to simply run kubectl delete -f https://github.com/jetstack/cert-manager/releases/download/v[X.X]/cert-manager.yaml which removed everything in that namespace (including old stuff), and then re-ran kubectl apply .... After doing that, I confirmed that the secret was new, and everything started working. HTH

bmoeskau on Dec 22, 2021

fixed this problem on my hard upgrade from v0.10 to v0.15 by deleting “cert-manager-webhook-ca” cause it’s not updated automatically if exists

darthcorsair on May 30, 2020

I am having the same symptom. And I am sure it is something with my Weave CNI, because it worked with AWS VPC CNI.

I even tried tcpdump on cert-manager and cert-manager-webhook pods, surprisingly, there is no traffic on webhook port.

Magicloud on Feb 20, 2020

hi @papanito my configuration is the same, but Kubernetes version 1.16 and I tried to install cert-manager today using static file instead of helm.

I had exactly the same issue and solved it by following this page https://cert-manager.io/docs/installation/compatibility/ . Particularly, I have used cert-manager-no-webhook.yaml instead of cert-manager.yaml. You can consider if this option is suitable for you.

So now I finished my configuration and HTTPS works fine. Followed https://www.digitalocean.com/community/tutorials/how-to-set-up-an-nginx-ingress-with-cert-manager-on-digitalocean-kubernetes . A note that I’m using bare metal.

laimison on Feb 15, 2020

这个问题可能是cni导致的，我修改了calico的mtu后这个问题解决了(This problem may be caused by cni. After I modified the mtu of calico, the problem was solved.)

“mtu”: 1440-> “mtu”: 1420,

{
  "name": "k8s-pod-network",
  "cniVersion": "0.3.1",
  "plugins": [
    {
      "type": "calico",
      "log_level": "info",
      "log_file_path": "/var/log/calico/cni/cni.log",
      "datastore_type": "kubernetes",
      "nodename": "k3s-operator-1",
      "mtu": 1420,
      "ipam": {
          "type": "calico-ipam"
      },
      "policy": {
          "type": "k8s"
      },
      "kubernetes": {
          "kubeconfig": "/etc/cni/net.d/calico-kubeconfig"
      }
    },
    {
      "type": "portmap",
      "snat": true,
      "capabilities": {"portMappings": true}
    },
    {
      "type": "bandwidth",
      "capabilities": {"bandwidth": true}
    }
  ]
}

liulangwa on Sep 29, 2020

Using hetzner cloud servers here, and the problem was fixed indeed by changing MTU not cert-manager

Changing calico MTU from 1440 to 1400 or 1420 fixed the error running test-resource.yaml

Error from server (InternalError): error when creating "test-resources.yaml": Internal error occurred: failed calling webhook "webhook.cert-manager.io": Post "https://cert-manager-webhook.cert-manager.svc:443/validate?timeout=10s": context deadline exceeded

MTU Change:

kubectl patch configmap/calico-config -n kube-system --type merge \
  -p '{"data":{"veth_mtu": "1400"}}'
kubectl rollout restart daemonset calico-node -n kube-system

SirNarsh on Oct 17, 2020

@turkenh I am seeing the same issue but no errors in my events. I am following the same approach as you. i.e., deploy the cert-manager first and then the issuer with a separate helm chart. Just as you had observed, I do not see the error if I deploy my Issuer after a few seconds (~60). Back to back installations of cert-manager and the Issuer certainly throws the following error: Internal error occurred: failed calling webhook “webhook.cert-manager.io”: Post https://cert-manager-webhook.cert-manager.svc:443/mutate?timeout=30s: x509: certificate signed by unknown authority @munnerz , I am certainly seeing the same error as the original issue with v.015.0. Any thoughts on why I am seeing the error when I try to immediately deploy the Issuer after the cert-manager deployment and NOT when I deploy the Issuer after a bit of a wait? This still appears to be a bug. Do you want me to open another issue to track this?

I had similar issue and found out that my kube-controller-manager pod and kube-api-server had wrongly configured NO_PROXY not excluding .svc from proxy traffic. I had to change /etc/kubernetes/manifests/*.yaml on master node.

rolish on Jun 8, 2020

Hi, I ran into the same issue like @zzaareer on a rancher kubernetes cluster. I have successfully deployed cert-manager via helm v3:

helm install cert-manager jetstack/cert-manager --namespace cert-manager --version v0.15.0 --set installCRDs=true --description "install cert-manager"

but when I try to install the test resources, I get the following error:

Error from server (InternalError): error when creating "test-resources.yaml": Internal error occurred: failed calling webhook "webhook.cert-manager.io": Post https://cert-manager-webhook.cert-manager.svc:443/mutate?timeout=30s: context deadline exceeded (Client.Timeout exceeded while awaiting headers)
Error from server (InternalError): error when creating "test-resources.yaml": Internal error occurred: failed calling webhook "webhook.cert-manager.io": Post https://cert-manager-webhook.cert-manager.svc:443/mutate?timeout=30s: context deadline exceeded

I attached a sidecar to the cert-manager pod for debugging and it shows me that I can resolve cert-manager-webhook.cert-manager.svc, but the IP is not answering on a ping.

I’ve resolved the IP to 10.43.179.12 and this matches my svc/cert-manager-webhook service. When I do k port-forward service/cert-manager-webhook 9090:443 and call localhost:9090 in my browser, I see that the API is up. But why is my cert-manager not reaching the webhook pod?

shibumi on May 11, 2020

curl -s -L "https://github.com/jetstack/cert-manager/releases/download/v0.14.1/cert-manager.crds.yaml" 2>&1 \
  | sed -e "s/namespace: cert-manager/namespace: ${namespace}/" \
    -e "s/cert-manager.io\/inject-ca-from-secret: cert-manager\/cert-manager-webhook-tls/cert-manager.io\/inject-ca-from-secret: ${namespace}\/${namespace}-cert-manager-webhook-tls/" \
  |  kubectl apply --validate=false -f -

After changing version to newer (v1.8.0 in curl) also helped for me! Thanks

dadurex on Sep 7, 2022

Potential resolution:

In our case, our cert-manager-webhooks pod had been running for nearly a year. We suspect it was using some sort of out-of-date internal cluster cert. After deleting the webhook pod, the Deployment spun up a new one without the issue.

sgringwe on Oct 29, 2020

This happens when you put any annotations on the Issuer or ClusterIssuer resources. Causes it to fail the validating webhook.

Can you file a seperate issue for that?

meyskens on Sep 22, 2020

I’m not sure if this is helpful, but an FYI: attempting to apply this via kubectl -k (kustomize) failed, but kubectl -f succeeded. I don’t know how to research more.

EDIT: potentially very relevant, I was working w/ a possibly very bad mix of kubectl versions:

λ kubectl version
Client Version: version.Info{Major:"1", Minor:"18", GitVersion:"v1.18.1", GitCommit:"7879fc12a63337efff607952a323df90cdc7a335", GitTreeState:"clean", BuildDate:"2020-04-10T21:53:58Z", GoVersion:"go1.14.2", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"14", GitVersion:"v1.14.3", GitCommit:"5e53fd6bc17c0dec8434817e69b04a25d8ae0ff0", GitTreeState:"clean", BuildDate:"2019-06-06T01:36:19Z", GoVersion:"go1.12.5", Compiler:"gc", Platform:"linux/amd64"}

weisjohn on Apr 17, 2020

@Antiarchitect Only your solution worked for me!

Steps taken:

kubectl delete mutatingwebhookconfiguration.admissionregistration.k8s.io cert-manager-webhook
kubectl delete validatingwebhookconfigurations.admissionregistration.k8s.io cert-manager-webhook
namespace="not-cert-manager"
curl -s -L "https://github.com/jetstack/cert-manager/releases/download/v0.14.1/cert-manager.crds.yaml" 2>&1 | sed -e "s/namespace: cert-manager/namespace: ${namespace}/" -e "s/cert-manager.io\/inject-ca-from-secret: cert-manager\/cert-manager-webhook-tls/cert-manager.io\/inject-ca-from-secret: ${namespace}\/${namespace}-cert-manager-webhook-tls/" |  kubectl apply --validate=false -f -

I followed the same action plan and it is working. but after that i cant describe or delete the issuer , it is giving me the following error:

conversion webhook for cert-manager.io/v1alpha2, Kind=Issuer failed: Post https://cert-manager-webhook.not-cert-manager.svc:443/convert?timeout=30s: service “cert-manager-webhook” not found.

Any Idea ?

zzaareer on Apr 5, 2020

Hi @TylerIlunga and @Antiarchitect,

I’ve the same issue, and with that fix I’ve already created an issuer. But when try describe created issuer, return that error:

conversion webhook for cert-manager.io/v1alpha2, Kind=Issuer failed: Post https://cert-manager-webhook.not-cert-manager.svc:443/convert?timeout=30s: service "cert-manager-webhook" not found.

jarpsimoes on Apr 3, 2020

Here https://github.com/jetstack/cert-manager/issues/2752#issuecomment-605966908 you can find the answer of @munnerz that explains very well the issue, the reason behind and a possible workaround.

pierluigilenoci on Mar 31, 2020

Got this error too.

Reason: the node MTU is smaller than the cert-manager-webhook pod MTU, lead to the TLS response packet not able to reach the node. Solution: adjust the cert-manager-webhook pod MTU to (node MTU - 20).

astraw99 on Jun 6, 2023

Since I also had the error

Internal error occurred: failed calling webhook \"webhook.cert-manager.io\":
Post https://cert-manager-webhook.cert-manager.svc:443/validate?timeout=30s: context deadline exceeded

which bothered me for quite some time I want to share my story 😉 Maybe it helps someone. So to get this error away I also did

$ kubectl delete mutatingwebhookconfiguration.admissionregistration.k8s.io cert-manager-webhook
$ kubectl delete validatingwebhookconfigurations.admissionregistration.k8s.io cert-manager-webhook

as mentioned in https://github.com/jetstack/cert-manager/issues/2602#issuecomment-669091541 . I’m installing cert-manager via the official Helm chart. So while I tried to upgrade from cert-manager v1.4.x to v1.5.0 the startupapicheck failed which also tries to call the webhook and it also exited with context deadline exceeded. While you can disable this check I really wanted to find out what was the real cause.

So while chatting with a team mate about that issue he asked the right question: Are you aware that the K8s control plane tries to connect to that webhook? 😉 I have no idea why I ignored that fact for quite a while… In my case there was just no connection from the control plane to the “Pod network” and I actually never needed it. So the controller nodes tried to connect to https://cert-manager-webhook.cert-manager.svc:443/validate?timeout=30s: and they of course had no idea where to route that request because there was just no network route from the controller nodes network to the “Pod network” (the K8s Pod and Service IP range).

My solution for now actually looks like this:

helm install cert-manager jetstack/cert-manager 
  --namespace cert-manager 
  --version v1.5.3
  --set installCRDs=true 
  --set global.leaderElection.namespace="cert-manager"
  --set webhook.securePort="30001"
  --set webhook.hostNetwork=true
  --set webhook.url.host="worker:30001"
  --set webhook.extraArgs="{--dynamic-serving-dns-names=worker:30001}"

This makes the webhook listening on port 30001 on the host network too. So now the controller nodes can communicate with the webhook via the host network. Of course worker needs to be replaced with a real hostname e.g. And to avoid a certificate error --dynamic-serving-dns-names is also needed. Here a list of valid DNS names can be included so that the webhook TLS certificate matches the hostname in the URL.

githubixx on Sep 21, 2021

Waiting a while as mentioned earlier seems to do the trick, so there is probably a timing issue somewhere. Tested on 0.15.2.

The following works for me (you might wanna tinker with the timer):
kubectl apply --validate=false -f https://github.com/jetstack/cert-manager/releases/download/v0.15.2/cert-manager.yaml

kubectl -n cert-manager rollout status deploy/cert-manager-webhook

sleep 20

I had a 60 second wait built in to my script and it still failed, came back 10 minutes later and tried this and it worked.

jesseborden on Jul 24, 2020

@munnerz, can you please consider re-opening this. Keeps happening in 0.15.2.

boxcee on Jul 17, 2020

I ended up finding another unique solution to this problem, and all of cert-manager is working at full capacity for me now. My setup was:

Kubernetes 1.18
Calico CNI
Bare Metal
Cert-Manager 0.14.1

To fix, for some reason I had to make an adjustment to the calico network IP pool configuration away from the default. I downloaded the calico setup YAML (https://docs.projectcalico.org/manifests/calico.yaml), and then I edited this snippet

- name: CALICO_IPV4POOL_IPIP
  value: "Always"

- name: CALICO_IPV4POOL_IPIP
  value: "Never"

After deleting the default created IP pool and restarting calico, I reinstalled cert-manager and it began working as intended.

I am not sure exactly why this change fixed all my problems.

dbeal-eth on Apr 10, 2020

I have been dealing with this issue for a couple days now. After the 0.15.0 alpha came out today I thought this issue would be resolved, but I continue to suffer the same issue.

Also I don’t think @Antiarchitect 's solution is actually a real solution since it necessitates deletes the webhook configurations, effectively disabling the webhook service. I think the issue is TLS connection establishment related, but I am not sure why none of the ciphers work.

dbeal-eth on Apr 8, 2020

Same here having upgraded from v0.11 to 0.14.1. Mandatory webhook component seems to have borked. Our new webhook pod is accessible on cert-manager-webhook.our-namespace.svc:443 and I’ve tried the hostNetwork suggestion and waiting for the pod to come up before creating the clusterIssuer resource. No dice. Rolling back to < v0.14 until all the open issues about this are closed. May I suggest a patch to make webhook optional again in the meantime?

holmesb on Mar 28, 2020