cert-manager: Internal error occurred: failed calling webhook "webhook.cert-manager.io": Post https://cert-manager-webhook.cert-manager.svc:443/mutate?timeout=30s: net/http: TLS handshake timeout

Bugs should be filed for issues encountered whilst operating cert-manager. You should first attempt to resolve your issues through the community support channels, e.g. Slack, in order to rule out individual configuration errors. Please provide as much detail as possible.

Describe the bug:

Cluster Issuer installation fails with TLS handshake timeout

kubectl apply -f cert-issuer-letsencrypt-prd.yml -n cert-manager
Error from server (InternalError): error when creating "cert-issuer-letsencrypt-prd.yml": Internal error occurred: failed calling webhook "webhook.cert-manager.io": Post https://cert-manager-webhook.cert-manager.svc:443/mutate?timeout=30s: net/http: TLS handshake timeout

Expected behaviour:

kubectl apply -f cert-issuer-letsencrypt-prd.yml -n cert-manager

works successfully and does not generate an error

Steps to reproduce the bug:

  1. Create cert-manager ns

    kubectl create ns cert-manager
    
  2. Install cert-manager using helm 3

    helm install cert-manager jetstack/cert-manager --namespace cert-manager
    NAME: cert-manager
    LAST DEPLOYED: Sat Feb 15 11:40:28 2020
    NAMESPACE: cert-manager
    STATUS: deployed
    REVISION: 1
    TEST SUITE: None
    NOTES:
    cert-manager has been deployed successfully!
    
  3. Add secret letsencrypt-prd

    kubectl -n cert-manager apply -f cert-cloudflare-api-key-secret.yml
    
  4. Create cluster-issuer

    kubectl apply -f cert-issuer-letsencrypt-prd.yml -n cert-manager  
    

cert-issuer-letsencrypt-prd.yml

apiVersion: cert-manager.io/v1alpha2
kind: ClusterIssuer
metadata:
  name: letsencrypt-prd
  namespace: cert-manager
spec:
  acme:
    server: https://acme-v02.api.letsencrypt.org/directory
    email: xxxx
    privateKeySecretRef: 
      name: letsencrypt-prd
    solvers:
    - dns01:
        cloudflare:
          email: xxxx
          apiKeySecretRef:
            name: cloudflare-api-key-secret
            key: api-key

Anything else we need to know?:

Environment details::

  • Kubernetes version (e.g. v1.10.2): v.1.17.2
  • Cloud-provider/provisioner (e.g. GKE, kops AWS, etc): bare-metal
  • cert-manager version (e.g. v0.4.0): 0.13.0
  • Install method (e.g. helm or static manifests): helm 3

/kind bug

Pods are running fine, no restarts

kubectl -n cert-manager get pods
NAME                                      READY   STATUS    RESTARTS   AGE
cert-manager-c6cb4cbdf-djqt4              1/1     Running   0          37m
cert-manager-cainjector-76f7596c4-wsb4z   1/1     Running   0          37m
cert-manager-webhook-8575f88c85-xf7w2     1/1     Running   0          31m

crd are there

kubectl get crd | grep cert-manager
certificaterequests.cert-manager.io                     2020-02-15T10:39:37Z
certificates.cert-manager.io                            2020-02-15T10:39:38Z
challenges.acme.cert-manager.io                         2020-02-15T10:39:38Z
clusterissuers.cert-manager.io                          2020-02-15T10:39:39Z
issuers.cert-manager.io                                 2020-02-15T10:39:40Z
orders.acme.cert-manager.io                             2020-02-15T10:39:40Z

logs of cert-manager-webhook pod repreatetly show http: TLS handshake error from 10.42.152.128:5067: EOF

kubectl -n cert-manager logs cert-manager-webhook-8575f88c85-xf7w2
I0215 10:47:05.409158       1 main.go:64]  "msg"="enabling TLS as certificate file flags specified"  
I0215 10:47:05.409423       1 server.go:126]  "msg"="listening for insecure healthz connections"  "address"=":6080"
I0215 10:47:05.409471       1 server.go:138]  "msg"="listening for secure connections"  "address"=":10250"
I0215 10:47:05.409495       1 server.go:155]  "msg"="registered pprof handlers"  
I0215 10:47:05.409672       1 tls_file_source.go:144]  "msg"="detected private key or certificate data on disk has changed. reloading certificate"  
2020/02/15 10:48:46 http: TLS handshake error from 10.42.152.128:25427: EOF
2020/02/15 10:53:56 http: TLS handshake error from 10.42.152.128:48126: EOF
2020/02/15 10:59:06 http: TLS handshake error from 10.42.152.128:21683: EOF
2020/02/15 11:04:16 http: TLS handshake error from 10.42.152.128:9457: EOF
2020/02/15 11:09:26 http: TLS handshake error from 10.42.152.128:41640: EOF
2020/02/15 11:14:36 http: TLS handshake error from 10.42.152.128:56638: EOF

here the logs from cert-manager-pod

kubectl -n cert-manager logs cert-manager-c6cb4cbdf-fzdmj
I0215 12:23:31.410690       1 start.go:76] cert-manager "msg"="starting controller"  "git-commit"="6d9200f9d" "version"="v0.13.0"
W0215 12:23:31.410750       1 client_config.go:543] Neither --kubeconfig nor --master was specified.  Using the inClusterConfig.  This might not work.
I0215 12:23:31.411825       1 controller.go:167] cert-manager/controller/build-context "msg"="configured acme dns01 nameservers" "nameservers"=["10.43.0.10:53"] 
I0215 12:23:31.412082       1 controller.go:130] cert-manager/controller "msg"="starting leader election"  
I0215 12:23:31.412188       1 metrics.go:202] cert-manager/metrics "msg"="listening for connections on" "address"="0.0.0.0:9402" 
I0215 12:23:31.412836       1 leaderelection.go:242] attempting to acquire leader lease  kube-system/cert-manager-controller...
I0215 12:24:50.921660       1 leaderelection.go:252] successfully acquired lease kube-system/cert-manager-controller
I0215 12:24:50.922007       1 controller.go:172] cert-manager/controller/certificaterequests "msg"="new certificate request controller registered"  "type"="selfsigned"
I0215 12:24:50.922139       1 controller.go:172] cert-manager/controller/certificaterequests "msg"="new certificate request controller registered"  "type"="venafi"
I0215 12:24:50.922149       1 controller.go:101] cert-manager/controller "msg"="starting controller" "controller"="certificaterequests-issuer-selfsigned" 
I0215 12:24:50.922214       1 controller.go:74] cert-manager/controller/certificaterequests-issuer-selfsigned "msg"="starting control loop"  
I0215 12:24:50.922245       1 controller.go:101] cert-manager/controller "msg"="starting controller" "controller"="certificaterequests-issuer-venafi" 
I0215 12:24:50.922291       1 controller.go:74] cert-manager/controller/certificaterequests-issuer-venafi "msg"="starting control loop"  
I0215 12:24:50.922307       1 controller.go:101] cert-manager/controller "msg"="starting controller" "controller"="clusterissuers" 
I0215 12:24:50.922338       1 controller.go:74] cert-manager/controller/clusterissuers "msg"="starting control loop"  
I0215 12:24:50.922376       1 controller.go:101] cert-manager/controller "msg"="starting controller" "controller"="webhook-bootstrap" 
I0215 12:24:50.922407       1 controller.go:74] cert-manager/controller/webhook-bootstrap "msg"="starting control loop"  
I0215 12:24:50.922412       1 controller.go:101] cert-manager/controller "msg"="starting controller" "controller"="issuers" 
I0215 12:24:50.922474       1 controller.go:74] cert-manager/controller/issuers "msg"="starting control loop"  
I0215 12:24:50.922537       1 controller.go:101] cert-manager/controller "msg"="starting controller" "controller"="orders" 
I0215 12:24:50.922578       1 controller.go:74] cert-manager/controller/orders "msg"="starting control loop"  
I0215 12:24:50.922602       1 controller.go:172] cert-manager/controller/certificaterequests "msg"="new certificate request controller registered"  "type"="acme"
I0215 12:24:50.922711       1 controller.go:101] cert-manager/controller "msg"="starting controller" "controller"="certificaterequests-issuer-acme" 
I0215 12:24:50.922736       1 controller.go:172] cert-manager/controller/certificaterequests "msg"="new certificate request controller registered"  "type"="vault"
I0215 12:24:50.922740       1 controller.go:74] cert-manager/controller/certificaterequests-issuer-acme "msg"="starting control loop"  
I0215 12:24:50.922855       1 controller.go:101] cert-manager/controller "msg"="starting controller" "controller"="certificaterequests-issuer-vault" 
I0215 12:24:50.922904       1 controller.go:74] cert-manager/controller/certificaterequests-issuer-vault "msg"="starting control loop"  
I0215 12:24:50.922982       1 controller.go:101] cert-manager/controller "msg"="starting controller" "controller"="certificates" 
I0215 12:24:50.923031       1 controller.go:74] cert-manager/controller/certificates "msg"="starting control loop"  
I0215 12:24:50.923042       1 controller.go:101] cert-manager/controller "msg"="starting controller" "controller"="ingress-shim" 
I0215 12:24:50.923073       1 controller.go:74] cert-manager/controller/ingress-shim "msg"="starting control loop"  
I0215 12:24:51.025320       1 controller.go:172] cert-manager/controller/certificaterequests "msg"="new certificate request controller registered"  "type"="ca"
I0215 12:24:51.025331       1 controller.go:101] cert-manager/controller "msg"="starting controller" "controller"="challenges" 
I0215 12:24:51.025385       1 controller.go:74] cert-manager/controller/challenges "msg"="starting control loop"  
I0215 12:24:51.025430       1 controller.go:101] cert-manager/controller "msg"="starting controller" "controller"="certificaterequests-issuer-ca" 
I0215 12:24:51.025473       1 controller.go:74] cert-manager/controller/certificaterequests-issuer-ca "msg"="starting control loop"  
I0215 12:24:51.122618       1 controller.go:129] cert-manager/controller/webhook-bootstrap "msg"="syncing item" "key"="cert-manager/cert-manager-webhook-ca" 
I0215 12:24:51.122638       1 controller.go:129] cert-manager/controller/webhook-bootstrap "msg"="syncing item" "key"="cert-manager/cloudflare-api-key-secret" 
I0215 12:24:51.122650       1 controller.go:129] cert-manager/controller/webhook-bootstrap "msg"="syncing item" "key"="cert-manager/cert-manager-webhook-tls" 
I0215 12:24:51.122669       1 controller.go:129] cert-manager/controller/webhook-bootstrap "msg"="syncing item" "key"="cert-manager/sh.helm.release.v1.cert-manager.v1" 
I0215 12:24:51.122674       1 controller.go:135] cert-manager/controller/webhook-bootstrap "msg"="finished processing work item" "key"="cert-manager/cloudflare-api-key-secret" 
I0215 12:24:51.122705       1 controller.go:129] cert-manager/controller/webhook-bootstrap "msg"="syncing item" "key"="cert-manager/cert-manager-token-5tdm7" 
I0215 12:24:51.122729       1 controller.go:135] cert-manager/controller/webhook-bootstrap "msg"="finished processing work item" "key"="cert-manager/cert-manager-token-5tdm7" 
I0215 12:24:51.122780       1 controller.go:129] cert-manager/controller/webhook-bootstrap "msg"="syncing item" "key"="cert-manager/cert-manager-webhook-token-6hpwz" 
I0215 12:24:51.122729       1 controller.go:135] cert-manager/controller/webhook-bootstrap "msg"="finished processing work item" "key"="cert-manager/sh.helm.release.v1.cert-manager.v1" 
I0215 12:24:51.122805       1 controller.go:135] cert-manager/controller/webhook-bootstrap "msg"="finished processing work item" "key"="cert-manager/cert-manager-webhook-token-6hpwz" 
I0215 12:24:51.122840       1 controller.go:129] cert-manager/controller/webhook-bootstrap "msg"="syncing item" "key"="cert-manager/default-token-vlftn" 
I0215 12:24:51.122867       1 controller.go:135] cert-manager/controller/webhook-bootstrap "msg"="finished processing work item" "key"="cert-manager/default-token-vlftn" 
I0215 12:24:51.122618       1 controller.go:129] cert-manager/controller/webhook-bootstrap "msg"="syncing item" "key"="cert-manager/cert-manager-cainjector-token-lzwbt" 
I0215 12:24:51.122903       1 controller.go:135] cert-manager/controller/webhook-bootstrap "msg"="finished processing work item" "key"="cert-manager/cert-manager-cainjector-token-lzwbt" 
I0215 12:24:51.123241       1 controller.go:129] cert-manager/controller/ingress-shim "msg"="syncing item" "key"="kube-system/dashboard-kubernetes-dashboard" 
I0215 12:24:51.123256       1 controller.go:197] cert-manager/controller/webhook-bootstrap/webhook-bootstrap/ca-secret "msg"="ca certificate already up to date" "resource_kind"="Secret" "resource_name"="cert-manager-webhook-ca" "resource_namespace"="cert-manager" 
I0215 12:24:51.123281       1 controller.go:135] cert-manager/controller/webhook-bootstrap "msg"="finished processing work item" "key"="cert-manager/cert-manager-webhook-ca" 
I0215 12:24:51.123420       1 controller.go:255] cert-manager/controller/webhook-bootstrap/webhook-bootstrap/ca-secret "msg"="serving certificate already up to date" "resource_kind"="Secret" "resource_name"="cert-manager-webhook-tls" "resource_namespace"="cert-manager" 
I0215 12:24:51.123450       1 controller.go:135] cert-manager/controller/webhook-bootstrap "msg"="finished processing work item" "key"="cert-manager/cert-manager-webhook-tls"

About this issue

  • Original URL
  • State: closed
  • Created 4 years ago
  • Comments: 70 (3 by maintainers)

Commits related to this issue

Most upvoted comments

I had issue with deploy ClusterIssuer, error was:

Internal error occurred: failed calling webhook \"webhook.cert-manager.io\": Post https://cert-manager-webhook.cert-manager.svc:443/validate?timeout=30s: context deadline exceeded

Solved as:

$ helm install \
  cert-manager jetstack/cert-manager \
  --namespace cert-manager \
  --version v0.16.0 \
  --set installCRDs=true

$ kubectl delete mutatingwebhookconfiguration.admissionregistration.k8s.io cert-manager-webhook
$ kubectl delete validatingwebhookconfigurations.admissionregistration.k8s.io cert-manager-webhook

@Antiarchitect Only your solution worked for me!

Steps taken:

kubectl delete mutatingwebhookconfiguration.admissionregistration.k8s.io cert-manager-webhook
kubectl delete validatingwebhookconfigurations.admissionregistration.k8s.io cert-manager-webhook
namespace="not-cert-manager"
curl -s -L "https://github.com/jetstack/cert-manager/releases/download/v0.14.1/cert-manager.crds.yaml" 2>&1 | sed -e "s/namespace: cert-manager/namespace: ${namespace}/" -e "s/cert-manager.io\/inject-ca-from-secret: cert-manager\/cert-manager-webhook-tls/cert-manager.io\/inject-ca-from-secret: ${namespace}\/${namespace}-cert-manager-webhook-tls/" |  kubectl apply --validate=false -f -

@turkenh I am seeing the same issue but no errors in my events. I am following the same approach as you. i.e., deploy the cert-manager first and then the issuer with a separate helm chart. Just as you had observed, I do not see the error if I deploy my Issuer after a few seconds (~60). Back to back installations of cert-manager and the Issuer certainly throws the following error: Internal error occurred: failed calling webhook “webhook.cert-manager.io”: Post https://cert-manager-webhook.cert-manager.svc:443/mutate?timeout=30s: x509: certificate signed by unknown authority @munnerz , I am certainly seeing the same error as the original issue with v.015.0. Any thoughts on why I am seeing the error when I try to immediately deploy the Issuer after the cert-manager deployment and NOT when I deploy the Issuer after a bit of a wait? This still appears to be a bug. Do you want me to open another issue to track this?

Using cert manager v0.15.0 which is released yesterday. With installCRDs set true, I am still getting the same error as above:

failed calling webhook "webhook.cert-manager.io": Post https://cert-manager-webhook.cert-manager.svc:443/mutate?timeout=30s: x509: certificate signed by unknown authority

Our scripts deploy another helm chart which contains cert manager resources just after cert manager helm release reports ready and helm fails with above error. However, if I try to create resources after some time, I don’t get any errors. So, it looks like a timing issue but I was not getting it with v0.15.0-alpha.0.

Waiting a while as mentioned earlier seems to do the trick, so there is probably a timing issue somewhere. Tested on 0.15.2.

The following works for me (you might wanna tinker with the timer):

kubectl apply --validate=false -f https://github.com/jetstack/cert-manager/releases/download/v0.15.2/cert-manager.yaml
kubectl -n cert-manager rollout status deploy/cert-manager-webhook
sleep 20

We’ve made significant improvements to the way TLS is managed in the upcoming v0.15 release, as well as adding an installCRDs option to the Helm chart which will handle correctly updating service names and conversion webhook configuration when deploying into namespaces other than cert-manager or using a Helm release name other than cert-manager.

I think this issue can now be closed after this, and if anyone is still running into issues I’d advise you to try the new v0.15.0-alpha.1 release and reporting back! (to be safe, it may be best to start ‘fresh’ in case you have a currently broken configuration!)

Still not sure why it does not work with the webhook. Also not sure whether I really sure if this is the best approach as

Doing this may expose your cluster to miss-configuration problems that in some cases could cause cert-manager to stop working altogether (i.e. if invalid types are set for fields on cert-manager resources).

Also interestingly, webhook was working on the initial setup of my cluster back in January. I did add an additional node and updated the underlying OS. Not sure yet why it stopped working…

@munnerz Thank you for your help. I deployed the cert manager using the kubectl command like below :

kubectl apply --validate=false -f https://github.com/jetstack/cert-manager/releases/download/v0.15.0-alpha.1/Cert-manager.yaml

every thing working fine as you can see:

kubectl get pod,service,endpoints -n cert-manager NAME READY STATUS RESTARTS AGE pod/cert-manager-5bb5b9dcf8-sb52s 1/1 Running 0 28m pod/cert-manager-cainjector-869f7868b7-rrrw2 1/1 Running 0 28m pod/cert-manager-webhook-79d78c45cd-7fxfs 1/1 Running 0 28m

NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/cert-manager ClusterIP 10.99.239.39 <none> 9402/TCP 28m service/cert-manager-webhook ClusterIP 10.99.49.145 <none> 443/TCP 28m

NAME ENDPOINTS AGE endpoints/cert-manager 10.244.4.60:9402 28m endpoints/cert-manager-webhook 10.244.5.75:10250 28m

but when i try to create the issuer and certificate i got the timeout and context deadline exceeded

I’ve solved my problems with sed 😃)

namespace="not-cert-manager"
curl -s -L "https://github.com/jetstack/cert-manager/releases/download/v0.14.1/cert-manager.crds.yaml" 2>&1 | sed -e "s/namespace: cert-manager/namespace: ${namespace}/" -e "s/cert-manager.io\/inject-ca-from-secret: cert-manager\/cert-manager-webhook-tls/cert-manager.io\/inject-ca-from-secret: ${namespace}\/${namespace}-cert-manager-webhook-tls/" |  kubectl apply --validate=false -f -

But you should remove not only CRDs but

mutatingwebhookconfigurations.admissionregistration.k8s.io
validatingwebhookconfigurations.admissionregistration.k8s.io

if they were improperly configured before

We ran into this, and the specific resource that was conflicting was the cert-manager-webhook-ca secret, which had been left over from a previous installation that was removed manually. When I looked at the details, that secret had been created 2 years previously to the new version of cert-manager being installed. I was able to simply run kubectl delete -f https://github.com/jetstack/cert-manager/releases/download/v[X.X]/cert-manager.yaml which removed everything in that namespace (including old stuff), and then re-ran kubectl apply .... After doing that, I confirmed that the secret was new, and everything started working. HTH

fixed this problem on my hard upgrade from v0.10 to v0.15 by deleting “cert-manager-webhook-ca” cause it’s not updated automatically if exists

I am having the same symptom. And I am sure it is something with my Weave CNI, because it worked with AWS VPC CNI.

I even tried tcpdump on cert-manager and cert-manager-webhook pods, surprisingly, there is no traffic on webhook port.

hi @papanito my configuration is the same, but Kubernetes version 1.16 and I tried to install cert-manager today using static file instead of helm.

I had exactly the same issue and solved it by following this page https://cert-manager.io/docs/installation/compatibility/ . Particularly, I have used cert-manager-no-webhook.yaml instead of cert-manager.yaml. You can consider if this option is suitable for you.

So now I finished my configuration and HTTPS works fine. Followed https://www.digitalocean.com/community/tutorials/how-to-set-up-an-nginx-ingress-with-cert-manager-on-digitalocean-kubernetes . A note that I’m using bare metal.

这个问题可能是cni导致的,我修改了calico的mtu后这个问题解决了(This problem may be caused by cni. After I modified the mtu of calico, the problem was solved.)

“mtu”: 1440-> “mtu”: 1420,

{
  "name": "k8s-pod-network",
  "cniVersion": "0.3.1",
  "plugins": [
    {
      "type": "calico",
      "log_level": "info",
      "log_file_path": "/var/log/calico/cni/cni.log",
      "datastore_type": "kubernetes",
      "nodename": "k3s-operator-1",
      "mtu": 1420,
      "ipam": {
          "type": "calico-ipam"
      },
      "policy": {
          "type": "k8s"
      },
      "kubernetes": {
          "kubeconfig": "/etc/cni/net.d/calico-kubeconfig"
      }
    },
    {
      "type": "portmap",
      "snat": true,
      "capabilities": {"portMappings": true}
    },
    {
      "type": "bandwidth",
      "capabilities": {"bandwidth": true}
    }
  ]
}

Using hetzner cloud servers here, and the problem was fixed indeed by changing MTU not cert-manager

Changing calico MTU from 1440 to 1400 or 1420 fixed the error running test-resource.yaml

Error from server (InternalError): error when creating "test-resources.yaml": Internal error occurred: failed calling webhook "webhook.cert-manager.io": Post "https://cert-manager-webhook.cert-manager.svc:443/validate?timeout=10s": context deadline exceeded

MTU Change:

kubectl patch configmap/calico-config -n kube-system --type merge \
  -p '{"data":{"veth_mtu": "1400"}}'
kubectl rollout restart daemonset calico-node -n kube-system

@turkenh I am seeing the same issue but no errors in my events. I am following the same approach as you. i.e., deploy the cert-manager first and then the issuer with a separate helm chart. Just as you had observed, I do not see the error if I deploy my Issuer after a few seconds (~60). Back to back installations of cert-manager and the Issuer certainly throws the following error: Internal error occurred: failed calling webhook “webhook.cert-manager.io”: Post https://cert-manager-webhook.cert-manager.svc:443/mutate?timeout=30s: x509: certificate signed by unknown authority @munnerz , I am certainly seeing the same error as the original issue with v.015.0. Any thoughts on why I am seeing the error when I try to immediately deploy the Issuer after the cert-manager deployment and NOT when I deploy the Issuer after a bit of a wait? This still appears to be a bug. Do you want me to open another issue to track this?

I had similar issue and found out that my kube-controller-manager pod and kube-api-server had wrongly configured NO_PROXY not excluding .svc from proxy traffic. I had to change /etc/kubernetes/manifests/*.yaml on master node.

Hi, I ran into the same issue like @zzaareer on a rancher kubernetes cluster. I have successfully deployed cert-manager via helm v3:

helm install cert-manager jetstack/cert-manager --namespace cert-manager --version v0.15.0 --set installCRDs=true --description "install cert-manager"

but when I try to install the test resources, I get the following error:

Error from server (InternalError): error when creating "test-resources.yaml": Internal error occurred: failed calling webhook "webhook.cert-manager.io": Post https://cert-manager-webhook.cert-manager.svc:443/mutate?timeout=30s: context deadline exceeded (Client.Timeout exceeded while awaiting headers)
Error from server (InternalError): error when creating "test-resources.yaml": Internal error occurred: failed calling webhook "webhook.cert-manager.io": Post https://cert-manager-webhook.cert-manager.svc:443/mutate?timeout=30s: context deadline exceeded

I attached a sidecar to the cert-manager pod for debugging and it shows me that I can resolve cert-manager-webhook.cert-manager.svc, but the IP is not answering on a ping.

I’ve resolved the IP to 10.43.179.12 and this matches my svc/cert-manager-webhook service. When I do k port-forward service/cert-manager-webhook 9090:443 and call localhost:9090 in my browser, I see that the API is up. But why is my cert-manager not reaching the webhook pod?

curl -s -L "https://github.com/jetstack/cert-manager/releases/download/v0.14.1/cert-manager.crds.yaml" 2>&1 \
  | sed -e "s/namespace: cert-manager/namespace: ${namespace}/" \
    -e "s/cert-manager.io\/inject-ca-from-secret: cert-manager\/cert-manager-webhook-tls/cert-manager.io\/inject-ca-from-secret: ${namespace}\/${namespace}-cert-manager-webhook-tls/" \
  |  kubectl apply --validate=false -f -

After changing version to newer (v1.8.0 in curl) also helped for me! Thanks

Potential resolution:

In our case, our cert-manager-webhooks pod had been running for nearly a year. We suspect it was using some sort of out-of-date internal cluster cert. After deleting the webhook pod, the Deployment spun up a new one without the issue.

This happens when you put any annotations on the Issuer or ClusterIssuer resources. Causes it to fail the validating webhook.

Can you file a seperate issue for that?

I’m not sure if this is helpful, but an FYI: attempting to apply this via kubectl -k (kustomize) failed, but kubectl -f succeeded. I don’t know how to research more.

EDIT: potentially very relevant, I was working w/ a possibly very bad mix of kubectl versions:

λ kubectl version
Client Version: version.Info{Major:"1", Minor:"18", GitVersion:"v1.18.1", GitCommit:"7879fc12a63337efff607952a323df90cdc7a335", GitTreeState:"clean", BuildDate:"2020-04-10T21:53:58Z", GoVersion:"go1.14.2", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"14", GitVersion:"v1.14.3", GitCommit:"5e53fd6bc17c0dec8434817e69b04a25d8ae0ff0", GitTreeState:"clean", BuildDate:"2019-06-06T01:36:19Z", GoVersion:"go1.12.5", Compiler:"gc", Platform:"linux/amd64"}

@Antiarchitect Only your solution worked for me!

Steps taken:

kubectl delete mutatingwebhookconfiguration.admissionregistration.k8s.io cert-manager-webhook
kubectl delete validatingwebhookconfigurations.admissionregistration.k8s.io cert-manager-webhook
namespace="not-cert-manager"
curl -s -L "https://github.com/jetstack/cert-manager/releases/download/v0.14.1/cert-manager.crds.yaml" 2>&1 | sed -e "s/namespace: cert-manager/namespace: ${namespace}/" -e "s/cert-manager.io\/inject-ca-from-secret: cert-manager\/cert-manager-webhook-tls/cert-manager.io\/inject-ca-from-secret: ${namespace}\/${namespace}-cert-manager-webhook-tls/" |  kubectl apply --validate=false -f -

I followed the same action plan and it is working. but after that i cant describe or delete the issuer , it is giving me the following error:

conversion webhook for cert-manager.io/v1alpha2, Kind=Issuer failed: Post https://cert-manager-webhook.not-cert-manager.svc:443/convert?timeout=30s: service “cert-manager-webhook” not found.

Any Idea ?

Hi @TylerIlunga and @Antiarchitect,

I’ve the same issue, and with that fix I’ve already created an issuer. But when try describe created issuer, return that error:

conversion webhook for cert-manager.io/v1alpha2, Kind=Issuer failed: Post https://cert-manager-webhook.not-cert-manager.svc:443/convert?timeout=30s: service "cert-manager-webhook" not found.

Here https://github.com/jetstack/cert-manager/issues/2752#issuecomment-605966908 you can find the answer of @munnerz that explains very well the issue, the reason behind and a possible workaround.

Got this error too.

Reason: the node MTU is smaller than the cert-manager-webhook pod MTU, lead to the TLS response packet not able to reach the node. Solution: adjust the cert-manager-webhook pod MTU to (node MTU - 20).

Since I also had the error

Internal error occurred: failed calling webhook \"webhook.cert-manager.io\":
Post https://cert-manager-webhook.cert-manager.svc:443/validate?timeout=30s: context deadline exceeded

which bothered me for quite some time I want to share my story 😉 Maybe it helps someone. So to get this error away I also did

$ kubectl delete mutatingwebhookconfiguration.admissionregistration.k8s.io cert-manager-webhook
$ kubectl delete validatingwebhookconfigurations.admissionregistration.k8s.io cert-manager-webhook

as mentioned in https://github.com/jetstack/cert-manager/issues/2602#issuecomment-669091541 . I’m installing cert-manager via the official Helm chart. So while I tried to upgrade from cert-manager v1.4.x to v1.5.0 the startupapicheck failed which also tries to call the webhook and it also exited with context deadline exceeded. While you can disable this check I really wanted to find out what was the real cause.

So while chatting with a team mate about that issue he asked the right question: Are you aware that the K8s control plane tries to connect to that webhook? 😉 I have no idea why I ignored that fact for quite a while… In my case there was just no connection from the control plane to the “Pod network” and I actually never needed it. So the controller nodes tried to connect to https://cert-manager-webhook.cert-manager.svc:443/validate?timeout=30s: and they of course had no idea where to route that request because there was just no network route from the controller nodes network to the “Pod network” (the K8s Pod and Service IP range).

My solution for now actually looks like this:

helm install cert-manager jetstack/cert-manager 
  --namespace cert-manager 
  --version v1.5.3
  --set installCRDs=true 
  --set global.leaderElection.namespace="cert-manager"
  --set webhook.securePort="30001"
  --set webhook.hostNetwork=true
  --set webhook.url.host="worker:30001"
  --set webhook.extraArgs="{--dynamic-serving-dns-names=worker:30001}"

This makes the webhook listening on port 30001 on the host network too. So now the controller nodes can communicate with the webhook via the host network. Of course worker needs to be replaced with a real hostname e.g. And to avoid a certificate error --dynamic-serving-dns-names is also needed. Here a list of valid DNS names can be included so that the webhook TLS certificate matches the hostname in the URL.

Waiting a while as mentioned earlier seems to do the trick, so there is probably a timing issue somewhere. Tested on 0.15.2.

The following works for me (you might wanna tinker with the timer):


kubectl apply --validate=false -f https://github.com/jetstack/cert-manager/releases/download/v0.15.2/cert-manager.yaml

kubectl -n cert-manager rollout status deploy/cert-manager-webhook

sleep 20

I had a 60 second wait built in to my script and it still failed, came back 10 minutes later and tried this and it worked.

@munnerz, can you please consider re-opening this. Keeps happening in 0.15.2.

I ended up finding another unique solution to this problem, and all of cert-manager is working at full capacity for me now. My setup was:

Kubernetes 1.18
Calico CNI
Bare Metal
Cert-Manager 0.14.1

To fix, for some reason I had to make an adjustment to the calico network IP pool configuration away from the default. I downloaded the calico setup YAML (https://docs.projectcalico.org/manifests/calico.yaml), and then I edited this snippet

- name: CALICO_IPV4POOL_IPIP
  value: "Always"

to

- name: CALICO_IPV4POOL_IPIP
  value: "Never"

After deleting the default created IP pool and restarting calico, I reinstalled cert-manager and it began working as intended.

I am not sure exactly why this change fixed all my problems.

I have been dealing with this issue for a couple days now. After the 0.15.0 alpha came out today I thought this issue would be resolved, but I continue to suffer the same issue.

Also I don’t think @Antiarchitect 's solution is actually a real solution since it necessitates deletes the webhook configurations, effectively disabling the webhook service. I think the issue is TLS connection establishment related, but I am not sure why none of the ciphers work.

Same here having upgraded from v0.11 to 0.14.1. Mandatory webhook component seems to have borked. Our new webhook pod is accessible on cert-manager-webhook.our-namespace.svc:443 and I’ve tried the hostNetwork suggestion and waiting for the pod to come up before creating the clusterIssuer resource. No dice. Rolling back to < v0.14 until all the open issues about this are closed. May I suggest a patch to make webhook optional again in the meantime?