cert-manager: Failed calling webhook "webhook.cert-manager.io": connect: connection refused

🌟 tl;dr: If you hit connect: connection refused, you can debug the error by reading the section Error 1: connect: connection refused of The Definitive Debugging Guide for the cert-manager Webhook Pod.

Describe the bug: Upon trying to create a ClusterIssuer (after waiting for cert-manager pod to be ready) I get:

Error from server (InternalError): error when creating "STDIN": Internal error occurred: failed calling webhook "webhook.cert-manager.io": Post https://cert-manager-webhook.cert-manager.svc:443/mutate?timeout=30s: dial tcp 10.152.183.156:443: connect: connection refused
Error: error executing "/tmp/terraform_1028679806.sh": Process exited with status 1

Expected behaviour: ClusterIssuer is created.

Steps to reproduce the bug: In my setup Terraform runs the following script after provisioning Ubuntu 20.04 on DigitalOcean:

snap wait system seed.loaded
snap install microk8s --classic --channel=1.18/stable
export PATH=$PATH:/snap/bin/
microk8s status --wait-ready
microk8s enable dns dashboard helm3 ingress registry

microk8s helm3 repo add jetstack https://charts.jetstack.io
microk8s helm3 repo update
microk8s kubectl create namespace cert-manager
microk8s helm3 install cert-manager jetstack/cert-manager \
  --namespace cert-manager \
  --version v0.15.2 \
  --set installCRDs=true \
  --set ingressShim.defaultIssuerName=letsencrypt-prod \
  --set ingressShim.defaultIssuerKind=ClusterIssuer \
  --set ingressShim.defaultIssuerGroup=cert-manager.io

while [[ $(microk8s kubectl get pods -n cert-manager -l app=cert-manager -o 'jsonpath={..status.conditions[?(@.type=="Ready")].status}') != "True" ]]
do
  echo "Waiting for cert-manager pods to be ready."
  sleep 5
done

cat <<EOF | microk8s kubectl apply -f -
apiVersion: cert-manager.io/v1alpha2
kind: ClusterIssuer
metadata:
  name: letsencrypt-prod
spec:
  acme:
    server: https://acme-v02.api.letsencrypt.org/directory
    email: <removed>
    privateKeySecretRef:
      name: <removed>
    solvers:
    - selector: {}
    - http01:
        ingress:
          class: nginx
---
EOF

Anything else we need to know?: The “wait” loop runs for about 30 seconds before proceeding and throwing the error.

Environment details::

Kubernetes version (e.g. v1.10.2): microk8s 1.18
Cloud-provider/provisioner (e.g. GKE, kops AWS, etc): DigitalOcean
cert-manager version (e.g. v0.4.0): 0.15.2
Install method (e.g. helm or static manifests): helm3

/kind bug

About this issue

Original URL
State: closed
Created 4 years ago
Comments: 16 (4 by maintainers)

Most upvoted comments

I didn’t see connect: connection refused for some reason… It seems your controller nodes cannot reach the network where the cert-manager pod runs. Is there some firewall in place that can block this? That would cause the refuse…

meyskens on Aug 5, 2020

We have new Webhook debug documentation, I suggest checking them out at https://cert-manager.io/docs/concepts/webhook/ if we missd any use case PRs welcome!

meyskens on Oct 8, 2020

kind create cluster
helm repo add jetstack https://charts.jetstack.io
helm install cert-manager --set installCRDs=true --version v1.0.1 jetstack/cert-manager

version

Client Version: version.Info{Major:"1", Minor:"16+", GitVersion:"v1.16.6-beta.0", GitCommit:"e7f962ba86f4ce7033828210ca3556393c377bcc", GitTreeState:"clean", BuildDate:"2020-01-15T08:26:26Z", GoVersion:"go1.13.5", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"19", GitVersion:"v1.19.1", GitCommit:"206bcadf021e76c27513500ca24182692aabd17e", GitTreeState:"clean", BuildDate:"2020-09-14T07:30:52Z", GoVersion:"go1.15", Compiler:"gc", Platform:"linux/amd64"}

kubectl apply -f resources/letsencrypt-issuer.yaml
Error from server (InternalError): error when creating "resources/letsencrypt-issuer.yaml": Internal error occurred: failed calling webhook "webhook.cert-manager.io": Post "https://cert-manager-webhook.default.svc:443/mutate?timeout=10s": dial tcp 10.101.92.30:443: connect: connection refused
Error from server (InternalError): error when creating "resources/letsencrypt-issuer.yaml": Internal error occurred: failed calling webhook "webhook.cert-manager.io": Post "https://cert-manager-webhook.default.svc:443/mutate?timeout=10s": dial tcp 10.101.92.30:443: connect: connection refused

log

kubectl logs cert-manager-webhook-7c6f7f585f-rwp9z
W0925 14:18:30.748694       1 client_config.go:608] Neither --kubeconfig nor --master was specified.  Using the inClusterConfig.  This might not work.
I0925 14:18:30.749012       1 webhook.go:57] cert-manager/webhook "msg"="using dynamic certificate generating using CA stored in Secret resource"  "secret_name"="cert-manager-webhook-ca" "secret_namespace"="default"
I0925 14:18:30.749480       1 server.go:146] cert-manager/webhook "msg"="listening for insecure healthz connections"  "address"=":6080"
I0925 14:18:30.749655       1 server.go:159] cert-manager/webhook "msg"="listening for secure connections"  "address"=":10250"
I0925 14:18:30.749755       1 server.go:185] cert-manager/webhook "msg"="registered pprof handlers"
I0925 14:18:30.751258       1 reflector.go:207] Starting reflector *v1.Secret (1m0s) from external/io_k8s_client_go/tools/cache/reflector.go:156
I0925 14:18:31.787704       1 dynamic_source.go:199] cert-manager/webhook "msg"="Updated serving TLS certificate"

Noticing a very similar if not the same issue on a fresh install into Kind from v1.0.1

AlexsJones on Sep 25, 2020