cert-manager: Post https://cert-manager-webhook.cert-manager.svc:443/convert?timeout=30s: service "cert-manager-webhook" not found

Describe the bug:

I installed cert-manager in the kube-system namespace with this command helm install cert-manager jetstack/cert-manager --namespace kube-system -f values.yaml --wait and everything seemed fine. During the installation of Prometheus on the same cluster, I had problems in generating certificates for alert-manager inputs.

Here is the error I found in the cert-manager logs. I only changed the alertmanager domain for security.

kubectl get certificates le-prometheus-alertmanager-tls -o json Error from server: conversion webhook for &{map[apiVersion:cert-manager.io/v1alpha2 kind:Certificate metadata:map[creationTimestamp:2020-03-25T16:18:11Z generation:1 labels:map[app:prometheus chart:prometheus-11.0.3 component:alertmanager heritage:Helm release:prometheus] name:le-prometheus-alertmanager-tls namespace:kube-system ownerReferences:[map[apiVersion:extensions/v1beta1 blockOwnerDeletion:true controller:true kind:Ingress name:prometheus-alertmanager uid:a225d045-1e39-4238-9352-06af00638c87]] uid:e73287ca-6db4-4e74-8e93-675eebfc8dcb] spec:map[dnsNames:[alertmanager.example.com] issuerRef:map[group:cert-manager.io kind:ClusterIssuer name:letsencrypt-production] secretName:le-prometheus-alertmanager-tls] status:map[conditions:[map[lastTransitionTime:2020-03-25T16:18:12Z message:Waiting for CertificateRequest "le-prometheus-alertmanager-tls-1268665689" to complete reason:InProgress status:False type:Ready]]]]} failed: Post https://cert-manager-webhook.cert-manager.svc:443/convert?timeout=30s: service "cert-manager-webhook" not found

From what I understand the problem is that the cert-manager tries to connect to webhook in the default namespace (cert-manager) instead of the one in which it was installed (kube-system).

Expected behavior: I expect cert-manager to call the webhook in the correct namespace.

Steps to reproduce the bug:

  • Install cert-manager in kube-system with this command: helm install cert-manager jetstack/cert-manager --namespace kube-system -f values.yaml --wait

Changed conf for values.yaml extraArgs: --cluster-resource-namespace=kube-system ingressShim: defaultIssuerName: "letsencrypt-production" defaultIssuerKind: "ClusterIssuer"

  • Install Prometheus in kube-system with this command: helm install prometheus stable/prometheus --namespace=kube-system -f values.yaml

Changed conf for values.yaml: all ingress to true with tls

Anything else we need to know?:

Environment details::

  • Kubernetes version (e.g. v1.15.10-eks-bac369):
  • Cloud-provider/provisioner (e.g. GKE, kops AWS, etc): EKS
  • cert-manager version (e.g. v0.4.0): v0.14.1
  • Install method (e.g. helm or static manifests): helm

/kind bug

About this issue

  • Original URL
  • State: closed
  • Created 4 years ago
  • Reactions: 19
  • Comments: 38 (3 by maintainers)

Most upvoted comments

This has now been resolved with the new --set installCRDs=true option that can be used when installing the Helm chart in the latest (v0.15.0-alpha.X) versions. We are still in the processing of writing new installation docs to include info on this feature, but if you’d like to give it a go it should clear all this up 😄

same

Same issue. Installed to kube-system and cant use kubectl describe or get.

Edit: I’ve manually changed the CRD YAML config as follows:

Download file https://github.com/jetstack/cert-manager/releases/download/v0.14.1/cert-manager.crds.yaml and manually replace:

  1. Annotations

Replace all occurances (should be 6 of them):

cert-manager.io/inject-ca-from-secret: cert-manager/cert-manager-webhook-tls

With:

cert-manager.io/inject-ca-from-secret: kube-system/cert-manager-webhook-tls
  1. Namespace definitions

Replace all occurances (should be 6 of them):

namespace: cert-manager

With:

namespace: kube-system

and then apply kubectl apply --validate=false -f cert-manager.crds.yaml (cert-manager.crds.yaml being the local file you just edited) instead of the command provided in the official docs.

After applying everything works as expected.

For the ones struggling, it seems that if you generate a name on the helm that is different from cert-manager, then the service will be named after your name (something like service/cert-manager-1595315110-webhook). Since the service is expected to be named cert-manager-webhook then it can’t be found. This can be solved by using cert-manager as the name of the release when installing the chart (probably cert-manager as the namespace too).

Anyway, that still didn’t solve the problem for me, so what I did was:

  • Removed the chart: helm uninstall <release_name>
  • Removed the custom resource definitions: kubectl delete -f https://github.com/jetstack/cert-manager/releases/download/v0.16.0-alpha.1/cert-manager.crds.yaml
  • Installed the chart again with: helm install jetstack/cert-manager --generate-name --set installCRDs=true

I have cert manager deployed to the “cert-manager” namespace and this issue still exists. I am unable to delete and describe resources.

Currently this problem is because the namespace is hardcoded in the chart. Currently it is possible when you get the yaml and manually change all the namespaces to the namespace you want. It would be nice if the chart will support it.

For the ones struggling, it seems that if you generate a name on the helm that is different from cert-manager, then the service will be named after your name (something like service/cert-manager-1595315110-webhook). Since the service is expected to be named cert-manager-webhook then it can’t be found. This can be solved by using cert-manager as the name of the release when installing the chart (probably cert-manager as the namespace too).

Anyway, that still didn’t solve the problem for me, so what I did was:

  • Removed the chart: helm uninstall <release_name>
  • Removed the custom resource definitions: kubectl delete -f https://github.com/jetstack/cert-manager/releases/download/v0.16.0-alpha.1/cert-manager.crds.yaml
  • Installed the chart again with: helm install jetstack/cert-manager --generate-name --set installCRDs=true

What worked for me at the end was

helm install \
cert-manager jetstack/cert-manager \
--namespace cert-manager \
--version v0.15.2 \
--set installCRDs=true	

I believe --set installCRDs=true does the trick

The chart is actually okay, and does not hardcode the namespace. The issue here is that the CRD manifests also have to hardcode a namespace name, and the CRDs are not managed by the Helm chart and thus cannot be templated.

There’s a number of places in the CRDs that make reference to the namespace cert-manager, including in spec.conversion as well as metadata.annotations - you’ll need to make sure to adjust all of these in order to have this work. Alternatively, just deploying into the cert-manager namespace works easiest 😅

We’re hoping to be able to improve this in future, but we require https://github.com/helm/helm/issues/7735 to be addressed before we can.

When using helm upgrade --install rather than helm install for the intiial deployment we’re seeing this error in any subsuquent deployments

Error: failed to create resource: Internal error occurred: failed calling webhook "webhook.cert-manager.io": Post https://cert-manager-webhook.cert-manager.svc:443/mutate?timeout=30s: x509: certificate signed by unknown authority

AKS 1.16.10 cert-manager v0.15.2 helm/tiller v2.16.9

Try removing the secret cert-manager-webhook-ca. It will be regenerated and then restart the cert-manager-webhook(probably not necessary) pod.

--set installCRDs=true worked out for me as well!

@gerrywastaken I think it’s not related to Pulumi, as I just ran into the same issue without Pulumi.

I had to downgrade to https://github.com/jetstack/cert-manager/releases/tag/v1.1.1 to make it work. There is also some comment regading the installCRDs flag in the release notes

Interestingly I’m getting the same error but having installed it in cert-manager namespace.

I have cert manager deployed to the “cert-manager” namespace and this issue still exists. I am unable to delete and describe resources.

I am having similar problem … using v0.14.2 … i.e. deploying cert-manager in the “cert-manager” namespace, Using airship armada to helm install which forces use of “–name xyz” to helm, Used ‘Values.webhook.serviceName = xyz-cert-manager-webhook’ to correct the dns names in certificate, … I can create issuer … but on get, describe, etc. of issuer get the following error:

Error from server: conversion webhook for cert-manager.io/v1alpha2, Kind=Issuer failed: Post https://cert-manager-webhook.cert-manager.svc:443/convert?timeout=30s: service “cert-manager-webhook” not found

Why is the wrong web hook dns name being used here ? ( the correct one is used when the issuer is created )

I want to install cert-manager to namespace cert-manager-stage, namespace cert-manager-prod, and so on.

I think you should fix this quickly.

There should be only one cert manager per cluster. You can however create multiple Issuers in different namespaces

For other poor souls like me who have been surfing the Internet for several days and have NOT customized the cert-manager namespace, please try restarting your webhook pod 🤦‍♂️

In our case we were having the dreadful remote error: tls: bad certificate errors and they are gone after a restart.

We’re using an old version of cert-manager (0.8.0) which worked just fine for a long time until this. My guess is that one of the secrets it’s using was replaced but there is no logic to re-read them. I don’t know what it was, but after contemplating a re-install in production, finding it apparently solved like this was worth sharing with others.

Also be sure to update the --dynamic-serving-dns-names flag in the cert-manager-webhook deployment which contains cert-manager as a namespace if you are changing the namespace name.

这个问题可能是cni导致的,我修改了calico的mtu后这个问题解决了(This problem may be caused by cni. After I modified the mtu of calico, the problem was solved.)

“mtu”: 1440-> “mtu”: 1420,

{
  "name": "k8s-pod-network",
  "cniVersion": "0.3.1",
  "plugins": [
    {
      "type": "calico",
      "log_level": "info",
      "log_file_path": "/var/log/calico/cni/cni.log",
      "datastore_type": "kubernetes",
      "nodename": "k3s-operator-1",
      "mtu": 1420,
      "ipam": {
          "type": "calico-ipam"
      },
      "policy": {
          "type": "k8s"
      },
      "kubernetes": {
          "kubeconfig": "/etc/cni/net.d/calico-kubeconfig"
      }
    },
    {
      "type": "portmap",
      "snat": true,
      "capabilities": {"portMappings": true}
    },
    {
      "type": "bandwidth",
      "capabilities": {"bandwidth": true}
    }
  ]
}

When using helm upgrade --install rather than helm install for the intiial deployment we’re seeing this error in any subsuquent deployments

Error: failed to create resource: Internal error occurred: failed calling webhook "webhook.cert-manager.io": Post https://cert-manager-webhook.cert-manager.svc:443/mutate?timeout=30s: x509: certificate signed by unknown authority

AKS 1.16.10 cert-manager v0.15.2 helm/tiller v2.16.9

Same issue. Installed to kube-system and cant use kubectl describe or get.

Edit: I’ve manually changed the CRD YAML config as follows:

Download file https://github.com/jetstack/cert-manager/releases/download/v0.14.1/cert-manager.crds.yaml and manually replace:

  1. Annotations

Replace all occurances (should be 6 of them):

cert-manager.io/inject-ca-from-secret: cert-manager/cert-manager-webhook-tls

With:

cert-manager.io/inject-ca-from-secret: kube-system/cert-manager-webhook-tls
  1. Namespace definitions

Replace all occurances (should be 6 of them):

namespace: cert-manager

With:

namespace: kube-system

and then apply kubectl apply --validate=false -f cert-manager.crds.yaml (cert-manager.crds.yaml being the local file you just edited) instead of the command provided in the official docs.

After applying everything works as expected.

@boris-savic Hi there,

I have tried your solution. Implementation went OK, but after when i try kubectl get clusterissuers

that I’m receiving

Error from server: conversion webhook for cert-manager.io/v1alpha2, Kind=ClusterIssuer failed: Post https://cert-manager-webhook.kube-system.svc:443/convert?timeout=30s: x509: certificate signed by unknown authority

I’m wondering if it was also your case

@boris-savic one cert-manager per one cluster : is restriction and should be documented. and anyway user can choose their namespace easily.

I want to install cert-manager to namespace cert-manager-stage, namespace cert-manager-prod, and so on.

I think you should fix this quickly.

@holdenkilbride the only workaround that I can see is to manually edit the CRD before applying it to match your desidered namespace (ref: https://github.com/jetstack/cert-manager/issues/2752#issuecomment-605883456)