cert-manager: Verifying Install: "failed calling admission webhook" (Azure, GKE private cluster)

Describe the bug: Upon re-installing cert-manager and trying to verify the install, the admission api is failing with the following description:

kubectl describe APIService v1beta1.admission.certmanager.k8s.io
Name:         v1beta1.admission.certmanager.k8s.io
Namespace:
Labels:       app=webhook
              chart=webhook-v0.6.4
              heritage=Tiller
              release=cert-manager
Annotations:  <none>
API Version:  apiregistration.k8s.io/v1
Kind:         APIService
Metadata:
  Creation Timestamp:  2019-03-01T10:08:13Z
  Resource Version:    13956808
  Self Link:           /apis/apiregistration.k8s.io/v1/apiservices/v1beta1.admission.certmanager.k8s.io
  UID:                 ecc47923-3c09-11e9-bae6-6e4899a3d5f0
Spec:
  Ca Bundle:               LS0tLS1<removed for brevity>LS0tCg==
  Group:                   admission.certmanager.k8s.io
  Group Priority Minimum:  1000
  Service:
    Name:            cert-manager-webhook
    Namespace:       cert-manager
  Version:           v1beta1
  Version Priority:  15
Status:
  Conditions:
    Last Transition Time:  2019-03-01T10:08:13Z
    Message:               no response from https://10.0.233.160:443: Get https://10.0.233.160:443: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
    Reason:                FailedDiscoveryCheck
    Status:                False
    Type:                  Available
Events:                    <none>

This manifests when trying to apply the test-resources.yaml for verifying the install, with the following output:

kubectl apply -f test-resources.yaml
namespace "cert-manager-test" created
Error from server (InternalError): error when creating "test-resources.yaml": Internal error occurred: failed calling admission webhook "issuers.admission.certmanager.k8s.io": the server is currently unable to handle the request
Error from server (InternalError): error when creating "test-resources.yaml": Internal error occurred: failed calling admission webhook "certificates.admission.certmanager.k8s.io": the server is currently unable to handle the request

Expected behaviour: Test Resources should be created successfully with no errors.

Steps to reproduce the bug:

Note: I have removed all other items from my cluster and following the install of the CRD’s, created the name space, labelled the name space, then tried the install via helm using the following commands:

kubectl apply -f https://raw.githubusercontent.com/jetstack/cert-manager/v0.6.2/deploy/manifests/00-crds.yaml
kubectl create namespace cert-manager
kubectl label namespace cert-manager certmanager.k8s.io/disable-validation=true
helm install --name cert-manager --namespace cert-manager --version v0.6.6 stable/cert-manager

Anything else we need to know?: I have previously installed cert-manager successfully on this cluster. I was then trying to get the nginx-ingress working but got into a bit of a mess. So I deleted all resources created (via helm), and tidied up any orphaned objects - so I could start from scratch again. However, I’m now running into this issue.

The only similar issue I’ve seen is this https://github.com/helm/charts/issues/10869. But I’m unsure what the resolution to this is.

All other objects appear to have been created and started successfully. I haven’t been able to see any other error messages having gone through the logs for the different pods.

Environment details::

  • Kubernetes version (e.g. v1.10.2): v1.11.3
  • Cloud-provider/provisioner (e.g. GKE, kops AWS, etc): Azure
  • cert-manager version (e.g. v0.4.0): 0.6.6
  • Install method (e.g. helm or static manifests): Helm

/kind bug

About this issue

  • Original URL
  • State: closed
  • Created 5 years ago
  • Reactions: 14
  • Comments: 36 (3 by maintainers)

Most upvoted comments

I’m also experiencing all of the issues listed in this thread.

Commands that I ran:

kubectl apply -f https://raw.githubusercontent.com/jetstack/cert-manager/release-0.8/deploy/manifests/00-crds.yaml
kubectl create namespace cert-manager
kubectl label namespace cert-manager certmanager.k8s.io/disable-validation=true
helm install   --name cert-manager   --namespace cert-manager   --version v0.8.1   jetstack/cert-manager

Output from the kube-apiserver:

I0624 17:14:56.867048       1 controller.go:608] quota admission added evaluator for: certificates.certmanager.k8s.io
I0624 17:14:56.900181       1 controller.go:608] quota admission added evaluator for: issuers.certmanager.k8s.io
I0624 17:14:59.674043       1 controller.go:105] OpenAPI AggregationController: Processing item v1beta1.admission.certmanager.k8s.io
E0624 17:15:00.493680       1 controller.go:111] loading OpenAPI spec for "v1beta1.admission.certmanager.k8s.io" failed with: failed to retrieve openAPI spec, http error: ResponseCode: 503, Body: service unavailable
, Header: map[X-Content-Type-Options:[nosniff] Content-Type:[text/plain; charset=utf-8]]
I0624 17:15:00.493691       1 controller.go:119] OpenAPI AggregationController: action for item v1beta1.admission.certmanager.k8s.io: Rate Limited Requeue.
I0624 17:15:06.565081       1 controller.go:105] OpenAPI AggregationController: Processing item v1beta1.admission.certmanager.k8s.io
E0624 17:15:06.565268       1 controller.go:111] loading OpenAPI spec for "v1beta1.admission.certmanager.k8s.io" failed with: failed to retrieve openAPI spec, http error: ResponseCode: 503, Body: service unavailable
, Header: map[Content-Type:[text/plain; charset=utf-8] X-Content-Type-Options:[nosniff]]
I0624 17:15:06.565291       1 controller.go:119] OpenAPI AggregationController: action for item v1beta1.admission.certmanager.k8s.io: Rate Limited Requeue.
I0624 17:15:10.182483       1 controller.go:105] OpenAPI AggregationController: Processing item v1beta1.admission.certmanager.k8s.io
E0624 17:15:10.199673       1 controller.go:111] loading OpenAPI spec for "v1beta1.admission.certmanager.k8s.io" failed with: OpenAPI spec does not exists
I0624 17:15:10.199697       1 controller.go:119] OpenAPI AggregationController: action for item v1beta1.admission.certmanager.k8s.io: Rate Limited Requeue.

Output from k get apiservices.apiregistration.k8s.io shows the following:

NAME                                   SERVICE                             AVAILABLE   AGE
v1alpha1.certmanager.k8s.io            Local                               True        9m
v1beta1.admission.certmanager.k8s.io   cert-manager/cert-manager-webhook   True        8m
v1beta1.certificates.k8s.io            Local                               True        16m

This was performed on a brand new, fresh k8s cluster running on Ubuntu on bare metal using RKE to set up the cluster.

Kubernetes version: 1.13.5 Helm version: 2.13.0 cert-manager version: 0.8.1

We ended up having to punt on cert-manager for now because of this issue. We are going to deploy a self-signed cert for the Nginx ingress for now and reevaluate when cert-manager resolves these issues.

I am facing same exact issue with 0.9.1 version as well. any update on this issue ?

I’m also getting this issue, the webhook pod never comes up:

MountVolume.SetUp failed for volume "certs" : secrets "cert-manager-webhook-webhook-tls" not found

The secret doesn’t exist.

i’m hitting the same problem on a GKE private cluster. i’ve attempted to allow maximal access on port 6443, but i’m hitting the same issue (test fails with failed calling admission webhook) and i get the same error from kubectl describe APIService v1beta1.admission.certmanager.k8s.io:

Status:
  Conditions:
    Last Transition Time:  2019-03-07T20:29:58Z
    Message:               no response from https://10.149.2.15:6443: Get https://10.149.2.15:6443: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
    Reason:                FailedDiscoveryCheck
    Status:                False
    Type:                  Available

i’ve given up on getting the webhook to work for now, and am sticking with cert-manager-no-webhook.yaml, but i’d love a resolution to this issue

We’re hitting the same issue in GKE. Is there any staff follow-up to this issue?

guys the biggest problem is that you are installing it without webhook.Enabled. If you do that you cannot use clusterissuers because kube apiservice is not there.

So what I did is

kubectl apply -f https://raw.githubusercontent.com/jetstack/cert-manager/release-0.9/deploy/manifests/00-crds.yaml
kubectl create ns cert-manager
kubectl label namespace cert-manager certmanager.k8s.io/disable-validation="true"
helm install cert-manager jetstack/cert-manager --namespace cert-manager --version v0.9.0 --set ingressShim.defaultIssuerName=letsencrypt --set ingressShim.defaultIssuerKind=ClusterIssuer

after that creating clusterissuer works and I can see that certificates are created automatically.

I had the same and i am a bit afraid this is due to my network policies in place.

Here my workflow to migrate from with webhook to without - NO WARRANTY!

kubectl get -o yaml \
   --all-namespaces \
   issuer,clusterissuer,certificates,orders,challenges > cert-manager-backup.yaml

# Delete old stuff ! - WATCH OUT YOU DELETE THE NAMESPACE AND ALL YOUR CUSTOM SECRETS E.G. FOR YOUR CLUSTER ISSUER
kubectl delete -f https://raw.githubusercontent.com/jetstack/cert-manager/release-0.7/deploy/manifests/cert-manager.yaml

kubectl create -f your-custom-secrets-in-the-cert-manager-namespace-e-g-aws-creds.yaml

# Now deploy new setup
curl https://raw.githubusercontent.com/jetstack/cert-manager/release-0.7/deploy/manifests/cert-manager-no-webhook.yaml  > cert-manager-no-webhook.yml
kubectl apply -f cert-manager-no-webhook.yml

# Recreate missing things through backup
kubectl create -f cert-manager-backup.yaml
  • This worked for me.

I’m also getting this on bare metal, and I’m scratching my head as to what to do about it. In case the details are useful:

I have a functioning set of pods:

NAME                                       READY   STATUS    RESTARTS   AGE     IP            NODE                 NOMINATED NODE   READINESS GATES
cert-manager-68cfd787b6-h2bz6              1/1     Running   0          13h     10.42.2.115   node-int-worker-01   <none>           <none>
cert-manager-cainjector-5975fd64c5-6gm98   1/1     Running   0          13h     10.42.2.114   node-int-worker-01   <none>           <none>
cert-manager-webhook-5c7f95fd44-84cz4      1/1     Running   0          2m26s   10.42.2.117   node-int-worker-01   <none>           <none>

But when I try to apply my Issuer yaml:

apiVersion: certmanager.k8s.io/v1alpha1
kind: Issuer
metadata:
  name: letsencrypt-staging
  namespace: default
spec:
  acme:
    server: https://acme-staging-v02.api.letsencrypt.org/directory
    email: my@email.com

    # Name of a secret used to store the ACME account private key
    privateKeySecretRef:
      name: letsencrypt-staging

    # ACME DNS-01 provider configurations
    dns01:

      # Here we define a list of DNS-01 providers that can solve DNS challenges
      providers:

        - name: cloudflare-dns
          cloudflare:
            email: my@email.com
            # A secretKeyRef to a cloudflare api key
            apiKeySecretRef:
              name: cloudflare-api-key
              key: api-key.txt

I get:

Error from server (InternalError): error when creating "/tmp/tmp7_0rbu2s/lets-encrypt-issuer.yaml": Internal error occurred: failed calling webhook "issuers.admission.certmanager.k8s.io":
the server is currently unable to handle the request

I’d love pointers on how to debug further. This is as far as I’ve gotten:

Somewhere along my google searching I came across “kubectl get apiservice” which let me see the following:

NAME                                   SERVICE                             AVAILABLE                      AGE
v1.                                    Local                               True                           2d
v1.apps                                Local                               True                           2d
v1.authentication.k8s.io               Local                               True                           2d
v1.authorization.k8s.io                Local                               True                           2d
v1.autoscaling                         Local                               True                           2d
v1.batch                               Local                               True                           2d
v1.crd.projectcalico.org               Local                               True                           2d
v1.monitoring.coreos.com               Local                               True                           14h
v1.networking.k8s.io                   Local                               True                           2d
v1.rbac.authorization.k8s.io           Local                               True                           2d
v1.storage.k8s.io                      Local                               True                           2d
v1alpha1.certmanager.k8s.io            Local                               True                           13h
v1beta1.admission.certmanager.k8s.io   cert-manager/cert-manager-webhook   False (FailedDiscoveryCheck)   13h
v1beta1.admissionregistration.k8s.io   Local                               True                           2d
v1beta1.apiextensions.k8s.io           Local                               True                           2d
v1beta1.apps                           Local                               True                           2d
v1beta1.authentication.k8s.io          Local                               True                           2d
v1beta1.authorization.k8s.io           Local                               True                           2d
v1beta1.batch                          Local                               True                           2d
v1beta1.certificates.k8s.io            Local                               True                           2d
v1beta1.coordination.k8s.io            Local                               True                           2d
v1beta1.events.k8s.io                  Local                               True                           2d
v1beta1.extensions                     Local                               True                           2d
v1beta1.metrics.k8s.io                 kube-system/metrics-server          False (FailedDiscoveryCheck)   2d
v1beta1.policy                         Local                               True                           2d
v1beta1.rbac.authorization.k8s.io      Local                               True                           2d
v1beta1.scheduling.k8s.io              Local                               True                           2d
v1beta1.storage.k8s.io                 Local                               True                           2d
v1beta2.apps                           Local                               True                           2d
v2beta1.autoscaling                    Local                               True                           2d
v2beta2.autoscaling                    Local                               True                           2d
v3.cluster.cattle.io                   Local                               True                           2d

Notably, the “v1beta1.admission.certmanager.k8s.io” seems to be failing it’s availability checks. Looking into it I see:

Name:         v1beta1.admission.certmanager.k8s.io
Namespace:
Labels:       app=webhook
              chart=webhook-v0.8.1
              heritage=Tiller
              release=cert-manager
Annotations:  certmanager.k8s.io/inject-ca-from: cert-manager/cert-manager-webhook-webhook-tls
              kubectl.kubernetes.io/last-applied-configuration:
                {"apiVersion":"apiregistration.k8s.io/v1beta1","kind":"APIService","metadata":{"annotations":{"certmanager.k8s.io/inject-ca-from":"cert-ma...
API Version:  apiregistration.k8s.io/v1
Kind:         APIService
Metadata:
  Creation Timestamp:  2019-06-22T07:10:56Z
  Resource Version:    108892
  Self Link:           /apis/apiregistration.k8s.io/v1/apiservices/v1beta1.admission.certmanager.k8s.io
  UID:                 e10e43e3-94bc-11e9-a957-0244a03303e1
Spec:
  Ca Bundle:               long-ca-string-here
  Group:                   admission.certmanager.k8s.io
  Group Priority Minimum:  1000
  Service:
    Name:            cert-manager-webhook
    Namespace:       cert-manager
  Version:           v1beta1
  Version Priority:  15
Status:
  Conditions:
    Last Transition Time:  2019-06-22T07:10:56Z
    Message:               no response from https://10.43.216.179:443: Get https://10.43.216.179:443: net/http: request canceled while waiting for connection (Client.Timeout exceeded while
awaiting headers)
    Reason:                FailedDiscoveryCheck
    Status:                False
    Type:                  Available
Events:                    <none>

The message of being unable to connect to https://10.43.216.179 looks suspicious, so I look into my services:

NAME                   TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)   AGE
cert-manager-webhook   ClusterIP   10.43.216.179   <none>        443/TCP   13h

And they seem fine? Describing the svc has it using selectors that match the pod itself, so all of that seems to be running?

I’m not sure if it’s a connection issue, but I’m unsure what node the API service is running on, and how I can debug connectivity issues?

If it helps, this whole cluster is a bunch of VMs brought up by vagrant. The vagrantfile looks like this:

# -*- mode: ruby -*-
# vim: ft=ruby ts=2 sw=2 sts=2 noexpandtab


Vagrant.configure("2") do |config|

  ["int","ext"].each_with_index do |cluster, index_cluster|
    ["control", "etcd", "worker"].each_with_index do |role, index_role|
      (1..1).each_with_index do |num, index_num|
        box_name = "node-#{cluster}-#{role}-#{num.to_s.rjust(2,'0')}"

        config.vm.define box_name do |box|
          l_mac_address="0E000000#{index_cluster}#{index_role}#{index_num}1"

          box.vm.box = "ubuntu/bionic64"
          box.vm.hostname = box_name
          box.disksize.size = '20GB'
          box.vm.network "public_network",
            use_dhcp_assigned_default_route: true,
            bridge: "eno2",
            mac: l_mac_address

          box.vm.provider "virtualbox" do |vb|
            vb.name = box_name

            if "worker" == role then
              vb.cpus = "8"
              vb.memory = "8192"
            else
              vb.cpus = "2"
              vb.memory = "4096"
            end

          end

          box.vm.provision :shell, :path => "bootstrap.sh"

          box.vm.provision "ansible" do |ansible|
            ansible.playbook = "playbook.yml"
            ansible.compatibility_mode = "2.0"
          end
        end
      end
    end
  end
end

Apologies for the deluge of information; but I’m hoping someone else has run into this.

Ditto, on GKE, freshly minted cluster.

I am seeing this same issue with v0.7 as well, from installing via the manifests.

after running this command it worked, looks like an access role issue kubectl create clusterrolebinding cluster-system-anonymous --clusterrole=cluster-admin --user=system:anonymous

Please use this command for test purposes, as it grants anyone access to perform any action on the cluster. THIS IS NOT A FIX

We were having the same issue using flux helm operator. To share some insights from the past, upgrading helm charts throughout “major” updates/version bumps never really worked. Usually we just delete (–purge) the release before doing this kind of bigger leaps.

So apparently one of our clusters got rid of v1beta1.admission.certmanager.k8s.io apiservice by itself with deletion of the helm release. The other ones got stuck with the aforementioned “failed calling admission webhook”.

Coming from v.0.6.X it seems that v0.10 now has v1beta1.webhook.certmanager.k8s.io instead of v1beta1.admission.certmanager.k8s.io.

TLDR; Just tried by deleting the helm release multiple times, but the “old” apiservice didn’t get removed. So I went on cleaning up with kubectl delete apiservice v1beta1.admission.certmanager.k8s.io. Everything’s gucci.

I’m also experiencing all of the issues listed in this thread.

Commands that I ran:

kubectl apply -f https://raw.githubusercontent.com/jetstack/cert-manager/release-0.8/deploy/manifests/00-crds.yaml
kubectl create namespace cert-manager
kubectl label namespace cert-manager certmanager.k8s.io/disable-validation=true
helm install   --name cert-manager   --namespace cert-manager   --version v0.8.1   jetstack/cert-manager

Output from the kube-apiserver:

I0624 17:14:56.867048       1 controller.go:608] quota admission added evaluator for: certificates.certmanager.k8s.io
I0624 17:14:56.900181       1 controller.go:608] quota admission added evaluator for: issuers.certmanager.k8s.io
I0624 17:14:59.674043       1 controller.go:105] OpenAPI AggregationController: Processing item v1beta1.admission.certmanager.k8s.io
E0624 17:15:00.493680       1 controller.go:111] loading OpenAPI spec for "v1beta1.admission.certmanager.k8s.io" failed with: failed to retrieve openAPI spec, http error: ResponseCode: 503, Body: service unavailable
, Header: map[X-Content-Type-Options:[nosniff] Content-Type:[text/plain; charset=utf-8]]
I0624 17:15:00.493691       1 controller.go:119] OpenAPI AggregationController: action for item v1beta1.admission.certmanager.k8s.io: Rate Limited Requeue.
I0624 17:15:06.565081       1 controller.go:105] OpenAPI AggregationController: Processing item v1beta1.admission.certmanager.k8s.io
E0624 17:15:06.565268       1 controller.go:111] loading OpenAPI spec for "v1beta1.admission.certmanager.k8s.io" failed with: failed to retrieve openAPI spec, http error: ResponseCode: 503, Body: service unavailable
, Header: map[Content-Type:[text/plain; charset=utf-8] X-Content-Type-Options:[nosniff]]
I0624 17:15:06.565291       1 controller.go:119] OpenAPI AggregationController: action for item v1beta1.admission.certmanager.k8s.io: Rate Limited Requeue.
I0624 17:15:10.182483       1 controller.go:105] OpenAPI AggregationController: Processing item v1beta1.admission.certmanager.k8s.io
E0624 17:15:10.199673       1 controller.go:111] loading OpenAPI spec for "v1beta1.admission.certmanager.k8s.io" failed with: OpenAPI spec does not exists
I0624 17:15:10.199697       1 controller.go:119] OpenAPI AggregationController: action for item v1beta1.admission.certmanager.k8s.io: Rate Limited Requeue.

Output from k get apiservices.apiregistration.k8s.io shows the following:

NAME                                   SERVICE                             AVAILABLE   AGE
v1alpha1.certmanager.k8s.io            Local                               True        9m
v1beta1.admission.certmanager.k8s.io   cert-manager/cert-manager-webhook   True        8m
v1beta1.certificates.k8s.io            Local                               True        16m

This was performed on a brand new, fresh k8s cluster running on Ubuntu on bare metal using RKE to set up the cluster.

Kubernetes version: 1.13.5 Helm version: 2.13.0 cert-manager version: 0.8.1

We ended up having to punt on cert-manager for now because of this issue. We are going to deploy a self-signed cert for the Nginx ingress for now and reevaluate when cert-manager resolves these issues.

@woodwardmatt using cert-manager without the webhook actually works fine – just don’t submit invalid resources! curious what you’re using instead of cert-manager to get SSL certs on k8s. did you go back to buying them and copying them into k8s secrets by hand?

kubectl create clusterrolebinding cluster-system-anonymous --clusterrole=cluster-admin --user=system:anonymous

i’m not a k8s expert, but this command looks like it grants cluster admin permissions to the system anonymous account, which doesn’t sound like a good idea to me. @sasiedu maybe add a disclaimer to your post?

@igor47 sorry i didn’t add a disclaimer. i was suggesting what might be the problem. didn’t ask anyone to use the command, i haven’t figured out the actual permissions needed. Thanks for alerting me.

it also worked for me when i allowed port 6443 in firewall rule for my private GKE cluster . way to troubleshoot is to do : install Custom Resouces , install cert-manager following all standanrd then check for v1beta1.admission.certmanager.k8s.io cert-manager/cert-manager-webhook False (FailedDiscoveryCheck) using --> kubectl get apiservice

then describe to find out the blocking port .

But can anyone tell me how good is to expose port 6443 on a private GKE cluster 😃

I came here because I got mail with ACTION REQUIRED because Lets Encrypt does only support 0.8.0 cert-manager instances (current jetstack/cert-manager version) and onwards in a couple of weeks. Air is getting thin. Experiencing the same issues on 0.9.1.

    kubectl apply -f https://raw.githubusercontent.com/jetstack/cert-manager/release-0.9/deploy/manifests/00-crds.yaml
    kubectl label namespace cert-manager certmanager.k8s.io/disable-validation="true"
    helm upgrade -i \
        --namespace cert-manager \
        --set ingressShim.defaultIssuerName=letsencrypt \
        --set ingressShim.defaultIssuerKind=ClusterIssuer \
        --set webhook.enabled=false \
        cert-manager \
        jetstack/cert-manager

That just doesn’t work. Thought that was a stable release

I had a configuration that was not deleted by ValidatingWebhookConfiguration because of this there was an error. I do not use cert-manager-webhook