cert-manager: Regression: HTTP-01 challenges fail with Istio, Traefik, ingress-gce and Azure AGIC (kubernetes.io/ingress.class not set anymore in 1.5 and 1.6)
Short description: if you use Traefik, Istio, ingress-gce, Azure AGIC or NGINX Inc’s NGINX Ingress, upgrading from cert-manager 1.4 or below to cert-manager 1.5 or 1.6 will silently break the HTTP-01 solver.
- Which versions are affected? You will be affected if you are using Kubernetes 1.19 and above in conjunction with cert-manager 1.5.0, 1.5.1, 1.5.2, 1.5.3, 1.5.4, 1.6.0, 1.6.1, 1.6.2, and 1.6.3 and that you use an Issuer or ClusterIssuer with an HTTP-01 solver configured (
spec.acme.solvers[].http01
). - Why? Since cert-manager 1.5 (https://github.com/jetstack/cert-manager/pull/4225), the solver does not set the annotation
kubernetes.io/ingress.class
anymore; instead, it sets the fieldingressClassName
, introduced with Kubernetes 1.18 (both in Ingress v1 and v1beta1) which requires the presence of an IngressClass (depending on the ingress controller, see below table). - A workaround is to create an IngressClass resource, but that comes as a surprise to the user. This breakage is “invisible” to the user, making it hard to debug. But the workaround does not work for Azure’ AGIC since it was using annotations with a slash, which aren’t compatible with
ingressClassName
(https://github.com/jetstack/cert-manager/issues/4547).
Who is affected?
Anyone using cert-manager 1.5 or 1.6 as well as one of the ingress controllers affected:
Ingress controller | Is affected? |
---|---|
Istio | yes |
Traefik | yes |
Contour | no |
Gloo Edge | no |
ingress-nginx | no |
NGINX Ingress (NGINX Inc) | yes but mitigated by the fact the Helm charts creates an IngressClass nginx for you |
Azure AGIC | yes |
ingress-gce | yes |
Notes:
- Istio and Traefik did not change the behaviour of the annotation ever, which means an IngressClass is never mandatory when using the annotation. On the other side, the
ingressClassName
requires you to have an IngressClass present on the cluster. This is why upgrading to 1.5 or 1.6 will break people’s HTTP-01 ACME solver. - Contour uses the “default controller” behaviour. The “default controller” is a controller which picks up Ingress resources that do not have the annotation
kubernetes.io/ingress.class
. With 1.18, Contour added--ingress-class-name
(disabled by default) which allows to set a different string to be matched with the annotation or ingressClassName on the Ingress. With this flag, the controller changes from “default controller” to “only matching class” behaviour and ignores any ingress that does not have the annotation or the ingressClassName. Contour does not mind whether the IngressClass resource exists, and only matches ingresses with the same ingressClassName or annotation. - Gloo uses the “default controller” behaviour. With 1.18, Gloo added
REQUIRE_INGRESS_CLASS=true
which turns on the “only matching class” behaviour. Gloo does not mind whether the IngressClass resource exists. - ingress-nginx uses the “default controller” behaviour. ingress-nginx changed the behaviour of the annotation in their v1.0.0: before v1.0.0, the annotation did not require an IngressClass, but the ingressClassName field did. With v1.0.0, both the annotation and ingressClassName require an IngressClass to exist. The present issue would not appear anyways since the ingress-nginx Helm chart creates an IngressClass for you by default.
- F5’s NGINX Ingress are the worst compatibility-wise: they broke people between two minor versions of their controller. They changed the behaviour of the annotation when Kubernetes 1.18 is detected. When 1.17 and below is detected (ingressClassName did not exist yet), the annotation does not require an IngressClass to exist. When 1.18 and above is detected, both the annotation and the ingressClassName require an IngressClass to exist. The present issue is mitigated by the fact the F5 NGINX Ingress Helm chart creates an IngressClass for you by default.
- Azure AGIC did not change the behaviour associated with the annotation. It does not support the
ingressClassName
field. Another issue with AGIC is that its default class name isazure/application-gateway
, which is fine for the annotation but not fine for ingressClassName (#4547). - ingress-gce does not support the
ingressClassName
field.
Example with Istio:
This issue stems from ~a thread~ (the Slack was discontinued, the link is dead) on the OpenFaaS Slack. A special thanks to @alexellis who was very helpful in finding out this issue.
In 1.5, the HTTP-01 solver behaviour changed with regard to the ingress.class annotation. Let us use the following ACME issuer:
apiVersion: cert-manager.io/v1
kind: Issuer
spec:
acme:
solvers:
- http01:
ingress:
class: istio
Before 1.5, cert-manager was creating a temporary Ingress resource that had the annotation kubernetes.io/ingress.class
:
apiVersion: networking.k8s.io/v1beta # ✨
kind: Ingress
metadata:
name: cm-acme-http-solver-qq8hf
annotations:
kubernetes.io/ingress.class: istio
spec:
rules:
- host: test.dev.mydomain.net
http:
paths:
- path: /.well-known/acme-challenge/5rVG6s5v4WJ_3WifvTzZtstttojFPRb76IwFMOHez9s
backend:
serviceName: cm-acme-http-solver-q4s47
servicePort: 8089
After 1.5, with the support of Ingress v1, the temporary Ingress “lost” its annotation:
apiVersion: networking.k8s.io/v1 # ✨
kind: Ingress
metadata:
name: cm-acme-http-solver-qq8hf
spec:
ingressClassName: istio
rules:
- host: test.dev.mydomain.net
http:
paths:
- path: /.well-known/acme-challenge/5rVG6s5v4WJ_3WifvTzZtstttojFPRb76IwFMOHez9s
pathType: ImplementationSpecific
backend:
service
name: cm-acme-http-solver-q4s47
port:
number: 8089
As detailed in the Istio page Configuring ingress using an Ingress resource, Istio has two different ways of operating:
- The standalone annotation mode: when the annotation
kubernetes.io/ingress.class
is present, it knows that Istio has to take care of this Ingress. - The IngressClass mode: when the field
ingressClassName
is used, Istio expects an IngressClass to be present with the nameistio
. If it does not exist, nothing happens.
With cert-manager 1.5, cert-manager switched from the standalone annotation mode to the IngressClass mode, breaking anyone using Istio’s Ingress controller (on clusters after Kubernetes 1.18 since that only happens when Ingress v1 is available).
I think we made a wrong assumption when we wrote the v1 internal conversion here. I think the annotation, although “deprecated”, doesn’t have the same behaviour as the ingressClassName
field:
% kubectl explain ingress.spec.ingressClassName
KIND: Ingress
VERSION: networking.k8s.io/v1
FIELD: ingressClassName <string>
DESCRIPTION:
IngressClassName is the name of the IngressClass cluster resource. The
associated IngressClass defines which controller will implement the
resource. This replaces the deprecated `kubernetes.io/ingress.class`
annotation. For backwards compatibility, when that annotation is set, it
must be given precedence over this field. The controller may emit a warning
if the field and annotation have different values. Implementations of this
API should ignore Ingresses without a class specified. An IngressClass
resource may be marked as default, which can be used to set a default value
for this field. For more information, refer to the IngressClass
documentation.
I think that the two modes (using “deprecated” standalone annotation and using the IngressClass) serve different purposes, and cert-manager should not switch from the standalone annotation mode to the IngressClass mode.
Related:
- Use of
ingressClassName
with the NGINX Ingress controller - Use of
ingressClassName
with Istio - Use of
ingressClassName
with Traefik - Use of
ingressClassName
with Ambassador - https://github.com/jetstack/cert-manager/issues/4547
- https://github.com/cert-manager/website/issues/709
- https://github.com/jetstack/cert-manager/issues/4562
- https://github.com/jetstack/cert-manager/pull/4470
Slack threads:
- Is the deprecated annotation going to be removed within the lifespan of v1 Ingress?
- Imprecision with the sentence “without a class specified” in the
ingressClassName
documentation
/kind bug /area acme/http01
About this issue
- Original URL
- State: closed
- Created 3 years ago
- Reactions: 5
- Comments: 26 (17 by maintainers)
Commits related to this issue
- Always use the kubernetes.io/ingress.class annotation (#4537) Signed-off-by: Jake Sanders <i@am.so-aweso.me> — committed to jakexks/cert-manager by jakexks 2 years ago
- Always use the kubernetes.io/ingress.class annotation (#4537) Signed-off-by: Jake Sanders <i@am.so-aweso.me> — committed to jakexks/cert-manager by jakexks 2 years ago
- Always use the kubernetes.io/ingress.class annotation (#4537) Signed-off-by: Jake Sanders <i@am.so-aweso.me> — committed to jakexks/cert-manager by jakexks 2 years ago
- Merge pull request #4762 from jakexks/use-only-ingress-annotation Always use the kubernetes.io/ingress.class annotation (#4537) — committed to cert-manager/cert-manager by jetstack-bot 2 years ago
- Revert "Always use the kubernetes.io/ingress.class annotation (#4537)" — committed to cert-manager/cert-manager by maelvls 2 years ago
- Always use the kubernetes.io/ingress.class annotation (#4537) Signed-off-by: Jake Sanders <i@am.so-aweso.me> — committed to jetstack-bot/cert-manager by jakexks 2 years ago
- Merge pull request #4785 from jetstack-bot/cherry-pick-4762-to-release-1.6 [release-1.6] Always use the kubernetes.io/ingress.class annotation (#4537) — committed to cert-manager/cert-manager by jetstack-bot 2 years ago
- Bump cert-manager version (#9) * Bump cert-manager version Regression in cert-manager v1.5.4 has caused various solvers to break, fixed in v1.7.0. Details here: https://github.com/cert-manager/cer... — committed to terraform-iaac/terraform-kubernetes-cert-manager by ajostergaard 2 years ago
- decrease version => fix breaking change cf. https://github.com/jetstack/cert-manager/issues/4537 — committed to kube-components-stack/helm-charts by MalibuKoKo 2 years ago
- decrease cert-manager version => fix breaking change cf. https://github.com/jetstack/cert-manager/issues/4537 — committed to kube-components-stack/helm-charts by MalibuKoKo 2 years ago
The plan for today:
release-1.5
(https://github.com/jetstack/cert-manager/pull/4783)release-1.6
(https://github.com/jetstack/cert-manager/pull/4785)That’s my current thinking, yep… on the premise that:
Therefore for maximum compatibility, it’d appear setting the annotation is the more broadly supported option.
I do also think some of the wording in the Kubernetes API types should be revised however - the annotation is denoted as deprecated, but there’s not actual any defined deprecation path since the annotation behaviour does exist in the v1 API (and I think it’s all been a little bit mixed up with the deprecation of the v1beta1 schema).
Plan:
ingress.class
on a v1 Ingress with thecert-manager.io/issuer
annotation~class
field behavior (Kubernetes 1.18 and before versus and 1.19 and above)~Update 15 January 2022: we have determined that this is a regression (which we classify as critical bug on the Supported releases page). The plan is to revert to the pre-1.5 behavior and backport the fix to cert-manager 1.5 and cert-manager 1.6.
Hm, taking a look at the v1.5 release, it appears that this doesn’t apply to v1.5.0-v1.5.3, but v1.5.4 is where we actually changed this behaviour: https://github.com/jetstack/cert-manager/blob/v1.5.4/pkg/issuer/acme/http/ingress.go#L154
This is the PR that backported it: https://github.com/jetstack/cert-manager/pull/4472
This is quite surprising given it’s quite a significant change to be backported - we’ve also broken our patch release guarantee too (some users on older v1.5 versions may upgrade to v1.5.4 to receive security/critical fixes, which will unexpectedly break renewal) 😬
I think we should revert this cherry-pick and release a new v1.5 release, and advise users to not upgrade to v1.5.4 if they make use of the ACME issuer.
I just noticed the annotation is present in Ingress v1beta1 but not in Ingress v1. I don’t understand why it was removed. I thought “deprecated” meant “to be removed later”, so the annotation should still be present on the Ingress v1.
Just looked at two of the most popular implementations to see if they have a version that supports kube 1.19, but does not look at the
ingressClassName
field. I realize that it may not be feasible to do it for allI do wonder whether any of the controller implementation may stop looking at the annotation, see i.e https://github.com/kubernetes/ingress-nginx/issues/8103
There’s some good ideas about how we can roll forward to use the new field, but I’m conscious we’ve effectively broken users and we should really roll back before attempting to go forward again…
Also just digging into https://github.com/jetstack/cert-manager/blob/7f9fceef907d9ba15e700cd8836000f78548e907/internal/ingress/convert.go - I didn’t realise we’re doing this, but we need to also be certain we keep these conversions up to date as new fields are added to Ingress resources when we upgrade our k8s dependencies, else we’ll potentially even remove fields from user ingresses (if they are using
edit-in-place
).You can actually already see an example of this that the generator picked up on itself here I think: https://github.com/jetstack/cert-manager/blob/7f9fceef907d9ba15e700cd8836000f78548e907/internal/ingress/convert.go#L386-L387