cert-manager: Multi-domain wildcard certificate validation fails when using DNS delegation
Describe the bug:
I have two domains (contoso.com
and example.com
) and would like to manage DNS01 challenges using single DNS zone (acme.example.com
- https://cert-manager.io/docs/configuration/acme/dns01/#delegated-domains-for-dns01), those in example.com
and contoso.com
zones I have created CNAME records that point to _acme-challenge.acme.example.com
(I am interested only in wildcard certificates for those domains, like *.example.com
). If i request a certificate for only one CN, for example *.example.com
or *.contoso.com
everything works fine - _acme-challenge
TXT record is created on acme.example.com
zone and challenge is solved. However, if I want to request a single certificate which would include both CNs at the same time (*.example.com
and *.contoso.com
), only one challenge is solved while second one remains in Waiting for DNS-01 challenge propagation: DNS record for "example.com" not yet propagated
state.
I am using Route53 DNS and there you cannot have two TXT records with the same name - instead you have one _acme-challenge
record which may have multiple values. It seems that there is a race condition during the challenge:
- TXT record with first challenge is provisioned
- Once first challenge is solved, TXT record is updated by appending value for second challenge
- However whole TXT record is immediately deleted not waiting for the second challenge to be solved, instead of removing value for the first challenge and leaving only value for second one.
Expected behaviour:
Certificate should be provisioned and both challenges should be solved. I am expecting that single _acme-challenge
TXT
record should exist with multiple values assigned. Upon solving one challenge only that one value should be removed, instead of deleting whole TXT
record. TXT
record should be deleted only of there are no other challenges pending.
Steps to reproduce the bug:
- All DNS zones are in Route53
- example.com zone delegates acme challenges to acme.example.com zone
- contoso.com zone delegates acme challenges to acme.example.com zone
- ClusterIssuer:
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
name: le-issuer
spec:
acme:
email: john.doe@gmail.com
server: https://acme-v02.api.letsencrypt.org/directory
privateKeySecretRef:
name: letsencrypt-account-key
solvers:
- dns01:
cnameStrategy: Follow
route53:
region: *******************
accessKeyID: *******************
hostedZoneID: *******************
secretAccessKeySecretRef:
name: letsencrypt-aws
key: secret-access-key
selector:
dnsNames:
- '*.example.com'
- '*.contoso.com'
- Certificate:
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
name: wildcard
spec:
dnsNames:
- '*.example.com'
- '*.contoso.com'
issuerRef:
group: cert-manager.io
kind: ClusterIssuer
name: le-issuer
secretName: wildcard
Anything else we need to know?:
Slack discussion: https://kubernetes.slack.com/archives/C4NV3DWUC/p1605512221087200 Related code: https://github.com/jetstack/cert-manager/blob/master/pkg/issuer/acme/dns/route53/route53.go#L173-L175
Environment details::
- Kubernetes version: v1.16.13
- Cloud-provider/provisioner: AWS
- cert-manager version: 1.0.4
- Install method: helm
/kind bug
About this issue
- Original URL
- State: closed
- Created 4 years ago
- Reactions: 11
- Comments: 16 (5 by maintainers)
I think I might be facing this exact same issue.
However I see minor differences:
Another similar report is https://github.com/jetstack/cert-manager/issues/3608
/remove-lifecycle stale
Running into the same problem (when delegating with multiple CNAME records pointing to the same DNS name) with GCP:s CloudDNS (so it seems to affect not only AWS:s Route53).
I would guess it’s the cleanup at https://github.com/jetstack/cert-manager/blob/master/pkg/issuer/acme/dns/clouddns/clouddns.go#L162-L170 that’s the problem, but I’m not very well versed with Go nor the cert-manager code base.