cert-manager: Wrong SOA record while updating delegated _acme-challenge zone.
Describe the bug:
We have delegated our _acme-challenge domains to a BIND dns server.
Delegation records:
_acme-challenge.kube.example.com IN NS dnsle-01.example.com.
_acme-challenge.kube.example.com IN NS dnsle-02.example.com.
Delegated Zone:
$ORIGIN _acme-challenge.kube.example.com.
$TTL 300
@ IN SOA dnsle-01.example.com. noc.example.com. (
1605083984 ; serial
7200 ; refresh
3600 ; retry
1209600 ; expire
3600 ; min
)
@ IN NS dnsle-01.example.com.
@ IN NS dnsle-02.example.com.
Updating via nsupdate or getting a cert using certbot works fine.
tcpdumps:
# cert-manager
85.10.233.254.54466 > 10.200.4.41.domain: [udp sum ok] 171 update [2n] [1au] SOA? example.com. ns: _acme-challenge.kube.example.com. ANY [0s] TXT, _acme-challenge.kube.example.com. [1m] TXT "QZzPbDQhXIfSbMN_InFWeLOYWd5owMLrlwC6gF-mD0A" ar: acme-key. ANY [0s] TSIG hmac-sha512. fudge=300 maclen=64 origid=171 error=0 otherlen=0 (278)
# nsupdate (manual)
85.10.233.254.39730 > 10.200.4.41.domain: [udp sum ok] 41562 update [2n] [1au] SOA? _acme-challenge.kube.example.com. ns: _acme-challenge.kube.example.com. [5m] TXT "test", _acme-challenge.kube.example.com. [5m] TXT "hallo2" ar: acme-key. ANY [0s] TSIG hmac-sha512. fudge=300 maclen=64 origid=41562 error=0 otherlen=0 (203)
Seems like the SOA entry is incorrectly set to example.com
? Bind tries then to update the wrong zone, which causes the "error"="DNS update failed: dns: bad authentication"
in cert-manager’s logs (I checked the auth using --v=5
, it’s correct)’
I already tried setting a nameserver (see below) which did not change the behavior:
--dns01-recursive-nameservers-only
--dns01-recursive-nameservers="1.1.1.1:53"
Sidenote: I changed the domain for privacy reasons.
Expected behaviour:
Setting the TXT record on the correct domain using a correct SOA entry.
Steps to reproduce the bug:
Its possible to recreate this using the upper examples as configuration.
Anything else we need to know?:
For now we cant switch to CNAME based verification (which only works if i use a complete domain, otherwise i get the same issue) since certbot does not support that directly.
Environment details::
- Kubernetes version: 4.5.0-0.okd-2020-10-15-235428 (v1.18.3)
- Cloud-provider/provisioner: Bind
- cert-manager version: 1.0.4
- Install method: e.g. helm/static manifests: static
/kind bug
About this issue
- Original URL
- State: closed
- Created 4 years ago
- Comments: 39 (21 by maintainers)
Hi @maelvls ,
I will start this at beginning of next week and keep you posted 😄
Nop, and as previously stated, it works perfectly with certbot, and dig
With a subdelegated setup like this : mydomain.com main DNS zone: (SOA is mydomain.com) _acme-challenge IN NS letsencrypt.mydomain.com
and a _acme-challenge.mydomain.com (SOA is _acme-challenge.mydomain.com as it should be) nameserver. The subdelegated zone on the letsencrypt.mydomain.com dns server: @ IN TXT “mysupertxtrecord”
If i do :
@maelvls i agree with the reasoning, and i would like to see this fix asap too . So i will get back to it, and write a first implementation of the followNS function. Will keep you posted about the progress.
Thanks,
Hi, so i made custom modification to a webhook to check that txt record could be updated. It works pretty well.
However, the DNS propagation check on cert-manager side is failling cause of this SOA search.
If i use dig, the record is properly updated and DNS validation should succeed.
I wonder why the propagation check have to go through recursive query ? Couldn’t we just “trust” standard DNS resolution for this ? A simple
net.LookupTXT("_acme-challenge.mydomain.com")
give me the right answer.