coredns: "serve_stale" option in the "cache" plugin behaves incorrectly
What happened:
serve_stale
does not update NXDOMAIN status if it gets constantly hammered by requests.
What you expected to happen: CoreDNS updates the record once an upstream DNS starts to return an A record after returning NXDOMAIN.
How to reproduce it (as minimally and precisely as possible): Begin the exercise by repeatedly hammering a CoreDNS instance with requests to the non-existent domain. It correctly returns NXDOMAIN.
while sleep 0.5; do dig test.default.svc.cluster.local @169.254.20.10; done
Create an appropriate (I’ve just created a Service in Kubernetes) A record on the upstream DNS. Verify:
$ dig test.default.svc.cluster.local. @192.168.0.10 +short
10.10.50.53
The aforementioned while
loop will return NXDOMAIN indefinitely.
Removing the serve_stale
option alleviates the issue.
Anything else we need to know?:
Notice, that these tests are performed not against the primary CoreDNS of a Kubernetes cluster, but against a secondary one that forwards requests to the primary (a node-level caching mechanism).
Environment:
- the version of CoreDNS: 1.6.6
- Corefile:
.:53 {
errors {
consolidate 10s ".* i/o timeout$"
consolidate 10s ".* write: operation not permitted$"
}
cache {
success 39936
denial 9984
prefetch 10 1m 25%
serve_stale
}
reload 2s
loop
bind 192.168.0.10 169.254.20.10
forward . 192.168.0.10 192.168.0.10 192.168.0.10 {
max_fails 0
}
prometheus 127.0.0.1:9254
health 127.0.0.1:9225
}
- logs, if applicable:
- OS (e.g:
cat /etc/os-release
):
NAME="Ubuntu"
VERSION="18.04.3 LTS (Bionic Beaver)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 18.04.3 LTS"
VERSION_ID="18.04"
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
VERSION_CODENAME=bionic
UBUNTU_CODENAME=bionic
- Others:
About this issue
- Original URL
- State: closed
- Created 4 years ago
- Comments: 15 (10 by maintainers)
Commits related to this issue
- Stop serving stale responses from negative cache. Serving stale responses from the negative cache interacts poorly with other plugins and seems to confuse users. Fixes #3586. — committed to gonzalop/coredns by gonzalop 4 years ago
- Stop serving stale responses from negative cache. Serving stale responses from the negative cache interacts poorly with other plugins and seems to confuse users. Fixes #3586. Signed-off-by: Gonzalo... — committed to gonzalop/coredns by gonzalop 4 years ago
Default TTL for denial of existence responses is 1800 according to the docs and the example config doesn’t overwrite TTLs.
I haven’t verified that this is what happens here but it looks to be the expected behavior, independent of using serve_stale or not.
cc: @gonzalop