calico: DNS lookups from Nginx to CoreDNS in eBPF mode fail when the CoreDNS Pods move
Slack Thread: https://calicousers.slack.com/archives/CPEPF833L/p1614095727031700
We discovered that when Nginx is doing DNS lookups, it uses the same source-port for all of its DNS queries. When operating in eBPF mode, Calico is doing the “service-to-ip” mapping the first time a connection is created, and then keeps that data for the duration of that particular connection. Because Nginx is using the same source port each time, Calico tracks this as the same connection. If your backing pods (coredns in this case) move, their IPs are never updated as part of this connection. This causes Nginx to fail to execute any future DNS lookups, breaking Nginx in many different ways.
Expected Behavior
If the backing pod for an already-created connection session has been gone away, I think that any caching that calico does should get wiped out so that when a new session is created (even if its from the same source port), a new mapping is created.
Steps to Reproduce (for bugs)
See the slack thread above for more details…
- Create a cluster running with eBPF mode.
- Create a service “guestbook”
- Create a service “guestbook-proxy” that is an nginx reverse proxy pointing to “guestbook.<namespace>”. (Make sure to set Nginx up to do DNS lookups against the coredns service)
kubectl rollout restart deployment/coredns -n kube-systemkubectl rollout restart deployment/guestbook
At this point, the guestbook-proxy pod should be unable to lookup the backend address for the guestbook service. It will report DNS lookup timeouts.
Context
We make heavy use of reverse-proxies in our environment… not only for our own apps, but even for Loki (the “loki-gateway” pod is an Nginx reverse proxy). We’ve had several outages in our staging environment due to our Coredns pods moving IPs and DNS resolution breaking.
Your Environment
Kubernetes: EKS 1.19 Calico: 3.18.2 OS: Bottlerocket v1.0.7 (pre-release)
Logs
tcpdump of the DNS lookups that work
100.64.99.171.53869 > 100.64.147.127.53: [bad udp cksum 0xbfe7 -> 0xa0a3!] 44873+ A? foo.okta.com. (35)
IP (tos 0x0, ttl 254, id 34455, offset 0, flags [DF], proto UDP (17), length 63)
100.64.99.171.53869 > 100.64.147.127.53: [bad udp cksum 0xbfe7 -> 0xa0a3!] 44873+ A? foo.okta.com. (35)
IP (tos 0x0, ttl 255, id 34456, offset 0, flags [DF], proto UDP (17), length 63)
100.64.99.171.53869 > 100.64.147.127.53: [bad udp cksum 0xbfe7 -> 0x5833!] 56505+ AAAA? foo.okta.com. (35)
IP (tos 0x0, ttl 254, id 34456, offset 0, flags [DF], proto UDP (17), length 63)
100.64.99.171.53869 > 100.64.147.127.53: [bad udp cksum 0xbfe7 -> 0x5833!] 56505+ AAAA? foo.okta.com. (35)
IP (tos 0x0, ttl 254, id 58486, offset 0, flags [DF], proto UDP (17), length 329)
100.64.147.127.53 > 100.64.99.171.53869: [udp sum ok] 56505 q: AAAA? foo.okta.com. 2/1/0 foo.okta.com. CNAME ok3-crtrs.tng.okta.com., ok3-crtrs.tng.okta.com. CNAME ok3-crtr-tls12-nlb-dfef298ffc8f82ca.elb.us-east-1.amazonaws.com. ns: elb.us-east-1.amazonaws.com. SOA ns-1826.awsdns-36.co.uk. awsdns-hostmaster.amazon.com. 1 7200 900 1209600 60 (301)
IP (tos 0x0, ttl 253, id 58486, offset 0, flags [DF], proto UDP (17), length 329)
100.64.147.127.53 > 100.64.99.171.53869: [udp sum ok] 56505 q: AAAA? foo.okta.com. 2/1/0 foo.okta.com. CNAME ok3-crtrs.tng.okta.com., ok3-crtrs.tng.okta.com. CNAME ok3-crtr-tls12-nlb-dfef298ffc8f82ca.elb.us-east-1.amazonaws.com. ns: elb.us-east-1.amazonaws.com. SOA ns-1826.awsdns-36.co.uk. awsdns-hostmaster.amazon.com. 1 7200 900 1209600 60 (301)
IP (tos 0x0, ttl 254, id 58487, offset 0, flags [DF], proto UDP (17), length 452)
100.64.147.127.53 > 100.64.99.171.53869: [udp sum ok] 44873 q: A? foo.okta.com. 5/0/0 foo.okta.com. CNAME ok3-crtrs.tng.okta.com., ok3-crtrs.tng.okta.com. CNAME ok3-crtr-tls12-nlb-dfef298ffc8f82ca.elb.us-east-1.amazonaws.com., ok3-crtr-tls12-nlb-dfef298ffc8f82ca.elb.us-east-1.amazonaws.com. A 18.209.113.152, ok3-crtr-tls12-nlb-dfef298ffc8f82ca.elb.us-east-1.amazonaws.com. A 18.209.113.153, ok3-crtr-tls12-nlb-dfef298ffc8f82ca.elb.us-east-1.amazonaws.com. A 18.209.113.160 (424)
IP (tos 0x0, ttl 253, id 58487, offset 0, flags [DF], proto UDP (17), length 452)
100.64.147.127.53 > 100.64.99.171.53869: [udp sum ok] 44873 q: A? foo.okta.com. 5/0/0 foo.okta.com. CNAME ok3-crtrs.tng.okta.com., ok3-crtrs.tng.okta.com. CNAME ok3-crtr-tls12-nlb-dfef298ffc8f82ca.elb.us-east-1.amazonaws.com., ok3-crtr-tls12-nlb-dfef298ffc8f82ca.elb.us-east-1.amazonaws.com. A 18.209.113.152, ok3-crtr-tls12-nlb-dfef298ffc8f82ca.elb.us-east-1.amazonaws.com. A 18.209.113.153, ok3-crtr-tls12-nlb-dfef298ffc8f82ca.elb.us-east-1.amazonaws.com. A 18.209.113.160 (424)
tcpdump from the same host, after coredns has moved
IP (tos 0x0, ttl 255, id 12183, offset 0, flags [DF], proto UDP (17), length 63)
100.64.99.171.53869 > 100.64.147.127.53: [bad udp cksum 0xbfe7 -> 0x6a0f!] 58845+ A? foo.okta.com. (35)
IP (tos 0x0, ttl 254, id 12183, offset 0, flags [DF], proto UDP (17), length 63)
100.64.99.171.53869 > 100.64.147.127.53: [bad udp cksum 0xbfe7 -> 0x6a0f!] 58845+ A? foo.okta.com. (35)
IP (tos 0x0, ttl 255, id 12184, offset 0, flags [DF], proto UDP (17), length 63)
100.64.99.171.53869 > 100.64.147.127.53: [bad udp cksum 0xbfe7 -> 0xb160!] 33676+ AAAA? foo.okta.com. (35)
IP (tos 0x0, ttl 254, id 12184, offset 0, flags [DF], proto UDP (17), length 63)
100.64.99.171.53869 > 100.64.147.127.53: [bad udp cksum 0xbfe7 -> 0xb160!] 33676+ AAAA? foo.okta.com. (35)
IP (tos 0x0, ttl 255, id 13049, offset 0, flags [DF], proto UDP (17), length 63)
100.64.99.171.53869 > 100.64.147.127.53: [bad udp cksum 0xbfe7 -> 0x6a0f!] 58845+ A? foo.okta.com. (35)
IP (tos 0x0, ttl 254, id 13049, offset 0, flags [DF], proto UDP (17), length 63)
100.64.99.171.53869 > 100.64.147.127.53: [bad udp cksum 0xbfe7 -> 0x6a0f!] 58845+ A? foo.okta.com. (35)
IP (tos 0x0, ttl 255, id 13050, offset 0, flags [DF], proto UDP (17), length 63)
100.64.99.171.53869 > 100.64.147.127.53: [bad udp cksum 0xbfe7 -> 0xb160!] 33676+ AAAA? foo.okta.com. (35)
IP (tos 0x0, ttl 254, id 13050, offset 0, flags [DF], proto UDP (17), length 63)
100.64.99.171.53869 > 100.64.147.127.53: [bad udp cksum 0xbfe7 -> 0xb160!] 33676+ AAAA? foo.okta.com. (35)
IP (tos 0x0, ttl 255, id 13824, offset 0, flags [DF], proto UDP (17), length 63)
100.64.99.171.53869 > 100.64.147.127.53: [bad udp cksum 0xbfe7 -> 0x6a0f!] 58845+ A? foo.okta.com. (35)
IP (tos 0x0, ttl 254, id 13824, offset 0, flags [DF], proto UDP (17), length 63)
100.64.99.171.53869 > 100.64.147.127.53: [bad udp cksum 0xbfe7 -> 0x6a0f!] 58845+ A? foo.okta.com. (35)
IP (tos 0x0, ttl 255, id 13825, offset 0, flags [DF], proto UDP (17), length 63)
100.64.99.171.53869 > 100.64.147.127.53: [bad udp cksum 0xbfe7 -> 0xb160!] 33676+ AAAA? foo.okta.com. (35)
IP (tos 0x0, ttl 254, id 13825, offset 0, flags [DF], proto UDP (17), length 63)
100.64.99.171.53869 > 100.64.147.127.53: [bad udp cksum 0xbfe7 -> 0xb160!] 33676+ AAAA? foo.okta.com. (35)
IP (tos 0x0, ttl 255, id 59601, offset 0, flags [DF], proto UDP (17), length 81)
About this issue
- Original URL
- State: open
- Created 3 years ago
- Reactions: 5
- Comments: 16 (8 by maintainers)
Commits related to this issue
- felix/bpf: allow using CTLB workaround just for UDP Issues described in https://github.com/projectcalico/calico/issues/4509 are only UDP related. We need to turn off CTLB for UDP. TCP would resolve b... — committed to tomastigera/project-calico-calico by tomastigera 2 years ago
- felix/bpf: allow using CTLB workaround just for UDP Issues described in https://github.com/projectcalico/calico/issues/4509 are only UDP related. We need to turn off CTLB for UDP. TCP would resolve b... — committed to tomastigera/project-calico-calico by tomastigera 2 years ago
- felix/bpf: allow using CTLB workaround just for UDP Issues described in https://github.com/projectcalico/calico/issues/4509 are only UDP related. We need to turn off CTLB for UDP. TCP would resolve b... — committed to tomastigera/project-calico-calico by tomastigera 2 years ago
- felix/bpf: allow using CTLB workaround just for UDP Issues described in https://github.com/projectcalico/calico/issues/4509 are only UDP related. We need to turn off CTLB for UDP. TCP would resolve b... — committed to tomastigera/project-calico-calico by tomastigera 2 years ago
- felix/bpf: allow using CTLB workaround just for UDP Issues described in https://github.com/projectcalico/calico/issues/4509 are only UDP related. We need to turn off CTLB for UDP. TCP would resolve b... — committed to tomastigera/project-calico-calico by tomastigera 2 years ago
- felix/bpf: allow using CTLB workaround just for UDP Issues described in https://github.com/projectcalico/calico/issues/4509 are only UDP related. We need to turn off CTLB for UDP. TCP would resolve b... — committed to tomastigera/project-calico-calico by tomastigera 2 years ago
- felix/bpf: allow using CTLB workaround just for UDP Issues described in https://github.com/projectcalico/calico/issues/4509 are only UDP related. We need to turn off CTLB for UDP. TCP would resolve b... — committed to tomastigera/project-calico-calico by tomastigera 2 years ago
- felix/bpf: allow using CTLB workaround just for UDP Issues described in https://github.com/projectcalico/calico/issues/4509 are only UDP related. We need to turn off CTLB for UDP. TCP would resolve b... — committed to tomastigera/project-calico-calico by tomastigera 2 years ago
Writing to note that this issue still exists.
I am actively working on resolving this issue properly. We want to use the benefit of connect time load balancing as much as possible so we want to use the slower path only when necessary. There is a whole bunch of scenarios. Stay tuned 🙏 it may be a 🎄 📦
@tomastigera, @lwr20 Any chance this is going to be looked at soonish?