calico: DNS lookups from Nginx to CoreDNS in eBPF mode fail when the CoreDNS Pods move

Slack Thread: https://calicousers.slack.com/archives/CPEPF833L/p1614095727031700

We discovered that when Nginx is doing DNS lookups, it uses the same source-port for all of its DNS queries. When operating in eBPF mode, Calico is doing the “service-to-ip” mapping the first time a connection is created, and then keeps that data for the duration of that particular connection. Because Nginx is using the same source port each time, Calico tracks this as the same connection. If your backing pods (coredns in this case) move, their IPs are never updated as part of this connection. This causes Nginx to fail to execute any future DNS lookups, breaking Nginx in many different ways.

Expected Behavior

If the backing pod for an already-created connection session has been gone away, I think that any caching that calico does should get wiped out so that when a new session is created (even if its from the same source port), a new mapping is created.

Steps to Reproduce (for bugs)

See the slack thread above for more details…

  1. Create a cluster running with eBPF mode.
  2. Create a service “guestbook”
  3. Create a service “guestbook-proxy” that is an nginx reverse proxy pointing to “guestbook.<namespace>”. (Make sure to set Nginx up to do DNS lookups against the coredns service)
  4. kubectl rollout restart deployment/coredns -n kube-system
  5. kubectl rollout restart deployment/guestbook

At this point, the guestbook-proxy pod should be unable to lookup the backend address for the guestbook service. It will report DNS lookup timeouts.

Context

We make heavy use of reverse-proxies in our environment… not only for our own apps, but even for Loki (the “loki-gateway” pod is an Nginx reverse proxy). We’ve had several outages in our staging environment due to our Coredns pods moving IPs and DNS resolution breaking.

Your Environment

Kubernetes: EKS 1.19 Calico: 3.18.2 OS: Bottlerocket v1.0.7 (pre-release)

Logs

tcpdump of the DNS lookups that work

    100.64.99.171.53869 > 100.64.147.127.53: [bad udp cksum 0xbfe7 -> 0xa0a3!] 44873+ A? foo.okta.com. (35)
IP (tos 0x0, ttl 254, id 34455, offset 0, flags [DF], proto UDP (17), length 63)
    100.64.99.171.53869 > 100.64.147.127.53: [bad udp cksum 0xbfe7 -> 0xa0a3!] 44873+ A? foo.okta.com. (35)
IP (tos 0x0, ttl 255, id 34456, offset 0, flags [DF], proto UDP (17), length 63)
    100.64.99.171.53869 > 100.64.147.127.53: [bad udp cksum 0xbfe7 -> 0x5833!] 56505+ AAAA? foo.okta.com. (35)
IP (tos 0x0, ttl 254, id 34456, offset 0, flags [DF], proto UDP (17), length 63)
    100.64.99.171.53869 > 100.64.147.127.53: [bad udp cksum 0xbfe7 -> 0x5833!] 56505+ AAAA? foo.okta.com. (35)
IP (tos 0x0, ttl 254, id 58486, offset 0, flags [DF], proto UDP (17), length 329)
    100.64.147.127.53 > 100.64.99.171.53869: [udp sum ok] 56505 q: AAAA? foo.okta.com. 2/1/0 foo.okta.com. CNAME ok3-crtrs.tng.okta.com., ok3-crtrs.tng.okta.com. CNAME ok3-crtr-tls12-nlb-dfef298ffc8f82ca.elb.us-east-1.amazonaws.com. ns: elb.us-east-1.amazonaws.com. SOA ns-1826.awsdns-36.co.uk. awsdns-hostmaster.amazon.com. 1 7200 900 1209600 60 (301)
IP (tos 0x0, ttl 253, id 58486, offset 0, flags [DF], proto UDP (17), length 329)
    100.64.147.127.53 > 100.64.99.171.53869: [udp sum ok] 56505 q: AAAA? foo.okta.com. 2/1/0 foo.okta.com. CNAME ok3-crtrs.tng.okta.com., ok3-crtrs.tng.okta.com. CNAME ok3-crtr-tls12-nlb-dfef298ffc8f82ca.elb.us-east-1.amazonaws.com. ns: elb.us-east-1.amazonaws.com. SOA ns-1826.awsdns-36.co.uk. awsdns-hostmaster.amazon.com. 1 7200 900 1209600 60 (301)
IP (tos 0x0, ttl 254, id 58487, offset 0, flags [DF], proto UDP (17), length 452)
    100.64.147.127.53 > 100.64.99.171.53869: [udp sum ok] 44873 q: A? foo.okta.com. 5/0/0 foo.okta.com. CNAME ok3-crtrs.tng.okta.com., ok3-crtrs.tng.okta.com. CNAME ok3-crtr-tls12-nlb-dfef298ffc8f82ca.elb.us-east-1.amazonaws.com., ok3-crtr-tls12-nlb-dfef298ffc8f82ca.elb.us-east-1.amazonaws.com. A 18.209.113.152, ok3-crtr-tls12-nlb-dfef298ffc8f82ca.elb.us-east-1.amazonaws.com. A 18.209.113.153, ok3-crtr-tls12-nlb-dfef298ffc8f82ca.elb.us-east-1.amazonaws.com. A 18.209.113.160 (424)
IP (tos 0x0, ttl 253, id 58487, offset 0, flags [DF], proto UDP (17), length 452)
    100.64.147.127.53 > 100.64.99.171.53869: [udp sum ok] 44873 q: A? foo.okta.com. 5/0/0 foo.okta.com. CNAME ok3-crtrs.tng.okta.com., ok3-crtrs.tng.okta.com. CNAME ok3-crtr-tls12-nlb-dfef298ffc8f82ca.elb.us-east-1.amazonaws.com., ok3-crtr-tls12-nlb-dfef298ffc8f82ca.elb.us-east-1.amazonaws.com. A 18.209.113.152, ok3-crtr-tls12-nlb-dfef298ffc8f82ca.elb.us-east-1.amazonaws.com. A 18.209.113.153, ok3-crtr-tls12-nlb-dfef298ffc8f82ca.elb.us-east-1.amazonaws.com. A 18.209.113.160 (424)

tcpdump from the same host, after coredns has moved

IP (tos 0x0, ttl 255, id 12183, offset 0, flags [DF], proto UDP (17), length 63)
    100.64.99.171.53869 > 100.64.147.127.53: [bad udp cksum 0xbfe7 -> 0x6a0f!] 58845+ A? foo.okta.com. (35)
IP (tos 0x0, ttl 254, id 12183, offset 0, flags [DF], proto UDP (17), length 63)
    100.64.99.171.53869 > 100.64.147.127.53: [bad udp cksum 0xbfe7 -> 0x6a0f!] 58845+ A? foo.okta.com. (35)
IP (tos 0x0, ttl 255, id 12184, offset 0, flags [DF], proto UDP (17), length 63)
    100.64.99.171.53869 > 100.64.147.127.53: [bad udp cksum 0xbfe7 -> 0xb160!] 33676+ AAAA? foo.okta.com. (35)
IP (tos 0x0, ttl 254, id 12184, offset 0, flags [DF], proto UDP (17), length 63)
    100.64.99.171.53869 > 100.64.147.127.53: [bad udp cksum 0xbfe7 -> 0xb160!] 33676+ AAAA? foo.okta.com. (35)
IP (tos 0x0, ttl 255, id 13049, offset 0, flags [DF], proto UDP (17), length 63)
    100.64.99.171.53869 > 100.64.147.127.53: [bad udp cksum 0xbfe7 -> 0x6a0f!] 58845+ A? foo.okta.com. (35)
IP (tos 0x0, ttl 254, id 13049, offset 0, flags [DF], proto UDP (17), length 63)
    100.64.99.171.53869 > 100.64.147.127.53: [bad udp cksum 0xbfe7 -> 0x6a0f!] 58845+ A? foo.okta.com. (35)
IP (tos 0x0, ttl 255, id 13050, offset 0, flags [DF], proto UDP (17), length 63)
    100.64.99.171.53869 > 100.64.147.127.53: [bad udp cksum 0xbfe7 -> 0xb160!] 33676+ AAAA? foo.okta.com. (35)
IP (tos 0x0, ttl 254, id 13050, offset 0, flags [DF], proto UDP (17), length 63)
    100.64.99.171.53869 > 100.64.147.127.53: [bad udp cksum 0xbfe7 -> 0xb160!] 33676+ AAAA? foo.okta.com. (35)
IP (tos 0x0, ttl 255, id 13824, offset 0, flags [DF], proto UDP (17), length 63)
    100.64.99.171.53869 > 100.64.147.127.53: [bad udp cksum 0xbfe7 -> 0x6a0f!] 58845+ A? foo.okta.com. (35)
IP (tos 0x0, ttl 254, id 13824, offset 0, flags [DF], proto UDP (17), length 63)
    100.64.99.171.53869 > 100.64.147.127.53: [bad udp cksum 0xbfe7 -> 0x6a0f!] 58845+ A? foo.okta.com. (35)
IP (tos 0x0, ttl 255, id 13825, offset 0, flags [DF], proto UDP (17), length 63)
    100.64.99.171.53869 > 100.64.147.127.53: [bad udp cksum 0xbfe7 -> 0xb160!] 33676+ AAAA? foo.okta.com. (35)
IP (tos 0x0, ttl 254, id 13825, offset 0, flags [DF], proto UDP (17), length 63)
    100.64.99.171.53869 > 100.64.147.127.53: [bad udp cksum 0xbfe7 -> 0xb160!] 33676+ AAAA? foo.okta.com. (35)
IP (tos 0x0, ttl 255, id 59601, offset 0, flags [DF], proto UDP (17), length 81)

About this issue

  • Original URL
  • State: open
  • Created 3 years ago
  • Reactions: 5
  • Comments: 16 (8 by maintainers)

Commits related to this issue

Most upvoted comments

Writing to note that this issue still exists.

I am actively working on resolving this issue properly. We want to use the benefit of connect time load balancing as much as possible so we want to use the slower path only when necessary. There is a whole bunch of scenarios. Stay tuned 🙏 it may be a 🎄 📦

@tomastigera, @lwr20 Any chance this is going to be looked at soonish?