kubernetes: DNS intermittent delays of 5s
Is this a BUG REPORT or FEATURE REQUEST?: /kind bug
What happened: DNS lookup is sometimes taking 5 seconds.
What you expected to happen: No delays in DNS.
How to reproduce it (as minimally and precisely as possible):
- Create a cluster in AWS using kops with cni networking:
kops create cluster --node-count 3 --zones eu-west-1a,eu-west-1b,eu-west-1c --master-zones eu-west-1a,eu-west-1b,eu-west-1c --dns-zone kube.example.com --node-size t2.medium --master-size t2.medium --topology private --networking cni --cloud-labels "Env=Staging" ${NAME}
- CNI plugin:
kubectl apply -f "https://cloud.weave.works/k8s/net?k8s-version=$(kubectl version | base64 | tr -d '\n')"
- Run this script in any pod with that has curl:
var=1
while true ; do
res=$( { curl -o /dev/null -s -w %{time_namelookup}\\n http://www.google.com; } 2>&1 )
var=$((var+1))
if [[ $res =~ ^[1-9] ]]; then
now=$(date +"%T")
echo "$var slow: $res $now"
break
fi
done
Anything else we need to know?:
- I am encountering this issue in both staging and production clusters, but for some reason staging cluster is having a lot more 5s delays.
- Delays happen both for external services (google.com) or internal, such as service.namespace.
- Happens on both 1.6 and 1.7 version of kubernetes, but did not encounter these issues in 1.5 (though the setup was a bit different - no CNI back then).
- Have not tested with 1.7 without CNI yet.
Environment:
- Kubernetes version (use
kubectl version
):
Client Version: version.Info{Major:"1", Minor:"8", GitVersion:"v1.8.2", GitCommit:"bdaeafa71f6c7c04636251031f93464384d54963", GitTreeState:"clean", BuildDate:"2017-10-24T19:48:57Z", GoVersion:"go1.8.3", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"7", GitVersion:"v1.7.10", GitCommit:"bebdeb749f1fa3da9e1312c4b08e439c404b3136", GitTreeState:"clean", BuildDate:"2017-11-03T16:31:49Z", GoVersion:"go1.8.3", Compiler:"gc", Platform:"linux/amd64"}
- Cloud provider or hardware configuration:
AWS
- OS (e.g. from /etc/os-release):
PRETTY_NAME="Ubuntu 16.04.3 LTS"
- Kernel (e.g.
uname -a
):
Linux ingress-nginx-3882489562-438sm 4.4.65-k8s #1 SMP Tue May 2 15:48:24 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
Similar issues
- https://github.com/kubernetes/dns/issues/96 - closed but seems to be exactly the same
- https://github.com/kubernetes/kubernetes/issues/45976 - has some comments matching this issue, but is taking the direction of fixing kube-dns up/down scaling problem, and is not about the intermittent failures.
/sig network
About this issue
- Original URL
- State: closed
- Created 6 years ago
- Reactions: 89
- Comments: 256 (111 by maintainers)
Links to this issue
Commits related to this issue
- Switch from alpine to jessie-slim for runner utility images The Alpine based images have a nasty problem with DNS failures that tends to surface when running them in Kubernetes. After a fair amount ... — committed to dgrove-oss/openwhisk by dgrove-oss 6 years ago
- Switch CoreDNS to a ds with host networking To work around kubernetes#56903: https://github.com/kubernetes/kubernetes/issues/56903 — committed to zenjoy/terraform-render-bootkube by deleted user 5 years ago
- removed the AAAA query from the default DNS behavior to decrease the corresponding overhead, see: https://github.com/kubernetes/kubernetes/issues/56903#issuecomment-409603030 — committed to inter169/musl by inter169 5 years ago
In my tests, using this option on /etc/resolv.conf
options single-request-reopen
Fixed the problem. But I don’t find a “clean” way to put it on pods in kubernetes 1.8. What I do:
@mikksoone Could you try if it solve your problem too?
ld-musl-x86_64.so.1-fix-musl-1.1.18-r3-alpine3.7.tar.gz
This bug described a few different types of slow DNS queries, in my use case the slow DNS requests occurred on alpine:3.7/3.8, due to the timed out for additional AAAA query along with A query by default. currently no need to query AAAA records by default, so the solution is removing AAAA resolution from the implements (musl/src/network/lookup_name.c, function ‘name_from_dns’), and the code fix worked for me:
without the fix, it took around 5 secs:
after replacing the /lib/ld-musl-x86_64.so.1, by the new one with my fix (see attached in this comment) to alpine linux docker, the DNS slow query gone away:
Also my nodejs app response time had decreased on the patched alpine.
I pushed the docker image with this fix onto public site: https://hub.docker.com/r/geekidea/alpine-a/, and you can use such docker images as below:
Note
Just to update, I’ve submitted two patches to fix the conntrack races in the kernel - http://patchwork.ozlabs.org/patch/937963/ (accepted) and http://patchwork.ozlabs.org/patch/952939/ (waiting for a review).
If both are accepted, then the timeout cases due to the races will be eliminated for those who run only one instance of a DNS server, and for others - the timeout hit rate should decrease.
To completely eliminate when |DNS server| > 1 is a non-trivial task and is still WIP.
Doesn’t solve the issue for me. Even with this option in resolv.conf I get timeouts of 5s, 2.5s and 3.5s - and they happen very often, twice per minute or so.
Note for Go users on Alpine: the Go 1.13 DNS resolver will support use-vc (golang/go#29594) and single-request (golang/go#29661).
@brb Tested with 5.0.0-rc6 the error rate has gone down to zero!
Just in case someone got here because of dns delays, in our case it was arp table overflow on the nodes (arp -n showing more than 1000 entries). Increasing the limits solved the problem.
I just posted a little write-up about our journey troubleshooting the issue, and how we are worked around it in production: https://blog.quentin-machu.fr/2018/06/24/5-15s-dns-lookups-on-kubernetes/.
Alpine 3.18 with included musl 1.2.4 seemed to have finally fixed this issue
https://www.alpinelinux.org/posts/Alpine-3.18.0-released.html
We in Pinterest is using kernel 5.0 and the default iptable set up, but still hitting this issue pretty badly:
here is a pcap trace of packages showed clearly proof of UDP packets not getting forwarded out to dns and client side is having 5s timeout / dns level retries, 10.3.253.87 and 10.3.212.90 are user pods and 10.3.23.54.domain is the dns pod
I did a
conntrack -S
and there is no insertion failure, which indicates that race 1 and 2 mentioned in this blog is already fixed, and we are hitting race 3.There are a few more action items we are trying ATM:
Will keep you guys updated but if anyone already tried any of the 3 options and has a failure / success story to share I’d be very much appreciated.
/cc @thockin @brb
The second kernel patch to mitigate the problem got accepted (context: https://www.weave.works/blog/racy-conntrack-and-dns-lookup-timeouts) and it is out in Linux 5.0-rc6.
Please test it and report whether it has reduced the timeout hit rate. Thanks.
We wrote a blog post describing the technical details of the problem and presenting the kernel fixes: https://www.weave.works/blog/racy-conntrack-and-dns-lookup-timeouts.
I’ve been having this issue for some time on kubernetes 1.7 and 1.8. I was dropping dns queries from time to time Yesterday I upgraded my cluster from 1.8.10 to 1.9.6 (kops from 1.8 to 1.9.0-alpha.3) and I started having this same issue ALL THE TIME. The workaround sugested in this issue has no effect and I can’t find any way of stopping it. I’ve made a small workaround by assigning the most requested (and poblematic) DNS to fixed IPs in /etc/hosts. Any idea on where the real problem is? I’ll test with a brand new cluster in the same versions and report back.
Same here, small clusters, no arp nor QPS limits.
dnsPolicy: Default
works without delays, but this unfortunately can not be used for all deployments.I think we found another fix for this - to use tcp mode on dns requests in a container by adding the following to the spec:
@axot @chrisghill we use coredns as daemonset and dnsmasq in front. We also use conntrack bypass and have only the dns infrastructure on the same node. The postmortem of an incident that triggered the change from our dnsmasq + tc-flannel and coredns backend via serviceip can be found at https://github.com/zalando-incubator/kubernetes-on-aws/blob/dev/docs/postmortems/jan-2019-dns-outage.md Our setup can be found at https://github.com/zalando-incubator/kubernetes-on-aws/tree/dev/cluster/manifests/coredns-local. We boot the nodes with conntrack bypass https://github.com/zalando-incubator/kubernetes-on-aws/blob/dev/cluster/node-pools/worker-default/userdata.clc.yaml#L115 to dns ports on the node ip https://github.com/zalando-incubator/kubernetes-on-aws/blob/dev/cluster/node-pools/worker-default/userdata.clc.yaml#L217 via environment variable set https://github.com/zalando-incubator/kubernetes-on-aws/blob/dev/cluster/node-pools/worker-default/userdata.clc.yaml#L154 I hope this helps you to build a solid dns infrastructure in kubernetes.
NodeLocalDNS uses TCP connections from the local DNS servers to the cluster DNS service IP, which are more robust than UDP. That is, a single pack drop kills a UDP transaction, TCP recovers comparatively instantly via handshaking and resending.
Anyway, to completely eliminate the DNS timeout issue, use Linux kernel v5.0 (the final release is probably next week) and Cilium for k8s networking. The latter replaces
kube-proxy
for accessing services with its own LB implementation (BPF based) which selects an endpoint based on a packet hash. So, in the case of two racing DNS requests, the same endpoint is selected, and thus neither packet is dropped => no DNS timeouts.https://github.com/torvalds/linux/commit/4e35c1cb9460240e983a01745b5f29fe3a4d8e39
@thockin @bowei
requesting your feedback therefore tagging you.
can this be of any interest here: https://tech.xing.com/a-reason-for-unexplained-connection-timeouts-on-kubernetes-docker-abd041cf7e02
there are multiple issues reported for this issue in kubernetes project, and will be great to have it resolved for everyone.
We are facing the same issue. Applying the
single-request-reopen
parameter to our pods’resolv.conf
“fixes” the issue, but there is one other piece of information I’d like to add.We noticed that if we change the DNS address in one of our pods’
resolv.conf
to one of ourcore-dns
pods’ address, everything works fine, no timeouts. But when we go through the default configuration, which is thecore-dns
service’s address, we get the intermittent 5 seconds delay.Since the
single-request-reopen
parameter controls the usage of one socket for more than one DNS request, it might be that k8s’ service implementation somehow is confused by receiving more than one request through the same socket.Correct, this was a design decision and discussed extensively as part of the KEP process: see https://github.com/kubernetes/enhancements/pull/1005#pullrequestreview-231381118 and two other PRs linked earlier.
One of the initial ideas which I really liked was to shadow a ClusterIP of kube-dns in-cluster Deployment for node-local-dns (https://github.com/kubernetes/community/pull/2842#discussion-diff-227648753R91) – this was not added to the spec to avoid tying it too much to the current implementation of how Services works, but I’d assume this scenario would still work (with kube-proxy in iptables mode only).
Still, node-local-dns is the recommended solution and is now in beta. If necessary, HA can be added in some way or another.
We (uSwitch) run it in production across multi-AZ clusters. It’s in non-HA setup (we decided to only consider adding HA if/once it starts causing any stability issues). We can’t be happier with it as our DNS long-tail latency improved significantly and DNS became a non-issue.
I can recommend node-problem-detector to monitor node-local-dns. Specifically, we have 3 custom plugins running tests every second (one testing node-local-dns itself; another one testing upstream external DNS and last one testing in-cluster kube-dns). This can be coupled with drano to cordon and drain nodes experiencing issues with node-local-dns. In practice we yet to see node-local-dns breaking in some way; the only cases of node-local-dns flakiness so far coincided with either node loss, or whole kernel (temporary) freeze.
edit:
Good point. In our clusters we terminate nodes by age (currently max age is set to 1 week) with help of surtr, so our node-local-dns DaemonSet is configured with a updateStrategy set to
OnDelete
. No more frivolous upgrades with service interruption!FYI:
single-request-reopen
didn’t help in my case (https://github.com/kubernetes/kubernetes/issues/56903#issuecomment-359897058)Kubernetes 1.10, AWS
OS in container:
I don’t know if Netty or OpenJDK support this option
I am by the way experiencing the same thing. With Kubernetes 1.10+CoreOS+Weave+CoreDNS/kube-dns, I see constant 5s latency on DNS resolution. tcpdump shows that the first AAAA requests get lost somehow: https://hastebin.com/banulayire.swift. With
single-request
orsingle-request-reopen
, the issue is gone.https://github.com/kubernetes/kubernetes/issues/62628
use-vc
didn’t work for us (AKS). The queries were consistent, but they all took about 8.5 seconds. Howeversingle-request-reopen
worked.Try to add
to your resolv.conf. It will force TCP for DNS lookups and will workaround this issue with ease
At this point, running kube-dns or dnsmasq on every node becomes very attractive.
Various issues discuss this - #45363 for instance.
Each workaround or fix doesn’t work for every use case, so the user, sys admin, and app developer should consider which one is fine to himself, also the costs & risks for patching some tc/iptables scripts, updating new kernel, or patching libc musl.
in my use case it was a nodejs app response timed out for 5+secs when requesting some URI’s (like http://wx.qlogo.cn/**) on alpine linux docker, so I coded the fix (https://github.com/kubernetes/kubernetes/issues/56903#issuecomment-409603030) in musl libc, it’s less costs & risks than patching tc scripts/iptables ruls or upgrading the hosts’ kernel for me. also I don’t think tc can work very well for the nodes hosting 1000k+ live tcp connections (or conntrack items).
We faced the same issue on a small self-managed cluster. The problem solved by scaling down the
coreDNS
pods to 1 pod.This is a strange and unexpected solution, but it has solved the problem.
Cluster info:
@zhan849 Alternativelly, you could use Cilium’s kube-proxy implementation which does not suffer from the conntrack races (it does not use netfilter/iptables).
https://blog.quentin-machu.fr/2018/06/24/5-15s-dns-lookups-on-kubernetes/
– Quentin Machu
On March 5, 2019 at 01:35:02, harperwang (notifications@github.com) wrote:
+1 I would like to understand more as well. It’s really crazy that an issue so fundamental exists for kubernetes.
@szuecs is there any configuration you are aware of that would actually eliminate this issue? Currently, the only solution I know of is to switch to tcp for dns (use-vc) but that’s not supported by all distros. I also assume that using calico or vpc routing on aws would bypass it because all pods have routable IP addresses and thus nat is never needed.
Do you think that running coredns on each node, with host Network =true would work?
See https://blog.quentin-machu.fr/2018/06/24/5-15s-dns-lookups-on-kubernetes/ for a more in-depth description of the issue on Kubernetes and a workaround that doesn’t involve rewriting musl or glibc.
@KIVagant
This is a libc option, this does not generally depend on the application but on the libc used. Except if they have their own implementation of the resolv stack (which I don’t know for OpenJDK/Netty)
Please however confirm that the issue actually comes from the conntrack race condition we are talking about, or something else. To confirm, use
watch -n1 conntrack -S
and check whether theinsert_failed
column is increasing as you are getting time-outs.My tc-based workaround should fix your issue if that’s the case.
Nodelocal DNSCache uses TCP for all upstream DNS queries(alleviation 3 that was mentioned in your previous comment) , in addition to skipping connection tracking for client pod to nodelocal DNS requests. It can be configured so that client pods continue to use the same DNS Server IP so the only change would be to deploy the daemonset. There are at least a couple of comments in this issue about clusters seeing significant improvement in DNS reliability and performance after deploying nodelocal dnscache. Hope that feedback helps.
@szuecs please note that there is upstream support for that and stable in 1.18 (haven’t tried it myself): https://kubernetes.io/docs/tasks/administer-cluster/nodelocaldns/
I guess this does not solve the root cause of the ndots problem, although is probably amortized. I guess this will just solve the conntrack races, that is the issue causing the intermittent delays.
@zhan849 why not using a daemonset and bypass conntrack? We use dnsmasq in front of coredns in a daemonset pod and use cloudinit and some systemd units to set resolv.conf values via kubelet to the node local dnsmasq running in hostnetwork. Works great even if we spike dns traffic to oomkill coredns. This happened in a nodejs heavy cluster. https://github.com/zalando-incubator/kubernetes-on-aws/tree/dev/cluster/manifests/coredns-local Everything else is in systemd units I can’t share.
Here you go: uswitch/node-problem-detector#4 (specifically https://github.com/uswitch/node-problem-detector/pull/4/commits/e124ea7241a671978551dd13f6b3ffaf87f96a80).
You might want to parameterize & change bits & pieces there, as it is tailored to specific a setup. Log an issue in the repo or hit me on k8s slack if any questions, so that we can keep this comment thread on the topic.
@realdimas any chance of making the source for your node-problem-detector plugins available? They sound super useful. Thanks much for all the info.
Folks, I’d like to reiterate in this comment thread what Tim Hockin, Pavithra Ramesh and others mentioned already: node-local DNS cache daemon is considered to be a solution to multiple causes of DNS long-tail latency/timeout issue.
Please have a look at
The source code of the implementation for node-local DNS cache can be found at https://github.com/kubernetes/dns/tree/898b99f8a72a547329ea7e4b28f63bc79375cac2/cmd/node-cache. Its essentially is a minimalist CoreDNS caching daemon with a static non-routable IP address with an embedded wrapper, which takes care of setting up dummy network interface and exempting its flows from connection tracking: https://github.com/kubernetes/dns/tree/898b99f8a72a547329ea7e4b28f63bc79375cac2/cmd/node-cache
Canonical manifests for installation of the DaemonSet: https://github.com/kubernetes/kubernetes/tree/0216ccf80a604b15bb19752dccd23ac2e62f1e10/cluster/addons/dns/nodelocaldns
Once installed, configured and verified – all you have to do is to re-point kubelet’s ClusterDNS to it.
While this feature officially went into alpha only in v1.13 (and graduated to beta in Kubernetes v1.15), it does not require you to run a recent version. It actually works well even with Kubernetes v1.12!
Thanks @Quentin-M based on the description given by @xeor , it looks like a different issue than a kernel race and requests being dropped. That’s why i was suggesting to open a new issue and discuss there.
@xeor what you are describing doesn’t seem at all related to this issue.
the DNS query from networking perspective is the same whether you’ve initiated it from
dig
orcurl
command.if
dig
always works and curl works only every 5 seconds (?) this sounds like something else all-together, but maybe I misunderstand your test case.@xeor @prameshj we are running K8s 1.13.2 with nodelocaldns on ~10 different clusters (CoreOS 1967.6.0, kernel 4.14.96) and have not had any issues with dns timeouts since moving to this setup across the while fleet.
The problem is that we would increase the number of dns requests to kube-dns and number of conntrack entries/DNATs which might be causing the latency problem to begin with. For a cache miss, nodelocal is going to query kube-dns as well. You are right, the nodelocal connections to kube-dns will be TCP, so those would atleast get RST instead of blackholed.
Just to be clear though, using options single-request doesn’t resolve the issue, it just makes it 99% unlikely to happen for most people. The problem of two packets coming in right back to back can still happen, but without it sending them back to back, it’s really unlikely to happen.
And as he also stated above, decresing the timeout so that it retries more often also helps give it a working chance it won’t hit back to back with another packet and help that one of them will make it through, though you will have a delay.
@ahmadalli for me I was able to get Alpine to respect the timeout in
/etc/resolv.conf
, setting it to 1 decreased the impact of this. It still happened just as frequently, but took my 5/10/15 second timeouts to 1/2/3 seconds. Eventually I moved to Debian though… This is a very frustrating problem.I don’t know too much about it, and I ran into all this after I made the Debian switch, but there may be a CNI-layer solution somewhere.
https://github.com/projectcalico/calico/issues/2073 https://github.com/weaveworks/weave/issues/3287 https://github.com/coreos/flannel/issues/1004
Just want to put down a record here. By adding
options single-request
to resolv.conf resolved the problem for us in K8S production.I would think the option "single-request’ better fits the problem scenario than “single-request-reopen”, but anyway you can try either one see whether it works for you.
@jcperezamin for the awareness https://github.com/zalando-incubator/kubernetes-on-aws/blob/dev/docs/postmortems/jan-2019-dns-outage.md and we do now daemonset to run a 2 container pod with dnsmasq+coredns and add a separate FW rule to bypass conntrack. This works without building a new base image and rebuild all your containers.
I think https://github.com/kubernetes/kubernetes/issues/70707 and the related issues are a better subscribe target. The feature is also already available as an alpha in 1.13+
@thockin, where can we track the per-node cache design and work?
@nitin302 as mentioned by @mikksoone, setting
dnsPolicy: Default
in your pod fixes the issue (for us, anyway). However, it means you won’t be able to resolve internal services by name. You’ll have to expose them in order to access the services.@marshallford Hey there. The most common issue is that I built the image with the assumption the community was running their DNS server on port 5353 (unprivileged), so the rules apply to port 5353 rather than 53. A contributor and I changed the script / Dockerfile yesterday to default to 53 instead. Tell me if that helps! You do not need to blow any nodes, or do anything besides running the container on every nodes. You should be able to see the tc rules, and conntrack’s insert_failed being less insane.
https://github.com/Quentin-M/weave-tc
For what its worth for future readers, we have hammered out a solution that seems to work really well.
Solution Overview Runs dnsmasq on each node. expose to the pods via a static ip on a new local adapter on an ip that is the same for all nodes. kubelets --cluster-dns points to this ip address
How the solution performs Test Cluster details:
We used this dns-tester to run some tests.
Using the stock arrangement, we see 1 lookup failure in every 4000 requests. This sounds like a small number, but in our testing, the probability is higher with in initial requests. For example, though the tester sees failures 1/4000 requests, manually running curl will fail with with a dns lookup more like one in 50 times.
Using the solution described here, we see one lookup failure in approximately 1M requests. Further, lookups with curl seem to never fail. It seems to us that this arrangement actually bypasses, vs simply masking, the UDP packet loss. We have not confirmed this with conntrack yet.
More about the solution We run dnsmasq on the nodes ( see above for examples-- we actually installed in on our base image for our nodes, but a daemonset as suggested by @szuecs would work well if you prefer a DaemonSet).
Then, we created a link-local adapter on the nodes on a static ip ( we used 192.18.0.1 ):
Dnsmasq listens on this ip:
Since this ip matches across all nodes, we can simply do
--cluster-dns 192.18.0.1,<kubedns>
. This addresses difficulty in setting it in kops, where we have limited control of the manifests and nodeup cycle.This solution meets all of our requirements:
To set the DNS resolver to the right EC2 node we do this:
Like this every POD has a /etc/resolv.conf different from every node and the first nameserver is the local one. The nameserver target is this dnsmasq daemonset https://github.com/zalando-incubator/kubernetes-on-aws/blob/dev/cluster/manifests/kube-dns/node-local-daemonset.yaml
@maxlaverse Thank you for your note about the typo, for the word usage of “fix” that I replaced with “mitigate”, and for your great initial troubleshooting work!
This is what we use daily, yes.
This is is great find, I was not aware at all. Given the wording of the documentation you provided, it would sound like one needs to unload the netfilter conntrack module explicitly to avoid double-tracking, which we haven’t done. The information provided in the kubernetes documentation regarding ipvs is actually confusing, as it states the nf_conntrack_ipv4 must be loaded.
Thanks for sharing this @Quentin-M.
I believe there is a typo here and you meant does not fix the packet loss.
I’ll use this opportunity to clarify one point 😉
I hope the article about connection timeouts on Kubernetes was not misleading but this flag
NF_NAT_RANGE_PROTO_RANDOM_FULLY
(or its user-space switch equivalent--random-fully
) is not supposed to fix anything at all. It allows starting from a random number when incrementally looking for a free port available for a network translation. It mitigates the issue but doesn’t solve it. The conntrack record is still created as one of the first hook of thePOSTROUTING
chain and inserted in the conntrack table at one of the last, leading to a race condition.I have one question that I think was asked a few times over all the related GitHub issues on that topic, and I believe the answer could interest people here. You’ve tested the
ipvs
back-end ofkube-proxy
and it didn’t solve the issue. Do you know why ? The lvs wiki states that theipvs
module uses its own connection tracking system. Since there is only network adress translation to be done, and no filtering or other kind of mangling involved, I had good hope it would work better.Just joining some dots: https://github.com/weaveworks/weave/issues/3287 https://github.com/kubernetes/kubernetes/issues/45976
and while I’m talking about dots let me recommend using fully-qualified names where possible, e.g.
google.com.
- the dot at the end stops the resolver from following the search path, so you don’t get lookups forgoogle.com.svc.cluster.local.
and so on.Can you pls share set of scrupta/commands you used to generate above stats
Also does someone has grafana dashboard template for kube-dns?
@bowei sadly this happens in very small clusters as well for us, ones that have so few containers that there is no feasible way we’d be hitting the QPS limit from AWS
We have the same issue within all of our kops deployed aws clusters (5). We tried moving from weave to flannel to rule out the CNI but the issue is the same. Our kube-dns pods are healthy, one on every host and they have not crashed recently.
Our arp tables are no where near full (less than 100 entries usually)