kops: arp_cache: neighbor table overflow!
-
What
kopsversion are you running? The commandkops version, will display this information. 1.8 -
What Kubernetes version are you running?
kubectl versionwill print the version if a cluster is running or provide the Kubernetes version specified as akopsflag. 1.8.6 -
What cloud provider are you using? aws
I’m having a log of arp_cache table overflow in my production cluster, reading this blog post about large clusters: https://blog.openai.com/scaling-kubernetes-to-2500-nodes/ they say that the solution is increasing the maximum size of the arp cache table, can I configure sysctl options:
net.ipv4.neigh.default.gc_thresh1 net.ipv4.neigh.default.gc_thresh2 net.ipv4.neigh.default.gc_thresh3
using kops?
thanks!
About this issue
- Original URL
- State: closed
- Created 6 years ago
- Comments: 19 (16 by maintainers)
Commits related to this issue
- [Calico] Activate node controller in calico-kube-controllers and add CALICO_K8S_NODE_REF in calico-node, this commit fixes #3224 and #4533 — committed to felipejfc/kops by felipejfc 6 years ago
- [Calico] Activate node controller in calico-kube-controllers and add CALICO_K8S_NODE_REF in calico-node, this commit fixes #3224 and #4533 — committed to felipejfc/kops by felipejfc 6 years ago
- [Calico] Activate node controller in calico-kube-controllers and add CALICO_K8S_NODE_REF in calico-node, this commit fixes #3224 and #4533 — committed to felipejfc/kops by felipejfc 6 years ago
- [Calico] Activate node controller in calico-kube-controllers and add CALICO_K8S_NODE_REF in calico-node, this commit fixes #3224 and #4533 — committed to vendrov/kops by felipejfc 6 years ago
- [Calico] Activate node controller in calico-kube-controllers and add CALICO_K8S_NODE_REF in calico-node, this commit fixes #3224 and #4533 — committed to rdrgmnzs/kops by felipejfc 6 years ago
No, it shouldn’t have one for every pod because the nodes themselves are the next hops for traffic, not individual pod IPs. Instead, you’ll get an ARP entry for each node in the cluster. So, a given node’s ARP cache should roughly be
num_pods_on_that_node + num_nodes_in_cluster.