kube-router: Sporadic "Kernel error" when a lot of pods are scheduled

What happened?

When I drain a node and pods get rescheduled on a different machine - more often than not the node that receives an instant spike in the pods to be scheduled gets this in logs:

E0802 21:30:35.086351       1 service_endpoints_sync.go:73] Error syncing ipvs svc iptables rules to permit traffic to service VIP's: failed to sync ipset: ipset v7.15: Kernel error received: Resource busy                                                                             
E0802 21:33:04.158225       1 network_policy_controller.go:283] Failed to cleanup stale ipsets: failed to delete ipset KUBE-SRC-FRNJANT7TJPCNM5F due to ipset v7.15: Set cannot be destroyed: it is in use by a kernel component                                                          
E0802 21:33:15.078398       1 network_policy_controller.go:259] Aborting sync. Failed to sync network policy chains: %!v(MISSING)failed to perform ipset restore: ipset v7.15: Error in line 693: Kernel error received: Resource busy                               

And when it happens pods lose network connectivity (at least to service IPs).

I think the corresponding commands that update kernel resources should have their own retry queues that are tighter - eg retry in a loop for 10 seconds, if “Resource busy”.

From another hand, I don’t entirely understand what “Resource busy” exactly means in this context? Is it a race condition around updating/removing kernel objects?

What did you expect to happen? I think it should not happen at all.

How can we reproduce the behavior you experienced? Steps to reproduce the behavior:

  1. Schedule a significant number of pods on one node

**Screenshots / Architecture Diagrams / Network Topologies ** If applicable, add those here to help explain your problem.

** System Information (please complete the following information)😗*

  • Kube-Router Version (kube-router --version): 1.5.4
  • Kube-Router Parameters: [e.g. --run-router --run-service-proxy --enable-overlay --overlay-type=full etc.] --run-router=true --run-firewall=true --run-service-proxy=true --bgp-graceful-restart=true --kubeconfig=/var/lib/kube-router/kubeconfig --hairpin-mode=true --metrics-port=8080
  • Kubernetes Version (kubectl version) : [e.g. 1.18.3] 1.26.4
  • Cloud Type: [e.g. AWS, GCP, Azure, on premise] baremetal
  • Kubernetes Deployment Type: [e.g. EKS, GKE, Kops, Kubeadm, etc.] kubeadm
  • Kube-Router Deployment Type: [e.g. DaemonSet, System Service] ds
  • Cluster Size: [e.g. 200 Nodes] ~15

** Logs, other output, metrics ** Please provide logs, other kind of output or observed metrics here.

Additional context Add any other context about the problem here.

About this issue

  • Original URL
  • State: closed
  • Created a year ago
  • Comments: 15 (9 by maintainers)

Most upvoted comments

I have found out that I cannot reproduce it on any of my development clusters, but only in prod. Hence I will do that next time I a security bulletin for kernel is issued (and a new canonical kernel is released): it usually happens at least every 2 weeks.

Hmm… So this shouldn’t happen because we have our own mutex surrounding ipset logic and all the functions that you posted above has the correct locking logic.

However, I can’t argue with the logs. Can you please run kube-router with -v 1 as one of the options and then re-create the scenario and post me the logs? I’m especially interested in the lines like:

  • Attempting to attain ipset mutex lock
  • Attained ipset mutex lock, continuing...
  • Returned ipset mutex lock

and the ordering of those from the various controllers surrounding the errors.