kube-router: Sporadic "Kernel error" when a lot of pods are scheduled

What happened?

When I drain a node and pods get rescheduled on a different machine - more often than not the node that receives an instant spike in the pods to be scheduled gets this in logs:

E0802 21:30:35.086351       1 service_endpoints_sync.go:73] Error syncing ipvs svc iptables rules to permit traffic to service VIP's: failed to sync ipset: ipset v7.15: Kernel error received: Resource busy                                                                             
E0802 21:33:04.158225       1 network_policy_controller.go:283] Failed to cleanup stale ipsets: failed to delete ipset KUBE-SRC-FRNJANT7TJPCNM5F due to ipset v7.15: Set cannot be destroyed: it is in use by a kernel component                                                          
E0802 21:33:15.078398       1 network_policy_controller.go:259] Aborting sync. Failed to sync network policy chains: %!v(MISSING)failed to perform ipset restore: ipset v7.15: Error in line 693: Kernel error received: Resource busy

And when it happens pods lose network connectivity (at least to service IPs).

I think the corresponding commands that update kernel resources should have their own retry queues that are tighter - eg retry in a loop for 10 seconds, if “Resource busy”.

From another hand, I don’t entirely understand what “Resource busy” exactly means in this context? Is it a race condition around updating/removing kernel objects?

What did you expect to happen? I think it should not happen at all.

How can we reproduce the behavior you experienced? Steps to reproduce the behavior:

Schedule a significant number of pods on one node

**Screenshots / Architecture Diagrams / Network Topologies ** If applicable, add those here to help explain your problem.

** System Information (please complete the following information)😗*

Kube-Router Version (kube-router --version): 1.5.4
Kube-Router Parameters: [e.g. --run-router --run-service-proxy --enable-overlay --overlay-type=full etc.] --run-router=true --run-firewall=true --run-service-proxy=true --bgp-graceful-restart=true --kubeconfig=/var/lib/kube-router/kubeconfig --hairpin-mode=true --metrics-port=8080
Kubernetes Version (kubectl version) : [e.g. 1.18.3] 1.26.4
Cloud Type: [e.g. AWS, GCP, Azure, on premise] baremetal
Kubernetes Deployment Type: [e.g. EKS, GKE, Kops, Kubeadm, etc.] kubeadm
Kube-Router Deployment Type: [e.g. DaemonSet, System Service] ds
Cluster Size: [e.g. 200 Nodes] ~15

** Logs, other output, metrics ** Please provide logs, other kind of output or observed metrics here.

Additional context Add any other context about the problem here.

About this issue

Original URL
State: closed
Created a year ago
Comments: 15 (9 by maintainers)

Most upvoted comments

I have found out that I cannot reproduce it on any of my development clusters, but only in prod. Hence I will do that next time I a security bulletin for kernel is issued (and a new canonical kernel is released): it usually happens at least every 2 weeks.

zerkms on Aug 7, 2023

Hmm… So this shouldn’t happen because we have our own mutex surrounding ipset logic and all the functions that you posted above has the correct locking logic.

However, I can’t argue with the logs. Can you please run kube-router with -v 1 as one of the options and then re-create the scenario and post me the logs? I’m especially interested in the lines like:

Attempting to attain ipset mutex lock
Attained ipset mutex lock, continuing...
Returned ipset mutex lock

and the ordering of those from the various controllers surrounding the errors.

aauren on Aug 4, 2023