kubernetes: taint node,controller manager can not delete pod
What happened?
one statefulset pod “statefulsettestpod-0” is running on node2, the pod toleration is
` tolerations:
- effect: NoExecute key: node.kubernetes.io/not-ready operator: Exists tolerationSeconds: 30
- effect: NoExecute key: node.kubernetes.io/unreachable operator: Exists tolerationSeconds: 30
- effect: NoExecute key: node.kubernetes.io/unschedulable operator: Exists tolerationSeconds: 30
` i taint the node with such command
` kubectl cordon node2 kubectl taint node node2 node.kubernetes.io/unschedulable:NoExecute kubectl taint node node2 node.kubernetes.io/unschedulable:NoSchedule
` the pod “statefulsettestpod-0” is deleted by the controller-manager (taint_manager),the i remove the taints with such command
` kubectl uncordon node2
kubectl taint node node2 node.kubernetes.io/unschedulable- ` the pod “statefulsettestpod-0” will be created on node2. then quickly taint the node again (within one minute ) ,the statefulset pod will not be deleted by the controller-manager.the pod will be running on the node forever.
when i check the controller-manager log, we find this log:
` W0102 07:42:44.106398 1 timed_workers.go:115] Trying to add already existing work for &{NamespacedName:67a41957-f57d-4156-ae53-dc796a3c66f2/statefulsettestpod-0}. Skipping.
` when i check the taint_manager code, when taint_manager delete pod, the pod name will not be deleted from the workerque.
` func (q *TimedWorkerQueue) getWrappedWorkerFunc(key string) func(args *WorkArgs) error { return func(args *WorkArgs) error { err := q.workFunc(args) q.Lock() defer q.Unlock() if err == nil { // To avoid duplicated calls we keep the key in the queue, to prevent // subsequent additions. q.workers[key] = nil } else { delete(q.workers, key) } return err } }
`
What did you expect to happen?
the pod will be deleted after taint the node
How can we reproduce it (as minimally and precisely as possible)?
1.taint the node 2.remove the taint from the node 3. taint the node again quickly (within on minute)
Anything else we need to know?
No response
Kubernetes version
$ kubectl version
# paste output here
1.20.5
Cloud provider
OS version
# On Linux:
$ cat /etc/os-release
# paste output here
$ uname -a
# paste output here
# On Windows:
C:\> wmic os get Caption, Version, BuildNumber, OSArchitecture
# paste output here
Install tools
Container runtime (CRI) and and version (if applicable)
Related plugins (CNI, CSI, …) and versions (if applicable)
About this issue
- Original URL
- State: open
- Created 2 years ago
- Comments: 19 (16 by maintainers)
@jlsong01 Thanks for pinging. Sorry for letting this fall of my radar. Please go ahead and send a PR. I will be more than happy to review it.