kubernetes: Update to tolerationSeconds doesn't work after tainting node
What happened?
Found when playing around the tolerations and taints.
You can still update the tolerationSeconds for a pod after a taint with effect NoExecute is added to the node , and if you describe the pod, it shows the updated tolerationSeconds in Tolerations, but it actually has no effect. The pod still be evicted after the original tolerationSeconds of time.
What did you expect to happen?
After changing tolerationSeconds in a pod running on a tainted node, the pod will be evicted earlier or later according to the updated tolerationSeconds.
How can we reproduce it (as minimally and precisely as possible)?
- Start a local-up cluster, or have a normal k8s cluster
- Create a pod with
tolerations, and make sure the pod is running, following is an example:
apiVersion: v1
kind: Pod
metadata:
name: nginx
spec:
containers:
- name: nginx
image: nginx:1.14.2
ports:
- containerPort: 80
tolerations:
- key: "CriticalAddonsOnly"
operator: "Exists"
effect: "NoExecute"
tolerationSeconds: 100
- Taint the node where the pod is running
kubectl taint node <node_name> CriticalAddonsOnly:NoExecute
- Patch the pod with less or more tolerationSeconds
kubectl patch pod nginx --patch '{"spec": {"tolerations": [{"effect": "NoExecute", "key": "CriticalAddonsOnly", "operator": "Exists", "tolerationSeconds": 10}]}}'
- Describe the pod to check if
tolerationSecondshas been updated and watch the pod to see if the updatedtolerationSecondshas effect, you will see the pod is still running and will be evicted after 100s instead of 10s.
Anything else we need to know?
In the testing, there’s an assumption that the duration of toleration can’t be extended, but I didn’t find any docs nor commits explain why, actually no one touches it since it was added to kubernetes. And users will be confused about the updated tolerationSeconds in pod schema and the inconsistent eviction.
Kubernetes version
master
Cloud provider
OS version
# On Linux:
$ cat /etc/os-release
NAME="Ubuntu"
VERSION="18.04.5 LTS (Bionic Beaver)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 18.04.5 LTS"
VERSION_ID="18.04"
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
VERSION_CODENAME=bionic
UBUNTU_CODENAME=bionic
$ uname -a
Linux flexbp 4.15.0-112-generic #113-Ubuntu SMP Thu Jul 9 23:41:39 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
# On Windows:
C:\> wmic os get Caption, Version, BuildNumber, OSArchitecture
# paste output here
Install tools
Container runtime (CRI) and version (if applicable)
Related plugins (CNI, CSI, …) and versions (if applicable)
About this issue
- Original URL
- State: closed
- Created 2 years ago
- Comments: 15 (10 by maintainers)
Hi @Abirdcfly , I’m currently doing some goroutine leak research in these related packages, I may help to solve this problem. Sorry to disrupt your research.