kubernetes: Missing taint: node.kubernetes.io/unreachable:NoExecute when nodes enter NotReady state
What happened:
When the worker nodes enter in a NotReady state, the taint node.kubernetes.io/unreachable:NoExecute
is not added to the node. Only the node.kubernetes.io/unreachable: NoSchedule
taint is added. The effect is that pods running on such workers remain in Running
state and never rescheduled.
This is similar to #97100 but in different conditions. Actually we discovered it during test of patch #98168. #98168 works only if you shutdown worker nodes, it doesn’t work if you shutdown masters too.
NAME STATUS ROLES AGE VERSION
k8s-caas-infra01 NotReady infra 72d v1.20.6
k8s-caas-infra02 NotReady infra 72d v1.20.6
k8s-caas-infra03 NotReady infra 72d v1.20.6
k8s-caas-master01 Ready control-plane,master 72d v1.20.6
k8s-caas-master02 Ready control-plane,master 72d v1.20.6
k8s-caas-master03 Ready control-plane,master 72d v1.20.6
k8s-caas-worker01 NotReady worker 72d v1.20.6
k8s-caas-worker02 NotReady worker 72d v1.20.6
$ kubectl -n auth-system get pods -o wide | grep infra
keycloak-0 1/1 Running 0 98m 10.38.70.92 k8s-caas-infra01 <none> <none>
keycloak-keycloak-operator-654ff77bf5-hv6c9 1/1 Running 0 107m 10.38.136.220 k8s-caas-infra03 <none> <none>
keycloak-postgresql-7654cccbb7-8s6fr 1/1 Running 0 107m 10.38.136.232 k8s-caas-infra03 <none> <none>
$ kubectl describe node k8s-caas-infra03
Name: k8s-caas-infra03
Roles: infra
Labels: beta.kubernetes.io/arch=amd64
beta.kubernetes.io/os=linux
kubernetes.io/arch=amd64
kubernetes.io/hostname=k8s-caas-infra03
kubernetes.io/os=linux
node-role.kubernetes.io/infra=
Annotations: csi.volume.kubernetes.io/nodeid: {"rbd.csi.ceph.com":"k8s-caas-infra03"}
kubeadm.alpha.kubernetes.io/cri-socket: /run/containerd/containerd.sock
node.alpha.kubernetes.io/ttl: 0
projectcalico.org/IPv4Address: 10.9.5.68/24
volumes.kubernetes.io/controller-managed-attach-detach: true
CreationTimestamp: Wed, 17 Feb 2021 12:26:12 +0100
Taints: node.kubernetes.io/unreachable:NoSchedule
What you expected to happen:
We expect the taints:
node.kubernetes.io/unreachable:NoExecute
node.kubernetes.io/unreachable:NoSchedule
node.kubernetes.io/not-ready:NoExecute
as per documentation
How to reproduce it (as minimally and precisely as possible):
This issue has been observed during an operational test on an on-prem Kubernetes v1.20.6 cluster. We have an high availability control plane with 3 masters and few workers distributed on two stretched data-center. In order to exercise the DR capability, we switched off all the VMs, both masters and workers.
- shut down all the masters and workers
- restart only the masters
- after masters become
Ready
, restart a couple of workers. - at this point we expect the system (after the eviction time of 300s) to “move” pods from the workers that are in
NotReady
to the workers that are inReady
state - check taints on the remaining workers that are still
NotReady
Anything else we need to know?:
Environment:
- Kubernetes version (use
kubectl version
):v1.20.6
- Cloud provider or hardware configuration:
on-premises running on RHEV
- OS (e.g:
cat /etc/os-release
):Ubuntu 20.04.2 LTS (Focal Fossa)
- Kernel (e.g.
uname -a
):5.4.0-72-generic
- Install tools:
kubeadm
- Network plugin and version (if this is a network-related bug):
calico
- Others:
About this issue
- Original URL
- State: closed
- Created 3 years ago
- Reactions: 3
- Comments: 25 (10 by maintainers)
@k8s-triage-robot: Closing this issue.
In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.
I simply switched off (brutally switched off the VM) all the nodes in step 1, not a graceful shutdown. This because the initial intent of the test was a DR simulation. Hope this helps.