kubernetes: Missing taint: node.kubernetes.io/unreachable:NoExecute when nodes enter NotReady state

What happened:

When the worker nodes enter in a NotReady state, the taint node.kubernetes.io/unreachable:NoExecute is not added to the node. Only the node.kubernetes.io/unreachable: NoSchedule taint is added. The effect is that pods running on such workers remain in Running state and never rescheduled.

This is similar to #97100 but in different conditions. Actually we discovered it during test of patch #98168. #98168 works only if you shutdown worker nodes, it doesn’t work if you shutdown masters too.

NAME                STATUS     ROLES                  AGE   VERSION
k8s-caas-infra01    NotReady   infra                  72d   v1.20.6
k8s-caas-infra02    NotReady   infra                  72d   v1.20.6
k8s-caas-infra03    NotReady   infra                  72d   v1.20.6
k8s-caas-master01   Ready      control-plane,master   72d   v1.20.6
k8s-caas-master02   Ready      control-plane,master   72d   v1.20.6
k8s-caas-master03   Ready      control-plane,master   72d   v1.20.6
k8s-caas-worker01   NotReady   worker                 72d   v1.20.6
k8s-caas-worker02   NotReady   worker                 72d   v1.20.6
$ kubectl -n auth-system get pods -o wide | grep infra
keycloak-0                                    1/1     Running   0          98m     k8s-caas-infra01   <none>           <none>
keycloak-keycloak-operator-654ff77bf5-hv6c9   1/1     Running   0          107m   k8s-caas-infra03   <none>           <none>
keycloak-postgresql-7654cccbb7-8s6fr          1/1     Running   0          107m   k8s-caas-infra03   <none>           <none>
$ kubectl describe node k8s-caas-infra03
Name:               k8s-caas-infra03
Roles:              infra
Labels:             beta.kubernetes.io/arch=amd64
Annotations:        csi.volume.kubernetes.io/nodeid: {"rbd.csi.ceph.com":"k8s-caas-infra03"}
                    kubeadm.alpha.kubernetes.io/cri-socket: /run/containerd/containerd.sock
                    node.alpha.kubernetes.io/ttl: 0
                    volumes.kubernetes.io/controller-managed-attach-detach: true
CreationTimestamp:  Wed, 17 Feb 2021 12:26:12 +0100
Taints:             node.kubernetes.io/unreachable:NoSchedule

What you expected to happen:

We expect the taints:


as per documentation

How to reproduce it (as minimally and precisely as possible):

This issue has been observed during an operational test on an on-prem Kubernetes v1.20.6 cluster. We have an high availability control plane with 3 masters and few workers distributed on two stretched data-center. In order to exercise the DR capability, we switched off all the VMs, both masters and workers.

  1. shut down all the masters and workers
  2. restart only the masters
  3. after masters become Ready, restart a couple of workers.
  4. at this point we expect the system (after the eviction time of 300s) to “move” pods from the workers that are in NotReady to the workers that are in Ready state
  5. check taints on the remaining workers that are still NotReady

Anything else we need to know?:


  • Kubernetes version (use kubectl version): v1.20.6
  • Cloud provider or hardware configuration: on-premises running on RHEV
  • OS (e.g: cat /etc/os-release): Ubuntu 20.04.2 LTS (Focal Fossa)
  • Kernel (e.g. uname -a): 5.4.0-72-generic
  • Install tools: kubeadm
  • Network plugin and version (if this is a network-related bug): calico
  • Others:

I think you have to elaborate on this scenario more for us to reproduce it.

I simply switched off (brutally switched off the VM) all the nodes in step 1, not a graceful shutdown. This because the initial intent of the test was a DR simulation. Hope this helps.