kubernetes: Pods stuck in Pending status - kube-scheduler 1.19.13

What happened:

After upgrading core k8s components from 1.18 to 1.19.13 we faced a problem with Pending pods. From time to time PODs(different deployments) are stuck in Pending status. From kube-scheduler logs, we see a new event that starts appearing after the upgrade. For example Pod kube-system/prometheus-thanos-frontend-86f4f8df77-h52tl **doesn't exist in informer cache**: pod "prometheus-thanos-frontend-86f4f8df77-h52tl" not found. However, ClusterAutoscaler still tries checking if there’s an appropriate node Pod kube-system.prometheus-thanos-frontend-86f4f8df77-h52tl marked as unschedulable can be scheduled on node ***.eu-west-1.compute.internal (based on hinting). Ignoring in scale up.

It looks like kube-scheduler just forgot about the pod, because after a couple of attempts it stop trying to schedule the pod and we see no logs/events for hours. On the other hand, CA still tries to schedule it even spin up a new nodes.

What you expected to happen:

Pending pods are assigned to nodes.

How to reproduce it (as minimally and precisely as possible):

We don’t see any pattern. It happens time to time for a completely different deployments.

Anything else we need to know?:

If we restart kube-scheduler so the leader is changed the Pending Pods are being successfully scheduled.

Environment:

  • Kubernetes version (use kubectl version):
Client Version: version.Info{Major:"1", Minor:"20", GitVersion:"v1.20.0", GitCommit:"af46c47ce925f4c4ad5cc8d1fca46c7b77d13b38", GitTreeState:"clean", BuildDate:"2020-12-08T17:59:43Z", GoVersion:"go1.15.5", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"19", GitVersion:"v1.19.13", GitCommit:"53c7b65d4531a749cd3a7004c5212d23daa044a9", GitTreeState:"clean", BuildDate:"2021-07-15T20:53:19Z", GoVersion:"go1.15.14", Compiler:"gc", Platform:"linux/amd64"}
  • Cloud provider or hardware configuration: AWS
  • OS (e.g: cat /etc/os-release): CentOS Linux 7 (Core)
  • Kernel (e.g. uname -a): Linux ***.eu-west-1.compute.internal 3.10.0-1160.25.1.el7.x86_64 #1 SMP Wed Apr 28 21:49:45 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
  • Install tools:
  • Network plugin and version (if this is a network-related bug):
  • Others: ClusterAutoscaler 1.19.1, kube-scheduler 1.19.13

About this issue

  • Original URL
  • State: closed
  • Created 3 years ago
  • Reactions: 1
  • Comments: 40 (34 by maintainers)

Most upvoted comments

I’m on the same team as OP. The cherry-pick fixed it. Thank you all for the quick resolution on a very tight deadline.

The fix is now in 1.19.15 😃

FYI, we are a bit tight on the deadline for 1.19 patch releases. I’m asking sig-release if we can squeeze this cherry-pick.

in theory, the fix to 1.19 could have been to only change the line above to use the independent pod informer.

Yeah, I considered that, but it’s also theoretically more risky, as we don’t have soak time for such change. See https://github.com/kubernetes/kubernetes/pull/105015#issuecomment-919574618 for context.