kubernetes: Pods sharing a PVC on a single node cluster fail to schedule

/kind bug /sig scheduling

What happened: I run 50 Nginx containers on a single node cluster via Deployment with 50 replicas. Those containers have a volumemount via PVC backed by HostPath (/data/kubernetes/tokens). Sometimes kube-scheduler leaves some pods in Pending state, while some get to Running state. Error is that it PVC is not bound, yet describing said PVC, it is bound and used by said Pod (and 49 others).

Deleting pod in question helps (new copy gets scheduled), and so does restarting scheduler container, leading to conclusion that it is a bug in scheduler itself.

What you expected to happen: All 50 containers always schedule successfully.

How to reproduce it (as minimally and precisely as possible): EDIT: Reproduced with minikube: https://github.com/kubernetes/kubernetes/issues/73216#issuecomment-457204530

Nginx deployment.yaml used: https://gist.github.com/tuminoid/56f939e5a06a16be72bbd155ecc75b2f

Anything else we need to know?: Logs, describes etc: https://gist.github.com/tuminoid/5e6367070eb95d45c0d59550b888cd0e

Sometimes all pods come up, sometimes only some, and in this example, it got stuck after it had scheduled only one. Getting stuck after one is very special case, but maybe underlines the issue that it is somehow random/racy.

Maybe relevant issues/PRs? #72045, #71551

Environment:

  • Kubernetes version (use kubectl version):
Client Version: version.Info{Major:"1", Minor:"12", GitVersion:"v1.12.4", GitCommit:"f49fa022dbe63faafd0da106ef7e05a29721d3f1", GitTreeState:"clean", BuildDate:"2018-12-14T07:10:00Z", GoVersion:"go1.10.4", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"12", GitVersion:"v1.12.4", GitCommit:"f49fa022dbe63faafd0da106ef7e05a29721d3f1", GitTreeState:"clean", BuildDate:"2018-12-14T06:59:37Z", GoVersion:"go1.10.4", Compiler:"gc", Platform:"linux/amd64"}

Reproduces on 1.13.1+, 1.12.4+ and 1.14.0-alpha1.

  • Cloud provider or hardware configuration: Minikube / self-rolled VM
  • OS (e.g. from /etc/os-release): Minikube / RHEL 7.6
  • Kernel (e.g. uname -a): Linux kubesingle 3.10.0-957.1.3.el7.x86_64 #1 SMP Thu Nov 15 17:36:42 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
  • Install tools: Minikube (kubeadm) / manual
  • Others:

About this issue

  • Original URL
  • State: closed
  • Created 5 years ago
  • Comments: 16 (10 by maintainers)

Most upvoted comments

EDIT: reproduced it on 1.12.5, just took longer. Seems #72558 needs to be cherry-picked to 1.12 too. Ignore old comment.