kubernetes: Race condition between kubelet and scheduler

What happened:

The pod passed the scheduler(moved from the Pending phase) and failed to start because of the kubelet with the error:

Mar 07 01:37:03 n1-standard-2-ubuntu-gke-1804-1-16-v20200330-317d5885 kubelet[7842]: I0307 01:37:03.668354    7842 predicate.go:143] Predicate failed on Pod: hugepages-q9b44_hugepages-test-5981(a32177e0-ea6e-49f5-8f25-a946530f34f4), for reason: Node didn't have enough resource: hugepages-2Mi, requested: 6291456, used: 6291456, capacity: 10485760

It means that the scheduler and kubelet get a different view of requested resources.

What you expected to happen:

I expect that if the scheduler passed the pod and the pod was started it should not fail under the kubelet because of insufficient resources.

How to reproduce it (as minimally and precisely as possible):

It failed under the CI - https://prow.k8s.io/view/gs/kubernetes-jenkins/logs/ci-kubernetes-node-kubelet-serial-hugepages/1368372352512954368

Steps:

  • configure hugepages under the node(5 - 2Mi-hugepages)
  • start the pod, that requests 6Mi of 2Mi-hugepages
  • delete the namespace where the pod started
  • restart the kubelet
  • start an additional pod, that requests 6Mi of 2Mi-hugepages

Anything else we need to know?:

It happens very rarely, something around 10%.

The problem that the scheduler and kubelet are using different mechanisms and structs to store active pods requests.

Environment:

  • Kubernetes version (use kubectl version): master
  • Cloud provider or hardware configuration:
  • OS (e.g: cat /etc/os-release):
  • Kernel (e.g. uname -a):
  • Install tools:
  • Network plugin and version (if this is a network-related bug):
  • Others:

About this issue

  • Original URL
  • State: closed
  • Created 3 years ago
  • Comments: 16 (12 by maintainers)

Most upvoted comments

@cynepco3hahue I’ve noticed a lot of similar error messages in the kubelet log of successfull tests. Here is an example: taken from the latest available test run: https://storage.googleapis.com/kubernetes-jenkins/logs/ci-kubernetes-node-kubelet-serial-hugepages/1397069286773624832/artifacts/n1-standard-2-ubuntu-gke-2004-1-20-v20210401-ec1b0cae/kubelet.log:

$ ~/go/src/k8s.io/kubernetes (master) $ wget https://storage.googleapis.com/kubernetes-jenkins/logs/ci-kubernetes-node-kubelet-serial-hugepages/1397069286773624832/artifacts/n1-standard-2-ubuntu-gke-2004-1-20-v20210401-ec1b0cae/kubelet.log -q -O- |grep enough
May 25 06:06:15 n1-standard-2-ubuntu-gke-2004-1-20-v20210401-ec1b0cae kubelet[4610]: I0525 06:06:15.486675    4610 predicate.go:144] "Predicate failed on Pod" pod="hugepages-test-3636/hugepages-fzbc2" err="Node didn't have enough resource: hugepages-1Gi, requested: 1073741824, used: 0, capacity: 0"
May 25 06:06:15 n1-standard-2-ubuntu-gke-2004-1-20-v20210401-ec1b0cae kubelet[4610]: I0525 06:06:15.487002    4610 event.go:291] "Event occurred" object="hugepages-test-3636/hugepages-fzbc2" kind="Pod" apiVersion="v1" type="Warning" reason="OutOfhugepages-1Gi" message="Node didn't have enough resource: hugepages-1Gi, requested: 1073741824, used: 0, capacity: 0"
May 25 06:06:15 n1-standard-2-ubuntu-gke-2004-1-20-v20210401-ec1b0cae kubelet[4610]: I0525 06:06:15.499191    4610 status_manager.go:595] "Status for pod updated successfully" pod="hugepages-test-3636/hugepages-fzbc2" statusVersion=1 status={Phase:Failed Conditions:[] Message:Pod Node didn't have enough resource: hugepages-1Gi, requested: 1073741824, used: 0, capacity: 0 Reason:OutOfhugepages-1Gi NominatedNodeName: HostIP: PodIP: PodIPs:[] StartTime:2021-05-25 06:06:00 +0000 UTC InitContainerStatuses:[] ContainerStatuses:[] QOSClass: EphemeralContainerStatuses:[]}
May 25 06:06:38 n1-standard-2-ubuntu-gke-2004-1-20-v20210401-ec1b0cae kubelet[5456]: I0525 06:06:38.554059    5456 predicate.go:144] "Predicate failed on Pod" pod="hugepages-test-3328/hugepages-2rvtg" err="Node didn't have enough resource: hugepages-2Mi, requested: 6291456, used: 0, capacity: 0"
May 25 06:06:38 n1-standard-2-ubuntu-gke-2004-1-20-v20210401-ec1b0cae kubelet[5456]: I0525 06:06:38.555957    5456 event.go:291] "Event occurred" object="hugepages-test-3328/hugepages-2rvtg" kind="Pod" apiVersion="v1" type="Warning" reason="OutOfhugepages-2Mi" message="Node didn't have enough resource: hugepages-2Mi, requested: 6291456, used: 0, capacity: 0"
May 25 06:06:38 n1-standard-2-ubuntu-gke-2004-1-20-v20210401-ec1b0cae kubelet[5456]: I0525 06:06:38.565518    5456 status_manager.go:595] "Status for pod updated successfully" pod="hugepages-test-3328/hugepages-2rvtg" statusVersion=1 status={Phase:Failed Conditions:[] Message:Pod Node didn't have enough resource: hugepages-2Mi, requested: 6291456, used: 0, capacity: 0 Reason:OutOfhugepages-2Mi NominatedNodeName: HostIP: PodIP: PodIPs:[] StartTime:2021-05-25 06:06:25 +0000 UTC InitContainerStatuses:[] ContainerStatuses:[] QOSClass: EphemeralContainerStatuses:[]}
May 25 06:07:01 n1-standard-2-ubuntu-gke-2004-1-20-v20210401-ec1b0cae kubelet[6371]: I0525 06:07:01.576772    6371 predicate.go:144] "Predicate failed on Pod" pod="hugepages-test-1137/hugepages-6sz8m" err="Node didn't have enough resource: hugepages-2Mi, requested: 6291456, used: 0, capacity: 0"
May 25 06:07:01 n1-standard-2-ubuntu-gke-2004-1-20-v20210401-ec1b0cae kubelet[6371]: I0525 06:07:01.577296    6371 event.go:291] "Event occurred" object="hugepages-test-1137/hugepages-6sz8m" kind="Pod" apiVersion="v1" type="Warning" reason="OutOfhugepages-2Mi" message="Node didn't have enough resource: hugepages-2Mi, requested: 6291456, used: 0, capacity: 0"
May 25 06:07:01 n1-standard-2-ubuntu-gke-2004-1-20-v20210401-ec1b0cae kubelet[6371]: I0525 06:07:01.586173    6371 status_manager.go:595] "Status for pod updated successfully" pod="hugepages-test-1137/hugepages-6sz8m" statusVersion=1 status={Phase:Failed Conditions:[] Message:Pod Node didn't have enough resource: hugepages-2Mi, requested: 6291456, used: 0, capacity: 0 Reason:OutOfhugepages-2Mi NominatedNodeName: HostIP: PodIP: PodIPs:[] StartTime:2021-05-25 06:06:49 +0000 UTC InitContainerStatuses:[] ContainerStatuses:[] QOSClass: EphemeralContainerStatuses:[]}

The fact that node capacity for the requested hugepages is 0 for 3 pods in a row makes me suspect that the test creates pods requesting hugepages that are not allocated or something similar.

Can you help to investigate this?