kubernetes: Equivalence cache pod hashing doesn't work for StatefulSets with PVCs

Is this a BUG REPORT or FEATURE REQUEST?: @kubernetes/sig-storage-bugs @kubernetes/sig-scheduling-bugs /assign

What happened: When I run my e2e test on my own cluster with only my feature gates enabled, the StatefulSet test passes. However, when the e2e test runs in CI, where all feature gates are enabled, the test is failing. I’m suspecting there is some integration issue with other scheduling alpha features.

The symptom I see, is that my predicate is only getting run against one node for 10 minutes. After 10 minutes, the predicate starts to get run across all the nodes again. Also, this is hit on the second pod in the StatefulSet. The first StatefulSet pod has no problems.

I1122 23:40:20.418567       5 scheduler_binder.go:133] FindPodVolumes for pod "e2e-tests-persistent-local-volumes-test-jbrwq/local-volume-statefulset-1", node "bootstrap-e2e-minion-group-tb58"
I1122 23:40:20.440905       5 scheduler_binder.go:133] FindPodVolumes for pod "e2e-tests-persistent-local-volumes-test-jbrwq/local-volume-statefulset-1", node "bootstrap-e2e-minion-group-tb58"
I1122 23:40:20.493613       5 scheduler_binder.go:133] FindPodVolumes for pod "e2e-tests-persistent-local-volumes-test-jbrwq/local-volume-statefulset-1", node "bootstrap-e2e-minion-group-tb58"
I1122 23:40:20.792998       5 scheduler_binder.go:133] FindPodVolumes for pod "e2e-tests-persistent-local-volumes-test-jbrwq/local-volume-statefulset-1", node "bootstrap-e2e-minion-group-tb58"
...
I1122 23:50:14.795526       5 scheduler_binder.go:133] FindPodVolumes for pod "e2e-tests-persistent-local-volumes-test-jbrwq/local-volume-statefulset-1", node "bootstrap-e2e-minion-group-tb58"
I1122 23:50:14.795964       5 scheduler_binder.go:133] FindPodVolumes for pod "e2e-tests-persistent-local-volumes-test-jbrwq/local-volume-statefulset-1", node "bootstrap-e2e-master"
I1122 23:50:14.796184       5 scheduler_binder.go:133] FindPodVolumes for pod "e2e-tests-persistent-local-volumes-test-jbrwq/local-volume-statefulset-1", node "bootstrap-e2e-minion-group-3dbd"
I1122 23:50:14.796385       5 scheduler_binder.go:133] FindPodVolumes for pod "e2e-tests-persistent-local-volumes-test-jbrwq/local-volume-statefulset-1", node "bootstrap-e2e-minion-group-k9mj"

About this issue

  • Original URL
  • State: closed
  • Created 7 years ago
  • Comments: 25 (25 by maintainers)

Commits related to this issue

Most upvoted comments

Sure, I will send out a fix immediately.

Ah, I see. If you include the UIDs of all the Pod’s PVCs in the hash, then each Statefulset pod is going to have a different hash, while each ReplicaSet pod is going to have the same hash. And this will also work for pods created by operators as well. Yes, this sounds much better.

@resouer @bsalamat I am looking through the equivalence cache unit tests, and TestGetHashEquivalencePod looks very suspicious. If I understand the test correctly, as long as the OwnerReference is the same, then the pod will generate the same hash value even if the pod name is different. That assumption does not work for statefulsets because each pod replica in the statefulset can have different volume specs, so the pods should not be considered the same in terms of evaluating predicates.