kubernetes: Modifying nodeSelector on StatefulSet doesn't reschedule Pods
/kind bug
What happened:
Changing nodeSelector of a StatefulSet doesn’t trigger rescheduling of it’s existing pods. I kubectl apply
the StatefulSet below, and wait for it’s Pods to get scheduled onto Nodes with label node_type: type1
. Then I change the nodeSelector label to node_type: type2
and do kubectl apply
again.
apiVersion: apps/v1beta1
kind: StatefulSet
metadata:
name: sstest
labels:
app: sstest
spec:
replicas: 2
serviceName: "service"
template:
metadata:
labels:
app: sstest
spec:
nodeSelector:
node_type: type1
containers:
- name: nginx
image: k8s.gcr.io/nginx-slim:0.8
What you expected to happen:
I expect the Pods to be rescheduled to the type2 Nodes, but nothing happens. The Pods only get rescheduled if I manually kill them using kubectl delete pod sstest-0
.
How to reproduce it (as minimally and precisely as possible):
- Create a cluster in which some nodes have the label
node_type: type1
while some other nodes have the labelnode_type: type2
. - Apply the above StatefulSet definition to the cluster.
- Change the nodeSelector from
node_type: type1
tonode_type: type2
in the StatefulSet definition file. - Apply the file again.
- Kill a Pod manually to verify that it gets rescheduled to another node.
Anything else we need to know?:
I tested the same scenario with Deployments. In that case the rescheduling worked as expected.
Environment:
- Kubernetes version (use
kubectl version
):
Client Version: version.Info{Major:"1", Minor:"8", GitVersion:"v1.8.4", GitCommit:"9befc2b8928a9426501d3bf62f72849d5cbcd5a3", GitTreeState:"clean", BuildDate:"2017-11-20T05:28:34Z", GoVersion:"go1.8.3", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"8+", GitVersion:"v1.8.4-gke.1", GitCommit:"04502ae78d522a3d410de3710e1550cfb16dad4a", GitTreeState:"clean", BuildDate:"2017-12-08T17:24:53Z", GoVersion:"go1.8.3b4", Compiler:"gc", Platform:"linux/amd64"}
- Cloud provider or hardware configuration: GKE
- OS (e.g. from /etc/os-release): cos
About this issue
- Original URL
- State: open
- Created 6 years ago
- Reactions: 10
- Comments: 37 (14 by maintainers)
Commits related to this issue
- #57838 Reschedule StatefulSet's Pods if nodeSelector has changed — committed to antoniaklja/kubernetes by antoniaklja 6 years ago
- #57838 Reschedule StatefulSet's Pods if nodeSelector has changed — committed to antoniaklja/kubernetes by antoniaklja 6 years ago
- #57838 Reschedule StatefulSet's Pods if nodeSelector has changed — committed to antoniaklja/kubernetes by antoniaklja 6 years ago
- #57838 Reschedule StatefulSet's Pods if nodeSelector has changed — committed to antoniaklja/kubernetes by antoniaklja 6 years ago
- feat: allow changes to nodeSelector This makes it possible to patch the underlying StatefulSet to change the nodeSelector to schedule pods on alternate nodes. Due to kubernetes/kubernetes#57838, the ... — committed to politics-rewired/postgres by bchrobot 5 years ago
Is it safe to assume this is not the intended behavior and will be fixed sometime in the future?
Same issue here 😕
It seem to be even worse in 1.21.2. Not even deleting a pod works; the new pod still has the old nodeSelector. I had to resort to deleting the StatefulSet and creating it again.
Stale issues rot after 30d of inactivity. Mark the issue as fresh with
/remove-lifecycle rotten
. Rotten issues close after an additional 30d of inactivity.If this issue is safe to close now please do so with
/close
.Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /lifecycle rotten
Stale issues rot after 30d of inactivity. Mark the issue as fresh with
/remove-lifecycle rotten
. Rotten issues close after an additional 30d of inactivity.If this issue is safe to close now please do so with
/close
.Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /lifecycle rotten /remove-lifecycle stale
Yup this is still a problem. Just witnessed on 1.19.11 which happens to be my blocker to upgrading 😢
Yup, makes sense.
/remove-lifecycle rotten /reopen
Can this issue be re-opened please? cc @nikhita
@fejta-bot: Closing this issue.
In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.
Issues go stale after 90d of inactivity. Mark the issue as fresh with
/remove-lifecycle stale
. Stale issues rot after an additional 30d of inactivity and eventually close.If this issue is safe to close now please do so with
/close
.Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /lifecycle stale
/lifecycle frozen
Issues go stale after 90d of inactivity. Mark the issue as fresh with
/remove-lifecycle stale
. Stale issues rot after an additional 30d of inactivity and eventually close.If this issue is safe to close now please do so with
/close
.Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /lifecycle stale
Rotten issues close after 30d of inactivity. Reopen the issue with
/reopen
. Mark the issue as fresh with/remove-lifecycle rotten
.Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /close
Issues go stale after 90d of inactivity. Mark the issue as fresh with
/remove-lifecycle stale
. Stale issues rot after an additional 30d of inactivity and eventually close.If this issue is safe to close now please do so with
/close
.Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /lifecycle stale
/remove-lifecycle stale
/remove-lifecycle rotten
Issues go stale after 90d of inactivity. Mark the issue as fresh with
/remove-lifecycle stale
. Stale issues rot after an additional 30d of inactivity and eventually close.If this issue is safe to close now please do so with
/close
.Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /lifecycle stale
@adam-sandor @dims I’ll have a look