kubernetes: deployments do not support (honor) container restartPolicy
Steps: Create deployment file and set restartPolicy to “Never” Result:
The Deployment "foo" is invalid.
spec.template.spec.restartPolicy: Unsupported value: "Never": supported values: Always
About this issue
- Original URL
- State: closed
- Created 8 years ago
- Reactions: 70
- Comments: 93 (7 by maintainers)
Links to this issue
Commits related to this issue
- Remove restart policy 1. It is part of the pod and not the container 2. Deployment can only use `restartPolicy: Always` https://github.com/kubernetes/kubernetes/issues/24725 Helm silently ignores th... — committed to cloudfoundry/eirini-release by alex-slynko 5 years ago
- Remove restart policy 1. It is part of the pod and not the container 2. Deployment can only use `restartPolicy: Always` https://github.com/kubernetes/kubernetes/issues/24725 Helm silently ignores th... — committed to cloudfoundry/eirini-release by alex-slynko 5 years ago
- Remove restart policy 1. It is part of the pod and not the container 2. Deployment can only use `restartPolicy: Always` https://github.com/kubernetes/kubernetes/issues/24725 Helm silently ignores th... — committed to idev4u/eirini-release by alex-slynko 5 years ago
If deploymets support only restartPolicy: Always, why do that parameter exists at all? It doesn’t make much sense to have a ‘parameter’ that can have only one value…
What is the alternative for running one-time or periodic tasks, like data import or backups?
Does anyone know why deployments only support restartPolicy == Always? It’s not clear from the documentation and I can’t spot the reason easily in the code.
I think there are legitimate use cases for “Never”. For example, I have a pod running on a node and it has a liveness probe. That node then fails in someway that makes it remain up but perform slowly. The liveness probe fails and the kubelet will now try to recreate the container. But the problem is with the node so the new container doesn’t become healthy. And my ReplicaSet doesn’t see the pod as failed so it doesn’t try to create a new pod elsewhere.
If I could set the restart policy to Never then I imagine in this scenario: liveness probe fails, container terminated, pod marked as failed, replica set creates a new pod (maybe or maybe not on the same node).
Or have I mis-understood something?
Sorry for a late follow up, but what if I’d like to deliberately kill the pods instead of restarting? I’ve got a deployment with pods that are somewhat buggy. They do not restore themselves well after one of the containers fails and thus my deployment becomes unstable after some time. How can I maintain a Deployment (and consequently a ReplicaSet) that would kill and recreate failing Pods rather than restarting them?
Another example that is difficult to handle: sidecars and sidecar proxies, in particular. If an application container does not properly clean up after itself (e.g. UDS sockets), then the pod stays alive since the sidecar is still running. The application keeps trying to restart, and keeps failing since it attempts to use some resources that have not been cleaned up.
I’m experiencing the same problem. Uh three years since the ticket opened. You need to fix the documentation!
+1 Our application is designed in such a way if container crash it will fail to start again. We need option “spec.template.spec.restartPolicy” : “Never”.
why closed,this issue is important
@ichekrygin Deployments only support restartPolicy =
Always
; so do Replication Controllers, Replica Sets, and Daemon Sets.I’ll document this.
I also think it would be great to reopen this.
I genuinely cannot believe the amount of people requesting this and the stubbornness displayed in this issue.
There is a clear use-case for this feature.
I am very confused - can the maintainers explain clearly why this issue has been closed multiple times when there is clear community desire for this feature? We just ran into a cluster-level failure triggered by the fact that we could not have the
calico-typha
pod re-scheduled on a different host due to a random port conflict that happened (tldr: something on the host randomly used port 9093… causing calico to fail to start up and bind its metrics listener).The desired behavior here seems very reasonable… can someone clarify why its still not implemented?
why is this issue closed? how do I get k8s to not restart a pod? what is the point of a parameter if it can only take one value?
I wonder this thing was raised 3 years back and why is documentation here https://v1-12.docs.kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle/#restart-policy says
while I get
kubectl version
The Never value is required for some troubleshooting purposes so that logs are not lost during development stages when time crunching matters like fluentd configuration is not complete. Kindly add the restart policy Never support
So many years have passed and so many people propose this question,they just dont solve it ,i cant understand
I’d like to have the ability to set Never in order to be able to debug in a pod which is having some issues only in a kubernetes environment.
I think
restartPolicy: recreate
is more like it. It’s not about restarting, it’s more of ”terminate this pod and replace it with a new one” instead of trying to restart containers.Just use a pod then, you don’t need a deployment
Edit: unless you do need a deployment for some reason idk. But in my experience to do what you described I would just use a pod instead of a deployment.
Please reopen because this is critical and particularly bad when combined with unstable volumne mount bug like this: https://github.com/kubernetes/kubernetes/issues/67643 or a dead fuse daemon.
If any volumne mount is broken for a pod, we have no choice but restarting the whole pod. We have put liveness probe and in code health check to commit suicide after write failure or any mountpoint abnormality. But if restartPolicy is always, the pod simply enters crashloop without restarting itself.
We used to be able to set “Never” on deployments and it works as expected. However recently we found that this has been enforced and our existing mechanism stops working.
Another possible solution is to add a time limit of crashloopbackoff and kill the pod after limit. We can write a script to do so but it would be better if it can be built into the official controller or kubelet.
@llech you can always use cron job. But I fully agree that if there is just one supported policy why does that parameter exists at all ? Is it placeholder for the future extensions ?
Or if you won’t fix it, could you please fix the documentation, specifically https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle/#restart-policy ?
And if you’re going to say, “well, Pod Containers can have a non-Always restart policy, but not when they’re wrapped in a Deployment” then maybe a note to that effect on the restart-policy page would be helpful?
I mean is there some practical reason that the deployment can’t support never or is it just a principle or something? Can we please have it?
On Sat, Mar 11, 2023 at 9:53 PM Petrus Repo @.***> wrote:
Compilling the various known use cases for “If Pod-XYZABC exits for any reason, do not restart it, provide me a new one”, either mentioned in other issues, or from my own personal experience (in providing infrasturcture consulting to startups) - so as to avoid confusing it with issues that would be solved with “restartTogether”, or other potential fixes.
Persistent errors that survive restarts
Security of container appspace
Note: For most of the above listed examples, this does not have issues with PVC itself or its data, so those PVC are reused across pods, also PVC is realtively easier to “reset” with an entrypoint script or sidecar. While it may overlap with “mount issues” in the driver.
In general, while most issues could probably be “fixed” by “fixing” the application software - especially applicaiton logic failure. It is important to recognise that a large percentage of people in charge of the company kubernetes infrastructure, may not have the ability to make such changes. They may not even have a say in the applications involved, as many enterprises have started forcing the move to docker/cloud “at all costs”.
In all the above cases, implementing healthchecks does not fully “work”, while it does help K8S operators notice the issue (infinite restart loop), it means they will need manual intervention using vanilla k8s. And really prevented some operators from being able to “sleep in peace” knowing that if enough of their replicas enter a restart loop without their daily/weekly intervention, would mean a downtime (that was until i advised on automating the restarts on a schedule)
As a result of the above, currently in the field i have seen the following being done to work around this lack of “Replace pod on error” feature.
Finally, in my opinion, its the ultimate expression of “Cattle” vs “Pets”, if there is any issue, i kill the pod, and get a new one. And if operators wants to live by this ideal to the extreme, so be it =)
In general most of the above is fixed if “restartPolicy: Never” is accepted, and new pods are created on failure, or as mentioned a new type of “ReplicaSet” is created to help this behaviour.
(to be clear, this should not be default due to potential increase resource drains, and such usage should be an decision of the operator)
Please re-open this, we need the pod not to restart, what’s the point of “Never” if we can’t use it?
I’m experiencing the same problem +1, we have some sidecars running in our pod which will keep running forever, and the we want the pod to fail and restart if the primary container failed.
@kubernetes/kubectl we should document what restart policy deployment support.
I would like to be able to do not restart deployment in order to understand what happened if failed and block the constant reboot that may happen, also during debugging for example.
Now I’m trying to do a workaround changing the Deployment into a Job because in that way it seems possible to do not restart on failure
Is the documentation supposed to be up to date? I feel like I’m missing something here: https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle/#restart-policy
Please don’t spam.
I also have a use case like @kachkaev where I would prefer to have a pod die and be rescheduled as a new pod than be restarted. This feature would be useful.
Just my 2 cents, I need restart never to run load testing tools. I can not launch as a pod (which supports restart never) and have > 1 replicas. Naturally I shifted to looking at a deployment but the same incompatibility exists. If my pods restart it will result in a never ending load test. In the case of my database load testing tools, they end up sampling rows out of a database to generate read operations. Over time that sample set gets larger and larger because the pods are restarting. This effectively halts the cluster.
I am looking forward to ephemeral containers for this use case, but I fear they may have the same drawbacks. I need a way to launch containers in parallel without restarts.
Mh, I must’ve missed that, sorry
I kinda understand that, but I don’t see a difference between restarting a Pod and just letting it die and creating a new one; the end result is the desired number of replicas
The same thing that happens when you delete a Pod or a node dies; the current number of replicas is below the desired one, therefore the ReplicaSet controller creates a new pod
Additionally, it doesn’t click for me why it’s necessary for Deployments to be
restartPolicy: Always
I get that that’s your desired concept, but I don’t see any drawbacks to allowing other policies and leaving
Always
as the default and on the other side I see quite a few advantages, like that this portion of the community would benefit from this.Writing a custom controller is not something you’d write in a day and push to production, but I’d think allowing other policies would be.
As for your examples;
I wouldn’t say that it’s in any way close to being akin to a Job, as Jobs run to completion and stop. Deployments keep a number of replicas up, forever.
Also, there is a very big difference between a Deployment with
restartPolicy: Never
and creating the Pods manually; the first one is automatic.There are also advantages like https://github.com/kubernetes/kubernetes/issues/24725#issuecomment-561396435;
Ok, so I think its clear from the community that there is a desire to have a way of saying “If Pod-XYZABC exits for any reason, do not restart it. Let it die and go away.” I think beyond that, the mechanics of how we accomplish such a thing is up to the experts in the community who understand the tooling the most.
As a non expert, my mental model is that a
Deployment
handles replacing Pods that go away quite well, so if we could just tell the PodSpec to “go away when the processes exit”, that feels like the answer I think people are grasping for. However, perhaps that is simply not how the interaction between a Pod and Deployment and the Scheduler works… so maybe the parameter should be somewhere else.I will note though that the community is confused by the appearance of the
restartPolicy
parameter on the PodSpec itself in this case (though, the argument that it’s just part of the PodSpec, so it can’t be hidden is reasonable). It just seems to me like the most intuitive thing is for the behavior to be the following:restartPolicy: Always
)Alternatively
restartPolicy: Never
Please add onFailure and Never. +1 need it can be configured as onFailure or Never. Always is not fit for the real case here.
Well, to be fair, the parameter exists because it is inherited from pod spec. A deployment spec contains a pod spec with some constrains, one of them being that restartPolicy can only be Never.
I think the reasoning behind this is that deployments are not jobs. Like a container must be able to cope with the fact that it can be shut down at any time, a deployment must be able to cope with the fact that it is kept alive by restarts.
Probably not quite. Rolling restart seems to be something triggered by a user, while what i’m asking is pod recreation on failure (rather than a restart).
We would like this functionality too, particularly
restartPolicy: OnFailure
.Our application sometimes needs to restart, but that triggers alerts because the Pod restarted. Currently, there is no metric
failureRestart
vssuccessRestart
so we can’t distinguish between these two cases.If the Pod just got deleted on container exit and then recreated that would solve that little annoyance for us.
But our maybe weird use-case aside, one big question which I didn’t find answered here is still open;
Why does this constraint exist in the first place?
Maybe if you could explain why the DeploymentSpec only allows PodSpecs with
restartPolicy: Always
we can accept that, but as it stands now I can only surmise that it’s just some arbitrary decision that has been madeI feel like a broken record… but this is a really pretty big issue. There are tons of cases where you do not want to restart a broken pod, but replace it. I am really surprised at how little movement there is here. 😕
Comments in this issue mix to separate aspects:
Sounds like this should get addressed by https://github.com/kubernetes/enhancements/pull/912. My workaround for now has been to set up a ServiceAccount that can delete pods, and then run
kubectl delete pods $HOSTNAME
from within my container.I was also confused by the failed message which said ‘Unsupported value: “Never”: supported values: “Always”’. There is no any document saying that the flag can’t be supported by Deployment.
I think I have a different use case than the ones I’ve read above. I have a sidecar which is failing on startup that I need to troubleshoot. The kink is that it’s in an OpenShift environment where I don‘t have direct access via
kubectl
; I can only access the container logs via its web UI, and every time the container restarts the UI loses the old container’s log output. I tried setting the restart policy to “Never” to prevent this from happening and see if that would preserve the logs long enough for me to read them, but got the error that this isn’t supported which led me here.@cwrau This has been explained in this thread. The deployment controller is designed for maintaining availability of long running services that are not routinely designed to exit. (Web/application servers, queue workers that poll for new work forever etc.)
Having a deployment with a restart policy of OnFailure is basically akin to having a Job these are designed to run to completion and not restart
Having a restart policy of Never and you may as well directly create the pods.
I’d suggest that if you need some other behavioral it’s a different class of thing and you should consider writing a custom controller to handle that.
Jobs are immutable, I need a run-to-completion that can be shot again and again whenever I want.
+1 need it can be configured as onFailure or Never. Always is not fit for the real case here.
+1 need it can be configured as onFailure or Never. Always is not fit for the real case here.
This has me so confused. I am creating a deployment that automatically creates a replicaset - I am not creating the replicaset directly. How do I get that replicaset to have
restartPolicy
Never
?Edit: oh now I see that replica sets only support
restartPolicy
=Always
😦 even more confused now
Another edit: seems like I just need to use a job instead of a deployment
A final edit: since our applications are pretty ephemeral and can easily be recreated, we switched to naked pods instead of deployments or jobs and it is working great 😃
I have a similar requirement. @kachkaev
I’ll throw another rock into this huge pile.
So, I was working on upgrading Kubernetes (oh noes!) among mountains of things that don’t work, there’s this particular tidbit: after you’ve done running
kubadm upgrade apply
, you need to drain the node. And if you happen to have single pods on that node not managed by deployments / replica sets / daemon sets and so on – well, tough luck, instead of moving your pods to a different node they are going to be killed.Sucks, right? – well, that’s not the end of the story… I was about to get smart and just automatically wrap any pod in a deployment object – how bad can it be, right? – and now I’m here – fun, fun, fun!
@JoeAshworth https://twitter.com/dims/status/1593348306054397952?s=20&t=hK-e1SZ_nyaLwlGbYnN53g
This is the whole point why we need this. We want the pod to lose the reservation. The pod blocks our limited GPU resources and we want them to be free for other pods. Having the deployments not exit out blocks our resources.
For me, I don’t get why we would want to have restarting pods and restarting containers in the pod. Especially if the pod has one container, why do we need to restart the container and why can’t it just be a new pod. A deployment starts a new pod when the pod fails. Why do we need to implement all the features on all the different objects? Doesn’t make sense to me. And somehow to prevent the never or on failure options it just because it does support always? That seems totally silly.
I completely agree, considering this would be very much appreciated.
Deployment templates are pod specs, and this is good because you specify a pod after all. And a pod may have an arbitrary restartPolicy. But as part of a deployment spec, there is the additional constraint that restartPolicy can only be “always”.
The alternative would be to have pod-spec and pod-within-deployment-template-spec, which I would find even more confusing.
It’s part of the nature of the deployment that it keeps things alive. If this is unfortunate for you, then deployment is not what you should use.
An interesting use case mentioned here is that some people need to tear down and recreate the whole pod. @diranged You may want that, too? But note that this could not be achieved with restartPolicy anyway. It would need a wholly new top-level setting in the YAML spec.
That is not what we are saying… we are saying that there are deployment-like models where you cannot or do not want the pod to restart in a failure, you want a replacement pod that has a new identity.
If your expectation is that your containers exit you should use a job. If you are concerned about these building up then use a label and delete the old jobs before you spin up a new one. Also time stamping is better than random gibberish.
+1 need it can be configured as onFailure or Never. Always is not fit for the real case here.