kubernetes: Rolling updates should fail early if a pod fails to start

In 1.0.x, rolling update didn’t block, so we could check things like a failed pod while waiting for the update to complete.

In 1.1.x, rolling update does block up to the timeout amount. This is great since we had to script it externally before. However it waits the entire timeout even if the update fails immediately. For instance, if you specify a non-existent image the pod will report a PullImageError. At that point the rolling update could abort immediately. Instead it blocks for the 5 minute timeout. When frequently developing and deploying this 5 minute timeout becomes painful.

I suggest rolling-update fails immediately if any of the new pods are not in a ‘Pending’ or ‘Running’ state. Not sure if there is a set of non-error states we can use.

About this issue

  • Original URL
  • State: open
  • Created 9 years ago
  • Reactions: 2
  • Comments: 21 (17 by maintainers)

Most upvoted comments

The more I think about it, the more I think it’s a bug in kubelet with docker. I think it should report the pod as Failed if the image pull fails. Then we can have some sensible policies to deal with the failure (whether it’s to abort the rolling update, or to try pulling the image again).