kubernetes: Enhancement: Marking a pending pod as failed after a certain timeout

What would you like to be added?

We would like to add configurable timeouts to allow pending pods to transition to failed.

It would be ideal if it could be configurable based on container events or messages rather than a catch all timeout.

A possible API could be as follows:

      - regexp: "Failed to pull image.*desc = failed to pull and unpack image"            # Suggests genuine problem with image name, no point in waiting around too long.
        gracePeriod: 90s
      - regexp: "Failed to pull image.*code = Unknown desc = Error response from daemon"  # Seen when image doesn't exist, no point in waiting around too long.
        gracePeriod: 90s

This can allow certain events/statuses to transition to failed.

Why is this needed?

In the Armada project, we found that for batch users, it is useful to control how long Pods stay in pending before marking the pods as failed. This allows for our scheduler to remove pods from the cluster and allow room for new pods to be scheduled. This also allows users to be notified of invalid pods and they are able to resubmit their pods with correct configurations.

We have discussed this idea in conjunction with the non goals of https://github.com/kubernetes/enhancements/tree/master/keps/sig-apps/3329-retriable-and-non-retriable-failures#non-goals and @alculquicondor requested that I create an issue.

About this issue

Original URL
State: open
Created 2 years ago
Comments: 49 (46 by maintainers)

Most upvoted comments

i am assuming that this functionality is not available yet,right ? is there any workaround using kubelete configurations perhaps ?

is it the same on k3s distributions ?

is there maybe any sort of workaround CronJob that regularly checks for pending pods and terminates the ones that have been there for long ?

dberardo-com on Jul 10, 2023

One clarification in your readme. The last 2 examples for missing configmap/secret volumes indicate that PodReadyToStartContainers is True in these scenarios, which seems to contradict the comment above.

Nice find. It is supposed to be False. I pushed a fix for that.

kannon92 on Jan 25, 2023

Although, an important question is how can an out-of-tree controller detect these failures. Is there a stable Pod condition type or reason that can be monitored?

If the controller has to do regex on error messages, that can easily break from one release to another. So at the very least in upstream we should have consistent Pod conditions.

alculquicondor on Nov 7, 2022