kueue: WaitForPodsReady: Requeue at the back of the queue after timeout

What would you like to be added:

When WaitForPodsReady is enabled, and a workload hits the timeout, it should go to the back of the queue, potentially using the sorting logic https://github.com/kubernetes-sigs/kueue/blob/b2a5e386d7e9c0e3346660dd01001734f631d7fd/pkg/scheduler/scheduler.go#L322

This can be implemented in different ways:

  1. Removing the Workload object at the time of eviction, so that the job controller creates a new Workload object that will have a new creationTimestamp. However, this requires #518.
  2. Add a field in the WorkloadStatus that stores the time when the Workload was preempted, then use in sorting.
  3. Use the LastTransitionTimestamp of the Admitted condition for sorting. However, this requires #532 to be completed (I’m on it)
  4. [Chosen] Add a new condition Evicted that records when the pod was evicted, use the timestamp for sorting. Also requires #532.

Why is this needed:

We currently just remove the Admission field from the Workload object, essentially putting the Workload at the head of the queue, to be scheduled right after the head that was waiting to be admitted.

This could lead to constant waiting in the queue if the workload happens to be to big to fit or simply badly configured with scheduling requirements that can’t be satisfied.

Eventually, we could make the behavior configurable, but for now, it’s more reasonable to put the workload that timed-out at the back of the queue.

Completion requirements:

This enhancement requires the following artifacts:

  • Design doc
  • API change
  • Docs update

The artifacts should be linked in subsequent comments.

About this issue

  • Original URL
  • State: closed
  • Created a year ago
  • Comments: 22 (22 by maintainers)

Most upvoted comments

Note that’s the Admitted condition.

Sorry, the page was stale and I didn’t see the latest comments on adding a new condition; I think that makes sense.

Yes, this will mean modifying the byCreationTime function (and rename it). Even currently the name is misleading as it also takes the priority into account.

We will also probably want to modify the entry ordering in scheduler to break the tie between nominated queue heads: https://github.com/kubernetes-sigs/kueue/blob/b2a5e386d7e9c0e3346660dd01001734f631d7fd/pkg/scheduler/scheduler.go#L322.