scheduler-plugins: the states in PodGroup is not accurate

Area

  • Scheduler
  • Controller
  • Helm Chart
  • Documents

Other components

No response

What happened?

after migrated to controller runtime, podGroup.status.scheduled count will not be updated by PostBind, and the phase transform seem not working as expected

What did you expect to happen?

PodGroup will reflect on pods states change and update in PodGroup.status.phase

but it is now accurate and sometimes wrong, we need to discuss the expected states change and make it always right.

let’s discuss the state flow before working on it:

currently, we have the following states:

  • Pending: pod group has been accepted by the system
  • Running: minMember pods of the pod group are in running phase.
  • PreScheduling: all pods of the pod group have enqueued and are waiting to be scheduled
  • Scheduling: partial pods have been scheduled and are in running phase, not meet minMember
  • Scheduled: minMember pods have been scheduled and are in running phase. @Huang-Wei, is this right? seems duplicated with running
  • Unknown: part of pods scheduled, and some not
  • Finished: minMember pods are successfully finished
  • Failed: at least one of pods have failed

https://github.com/kubernetes-sigs/scheduler-plugins/blob/9701eb847a9d343a28b882e3c61ab7ea3a09eca7/apis/scheduling/v1alpha1/types.go#L87-L112

Please notice, the following phase only shows my understanding of the defined phase in the code, it may be a misunderstanding or could be discussed to improve

stateDiagram-v2
	state if_minMember <<choice>>
    [*] --> Pending
    Pending --> PreScheduling: pods added
    PreScheduling --> Scheduling: some of the pods scheduled
    Scheduling --> Scheduled: minMember pods scheduled, but not running
    Scheduled --> Running: minMember pods scheduled and running
    Running --> Failed: at least one of the pods failed
    Failed --> if_minMember: failed fixed
    if_minMember --> Scheduling: minMember does not meet
    if_minMember --> Scheduled: minMember meet
    Running --> Finished: all pods successfully finished
    Finished --> [*]

How can we reproduce it (as minimally and precisely as possible)?

  1. create a podGroup with minMember 3
  2. create 3 pods in podGroup
  3. change 1 of the pods to make it unschedulable
  4. we can see the phase not working as expected and scheduled count not right.

Anything else we need to know?

No response

Kubernetes version

$ kubectl version
1.25.7

Scheduler Plugins version

0.25.7

About this issue

  • Original URL
  • State: closed
  • Created a year ago
  • Comments: 20 (20 by maintainers)

Most upvoted comments

I may delegate the review work to @denkensk as he was the original author. My 2 cents about the status revamp work:

  • if some status is transient, there is no point keeping it
  • if some status is nuanced to calculate while also exposing trivial information to end-user, we should eliminate it as well.

/assign @denkensk as primary reviewer.

thanks @Gekko0114

/close

We can close it since I completed PR

@zwpaper, @denkensk Sure, I agree with you. Thanks for clarifying the discussion. I will implement it.

Hi @denkensk, I would like to hear your suggestions regarding this issue. Could you comment at your convenience?

  1. Since Running and Scheduled are redundant, we can remove Scheduled.

Scheduled could mean scheduled but not yet running? it may be the original design intention

With the above definitions, PodGroupStatus.Scheduled becomes unnecessary and can be removed

if we keep the scheduled phase, then status.scheduled would also be kept.

the point may be that is there a phase pods scheduled but not running and whether we should expose this phase to users.

for example, pods scheduled but stuck on a ContainerCreating or some other status