scheduler-plugins: the states in PodGroup is not accurate

Area

Scheduler
Controller
Helm Chart
Documents

Other components

No response

What happened?

after migrated to controller runtime, podGroup.status.scheduled count will not be updated by PostBind, and the phase transform seem not working as expected

What did you expect to happen?

PodGroup will reflect on pods states change and update in PodGroup.status.phase

but it is now accurate and sometimes wrong, we need to discuss the expected states change and make it always right.

let’s discuss the state flow before working on it:

currently, we have the following states:

Pending: pod group has been accepted by the system
Running: minMember pods of the pod group are in running phase.
PreScheduling: all pods of the pod group have enqueued and are waiting to be scheduled
Scheduling: partial pods have been scheduled and are in running phase, not meet minMember
Scheduled: minMember pods have been scheduled and are in running phase. @Huang-Wei, is this right? seems duplicated with running
Unknown: part of pods scheduled, and some not
Finished: minMember pods are successfully finished
Failed: at least one of pods have failed

https://github.com/kubernetes-sigs/scheduler-plugins/blob/9701eb847a9d343a28b882e3c61ab7ea3a09eca7/apis/scheduling/v1alpha1/types.go#L87-L112

Please notice, the following phase only shows my understanding of the defined phase in the code, it may be a misunderstanding or could be discussed to improve

stateDiagram-v2
	state if_minMember <<choice>>
    [*] --> Pending
    Pending --> PreScheduling: pods added
    PreScheduling --> Scheduling: some of the pods scheduled
    Scheduling --> Scheduled: minMember pods scheduled, but not running
    Scheduled --> Running: minMember pods scheduled and running
    Running --> Failed: at least one of the pods failed
    Failed --> if_minMember: failed fixed
    if_minMember --> Scheduling: minMember does not meet
    if_minMember --> Scheduled: minMember meet
    Running --> Finished: all pods successfully finished
    Finished --> [*]

How can we reproduce it (as minimally and precisely as possible)?

create a podGroup with minMember 3
create 3 pods in podGroup
change 1 of the pods to make it unschedulable
we can see the phase not working as expected and scheduled count not right.

Anything else we need to know?

No response

Kubernetes version

$ kubectl version
1.25.7

Scheduler Plugins version

0.25.7

About this issue

Original URL
State: closed
Created a year ago
Comments: 20 (20 by maintainers)

Most upvoted comments

I may delegate the review work to @denkensk as he was the original author. My 2 cents about the status revamp work:

if some status is transient, there is no point keeping it
if some status is nuanced to calculate while also exposing trivial information to end-user, we should eliminate it as well.

/assign @denkensk as primary reviewer.

Huang-Wei on May 2, 2023

thanks @Gekko0114

/close

zwpaper on Jul 1, 2023

We can close it since I completed PR

Gekko0114 on Jul 1, 2023

https://github.com/kubernetes-sigs/scheduler-plugins/pull/574 Updated the PR.

Gekko0114 on May 17, 2023

@zwpaper, @denkensk Sure, I agree with you. Thanks for clarifying the discussion. I will implement it.

Gekko0114 on May 16, 2023

Hi @denkensk, I would like to hear your suggestions regarding this issue. Could you comment at your convenience?

Gekko0114 on May 10, 2023

Since Running and Scheduled are redundant, we can remove Scheduled.

Scheduled could mean scheduled but not yet running? it may be the original design intention

With the above definitions, PodGroupStatus.Scheduled becomes unnecessary and can be removed

if we keep the scheduled phase, then status.scheduled would also be kept.

the point may be that is there a phase pods scheduled but not running and whether we should expose this phase to users.

for example, pods scheduled but stuck on a ContainerCreating or some other status

zwpaper on Apr 16, 2023