kubernetes: pod_scheduling_durating_seconds includes the time a Pod fails PreEnqueue
What happened?
pod_scheduling_durating_seconds
is recording the time that a Pod is gated.
We use the timestamp when the scheduler inserts the pod into the queue: https://github.com/kubernetes/kubernetes/blob/84c8abfb8bf900ce36f7ebfbc52794bad972d8cc/pkg/scheduler/internal/queue/scheduling_queue.go#L402
What did you expect to happen?
The period of time when a Pod fails PreEnque (like being gated) shouldn’t be accounted in the pod_scheduling_duration_seconds.
How can we reproduce it (as minimally and precisely as possible)?
Create a Pod with scheduling gates.
Wait some time before removing the gate.
Observe the pod_scheduling_duration_seconds
metric
Anything else we need to know?
No response
Kubernetes version
1.26+
Cloud provider
Any
OS version
# On Linux:
$ cat /etc/os-release
# paste output here
$ uname -a
# paste output here
# On Windows:
C:\> wmic os get Caption, Version, BuildNumber, OSArchitecture
# paste output here
Install tools
Container runtime (CRI) and version (if applicable)
Related plugins (CNI, CSI, …) and versions (if applicable)
About this issue
- Original URL
- State: closed
- Created a year ago
- Comments: 23 (22 by maintainers)
We don’t have a conclusion yet, but as a bug, it should qualify for changes after the freezes.
@helayoty would you want to take a stab on this issue?