kubernetes: scheduler: add nodeAnnotationsChanged event to trigger rescheduling

What happened?

Due to limited character space, I’ve stored additional information in annotations on nodes. My custom scheduling plugins read these annotations to schedule Pods. However, when a Pod fails to be scheduled on a node, I want it to be removed from the Unschedulable queue. This can be achieved by modifying the annotations of the node, allowing the Pod to be rescheduled.

Unfortunately, the eventHandler of the scheduler only supports five types of node events, and it does not include the nodeAnnotationsChanged event. Consequently, I am currently unable to trigger rescheduling by changing the annotations of nodes.

/sig scheduling

What did you expect to happen?

Unschedulable pods will be rescheduled after annotations of nodes changed.

How can we reproduce it (as minimally and precisely as possible)?

None

Anything else we need to know?

No response

Kubernetes version

$ kubectl version
# paste output here

Cloud provider

OS version

# On Linux:
$ cat /etc/os-release
# paste output here
$ uname -a
# paste output here

# On Windows:
C:\> wmic os get Caption, Version, BuildNumber, OSArchitecture
# paste output here

Install tools

Container runtime (CRI) and version (if applicable)

Related plugins (CNI, CSI, …) and versions (if applicable)

About this issue

  • Original URL
  • State: open
  • Created a year ago
  • Comments: 28 (28 by maintainers)

Most upvoted comments

Sorry, the previous one is not right, a detailed one:

a. we have a pod failed by nodeAffinity. pod will be placed into the unschedulablePods b. node autoscaler triggered, a new node added, emitted a node add event c. called MoveAllToActiveOrBackoffQueue -> preCheckForNode d. node not ready right now and holds a notReady taint, then preCheckForNode failed, pod still stays in the unschedulablePods e. node gets ready, emitted a node update event, but dismatch with the registered event(if we register with UpdateNodeLabel in nodeAffinity plugin) f. pod will stay in unschedulabelPods

I guess this is the right flow. Anyway, glad to see it be more effect in the future.