kubernetes: Optimize scheduler's FitError

What would you like to be added?

Scheduler’s FitError wraps info to diagose a pod’s scheduler failure. The object is dumped to system event, and also injected into a pod’s status. However, the way FitError is currently dealt with is not efficient in the following aspects:

1. Unnecessary compose/decompose

When a FitError gets wrapped into a Status, it’s achieved by:

framework.NewStatus(..., err.Error()), or
framework.AsStatus(err)

The first function decomposed the err, and composed another error. The second function, although injects the err into the Status, it calls err.Error() which should be lazy-evaluted.

The optimization should try to reuse the same FitError as much as possible, and also delay its evaluation (i.e., call FitError.Error()).

2. Duplicated calls to FitError.Error()

Suppose we have solved the problems mentioned above, and hence have a unique FitError across one scheduling cycle, the current logic that calls FitError.Error() multiple times are still problematic. Currently it may be called ~4~ 3 times during a scheduling cycle:

~composing a PostFilter status (A)~
duplicated entries when dealing with the error (B, C, D)

Why is this needed?

FitError.Error() is not a trivial function, esp. in a large cluster or the failure reasons are diverse. We should eliminate unnecessary calls (explicit ones or implicit ones by klog) as much as possible.

About this issue

Original URL
State: closed
Created 2 years ago
Comments: 17 (12 by maintainers)

Most upvoted comments

https://github.com/kubernetes/kubernetes/issues/103853#issuecomment-889370324 😃

alculquicondor on Jul 7, 2022