kubernetes: Optimize scheduler's FitError
What would you like to be added?
Scheduler’s FitError wraps info to diagose a pod’s scheduler failure. The object is dumped to system event, and also injected into a pod’s status. However, the way FitError is currently dealt with is not efficient in the following aspects:
1. Unnecessary compose/decompose
When a FitError gets wrapped into a Status, it’s achieved by:
framework.NewStatus(..., err.Error())
, orframework.AsStatus(err)
The first function decomposed the err, and composed another error. The second function, although injects the err into the Status, it calls err.Error() which should be lazy-evaluted.
The optimization should try to reuse the same FitError as much as possible, and also delay its evaluation (i.e., call FitError.Error()).
2. Duplicated calls to FitError.Error()
Suppose we have solved the problems mentioned above, and hence have a unique FitError across one scheduling cycle, the current logic that calls FitError.Error()
multiple times are still problematic. Currently it may be called ~4~ 3 times during a scheduling cycle:
Why is this needed?
FitError.Error()
is not a trivial function, esp. in a large cluster or the failure reasons are diverse. We should eliminate unnecessary calls (explicit ones or implicit ones by klog) as much as possible.
About this issue
- Original URL
- State: closed
- Created 2 years ago
- Comments: 17 (12 by maintainers)
https://github.com/kubernetes/kubernetes/issues/103853#issuecomment-889370324 😃