kubernetes: when the node restarts, the numa information of the device cannot be obtained from the checkpoint
What happened:
When the node restarts, the numa information of the device cannot be obtained from the checkpoint(maybe device plugin start slow), resulting in an error in the cpuset setting.
Through the tracking code, it is found that when the kubelet is restarted, the Topology information is not retained in the checkpoint. https://github.com/kubernetes/kubernetes/blob/ea0764452222146c47ec826977f49d7001b0ea8c/pkg/kubelet/cm/devicemanager/topology_hints.go#L131-L139
m.allDevices is nil.
What you expected to happen:
m.allDevices can restore from checkpoint.Can get the numa information correctly
How to reproduce it (as minimally and precisely as possible):
- The device plugin reports the device information and the associated numa, and normally schedules a pod to the node
- Restart node, the device plugin may not be started yet, or it may be slow to start. At this time, the original container is being started gradually
- kubelet print message “[topologymanager] Hint Provider has no preference for NUMA affinity with any resource”
- the origin container can not get numa from device plugin
Anything else we need to know?:
Environment:
- Kubernetes version (use
kubectl version): - Cloud provider or hardware configuration:
- OS (e.g:
cat /etc/os-release): - Kernel (e.g.
uname -a): - Install tools:
- Network plugin and version (if this is a network-related bug):
- Others:
About this issue
- Original URL
- State: closed
- Created 3 years ago
- Comments: 16 (9 by maintainers)
In the restart flow, the kubelet should reset the allocatable device count to zero until the device plugin register itself again (https://github.com/kubernetes/kubernetes/blob/release-1.21/pkg/kubelet/cm/devicemanager/manager.go#L517). Which can be slow indeed, but pods should not reach the node until devices are reported allocatable. When the device plugin registers itself again, it will report topology information, so we’re good.
It seems to me that you possibly just hit https://github.com/kubernetes/kubernetes/issues/102880 maybe?