k8s-device-plugin: nvidia-device-plugin container CrashLoopBackOff error
I deployed device-plugin container on k8s via the guide. However I got container CrashLoopBackOff error:
NAME READY STATUS RESTARTS AGE
nvidia-device-plugin-daemonset-zb8xn 0/1 CrashLoopBackOff 6 9m
And when I run
docker run -it -v /var/lib/kubelet/device-plugins:/var/lib/kubelet/device-plugins nvidia/k8s-device-plugin:1.8
I got error like this:
2017/11/29 01:54:30 Loading NVML
2017/11/29 01:54:30 could not load NVML library
But I am pretty sure that I have installed NVML library. So did I miss anything here? How to check if I installed NVML library?
About this issue
- Original URL
- State: closed
- Created 7 years ago
- Comments: 15 (9 by maintainers)
Commits related to this issue
- Merge pull request #11 from Meoop/release-v0.7.1-er feat(*): sync device information to extended resources — committed to Meoop/k8s-device-plugin by Meoop 4 years ago
@flx42 Hello, I noticed that the GPU must meet the demands as blow:
ref:https://github.com/nvidia/nvidia-docker/wiki/Installation-(version-2.0)#prerequisites
My GPU is Geforce GT 730 and the driver version is 384.130, it doesn’t work with error:
2018/11/22 11:17:25 Warning: GPU-6e5aa18f-ba28-e70b-57cc-95f1be4b178b is too old to support healthchecking: nvml: Not Supported. Marking it unhealthy.
So I am wondering if it is Architecture the wrong part. But I don’t know how to figure out the Architecture of my GPU after a lot of trials. Do you have any clue about how to get the Architecture of nvidia GPU? Thanks