k8s-device-plugin: nvidia-device-plugin container CrashLoopBackOff error

I deployed device-plugin container on k8s via the guide. However I got container CrashLoopBackOff error:

NAME                                   READY     STATUS             RESTARTS   AGE
nvidia-device-plugin-daemonset-zb8xn   0/1       CrashLoopBackOff   6          9m

And when I run

docker run -it -v /var/lib/kubelet/device-plugins:/var/lib/kubelet/device-plugins nvidia/k8s-device-plugin:1.8

I got error like this:

2017/11/29 01:54:30 Loading NVML
2017/11/29 01:54:30 could not load NVML library

But I am pretty sure that I have installed NVML library. So did I miss anything here? How to check if I installed NVML library?

About this issue

  • Original URL
  • State: closed
  • Created 7 years ago
  • Comments: 15 (9 by maintainers)

Commits related to this issue

Most upvoted comments

@flx42 Hello, I noticed that the GPU must meet the demands as blow:

    NVIDIA GPU with Architecture > Fermi (2.1)
    NVIDIA drivers ~= 361.93 (untested on older versions)

ref:https://github.com/nvidia/nvidia-docker/wiki/Installation-(version-2.0)#prerequisites

My GPU is Geforce GT 730 and the driver version is 384.130, it doesn’t work with error: 2018/11/22 11:17:25 Warning: GPU-6e5aa18f-ba28-e70b-57cc-95f1be4b178b is too old to support healthchecking: nvml: Not Supported. Marking it unhealthy.

So I am wondering if it is Architecture the wrong part. But I don’t know how to figure out the Architecture of my GPU after a lot of trials. Do you have any clue about how to get the Architecture of nvidia GPU? Thanks