gpustat: Faile to run ``gpustat --debug'': pynvml.NVMLError_LibraryNotFound: NVML Shared Library Not Found

Hi,

On Ubuntu 20.04 with Python 3.8.3, I failed to run gpustat --debug, as shown below:

$ gpustat --debug
Error on querying NVIDIA devices. Use --debug flag for details
Traceback (most recent call last):
  File "/home/werner/.pyenv/versions/3.8.3/envs/socks5-haproxy/lib/python3.8/site-packages/pynvml.py", line 644, in _LoadNvmlLibrary
    nvmlLib = CDLL("libnvidia-ml.so.1")
  File "/home/werner/.pyenv/versions/3.8.3/lib/python3.8/ctypes/__init__.py", line 373, in __init__
    self._handle = _dlopen(self._name, mode)
OSError: libnvidia-ml.so.1: cannot open shared object file: No such file or directory

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/werner/.pyenv/versions/3.8.3/envs/socks5-haproxy/lib/python3.8/site-packages/gpustat/__main__.py", line 19, in print_gpustat
    gpu_stats = GPUStatCollection.new_query()
  File "/home/werner/.pyenv/versions/3.8.3/envs/socks5-haproxy/lib/python3.8/site-packages/gpustat/core.py", line 281, in new_query
    N.nvmlInit()
  File "/home/werner/.pyenv/versions/3.8.3/envs/socks5-haproxy/lib/python3.8/site-packages/pynvml.py", line 608, in nvmlInit
    _LoadNvmlLibrary()
  File "/home/werner/.pyenv/versions/3.8.3/envs/socks5-haproxy/lib/python3.8/site-packages/pynvml.py", line 646, in _LoadNvmlLibrary
    _nvmlCheckReturn(NVML_ERROR_LIBRARY_NOT_FOUND)
  File "/home/werner/.pyenv/versions/3.8.3/envs/socks5-haproxy/lib/python3.8/site-packages/pynvml.py", line 310, in _nvmlCheckReturn
    raise NVMLError(ret)
pynvml.NVMLError_LibraryNotFound: NVML Shared Library Not Found

About this issue

  • Original URL
  • State: closed
  • Created 4 years ago
  • Comments: 18 (4 by maintainers)

Most upvoted comments

This was my solution hope it helps someone:

pynvml ask for nvml.dll on "C:\Program Files\NVIDIA Corporation\NVSMI" and "C:\Windows\System32", but the new installer puts the file in “C:\Windows\System32\DriverStore\FileRepository\nv_dispi.inf_amd64_aXXXXXXXXXXXXXX”, just copy the dll from “FileRepostory” to the “Program Files” location.

If there is no “NVSMI” folder inside “C:\Program Files\NVIDIA Corporation” make one and just put the dll inside

The nvml.dll on system32 is 596kb, the file inside “FileRepostory” is 1051kb, if there is a nvml.dll inside “Program Files” but is the 596kb version, just replace it for the 1051kb one.

Make sure right click and copy the file and not just hold and move, it will take the original file from “File Repository” and you will not have privileges to copy back or undo the file move.

You can solve this issue as belows:

  1. Search “nvml.dll” file in “C:\Windows\System32\DriverStore\FileRepository”
  2. Copy “nvml.dll” file to “C:\Program Files\NVIDIA Corporation\NVSMI” (Make NVSMI folder if not in there by yourself)
  3. Done

This was my solution hope it helps someone: pynvml ask for nvml.dll on “C:\Program Files\NVIDIA Corporation\NVSMI” and “C:\Windows\System32”, but the new installer puts the file in “C:\Windows\System32\DriverStore\FileRepository\nv_dispi.inf_amd64_aXXXXXXXXXXXXXX”, just copy the dll from “FileRepostory” to the “Program Files” location. If there is no “NVSMI” folder inside “C:\Program Files\NVIDIA Corporation” make one and just put the dll inside The nvml.dll on system32 is 596kb, the file inside “FileRepostory” is 1051kb, if there is a nvml.dll inside “Program Files” but is the 596kb version, just replace it for the 1051kb one. Make sure right click and copy the file and not just hold and move, it will take the original file from “File Repository” and you will not have privileges to copy back or undo the file move.

Thanks a ton, I was running into this issue earlier while working with some Pytorch/fastai models. Now it seems good. Thanks again.

This works for me with a slight change: The location of nvml.dll is now in C:\Windows\System32\DriverStore\FileRepository\nvrzui.inf_amd64_8df10ddaac270452