nvidia-container-toolkit: nvidia-container-cli: initialization error: load library failed: libnvidia-ml.so.1
Hello,
I tried the different combinations of conda and pip packages that people suggest to get tensorflow running for the rtx 30 series. Thought it was working after utilizing the gpu with keras tutorial code but moved to a different type of model and something apparently broke.
Now I’m trying the docker route.
docker run --gpus all -it --rm nvcr.io/nvidia/tensorflow:22.11-tf2-py3
docker: Error response from daemon: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error running hook #0: error running hook: exit status 1, stdout: , stderr: Auto-detected mode as 'legacy' nvidia-container-cli: initialization error: load library failed: libnvidia-ml.so.1: cannot open shared object file: no such file or directory: unknown.
There seems to be a lot of missing libraries.
3. Information to attach (optional if deemed irrelevant)
- [ x] Some nvidia-container information:
nvidia-container-cli -k -d /dev/tty info
- I1202 15:15:34.407243 26518 nvc.c:376] initializing library context (version=1.11.0, build=) I1202 15:15:34.407353 26518 nvc.c:350] using root / I1202 15:15:34.407365 26518 nvc.c:351] using ldcache /etc/ld.so.cache I1202 15:15:34.407377 26518 nvc.c:352] using unprivileged user 1000:1000 I1202 15:15:34.407426 26518 nvc.c:393] attempting to load dxcore to see if we are running under Windows Subsystem for Linux (WSL) I1202 15:15:34.408137 26518 nvc.c:395] dxcore initialization failed, continuing assuming a non-WSL environment W1202 15:15:34.411623 26519 nvc.c:273] failed to set inheritable capabilities W1202 15:15:34.411736 26519 nvc.c:274] skipping kernel modules load due to failure I1202 15:15:34.412602 26520 rpc.c:71] starting driver rpc service I1202 15:15:34.433974 26521 rpc.c:71] starting nvcgo rpc service I1202 15:15:34.438005 26518 nvc_info.c:766] requesting driver information with ‘’ I1202 15:15:34.445181 26518 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libnvoptix.so.520.56.06 I1202 15:15:34.445313 26518 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libnvidia-tls.so.520.56.06 I1202 15:15:34.445952 26518 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libnvidia-rtcore.so.520.56.06 I1202 15:15:34.446254 26518 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libnvidia-ptxjitcompiler.so.520.56.06 I1202 15:15:34.446554 26518 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libnvidia-opticalflow.so.520.56.06 I1202 15:15:34.446877 26518 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libnvidia-opencl.so.520.56.06 I1202 15:15:34.447241 26518 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libnvidia-ngx.so.520.56.06 I1202 15:15:34.447301 26518 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libnvidia-ml.so.520.56.06 I1202 15:15:34.447405 26518 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libnvidia-glvkspirv.so.520.56.06 I1202 15:15:34.447490 26518 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libnvidia-glsi.so.520.56.06 I1202 15:15:34.447550 26518 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libnvidia-glcore.so.520.56.06 I1202 15:15:34.447813 26518 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libnvidia-fbc.so.520.56.06 I1202 15:15:34.448099 26518 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libnvidia-encode.so.520.56.06 I1202 15:15:34.448197 26518 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libnvidia-eglcore.so.520.56.06 I1202 15:15:34.448693 26518 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libnvidia-compiler.so.520.56.06 I1202 15:15:34.448755 26518 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libnvidia-cfg.so.520.56.06 I1202 15:15:34.449075 26518 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libnvidia-allocator.so.520.56.06 I1202 15:15:34.449417 26518 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libnvcuvid.so.520.56.06 I1202 15:15:34.450211 26518 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libcudadebugger.so.520.56.06 I1202 15:15:34.450273 26518 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libcuda.so.520.56.06 I1202 15:15:34.450625 26518 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libGLX_nvidia.so.520.56.06 I1202 15:15:34.450896 26518 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libGLESv2_nvidia.so.520.56.06 I1202 15:15:34.451174 26518 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libGLESv1_CM_nvidia.so.520.56.06 I1202 15:15:34.451236 26518 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libEGL_nvidia.so.520.56.06 I1202 15:15:34.451580 26518 nvc_info.c:173] selecting /usr/lib/i386-linux-gnu/libnvidia-tls.so.520.56.06 I1202 15:15:34.451929 26518 nvc_info.c:173] selecting /usr/lib/i386-linux-gnu/libnvidia-ptxjitcompiler.so.520.56.06 I1202 15:15:34.452169 26518 nvc_info.c:173] selecting /usr/lib/i386-linux-gnu/libnvidia-opticalflow.so.520.56.06 I1202 15:15:34.452413 26518 nvc_info.c:173] selecting /usr/lib/i386-linux-gnu/libnvidia-opencl.so.520.56.06 I1202 15:15:34.452680 26518 nvc_info.c:173] selecting /usr/lib/i386-linux-gnu/libnvidia-ml.so.520.56.06 I1202 15:15:34.452975 26518 nvc_info.c:173] selecting /usr/lib/i386-linux-gnu/libnvidia-glvkspirv.so.520.56.06 I1202 15:15:34.453288 26518 nvc_info.c:173] selecting /usr/lib/i386-linux-gnu/libnvidia-glsi.so.520.56.06 I1202 15:15:34.453571 26518 nvc_info.c:173] selecting /usr/lib/i386-linux-gnu/libnvidia-glcore.so.520.56.06 I1202 15:15:34.453833 26518 nvc_info.c:173] selecting /usr/lib/i386-linux-gnu/libnvidia-fbc.so.520.56.06 I1202 15:15:34.454141 26518 nvc_info.c:173] selecting /usr/lib/i386-linux-gnu/libnvidia-encode.so.520.56.06 I1202 15:15:34.454359 26518 nvc_info.c:173] selecting /usr/lib/i386-linux-gnu/libnvidia-eglcore.so.520.56.06 I1202 15:15:34.455059 26518 nvc_info.c:173] selecting /usr/lib/i386-linux-gnu/libnvidia-compiler.so.520.56.06 I1202 15:15:34.455764 26518 nvc_info.c:173] selecting /usr/lib/i386-linux-gnu/libnvidia-allocator.so.520.56.06 I1202 15:15:34.456075 26518 nvc_info.c:173] selecting /usr/lib/i386-linux-gnu/libnvcuvid.so.520.56.06 I1202 15:15:34.456395 26518 nvc_info.c:173] selecting /usr/lib/i386-linux-gnu/libcuda.so.520.56.06 I1202 15:15:34.456750 26518 nvc_info.c:173] selecting /usr/lib/i386-linux-gnu/libGLX_nvidia.so.520.56.06 I1202 15:15:34.457050 26518 nvc_info.c:173] selecting /usr/lib/i386-linux-gnu/libGLESv2_nvidia.so.520.56.06 I1202 15:15:34.457314 26518 nvc_info.c:173] selecting /usr/lib/i386-linux-gnu/libGLESv1_CM_nvidia.so.520.56.06 I1202 15:15:34.457580 26518 nvc_info.c:173] selecting /usr/lib/i386-linux-gnu/libEGL_nvidia.so.520.56.06 W1202 15:15:34.457645 26518 nvc_info.c:399] missing library libnvidia-nscq.so W1202 15:15:34.457659 26518 nvc_info.c:399] missing library libnvidia-fatbinaryloader.so W1202 15:15:34.457678 26518 nvc_info.c:399] missing library libnvidia-pkcs11.so W1202 15:15:34.457694 26518 nvc_info.c:399] missing library libvdpau_nvidia.so W1202 15:15:34.457709 26518 nvc_info.c:399] missing library libnvidia-ifr.so W1202 15:15:34.457722 26518 nvc_info.c:399] missing library libnvidia-cbl.so W1202 15:15:34.457740 26518 nvc_info.c:403] missing compat32 library libnvidia-cfg.so W1202 15:15:34.457753 26518 nvc_info.c:403] missing compat32 library libnvidia-nscq.so W1202 15:15:34.457768 26518 nvc_info.c:403] missing compat32 library libcudadebugger.so W1202 15:15:34.457780 26518 nvc_info.c:403] missing compat32 library libnvidia-fatbinaryloader.so W1202 15:15:34.457792 26518 nvc_info.c:403] missing compat32 library libnvidia-pkcs11.so W1202 15:15:34.457808 26518 nvc_info.c:403] missing compat32 library libnvidia-ngx.so W1202 15:15:34.457828 26518 nvc_info.c:403] missing compat32 library libvdpau_nvidia.so W1202 15:15:34.457843 26518 nvc_info.c:403] missing compat32 library libnvidia-ifr.so W1202 15:15:34.457860 26518 nvc_info.c:403] missing compat32 library libnvidia-rtcore.so W1202 15:15:34.457880 26518 nvc_info.c:403] missing compat32 library libnvoptix.so W1202 15:15:34.457894 26518 nvc_info.c:403] missing compat32 library libnvidia-cbl.so I1202 15:15:34.460121 26518 nvc_info.c:299] selecting /usr/bin/nvidia-smi I1202 15:15:34.460197 26518 nvc_info.c:299] selecting /usr/bin/nvidia-debugdump I1202 15:15:34.460243 26518 nvc_info.c:299] selecting /usr/bin/nvidia-persistenced I1202 15:15:34.460336 26518 nvc_info.c:299] selecting /usr/bin/nvidia-cuda-mps-control I1202 15:15:34.460409 26518 nvc_info.c:299] selecting /usr/bin/nvidia-cuda-mps-server W1202 15:15:34.460616 26518 nvc_info.c:425] missing binary nv-fabricmanager I1202 15:15:34.460810 26518 nvc_info.c:343] listing firmware path /usr/lib/firmware/nvidia/520.56.06/gsp.bin I1202 15:15:34.460876 26518 nvc_info.c:529] listing device /dev/nvidiactl I1202 15:15:34.460891 26518 nvc_info.c:529] listing device /dev/nvidia-uvm I1202 15:15:34.460904 26518 nvc_info.c:529] listing device /dev/nvidia-uvm-tools I1202 15:15:34.460915 26518 nvc_info.c:529] listing device /dev/nvidia-modeset I1202 15:15:34.460980 26518 nvc_info.c:343] listing ipc path /run/nvidia-persistenced/socket W1202 15:15:34.461036 26518 nvc_info.c:349] missing ipc path /var/run/nvidia-fabricmanager/socket W1202 15:15:34.461083 26518 nvc_info.c:349] missing ipc path /tmp/nvidia-mps I1202 15:15:34.461100 26518 nvc_info.c:822] requesting device information with ‘’ I1202 15:15:34.468056 26518 nvc_info.c:713] listing device /dev/nvidia0 (GPU-ba9fdcdb-8a2b-d2b6-f69c-5f2ac08dde8b at 00000000:01:00.0) NVRM version: 520.56.06 CUDA version: 11.8
Device Index: 0 Device Minor: 0 Model: NVIDIA GeForce RTX 3090 Ti Brand: GeForce GPU UUID: GPU-ba9fdcdb-8a2b-d2b6-f69c-5f2ac08dde8b Bus Location: 00000000:01:00.0 Architecture: 8.6 I1202 15:15:34.468151 26518 nvc.c:434] shutting down library context I1202 15:15:34.468317 26521 rpc.c:95] terminating nvcgo rpc service I1202 15:15:34.469397 26518 rpc.c:132] nvcgo rpc service terminated successfully I1202 15:15:34.474156 26520 rpc.c:95] terminating driver rpc service I1202 15:15:34.474599 26518 rpc.c:132] driver rpc service terminated successfully
- [ x] Kernel version from
uname -a
- 5.15.0-53-generic NVIDIA/nvidia-docker#59-Ubuntu SMP Mon Oct 17 18:53:30 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
- [ x] Driver information from
nvidia-smi -a
- ==============NVSMI LOG==============
Timestamp : Fri Dec 2 09:17:13 2022 Driver Version : 520.56.06 CUDA Version : 11.8
Attached GPUs : 1 GPU 00000000:01:00.0 Product Name : NVIDIA GeForce RTX 3090 Ti Product Brand : GeForce Product Architecture : Ampere Display Mode : Enabled Display Active : Enabled Persistence Mode : Enabled MIG Mode Current : N/A Pending : N/A Accounting Mode : Disabled Accounting Mode Buffer Size : 4000 Driver Model Current : N/A Pending : N/A Serial Number : N/A GPU UUID : GPU-ba9fdcdb-8a2b-d2b6-f69c-5f2ac08dde8b Minor Number : 0 VBIOS Version : 94.02.A0.00.2D MultiGPU Board : No Board ID : 0x100 GPU Part Number : N/A Module ID : 0 Inforom Version Image Version : G002.0000.00.03 OEM Object : 2.0 ECC Object : 6.16 Power Management Object : N/A GPU Operation Mode Current : N/A Pending : N/A GSP Firmware Version : N/A GPU Virtualization Mode Virtualization Mode : None Host VGPU Mode : N/A IBMNPU Relaxed Ordering Mode : N/A PCI Bus : 0x01 Device : 0x00 Domain : 0x0000 Device Id : 0x220310DE Bus Id : 00000000:01:00.0 Sub System Id : 0x88701043 GPU Link Info PCIe Generation Max : 4 Current : 1 Link Width Max : 16x Current : 16x Bridge Chip Type : N/A Firmware : N/A Replays Since Reset : 0 Replay Number Rollovers : 0 Tx Throughput : 1000 KB/s Rx Throughput : 0 KB/s Fan Speed : 0 % Performance State : P8 Clocks Throttle Reasons Idle : Active Applications Clocks Setting : Not Active SW Power Cap : Not Active HW Slowdown : Not Active HW Thermal Slowdown : Not Active HW Power Brake Slowdown : Not Active Sync Boost : Not Active SW Thermal Slowdown : Not Active Display Clock Setting : Not Active FB Memory Usage Total : 24564 MiB Reserved : 310 MiB Used : 510 MiB Free : 23742 MiB BAR1 Memory Usage Total : 256 MiB Used : 13 MiB Free : 243 MiB Compute Mode : Default Utilization Gpu : 6 % Memory : 5 % Encoder : 0 % Decoder : 0 % Encoder Stats Active Sessions : 0 Average FPS : 0 Average Latency : 0 FBC Stats Active Sessions : 0 Average FPS : 0 Average Latency : 0 Ecc Mode Current : Disabled Pending : Disabled ECC Errors Volatile SRAM Correctable : N/A SRAM Uncorrectable : N/A DRAM Correctable : N/A DRAM Uncorrectable : N/A Aggregate SRAM Correctable : N/A SRAM Uncorrectable : N/A DRAM Correctable : N/A DRAM Uncorrectable : N/A Retired Pages Single Bit ECC : N/A Double Bit ECC : N/A Pending Page Blacklist : N/A Remapped Rows Correctable Error : 0 Uncorrectable Error : 0 Pending : No Remapping Failure Occurred : No Bank Remap Availability Histogram Max : 192 bank(s) High : 0 bank(s) Partial : 0 bank(s) Low : 0 bank(s) None : 0 bank(s) Temperature GPU Current Temp : 36 C GPU Shutdown Temp : 97 C GPU Slowdown Temp : 94 C GPU Max Operating Temp : 92 C GPU Target Temperature : 83 C Memory Current Temp : N/A Memory Max Operating Temp : N/A Power Readings Power Management : Supported Power Draw : 32.45 W Power Limit : 480.00 W Default Power Limit : 480.00 W Enforced Power Limit : 480.00 W Min Power Limit : 100.00 W Max Power Limit : 516.00 W Clocks Graphics : 210 MHz SM : 210 MHz Memory : 405 MHz Video : 555 MHz Applications Clocks Graphics : N/A Memory : N/A Default Applications Clocks Graphics : N/A Memory : N/A Max Clocks Graphics : 2115 MHz SM : 2115 MHz Memory : 10501 MHz Video : 1950 MHz Max Customer Boost Clocks Graphics : N/A Clock Policy Auto Boost : N/A Auto Boost Default : N/A Voltage Graphics : 740.000 mV Processes GPU instance ID : N/A Compute instance ID : N/A Process ID : 2283 Type : G Name : /usr/lib/xorg/Xorg Used GPU Memory : 259 MiB GPU instance ID : N/A Compute instance ID : N/A Process ID : 2441 Type : G Name : /usr/bin/gnome-shell Used GPU Memory : 52 MiB GPU instance ID : N/A Compute instance ID : N/A Process ID : 3320 Type : G Name : /opt/docker-desktop/Docker Desktop --type=gpu-process --enable-crashpad --enable-crash-reporter=46721d59-e3cc-4241-8f96-57bab71f8674,no_channel --user-data-dir=/home/kanaka/.config/Docker Desktop --gpu-preferences=WAAAAAAAAAAgAAAIAAAAAAAAAAAAAAAAAABgAAAAAAA4AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACAAAAAAAAAAIAAAAAAAAAABAAAAAAAAAAgAAAAAAAAACAAAAAAAAAAIAAAAAAAAAA== --shared-files --field-trial-handle=0,i,777493636119283380,17735576311253417080,131072 --disable-features=SpareRendererForSitePerProcess Used GPU Memory : 27 MiB GPU instance ID : N/A Compute instance ID : N/A Process ID : 4402 Type : C+G Name : /opt/google/chrome/chrome --type=gpu-process --enable-crashpad --crashpad-handler-pid=4367 --enable-crash-reporter=, --change-stack-guard-on-fork=enable --gpu-preferences=WAAAAAAAAAAgAAAIAAAAAAAAAAAAAAAAAABgAAEAAAA4AAAAAAAAAAEAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACAAAAAAAAAAIAAAAAAAAAABAAAAAAAAAAgAAAAAAAAACAAAAAAAAAAIAAAAAAAAAA== --shared-files --field-trial-handle=0,i,1352372760819385498,10632265477078674372,131072 Used GPU Memory : 166 MiB
- [ x] Docker version from
docker version
- Client: Docker Engine - Community Cloud integration: v1.0.29 Version: 20.10.21 API version: 1.41 Go version: go1.18.7 Git commit: baeda1f Built: Tue Oct 25 18:01:58 2022 OS/Arch: linux/amd64 Context: desktop-linux Experimental: true
Server: Docker Desktop 4.15.0 (93002) Engine: Version: 20.10.21 API version: 1.41 (minimum version 1.12) Go version: go1.18.7 Git commit: 3056208 Built: Tue Oct 25 18:00:19 2022 OS/Arch: linux/amd64 Experimental: false containerd: Version: 1.6.10 GitCommit: 770bd0108c32f3fb5c73ae1264f7e503fe7b2661 runc: Version: 1.1.4 GitCommit: v1.1.4-0-g5fd4c4d docker-init: Version: 0.19.0 GitCommit: de40ad0
-
[x ] NVIDIA packages version from
dpkg -l '*nvidia*'
orrpm -qa '*nvidia*'
-Desired=Unknown/Install/Remove/Purge/Hold | Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend |/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad) ||/ Name Version Architecture Description ++±===================================-============================-============-======================================================================== un libgldispatch0-nvidia <none> <none> (no description available) ii libnvidia-cfg1-515:amd64 520.56.06-0lambda0.22.04.3 amd64 Transitional package for libnvidia-cfg1-520 ii libnvidia-cfg1-520:amd64 520.56.06-0lambda0.22.04.3 amd64 NVIDIA binary OpenGL/GLX configuration library un libnvidia-cfg1-any <none> <none> (no description available) un libnvidia-common <none> <none> (no description available) ii libnvidia-common-515 520.56.06-0lambda0.22.04.3 all Transitional package for libnvidia-common-520 ii libnvidia-common-520 520.56.06-0lambda0.22.04.3 all Shared files used by the NVIDIA libraries un libnvidia-compute <none> <none> (no description available) ii libnvidia-compute-515:amd64 520.56.06-0lambda0.22.04.3 amd64 Transitional package for libnvidia-compute-520 ii libnvidia-compute-515:i386 520.56.06-0lambda0.22.04.3 i386 Transitional package for libnvidia-compute-520 ii libnvidia-compute-520:amd64 520.56.06-0lambda0.22.04.3 amd64 NVIDIA libcompute package ii libnvidia-compute-520:i386 520.56.06-0lambda0.22.04.3 i386 NVIDIA libcompute package ii libnvidia-container-tools 1.11.0+dfsg-0lambda0.22.04.1 amd64 Package for configuring containers with NVIDIA hardware (CLI tool) ii libnvidia-container1:amd64 1.11.0+dfsg-0lambda0.22.04.1 amd64 Package for configuring containers with NVIDIA hardware (shared library) un libnvidia-decode <none> <none> (no description available) ii libnvidia-decode-515:amd64 520.56.06-0lambda0.22.04.3 amd64 Transitional package for libnvidia-decode-520 ii libnvidia-decode-515:i386 520.56.06-0lambda0.22.04.3 i386 Transitional package for libnvidia-decode-520 ii libnvidia-decode-520:amd64 520.56.06-0lambda0.22.04.3 amd64 NVIDIA Video Decoding runtime libraries ii libnvidia-decode-520:i386 520.56.06-0lambda0.22.04.3 i386 NVIDIA Video Decoding runtime libraries ii libnvidia-egl-wayland1:amd64 1:1.1.9-1.1 amd64 Wayland EGL External Platform library – shared library un libnvidia-encode <none> <none> (no description available) ii libnvidia-encode-515:amd64 520.56.06-0lambda0.22.04.3 amd64 Transitional package for libnvidia-encode-520 ii libnvidia-encode-515:i386 520.56.06-0lambda0.22.04.3 i386 Transitional package for libnvidia-encode-520 ii libnvidia-encode-520:amd64 520.56.06-0lambda0.22.04.3 amd64 NVENC Video Encoding runtime library ii libnvidia-encode-520:i386 520.56.06-0lambda0.22.04.3 i386 NVENC Video Encoding runtime library un libnvidia-encode1 <none> <none> (no description available) un libnvidia-extra <none> <none> (no description available) ii libnvidia-extra-515:amd64 520.56.06-0lambda0.22.04.3 amd64 Transitional package for libnvidia-extra-520 ii libnvidia-extra-520:amd64 520.56.06-0lambda0.22.04.3 amd64 Extra libraries for the NVIDIA driver ii libnvidia-extra-520:i386 520.56.06-0lambda0.22.04.3 i386 Extra libraries for the NVIDIA driver un libnvidia-fbc1 <none> <none> (no description available) ii libnvidia-fbc1-515:amd64 520.56.06-0lambda0.22.04.3 amd64 Transitional package for libnvidia-fbc1-520 ii libnvidia-fbc1-515:i386 520.56.06-0lambda0.22.04.3 i386 Transitional package for libnvidia-fbc1-520 ii libnvidia-fbc1-520:amd64 520.56.06-0lambda0.22.04.3 amd64 NVIDIA OpenGL-based Framebuffer Capture runtime library ii libnvidia-fbc1-520:i386 520.56.06-0lambda0.22.04.3 i386 NVIDIA OpenGL-based Framebuffer Capture runtime library un libnvidia-gl <none> <none> (no description available) un libnvidia-gl-390 <none> <none> (no description available) un libnvidia-gl-410 <none> <none> (no description available) un libnvidia-gl-470 <none> <none> (no description available) un libnvidia-gl-495 <none> <none> (no description available) ii libnvidia-gl-515:amd64 520.56.06-0lambda0.22.04.3 amd64 Transitional package for libnvidia-gl-520 ii libnvidia-gl-515:i386 520.56.06-0lambda0.22.04.3 i386 Transitional package for libnvidia-gl-520 ii libnvidia-gl-520:amd64 520.56.06-0lambda0.22.04.3 amd64 NVIDIA OpenGL/GLX/EGL/GLES GLVND libraries and Vulkan ICD ii libnvidia-gl-520:i386 520.56.06-0lambda0.22.04.3 i386 NVIDIA OpenGL/GLX/EGL/GLES GLVND libraries and Vulkan ICD un libnvidia-legacy-390xx-egl-wayland1 <none> <none> (no description available) un libnvidia-ml1 <none> <none> (no description available) un nvidia-common <none> <none> (no description available) un nvidia-compute-utils <none> <none> (no description available) ii nvidia-compute-utils-515 520.56.06-0lambda0.22.04.3 amd64 Transitional package for nvidia-compute-utils-520 ii nvidia-compute-utils-520 520.56.06-0lambda0.22.04.3 amd64 NVIDIA compute utilities un nvidia-contaienr-runtime <none> <none> (no description available) un nvidia-container-runtime <none> <none> (no description available) un nvidia-container-runtime-hook <none> <none> (no description available) ii nvidia-container-toolkit 1.11.0-0lambda0.22.04.1 amd64 OCI hook for configuring containers for NVIDIA hardware ii nvidia-container-toolkit-base 1.11.0-0lambda0.22.04.1 amd64 OCI hook for configuring containers for NVIDIA hardware ii nvidia-dkms-515 520.56.06-0lambda0.22.04.3 amd64 Transitional package for nvidia-dkms-520 ii nvidia-dkms-520 520.56.06-0lambda0.22.04.3 amd64 NVIDIA DKMS package un nvidia-dkms-kernel <none> <none> (no description available) un nvidia-driver <none> <none> (no description available) ii nvidia-driver-515 520.56.06-0lambda0.22.04.3 amd64 Transitional package for nvidia-driver-520 ii nvidia-driver-520 520.56.06-0lambda0.22.04.3 amd64 NVIDIA driver metapackage un nvidia-driver-binary <none> <none> (no description available) un nvidia-egl-wayland-common <none> <none> (no description available) un nvidia-kernel-common <none> <none> (no description available) ii nvidia-kernel-common-515 520.56.06-0lambda0.22.04.3 amd64 Transitional package for nvidia-kernel-common-520 ii nvidia-kernel-common-520 520.56.06-0lambda0.22.04.3 amd64 Shared files used with the kernel module un nvidia-kernel-source <none> <none> (no description available) ii nvidia-kernel-source-515 520.56.06-0lambda0.22.04.3 amd64 Transitional package for nvidia-kernel-source-520 ii nvidia-kernel-source-520 520.56.06-0lambda0.22.04.3 amd64 NVIDIA kernel source package un nvidia-libopencl1-dev <none> <none> (no description available) un nvidia-opencl-icd <none> <none> (no description available) un nvidia-persistenced <none> <none> (no description available) ii nvidia-prime 0.8.17.1 all Tools to enable NVIDIA’s Prime ii nvidia-settings 510.47.03-0ubuntu1 amd64 Tool for configuring the NVIDIA graphics driver un nvidia-settings-binary <none> <none> (no description available) un nvidia-smi <none> <none> (no description available) un nvidia-utils <none> <none> (no description available) ii nvidia-utils-515 520.56.06-0lambda0.22.04.3 amd64 Transitional package for nvidia-utils-520 ii nvidia-utils-520 520.56.06-0lambda0.22.04.3 amd64 NVIDIA driver support binaries ii xserver-xorg-video-nvidia-515 520.56.06-0lambda0.22.04.3 amd64 Transitional package for xserver-xorg-video-nvidia-520 ii xserver-xorg-video-nvidia-520 520.56.06-0lambda0.22.04.3 amd64 NVIDIA binary Xorg driver -
[ x] NVIDIA container library version from
nvidia-container-cli -V
-
cli-version: 1.11.0 lib-version: 1.11.0 build date: 2022-10-25T22:10+00:00 build revision: build compiler: x86_64-linux-gnu-gcc-11 11.3.0 build platform: x86_64 build flags: -D_GNU_SOURCE -D_FORTIFY_SOURCE=2 -Wdate-time -D_FORTIFY_SOURCE=2 -DNDEBUG -std=gnu11 -O2 -g -fdata-sections -ffunction-sections -fplan9-extensions -fstack-protector -fno-strict-aliasing -fvisibility=hidden -Wall -Wextra -Wcast-align -Wpointer-arith -Wmissing-prototypes -Wnonnull -Wwrite-strings -Wlogical-op -Wformat=2 -Wmissing-format-attribute -Winit-self -Wshadow -Wstrict-prototypes -Wunreachable-code -Wconversion -Wsign-conversion -Wno-unknown-warning-option -Wno-format-extra-args -Wno-gnu-alignof-expression -g -O2 -ffile-prefix-map=/build/libnvidia-container-956QFy/libnvidia-container-1.11.0+dfsg=. -flto=auto -ffat-lto-objects -flto=auto -ffat-lto-objects -fstack-protector-strong -Wformat -Werror=format-security -Wl,-zrelro -Wl,-znow -Wl,-zdefs -Wl,–gc-sections -Wl,-Bsymbolic-functions -flto=auto -ffat-lto-objects -flto=auto -Wl,-z,relro
About this issue
- Original URL
- State: open
- Created 2 years ago
- Reactions: 4
- Comments: 38 (4 by maintainers)
I had the same issue. For me a reinstall of docker fixed the issue:
I run as a bash script:
Hi guys,
I hit same issue on Ubuntu 22.04 LTS, I followed instructions to reinstall as bellow (note I have installed as well docker-desktop initially)
And I was able to run this machine from above comment:
docker run --rm -it --gpus=all nvcr.io/nvidia/k8s/cuda-sample:nbody nbody -gpu -benchmark
and I was finally capable to run RAPIDS:docker run --gpus all --pull always --rm -it --shm-size=1g --ulimit memlock=-1 --ulimit stack=67108864 -p 8888:8888 -p 8787:8787 -p 8786:8786 rapidsai/notebooks:23.12a-cuda11.2-py3.10
At moment the docker-desktop is uninstalled. I will try to install it again and run the tests.
@JosephKuchar try reinstalling docker - I had similar problem, with the issue being the missing runtime (see
docker info
). the solution was for me to reinstall docker https://github.com/NVIDIA/nvidia-docker/issues/1648#issuecomment-1785033393The toolkit explicitly looks for
libnvidia-ml.so.1
which should be symlinked tolibnvidia-mk.so.<DRIVER_VERSION>
after runningldconfig
on your host. Sincenvidia-smi
works (and also useslibnvidia-ml.so.1
), I would not expect this to be the case.How is docker installed, could it be that it is installed as a snap and cannot load the system libraries because of this?
I actually managed to fix this. At some point in time we had uncommented the option root = “/run/nvidia/driver” in /etc/nvidia-container-runtime/config.toml (must have seen directions on this somewhere). My best guess is that we had updated something on the system that made this no longer be a viable option, and after a reboot, everything stopped working. I commented out that option and everything popped up.
To find it, I created a wrapper around nvidia-container-cli:
That showed me a working and a non-working systems’ option that were being passed.
Not working:
Working:
UPDATE
reading through docs - for https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html
this command works fine…
ok
it’s basically a problem without using sudo…
UPDATE - FIXED. I don’t know if this helps - but on my installation I had cudnn-local-repo-ubuntu2204-8.6.0.163_1.0-1_amd64.deb + 11.8 cuda this is incorrect. i was using cog - and this didn’t find the error - just assumed it was all working correctly. updating to latest cudnn - resolved my original issue. cudnn-local-repo-ubuntu2204-8.7.0.84_1.0-1_amd64.deb
same problem ubuntu 22:04
Linux msi 5.15.0-56-generic NVIDIA/nvidia-docker#62-Ubuntu SMP Tue Nov 22 19:54:14 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
docker desktop
can you unpack this?
The toolkit explicitly looks for libnvidia-ml.so.1 which should be symlinked to libnvidia-mk.so.<DRIVER_VERSION> after running ldconfig on your host. Since nvidia-smi works (and also uses libnvidia-ml.so.1), I would not expect this to be the case.
How is docker installed, could it be that it is installed as a snap and cannot load the system libraries because of this?
I installed
sudo apt-get install -y nvidia-docker2
successfully nvidia-docker2 is already the newest version (2.11.0-1).
This worked for me. Thankyou so much
Looks like this just doesn’t work with docker desktop.
When you run the script that @bkocis shared - you’re installing docker-ce, most likely next to docker desktop. So the
sudo
version of docker runs the CE version, and the regular one will use your docker desktop version.At least, this is what happens for me 😃
Before installing docker-ce, you’d get this error:
I have this issue unless I run as root. Using docker-desktop
It also seems it is reproducable with the PKGBUILD i created here https://gitlab.com/nvidia/container-toolkit/container-toolkit/-/issues/17#note_1530784413
here is my config.toml
I have the same error
[nvidia-container-cli: initialization error: load library failed: libnvidia-ml.so.1](https://github.com/NVIDIA/nvidia-container-toolkit/issues/154)
when running docker without sudo.Is there possible ways to get thing work without sudo?
All instructions were helpful, but I had to start docker, docker build, and docker run at root privileges to make it work!!! Even after repeated hard tries, unable to run at a user-level permissions.