nvidia-container-toolkit: v1.13.5 - cdi generate fails to find matching libraries for driver version
Hi,
With version v1.12.0
, this command used to work fine and generate the CDI condig:
# nvidia-ctk cdi generate --nvidia-ctk-path "/snap/${SNAP_NAME}/current/usr/bin/nvidia-ctk"
After switching to v1.13.5
, it fails with:
# nvidia-ctk cdi generate --nvidia-ctk-path "/snap/${SNAP_NAME}/current/usr/bin/nvidia-ctk"
INFO[0000] Auto-detected mode as "nvml"
INFO[0000] Selecting /dev/nvidia0 as /dev/nvidia0
INFO[0000] Selecting /dev/dri/card1 as /dev/dri/card1
WARN[0000] Could not locate /dev/dri/controlD65: pattern /dev/dri/controlD65 not found
INFO[0000] Selecting /dev/dri/renderD128 as /dev/dri/renderD128
INFO[0000] Using driver version 515.105.01
ERRO[0000] failed to generate CDI spec: failed to create edits common for entities: failed to create discoverer for common entities: failed to create discoverer for driver files: failed to create discoverer for driver libraries: failed to get libraries for driver version: failed to locate libcuda.so.515.105.01: pattern libcuda.so.515.105.01 not found
With both versions, the same LD_LIBRARY_PATH
is used, which conatins the correct path to a folder containing ibcuda.so.515.105.01
Full disclaimer, this is running within a snap on Ubuntu Core 22. which is not officially supported I’m sure, but did used to work well with v12
Did anything much change in v13
around libraries discovery ?
Any guidance on trouble shooting or help would be appreciated.
About this issue
- Original URL
- State: open
- Created a year ago
- Comments: 42 (16 by maintainers)
@jocado there are no cocrete dates yet, but we will most likely have to release by the end of January due to some other features that are required.