podman: oci-nvidia-hook not working as expected with podman command
Is this a BUG REPORT or FEATURE REQUEST?:
Uncomment only one, leave it on its own line:
/kind bug
/kind feature
Description
Steps to reproduce the issue:
Set up the the hook as per the the Openshift guide: https://blog.openshift.com/use-gpus-with-device-plugin-in-openshift-3-9/
I have also tried the 1.0.0 Hook Schema Nvidia example documented in this repo for the oci-nvidia-hook.json file.
Run the test container suggested in the Openshift guide with podman:
sudo podman run -it --rm docker.io/mirrorgooglecontainers/cuda-vector-add:v0.1
Describe the results you received: None of the Nvidia or CUDA tools are mounted and the test fails:
Failed to allocate device vector A (error code CUDA driver version is insufficient for CUDA runtime version)!
Describe the results you expected: For the vector-add test to pass as it does when using the fedora docker command.
Additional information you deem important (e.g. issue happens only occasionally): A work around is to use the nvidia-container-runtime with the podman --runtime option but I would rather use the native runc with the Nvidia hook if possible.
Output of podman version:
Version: 0.6.1-dev
Go Version: go1.10.2
OS/Arch: linux/amd64
Output of podman info:
host:
MemFree: 181026816
MemTotal: 8314044416
SwapFree: 8455450624
SwapTotal: 8455712768
arch: amd64
cpus: 8
hostname: kfworkstation
kernel: 4.16.12-200.fc27.x86_64
os: linux
uptime: 4h 29m 19.44s (Approximately 0.17 days)
insecure registries:
registries: []
registries:
registries:
- docker.io
- registry.fedoraproject.org
- registry.access.redhat.com
store:
ContainerStore:
number: 1
GraphDriverName: overlay
GraphOptions:
- overlay.override_kernel_check=true
GraphRoot: /var/lib/containers/storage
GraphStatus:
Backing Filesystem: extfs
Native Overlay Diff: "true"
Supports d_type: "true"
ImageStore:
number: 13
RunRoot: /var/run/containers/storage
Additional environment details (AWS, VirtualBox, physical, etc.): Running on physical hardware with Quadro 1000M.
About this issue
- Original URL
- State: closed
- Created 6 years ago
- Comments: 20 (11 by maintainers)
Commits related to this issue
- hooks/docs: Fix 1.0.0 Nvidia example (adding version, etc.) Reported by Gary Edwards [1]. Both typos are originally from 68eb128f (pkg/hooks: Version the hook structure and add 1.0.0 hooks, 2018-04-... — committed to wking/libpod by wking 6 years ago
- hooks/docs: Fix 1.0.0 Nvidia example (adding version, etc.) Reported by Gary Edwards [1]. Both typos are originally from 68eb128f (pkg/hooks: Version the hook structure and add 1.0.0 hooks, 2018-04-... — committed to containers/podman by wking 6 years ago
The reason for the hanging seemed to be the omission of the version information:
The command now runs but the hook does not seem to be triggered as non of the Nvidia tools are mounted.