autoware: CUDA environment is broken when I run a docker container with rocker
Checklist
- I’ve read the contribution guidelines.
- I’ve searched other issues and no duplicate issues were found.
- I’m convinced that this is not my fault but a bug.
Description
When I run a docker container built on this repository with rocker, nvidia-smi and CUDA packages of autoware.universe didn’t work.
Expected behavior
$ rocker --nvidia --x11 --user ghcr.io/autowarefoundation/autoware-universe:humble-latest nvidia-smi
returns the same result as
docker run --rm -it --gpus all -e DISPLAY -e TERM -e QT_X11_NO_MITSHM=1 -v /tmp/.X11-unix:/tmp/.X11-unix -v /etc/localtime:/etc/localtime:ro ghcr.io/autowarefoundation/autoware-universe:humble-latest
Actual behavior
$ rocker --nvidia --x11 --user --volume $PWD:$HOME/autoware -- ghcr.io/autowarefoundation/autoware-universe:humble-latest nvidia-smi
...
bash: nvidia-smi: command not found
or
# in the docker container
$ ros2 launch lidar_centerpoint lidar_centerpoint.launch.xml
[INFO] [launch]: All log files can be found below /home/yusuke/.ros/log/2022-05-27-18-24-13-889706-yusuke-desktop-14888
[INFO] [launch]: Default logging verbosity is set to INFO
[INFO] [lidar_centerpoint_node-1]: process started with pid [14889]
[lidar_centerpoint_node-1] terminate called after throwing an instance of 'thrust::system::detail::bad_alloc'
[lidar_centerpoint_node-1] what(): std::bad_alloc: cudaErrorInsufficientDriver: CUDA driver version is insufficient for CUDA runtime version
[ERROR] [lidar_centerpoint_node-1]: process has died [pid 14889, exit code -6, cmd '/home/yusuke/autoware/install/lidar_centerpoint/lib/lidar_centerpoint/lidar_centerpoint_node --ros-args -r __node:=lidar_centerpoint --params-file /tmp/launch_params__7bviznb --params-file /tmp/launch_params_xhbpo0pj --params-file /tmp/launch_params_3cva1bu7 --params-file /tmp/launch_params_591wukuf --params-file /tmp/launch_params_binyjp_2 --params-file /tmp/launch_params_d_ubfmz8 --params-file /tmp/launch_params_3ciwtkcg --params-file /tmp/launch_params_cy1qmkld --params-file /home/yusuke/autoware/install/lidar_centerpoint/share/lidar_centerpoint/config/default.param.yaml -r ~/input/pointcloud:=/sensing/lidar/pointcloud -r ~/output/objects:=objects'].
Steps to reproduce
- build a docker image in
dockerdirectory - run a docker container with rocker
nvidia-smi
Versions
No response
Possible causes
No response
Additional context
No response
About this issue
- Original URL
- State: closed
- Created 2 years ago
- Comments: 20 (16 by maintainers)
Sent a PR: https://github.com/osrf/rocker/pull/182
If it’s not accepted, I’ll add the following block in the Dockerfile.
@kenji-miyake I’m sorry for the late replay. AND thank you for identifying the cause of this issue! I will close this issue after creating a PR to add your suggestion into the document.
@angry-crab
humble-latestdoesn’t contain CUDA now. Please usehumble-latest-cudainstead. https://github.com/autowarefoundation/autoware/pkgs/container/autoware-universe/26944787?tag=humble-latest-cudaYes indeed, my
latestis older thangalactic-latest: