Vitis-AI: train with caffee Vitis-AI GPU fail
Hi,
I am getting the following issue while doing train on cf_refinedet_coco_360_480_0.96_5.08G_2.0
(vitis-ai-caffe) Vitis-AI /workspace/models/AI-Model-Zoo/cf_refinedet_coco_360_480_0.96_5.08G_2.0/code/train > bash train.sh
../../../caffe-xilinx/build/tools/caffe.bin does not exist, try use path in pre-build docker
F0303 10:14:08.370003 394 gpu_memory.cpp:171] Check failed: error == cudaSuccess (10 vs. 0) invalid device ordinal
*** Check failure stack trace: ***
@ 0x7ff0e4aaf4dd google::LogMessage::Fail()
@ 0x7ff0e4ab7071 google::LogMessage::SendToLog()
@ 0x7ff0e4aaeecd google::LogMessage::Flush()
@ 0x7ff0e4ab076a google::LogMessageFatal::~LogMessageFatal()
@ 0x7ff0e3760145 caffe::GPUMemory::Manager::update_dev_info()
@ 0x7ff0e37606bf caffe::GPUMemory::Manager::init()
@ 0x55a72c9920ed train()
@ 0x55a72c98ba59 main
@ 0x7ff0e1ceac87 __libc_start_main
@ 0x55a72c98c6a8 (unknown)
train.sh: line 37: 394 Aborted (core dumped) $exec_path "$@"
Here is the output of nvidia-smi
mhanuel@mhanuel-MSI:~$ nvidia-smi
Thu Mar 3 10:15:15 2022
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 510.47.03 Driver Version: 510.47.03 CUDA Version: 11.6 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 NVIDIA GeForce ... On | 00000000:01:00.0 On | N/A |
| 0% 36C P8 24W / 170W | 386MiB / 12288MiB | 1% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| 0 N/A N/A 7372 G /usr/lib/xorg/Xorg 35MiB |
| 0 N/A N/A 7851 G /usr/lib/xorg/Xorg 235MiB |
| 0 N/A N/A 7976 G /usr/bin/gnome-shell 40MiB |
| 0 N/A N/A 8471 G ...520405909793494209,131072 23MiB |
| 0 N/A N/A 180023 G ...AAAAAAAAA= --shared-files 39MiB |
+-----------------------------------------------------------------------------+
What could I be missing?
About this issue
- Original URL
- State: closed
- Created 2 years ago
- Comments: 21 (8 by maintainers)
Hi @mhanuel26 ,
I noticed that you are using
GeForce RTX 3060. RTX 3060 uses the Ampere architecture, and requires at least CUDA 11.0. Unfortunately caffe can only be built with CUDA 10.0, and is not compatible with CUDA 11.0Is there a chance that you can try with another NVIDIA GPU?