onnxruntime: Always getting "Failed to create CUDAExecutionProvider"

Describe the bug

When I try to create InferenceSession in Python with providers=['CUDAExecutionProvider'], I get the warning:

2022-04-01 22:45:36.716353289 [W:onnxruntime:Default, onnxruntime_pybind_state.cc:535 CreateExecutionProviderInstance] Failed to create CUDAExecutionProvider. Please reference https://onnxruntime.ai/docs/reference/execution-providers/CUDA-ExecutionProvider.html#requirements to ensure all dependencies are met.

And my CPU usage shoots up, while my GPU usage stays at 0.

Urgency

Not urgent.

System information

OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Linux Ubuntu 20.04
ONNX Runtime installed from (source or binary): binary
ONNX Runtime version: 1.10.0
Python version: 3.7.13
Visual Studio version (if applicable):
GCC/Compiler version (if compiling from source):
CUDA/cuDNN version: CUDA 11.4 / cuDNN 8.2.4
GPU model and memory: RTX 3090 24GB

To Reproduce

Download and run any model, e.g. here’s one from PyTorch’s ONNX example:

super_resolution.zip

pip install onnxruntime-gpu==1.10.0

import onnxruntime as ort
import numpy as np

sess = ort.InferenceSession('../../Downloads/super_resolution.onnx', providers=['CUDAExecutionProvider'])
sess.run(None, {
    sess.get_inputs()[0].name: np.random.rand(1, 1, 224, 224).astype(np.float32)
})

Expected behavior

It should work, or at least print out a more informative warning.

Screenshots

Additional context

About this issue

Original URL
State: open
Created 2 years ago
Reactions: 8
Comments: 36 (13 by maintainers)

Most upvoted comments

I hate that this fixed it, but yes, importing torch before onnxruntime fixed it for me as well.

+26

BotScutters on Jan 24, 2023

https://stackoverflow.com/questions/75267445/why-does-onnxruntime-fail-to-create-cudaexecutionprovider-in-linuxubuntu-20/75267493#75267493 solved my problem.

oguz-hanoglu on Jan 31, 2023

I have been facing the same problem for 3 days. And I couldn’t find any solution. People have asked many questions and opened threads about similar issues. unfortunately no answer from Microsoft Team. Am I wrong?

serdarildercaglar on Nov 22, 2022

For anyone who may trap in the same problem, I record all I done for a week to bring onnxruntime-gpu up in a container on debian system as below:

install nvidia drive on debian from nvidia, follow instructions https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html#network-repo-installation-for-debian for the system, the cuda-tookit or nvidia-cuda is not necessary. Only add nvidia apt source and install nvidia-driver

apt update && apt install nvidia-driver

check driver status with nvidia-smi, It should look like:

+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.54.03              Driver Version: 535.54.03    CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+

the driver veresion and cuda version should both not empty. (I tried nvidia-tesla-470-driver from debian apt source firstly, but it seems not working. the cuda version showed empty and ort throw exceptions complained the driver version too low

install nvidia-container-tookit follow instructions from https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html and then I could bring up python3.9 docker image with docker-compose config

version: "2.2"
services:
  modelsrv:
    image: python:3.9.17-slim-bullseye
    container_name: modelsrv
    command: sleep 100000d
    restart: always
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: 1
              capabilities: [gpu]

the configuration is from https://docs.docker.com/compose/gpu-support/.

and in the container, the command nvidia-smi should output the same info as from the base system.

in the container install pip packages as below:

pip install nvidia-pyindex
pip install onnxruntime-gpu==1.15.1 nvidia-cublas==11.5.1.101 nvidia-cublas-cu117==11.10.1.25 nvidia-cuda-runtime-cu114==11.4.148 nvidia-cudnn==8.2.0.51 nvidia-cudnn-cu11==8.5.0.96 nvidia-cufft==10.4.2.58 nvidia-curand==10.2.4.58

and then export LD_LIBRARY_PATH as

export LD_LIBRARY_PATH=/usr/local/lib/python3.9/site-packages/nvidia/cublas/lib/:/usr/local/lib/python3.9/site-packages/nvidia/cuda_runtime/lib/:/usr/local/lib/python3.9/site-packages/nvidia/cudnn/lib/:/usr/local/lib/python3.9/site-packages/nvidia/cufft/lib/:/usr/local/lib/python3.9/site-packages/nvidia/curand/lib/

And it works fine by now.

for tensorrt just install tensort and add tensorrt_libs to LD_LIBRARY_PATH:

pip install tensorrt

export LD_LIBRARY_PATH=/usr/local/lib/python3.9/site-packages/nvidia/cublas/lib/:/usr/local/lib/python3.9/site-packages/nvidia/cuda_runtime/lib/:/usr/local/lib/python3.9/site-packages/nvidia/cudnn/lib/:/usr/local/lib/python3.9/site-packages/nvidia/cufft/lib/:/usr/local/lib/python3.9/site-packages/nvidia/curand/lib/:/usr/local/lib/python3.9/site-packages/tensorrt_libs/

fireinice on Jul 14, 2023

I ran into this as well and had to look through a few different files before I found the imports at fault. Is the import order documented anywhere?

I’m now importing ORT through a module that also imports torch:

# this file exists to make sure torch is always imported before onnxruntime
# to work around https://github.com/microsoft/onnxruntime/issues/11092

import torch  # NOQA
from onnxruntime import *  # NOQA

ssube on Mar 1, 2023

CUDA 12 is not fully supported. We noted build compatibility in the 1.14 release notes, however it was partial (only Linux) as we found issues with Windows builds later.

pranavsharma on Mar 28, 2023

I hate that this fixed it, but yes, importing torch before onnxruntime fixed it for me as well.

Read this if you want to know why this issue happens:

The problem is that ONNX doesn’t know how to search for CUDA. PyTorch knows how to search for it, and adds it to Python’s internal path, so that ONNX can later find it.

The bug/issue is with ONNX library. I have coded a workaround here:

https://github.com/cubiq/ComfyUI_IPAdapter_plus/issues/238

Arcitec on Apr 18, 2024

@avinash-218, you can try run a python script like the following in your machine:

import onnxruntime
import torch
import psutil
import os

session = onnxruntime.InferenceSession(
        "model.onnx", providers=['CUDAExecutionProvider', 'CPUExecutionProvider']
)

p = psutil.Process(os.getpid())
for lib in p.memory_maps():
   print(lib.path)

Then change the order of first two lines, and run again.

In Windows, I tested torch 2.0.1 and ORT 1.16.1, and it seems that import order does not matter. In both case, the cuda and cudnn from torch will be used. It is likely torch will load those DLLs during import, while ORT will delay loading cuda and cudnn until they are used.

In Linux, the result might be different so you will need to try it by yourself.

tianleiwu on Oct 24, 2023

Importing torch is NOT necessary UNLESS you have not installed the required CUDA libraries and setup the environment correctly.

The python torch GPU module directly includes all the CUDA libraries which are very large - hence the size of that package is over 2.5GB. When you import torch it updates the environment the script is running in to point to the CUDA libraries directly included in that package.

Downloading https://download.pytorch.org/whl/cu118/torch-2.1.0%2Bcu118-cp311-cp311-win_amd64.whl (2722.7 MB)

e.g. from windows showing some of the CUDA related libraries in the torch packages’s lib directory

The ORT python package does not include the CUDA libraries, which is more efficient and flexible, but requires the user to install the required CUDA libraries and setup the environment correctly so that they can be found. Note that you need CUDA and cuDNN to be installed.

skottmckay on Oct 5, 2023

I worked with nvidia ngc docker container nvcr.io/nvidia/pytorch:23.01-py3 , which have all the envrioment installed initially, but still failed.

I’d suggest running the script to see which libraries are not being loaded as expected. The location of those libraries should be listed in the output from cat /etc/ld.so.conf.d/* (at least on Ubuntu - not sure if other linux distributions differ).

There’s also a known issue listed at the bottom of this page that mentions having to manually set LD_LIBRARY_PATH to resolve a problem with cudnn. Not sure if that is potentially a factor.

ORT doesn’t officially support CUDA 12 yet either, so it may be better to try the 22.12 version of the container.

skottmckay on Mar 27, 2023

@fijipants can you share the final workable Dockerfile? I am struggling on this problem for 2 days.

yaliqin on Aug 20, 2022

I was able to get it working inside of a Docker container nvidia/cuda:11.4.3-cudnn8-devel-ubuntu20.04.

Then I tried it again locally, and it worked! The only thing I did locally was apt install nvidia-container-toolkit after adding the sources.

Should that be added as a requirement?

fijipants on Apr 2, 2022