onnxruntime: Inconsistency detected by ld.so: dl-version.c: 224: _dl_check_map_versions: Assertion `needed != NULL' failed!

Describe the bug use onnxruntime-gpu inference my own onnx model. It works well when I use data in cpu device. But there is a error throwed when I use data in gpu device.

It works well by this code: ortvalue = onnxruntime.OrtValue.ortvalue_from_numpy(img.numpy()) It will fail by this: ortvalue = onnxruntime.OrtValue.ortvalue_from_numpy(img_lq.numpy(), device_type="cuda", device_id=0) The error is:

Inconsistency detected by ld.so: dl-version.c: 224: _dl_check_map_versions: Assertion `needed != NULL’ failed!

Urgency If there are particular important use cases blocked by this or strict project-related timelines, please share more information and dates. If there are no hard deadlines, please specify none.

System information

OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Linux Ubuntu 16.04
ONNX Runtime installed from (source or binary): pypi
ONNX Runtime version: onnxruntime-gpu 1.9.0
Python version: Python 3.6.13
Visual Studio version (if applicable):
GCC/Compiler version (if compiling from source): GCC 7.3.0
CUDA/cuDNN version: cudatoolkit 10.1.243
GPU model and memory: gtx 2080ti

To Reproduce

Describe steps/code to reproduce the behavior.
Attach the ONNX model to the issue (where applicable) to expedite investigation.

Expected behavior Any help to use gpu version tor inference onnx model?

Screenshots

About this issue

Original URL
State: closed
Created 3 years ago
Comments: 19 (8 by maintainers)

Commits related to this issue

Avoid calling patchelf (#17365) ### Description Resolve #9754 — committed to microsoft/onnxruntime by snnn 10 months ago
Avoid calling patchelf (#17365) ### Description Resolve #9754 — committed to microsoft/onnxruntime by snnn 10 months ago
Cherry-picks pipeline changes to 1.16.0 release branch (#17577) ### Description 1. Delete Prefast tasks (#17522) 2. Disable yum update (#17551) 3. Avoid calling patchelf (#17365 and #17562) we tha... — committed to microsoft/onnxruntime by snnn 9 months ago
Avoid calling patchelf (#17365) ### Description Resolve #9754 — committed to kleiti/onnxruntime by snnn 10 months ago

Most upvoted comments

My apologies for the ramble. Desperation tends to do that. I had resolved the issue on my own. It turns out that PyInstaller was not including all of the necessary CUDA libraries. Including them manually allowed onnxruntime to start up (and then crash when it couldn’t find the cudnn_*_infer libraries, but that error was transparent.)

I will say though that it is incredibly frustrating to have spent the time on what ended up being a fairly simple issue. I understand that the import hack is done to avoid the eyes of the auditor, but this same hack made it much more difficult to realize that it was just a matter of a missing dependency. I know I’m barking up the wrong tree here since I could’ve just not used Python, but this type of error would’ve happened much sooner in the pipeline and likely had a more useful error message in any compiled language.

camblomquist on May 12, 2022

In my specific case, after looking at the onnx runtime requirements again, I noticed that I might be missing cudnn. I tried installing libcudnn8 and libcudnn8-dev, and I was able to run my code successfully. Not quite sure if libcudnn8-dev was necessary or if libcudnn8 would have been sufficient.

GuillaumeTong on Feb 15, 2022

If you are running with TensorrtExecutionProvider, reinstall libnvinfer libraries solves the issue in my case:

Uninstall libnvinfer libraries:

sudo apt-get purge "libnvinfer*"

Install all libnvinfer libraries for your cuda version. Run apt-cache policy libnvinfer8 to check available versions.

For cuda-11.4 and libnvinfer 8.2.5.1:

sudo apt install libnvinfer8=8.2.5-1+cuda11.4 libnvinfer-plugin8=8.2.5-1+cuda11.4 libnvparsers8=8.2.5-1+cuda11.4 libnvonnxparsers8=8.2.5-1+cuda11.4 libnvinfer-dev=8.2.5-1+cuda11.4 libnvinfer-plugin-dev=8.2.5-1+cuda11.4 libnvparsers-dev=8.2.5-1+cuda11.4 libnvonnxparsers-dev=8.2.5-1+cuda11.4 cuda-cudart-dev-11-4 libcublas-dev-11-4

For cuda-11.6 and libinfer 8.4.3 (tested also with cuda-11.8):

sudo apt install libnvinfer8=8.4.3-1+cuda11.6 libnvinfer-plugin8=8.4.3-1+cuda11.6 libnvparsers8=8.4.3-1+cuda11.6 libnvonnxparsers8=8.4.3-1+cuda11.6 libnvinfer-dev=8.4.3-1+cuda11.6 libnvinfer-plugin-dev=8.4.3-1+cuda11.6 libnvinfer-bin=8.4.3-1+cuda11.6 libnvparsers-dev=8.4.3-1+cuda11.6 libnvonnxparsers-dev=8.4.3-1+cuda11.6 cuda-cudart-dev-11-6 libcublas-dev-11-6

Hope it helps.

biendltb on Jan 13, 2023

First, onnxruntime python packages, “onnxruntime” and “onnxruntime-gpu”, follow manylinux2014(pep-0599 ) standard. But the gpu one, onnxruntime-gpu, isn’t fully compliant.

The PEP 599 policy says: “The wheel’s binary executables or shared objects may not link against externally-provided libraries except those in the following list”

libgcc_s.so.1
libstdc++.so.6
libm.so.6
libdl.so.2
librt.so.1
libc.so.6
libnsl.so.1
libutil.so.1
libpthread.so.0
libresolv.so.2
libX11.so.6
libXext.so.6
libXrender.so.1
libICE.so.6
libSM.so.6
libGL.so.1
libgobject-2.0.so.0
libgthread-2.0.so.0
libglib-2.0.so.0

But we need CUDA. And CUDA isn’t in the list. BTW, if you run ldd with onnxruntime’s cpu only package, “onnxruntime”, you won’t see the error.

The policy was designed as any external dependency should be packed into the wheel file. However, we can’t. Because,

It will make the package over size. Pypi.org has a 100MB per wheel file size limit.
It could cause license issues. I’m not sure if we could redistribute Nvidia’s CUDA libraries.

So we did a dirty hack. Before pack the wheel, we patch the so file to pretend it doesn’t depend on CUDA. To cheat on manylinux’s auditwheel tool. Then we pack the wheel and manually load the CUDA libraries. The error message you saw is caused the tool for patching *.so files: patchelf. If we don’t use the tool, we won’t have this issue.

Alternatively, we could modify the policy. Patch the auditwheel tool, add a custom policy file to whitelist CUDA libraries. The file is: https://github.com/pypa/auditwheel/blob/main/src/auditwheel/policy/manylinux-policy.json . See #144 for more information.

(Hi @adk9, the above answer is only for onnxruntime inference packages. onnxruntime-training package is built in a special way that I’m not familiar. )

snnn on Nov 24, 2021

Stumbled across this issue. FWIW, in newer auditwheel there is an option to exclude shared objects that will be provided in a different manner. This is the PR that added the option --exclude pypa/auditwheel#368, specifically for the use case described here:

$ auditwheel repair --help
usage: auditwheel repair [-h] [--plat PLATFORM] [-L LIB_SDIR] [-w WHEEL_DIR] [--no-update-tags] [--strip] [--exclude EXCLUDE]
                         [--only-plat]
                         WHEEL_FILE [WHEEL_FILE ...]

Vendor in external shared library dependencies of a wheel.
If multiple wheels are specified, an error processing one
wheel will abort processing of subsequent wheels.
...
 --exclude EXCLUDE     Exclude SONAME from grafting into the resulting wheel (can be specified multiple times)

mattip on Jul 27, 2023

I’m also running into the same issue with onnxruntime-training (1.9.0) and onnxruntime-gpu (1.9.0) wheels installed from PyPI, trying to train a simple model using the CUDA EP. @snnn, the suggested fix above is not clear to me; can you elaborate on it?

adk9 on Nov 24, 2021