tensorflow: Profiling: libcupti.so cannot be loaded

Environment info

Operating System: Linux Ubuntu 14.04 LTS (64bit)

Installed version of CUDA and cuDNN: CUDA 7.5.18 and CUDNN 4.0.7 (please attach the output of ls -l /path/to/cuda/lib/libcud*):

❯❯❯ ls -l /usr/local/cuda-7.5/lib/libcud*
-rw----r-- 1 root root 189170  2월 24 18:12 /usr/local/cuda-7.5/lib/libcudadevrt.a
lrwxrwxrwx 1 root root     16  2월 24 18:12 /usr/local/cuda-7.5/lib/libcudart.so -> libcudart.so.7.5
lrwxrwxrwx 1 root root     19  2월 24 18:12 /usr/local/cuda-7.5/lib/libcudart.so.7.5 -> libcudart.so.7.5.18
-rwx---r-x 1 root root 311596  2월 24 18:12 /usr/local/cuda-7.5/lib/libcudart.so.7.5.18
-rw----r-- 1 root root 558020  2월 24 18:12 /usr/local/cuda-7.5/lib/libcudart_static.a
❯❯❯ ls -l /usr/local/cuda-7.5/lib64/libcud*
-rw----r-- 1 root root   322936  2월 24 18:12 /usr/local/cuda-7.5/lib64/libcudadevrt.a
lrwxrwxrwx 1 root root       16  2월 24 18:12 /usr/local/cuda-7.5/lib64/libcudart.so -> libcudart.so.7.5
lrwxrwxrwx 1 root root       19  2월 24 18:12 /usr/local/cuda-7.5/lib64/libcudart.so.7.5 -> libcudart.so.7.5.18
-rwx---r-x 1 root root   383336  2월 24 18:12 /usr/local/cuda-7.5/lib64/libcudart.so.7.5.18
-rw----r-- 1 root root   720192  2월 24 18:12 /usr/local/cuda-7.5/lib64/libcudart_static.a
lrwxrwxrwx 1 root root       13  3월  3 03:30 /usr/local/cuda-7.5/lib64/libcudnn.so -> libcudnn.so.4
lrwxrwxrwx 1 root root       17  3월  3 03:30 /usr/local/cuda-7.5/lib64/libcudnn.so.4 -> libcudnn.so.4.0.7
-rwxr-xr-x 1 root root 61453024  3월  3 03:30 /usr/local/cuda-7.5/lib64/libcudnn.so.4.0.7

If installed from binary pip package, provide:

  1. Which pip package you installed : Tensorflow 0.8.0 Nightly Python2.7 Linux (GPU) e.g. Build 118
  2. The output from python -c “import tensorflow; print(tensorflow.version)”. : 0.8.0

If installed from sources, provide the commit hash:

Steps to reproduce

Although it is experimental, I am using the GPU profiling functionality with CUPTI.

  1. Run any tensorflow code that uses CUPTI or tf.RunOptions.FULL_TRACE.
  2. The following error (segfault) occurs.
I tensorflow/stream_executor/dso_loader.cc:102] Couldn't open CUDA library libcupti.so. LD_LIBRARY_PATH:
F tensorflow/core/common_runtime/gpu/cupti_wrapper.cc:57] Check failed: f != nullptr could not find cuptiActivityRegisterCallbacksin libcupti DSO; dlerror: /home/wookayin/.virtualenvs/tfdebug/local/lib/python2.7/site-packages/tensorflow/python/_pywrap_tensorflow.so: undefined symbol: cuptiActivityRegisterCallbacks
[1]    82604 abort (core dumped)  python 19-mnist-profiling.py

Any TF code that invokes loading of libcupti.so will face the same error, but for convenience I will share a code that can run standalone: 19-mnist-profiling.py

What have you tried?

The problem is that the shared library libcupti.so cannot be loaded. However, in some older nightly version (such as Build 103) it worked.

UPD: I binary-searched to find the changeset to be blamed. Build 103 works, but Build 104 (Failed) and Build 105 does not work. I highly suspect that this regression is since commit 6bd964c (but not sure):

  • The path to libcupti.so would be /usr/local/cuda/extras/CUPTI/lib64/libcupti.so.
  • After this commit, it seems that path to libcupti.so goes wrong. (but why?)

A strange thing to me is that tensorflow already has a unit test for CUPTI and GPU tracing functionalities, so CI must have run this test as well. This bug might be happening in some environments only (like nightly build I installed via pip), or it can be a a bazel-related problem (when generating packages).

I have not investigated into this problem in detail; it looks that after some troubleshooting I can figure out what the cause is.

Thanks!

Logs or other output that would be helpful

N/A

About this issue

  • Original URL
  • State: closed
  • Created 8 years ago
  • Reactions: 3
  • Comments: 21 (9 by maintainers)

Commits related to this issue

Most upvoted comments

Appending /usr/local/cuda/extras/CUPTI/lib64 to LD_LIBRARY_PATH environment variable solved the problem.

I’m not sure that the change to ‘with_the_same_user’ is relevant. (AFAIK, This is just used in our Jenkins test framework to ensure the LD_LIBRARY_PATH from the Docker container is used when running GPU tests as the user ‘CI_BUILD_UID’)

Note: the same GIT commit also changed the interpretation of the trace level enum. Prior to this change, the TRACE_FULL option was only recording the host-side enqueueing of ops (i.e. what is now done if you specify ‘SOFTWARE_TRACE’)

It is now the case the ‘TRACE_FULL’ will also try to enable the GPUTracer on CUDA builds.
If you’re running a python script, this requires libcupti to be on your LD_LIBRARY_PATH (since there’s no other way to find the NVidia library.)

e.g. LD_LIBRARY_PATH=/usr/loca/cuda/lib64:/usrlocal/cuda/extras/lib64

Please can you check your path, and see if this fixes the problem?

As said above, this solved my issue. In my case, I edited my ./bashrc file in the home directory by appending /usr/local/cuda/extras/CUPTI/lib64 to the LD_LIBRARY_PATH.

export LD_LIBRARY_PATH=${CUDA_HOME}/lib64:$LD_LIBRARY_PATH:/usr/local/cuda/extras/CUPTI/lib64

I got this error after following the TensorBoard Tutorial from running their mnist_with_summaries.py. It occurred at the first summary logging (running FULL_TRACE).

I’m facing this issue as well. Installed tf from source. Here’s the log. Facing this issue while trying to run mnist_with_summaries.py.

I tensorflow/stream_executor/dso_loader.cc:125] successfully opened CUDA library libcublas.so.8.0 locally I tensorflow/stream_executor/dso_loader.cc:125] successfully opened CUDA library libcudnn.so.5 locally I tensorflow/stream_executor/dso_loader.cc:125] successfully opened CUDA library libcufft.so.8.0 locally I tensorflow/stream_executor/dso_loader.cc:125] successfully opened CUDA library libcuda.so.1 locally I tensorflow/stream_executor/dso_loader.cc:125] successfully opened CUDA library libcurand.so.8.0 locally W tensorflow/core/platform/cpu_feature_guard.cc:95] The TensorFlow library wasn’t compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations. W tensorflow/core/platform/cpu_feature_guard.cc:95] The TensorFlow library wasn’t compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations. W tensorflow/core/platform/cpu_feature_guard.cc:95] The TensorFlow library wasn’t compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations. W tensorflow/core/platform/cpu_feature_guard.cc:95] The TensorFlow library wasn’t compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations. Extracting /tmp/tensorflow/mnist/input_data/train-images-idx3-ubyte.gz Extracting /tmp/tensorflow/mnist/input_data/train-labels-idx1-ubyte.gz Extracting /tmp/tensorflow/mnist/input_data/t10k-images-idx3-ubyte.gz Extracting /tmp/tensorflow/mnist/input_data/t10k-labels-idx1-ubyte.gz I tensorflow/core/common_runtime/gpu/gpu_device.cc:885] Found device 0 with properties: name: GeForce GTX 1080 major: 6 minor: 1 memoryClockRate (GHz) 1.8475 pciBusID 0000:4b:00.0 Total memory: 7.92GiB Free memory: 6.45GiB I tensorflow/core/common_runtime/gpu/gpu_device.cc:906] DMA: 0 I tensorflow/core/common_runtime/gpu/gpu_device.cc:916] 0: Y I tensorflow/core/common_runtime/gpu/gpu_device.cc:975] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX 1080, pci bus id: 0000:4b:00.0) Accuracy at step 0: 0.0938 Accuracy at step 10: 0.6802 Accuracy at step 20: 0.818 Accuracy at step 30: 0.8561 Accuracy at step 40: 0.8777 Accuracy at step 50: 0.8767 Accuracy at step 60: 0.8853 Accuracy at step 70: 0.8833 Accuracy at step 80: 0.8885 Accuracy at step 90: 0.894 I tensorflow/stream_executor/dso_loader.cc:116] Couldn’t open CUDA library LD_LIBRARY_PATH: F tensorflow/core/platform/default/gpu/cupti_wrapper.cc:59] Check failed: ::tensorflow::Status::OK() == (::tensorflow::Env::Default()->GetSymbolFromLibrary( GetDsoHandle(), kName, &f)) (OK vs. Not found: /usr/local/lib/python3.5/dist-packages/tensorflow/python/_pywrap_tensorflow.so: undefined symbol: cuptiActivityRegisterCallbacks)could not find cuptiActivityRegisterCallbacksin libcupti DSO

What is suspicious to me is the error calls out libcupti.so.8.0. (I think the version is 7.5), wonder why it’s doing that!! All the other cudnn libraries are fine and are found locally. Any pointers?

I have made the suggested changes to .bashrc with LD_LIBRARY_PATH. No luck.