tensorflow: [1.1.0-gpu image] Can't open shared object file libcuda.so.1

Version info

GPU: Nvidia K40 and K80 Docker : 1.12.6 Image tag: 1.1.0-gpu and latest-gpu

Reproduce

I pulled the tensorflow/tensorflow:1.1.0-gpu docker image and I got the error as below when I started to run it:

Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/__init__.py", line 24, in <module>
    from tensorflow.python import *
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/__init__.py", line 51, in <module>
    from tensorflow.python import pywrap_tensorflow
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/pywrap_tensorflow.py", line 52, in <module>
    raise ImportError(msg)
ImportError: Traceback (most recent call last):
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/pywrap_tensorflow.py", line 41, in <module>
    from tensorflow.python.pywrap_tensorflow_internal import *
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 28, in <module>
    _pywrap_tensorflow_internal = swig_import_helper()
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 24, in swig_import_helper
    _mod = imp.load_module('_pywrap_tensorflow_internal', fp, pathname, description)
ImportError: libcuda.so.1: cannot open shared object file: No such file or directory

Failed to load the native TensorFlow runtime.

About this issue

Original URL
State: closed
Created 7 years ago
Reactions: 1
Comments: 24 (11 by maintainers)

Commits related to this issue

PR #10642: Make configure.py command visible Imported from GitHub PR https://github.com/openxla/xla/pull/10642 Copybara import of the project: -- d5df8a14b8837c0980a8641a83b9d6d9e33577cc by prrathi... — committed to tensorflow/tensorflow by prrathi 3 months ago
PR #10642: Make configure.py command visible Imported from GitHub PR https://github.com/openxla/xla/pull/10642 Copybara import of the project: -- d5df8a14b8837c0980a8641a83b9d6d9e33577cc by prrathi... — committed to tensorflow/tensorflow by prrathi 3 months ago
PR #10642: Make configure.py command visible Imported from GitHub PR https://github.com/openxla/xla/pull/10642 Copybara import of the project: -- d5df8a14b8837c0980a8641a83b9d6d9e33577cc by prrathi... — committed to tensorflow/tensorflow by prrathi 3 months ago

Most upvoted comments

Are you using nvidia-docker? e.g. https://hub.docker.com/r/tensorflow/tensorflow/

+69

aselle on Jun 12, 2017

Make sure to run with nvidia-docker

+12

sumsuddin on Aug 10, 2017

Fair enough. The real question is whether or not the GPU install is working or not or if it is a Tensorflow bug. Using some other GPU program like nvidia-smi that is simpler is a common way to separate the problem. LD_DEBUG is a great debugging tool for seeing why shared libraries aren’t loading

LD_DEBUG=libs python -c "import tensorflow"

pipe it to a file and find where it is trying to load libcuda and it will tell you exactly what locations its trying for which libraries.

Maybe google for others that have gotten tenosrflow on kubernetes. https://medium.com/jim-fleming/running-tensorflow-on-kubernetes-ca00d0e67539

aselle on Jun 15, 2017

Yes, just append --runtime=nvidia to your docker command. The reason is nvidia-docker v1 uses the nvidia-docker alias, where v2 uses docker --runtime=nvidia.

zhangalex on May 5, 2019

https://hub.docker.com/r/tensorflow/tensorflow/

Are you using nvidia-docker? e.g. https://hub.docker.com/r/tensorflow/tensorflow/

docker run --runtime=nvidia -p xxxxxxx -t tensorflow/serving:1.12.0-gpu

xinghalo on Jan 21, 2019