tensorflow: [1.1.0-gpu image] Can't open shared object file libcuda.so.1
Version info
GPU: Nvidia K40 and K80 Docker : 1.12.6 Image tag: 1.1.0-gpu and latest-gpu
Reproduce
I pulled the tensorflow/tensorflow:1.1.0-gpu docker image and I got the error as below when I started to run it:
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/usr/local/lib/python2.7/dist-packages/tensorflow/__init__.py", line 24, in <module>
from tensorflow.python import *
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/__init__.py", line 51, in <module>
from tensorflow.python import pywrap_tensorflow
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/pywrap_tensorflow.py", line 52, in <module>
raise ImportError(msg)
ImportError: Traceback (most recent call last):
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/pywrap_tensorflow.py", line 41, in <module>
from tensorflow.python.pywrap_tensorflow_internal import *
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 28, in <module>
_pywrap_tensorflow_internal = swig_import_helper()
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 24, in swig_import_helper
_mod = imp.load_module('_pywrap_tensorflow_internal', fp, pathname, description)
ImportError: libcuda.so.1: cannot open shared object file: No such file or directory
Failed to load the native TensorFlow runtime.
About this issue
- Original URL
- State: closed
- Created 7 years ago
- Reactions: 1
- Comments: 24 (11 by maintainers)
Commits related to this issue
- PR #10642: Make configure.py command visible Imported from GitHub PR https://github.com/openxla/xla/pull/10642 Copybara import of the project: -- d5df8a14b8837c0980a8641a83b9d6d9e33577cc by prrathi... — committed to tensorflow/tensorflow by prrathi 3 months ago
- PR #10642: Make configure.py command visible Imported from GitHub PR https://github.com/openxla/xla/pull/10642 Copybara import of the project: -- d5df8a14b8837c0980a8641a83b9d6d9e33577cc by prrathi... — committed to tensorflow/tensorflow by prrathi 3 months ago
- PR #10642: Make configure.py command visible Imported from GitHub PR https://github.com/openxla/xla/pull/10642 Copybara import of the project: -- d5df8a14b8837c0980a8641a83b9d6d9e33577cc by prrathi... — committed to tensorflow/tensorflow by prrathi 3 months ago
Are you using nvidia-docker? e.g. https://hub.docker.com/r/tensorflow/tensorflow/
Make sure to run with
nvidia-dockerFair enough. The real question is whether or not the GPU install is working or not or if it is a Tensorflow bug. Using some other GPU program like nvidia-smi that is simpler is a common way to separate the problem. LD_DEBUG is a great debugging tool for seeing why shared libraries aren’t loading
pipe it to a file and find where it is trying to load libcuda and it will tell you exactly what locations its trying for which libraries.
Maybe google for others that have gotten tenosrflow on kubernetes. https://medium.com/jim-fleming/running-tensorflow-on-kubernetes-ca00d0e67539
Yes, just append --runtime=nvidia to your docker command. The reason is nvidia-docker v1 uses the nvidia-docker alias, where v2 uses docker --runtime=nvidia.