tensorflow: ImportError: libcublas.so.9.0: cannot open shared object file: No such file or directory

OS Platform and Distribution: Linux Ubuntu 17.10

TensorFlow installed using pip TensorFlow version: 1.6, with GPU support Python Version: 3.6.4 CUDA version: 9.1 GPU model and memory: NVidia GEForce 940MX 2GB command to reproduce: ~$ python3

import tensorflow as tf (basically run any tensorflow program to reproduce)

Problem: Whenever you run a tensorflow program, you get a huge error log, but the main problem is this: ImportError: libcublas.so.9.0: cannot open shared object file: No such file or directory So, the reason this is happening is because TensorFlow wants Cuda 9.0, but I have Cuda 9.1. This problem can be fixed by installing Cuda 9.0, but I have a few requests. Seeing that a couple of people have this problem (see https://github.com/tensorflow/tensorflow/issues/15604, https://github.com/tensorflow/tensorflow/issues/15817, https://github.com/tensorflow/tensorflow/issues/15817), I think that TensorFlow could be updated so that it works with Cuda 9.1 (but I think this issue is only with Ubuntu), or the following could be done: Update the TensorFlow documentation, saying that you specifically need Cuda 9.0 for TensorFlow 1.6, and Cuda 8.0 for TensorFlow 1.4, and so on And also, include this in the errors list at https://www.tensorflow.org/install/install_linux#common_installation_problems.

Edit: If a Pull Request is required to update the documentation, I am fine with doing that.

About this issue

  • Original URL
  • State: closed
  • Created 6 years ago
  • Reactions: 15
  • Comments: 41 (1 by maintainers)

Most upvoted comments

Please refer combinations of CUDA, CuDNN and Tensorflow.

This error happens majorly due to incorrect version combinations of Nvidia-driver, CUDA, CuDNN and Tensorflow-gpu image

This issue is related to Google’s protobuf-compiler due to which tensorflow fails to find the shared object file, in this instance, libcublas.so.9.0. Even switching from CUDA 9.1 to 9.0 didn’t help as tensorflow was still unable to locate the file.

Building the latest version of protobuf (3.5.0) from source didn’t help either. What worked for me was to install the system-wide protobuf compiler through apt install protobuf-compiler on Ubuntu 16.04. And, install the python version through pip3 install protobuf. I am using CUDA 9.0 as 9.1 is not yet compatible with tensorflow’s pre-built binary.

You can check the system-wide protobuf version using protoc --version which is 2.6.1 on 16.04. The protoc python version is 3.5.2.post1. Hope this helps. I had a similar issue using earlier versions of tensorflow and CUDA 8, and had documented this troubleshooting procedure. Using the same procedure, I am able to use tensorflow 1.8.0 too.

Thanks. Just a word to others trying to do all this. It’s obvious by watching this thread… If you don’t know what you’re doing from the getgo. I.e., you’re not a AI engineer who know specifically knows why they need CUDA , taken heed. CUDA is mess with TensorFlow (and TF is a mess on its own).

You’re wasting time here if you want to check out CUDA. think about the Ml/AI solution you’re trying to solve and work accordingly. Don’t just install this monster if your have no idea how to use it.

If you want to just learn, use docker for the love of god. There are a million images that can work. Try Kaggle’s AI image. Ready to go with no nonsense like this. Masochists you all are. Lol.

re-compile tensorflow. @angeload @Parnia @agilebean @fay111101 @mesargent install bazel:

sudo apt-get install openjdk-8-jdk
echo "deb [arch=amd64] http://storage.googleapis.com/bazel-apt stable jdk1.8" | sudo tee /etc/apt/sources.list.d/bazel.list
sudo apt-get install curl
curl https://bazel.build/bazel-release.pub.gpg | sudo apt-key add -
sudo apt-get update && sudo apt-get install bazel

clone tf code:

git clone https://github.com/tensorflow/tensorflow.git
cd tensorflow
git pull
git checkout r1.9 

./configure

Do you wish to build TensorFlow with jemalloc as malloc support? [Y/n]: enter Do you wish to build TensorFlow with Google Cloud Platform support? [Y/n]: enter Do you wish to build TensorFlow with Hadoop File System support? [Y/n]: enter Do you wish to build TensorFlow with Amazon S3 File System support? [Y/n]: enter Do you wish to build TensorFlow with Apache Kafka Platform support? [Y/n]: enter Do you wish to build TensorFlow with XLA JIT support? [y/N]: enter Do you wish to build TensorFlow with GDR support? [y/N]: enter Do you wish to build TensorFlow with VERBS support? [y/N]: enter Do you wish to build TensorFlow with OpenCL SYCL support? [y/N]: enter Do you wish to build TensorFlow with CUDA support? [y/N]: Y Please specify the CUDA SDK version you want to use, e.g. 7.0. [Leave empty to default to CUDA 9.0]: 9.2 Please specify the location where CUDA 9.2 toolkit is installed. Refer to README.md for more details. [Default is /usr/local/cuda]: /usr/local/cuda-9.2 Please specify the cuDNN version you want to use. [Leave empty to default to cuDNN 7.0]: 7.1.4 Please specify the location where cuDNN 7 library is installed. Refer to README.md for more details. [Default is /usr/local/cuda-9.2]: /usr/local/cuda-9.2 Do you wish to build TensorFlow with TensorRT support? [y/N]: enter Please specify the NCCL version you want to use. [Leave empty to default to NCCL 1.3]: enter Please note that each additional compute capability significantly increases your build time and binary size. [Default is: 6.1]: enter Do you want to use clang as CUDA compiler? [y/N]: enter Please specify which gcc should be used by nvcc as the host compiler. [Default is /usr/bin/gcc]: enter Do you wish to build TensorFlow with MPI support? [y/N]: enter Please specify optimization flags to use during compilation when bazel option “–config=opt” is specified [Default is -march=native]: -march=native Would you like to interactively configure ./WORKSPACE for Android builds? [y/N]:enter

build:

bazel build --config=opt //tensorflow/tools/pip_package:build_pip_package
bazel-bin/tensorflow/tools/pip_package/build_pip_package tensorflow_pkg
cd tensorflow_pkg
sudo pip install tensorflow*.whl

testing:

import tensorflow as tf   
hello = tf.constant('Hello, TensorFlow!')
sess = tf.Session()
print(sess.run(hello))

print Hello, TensorFlow!

ALL READ: -

https://github.com/awslabs/keras-apache-mxnet is your TF replacement. you won’t look back… trust me. Been at this for 24 years (software engineering in general) and have built anything you’ve seen on the net. And guess what, without TF. Worked at hedge funds to build prop trading tools and algos, search and rescue 100% autonomous drones, ios slot machines, surgical tools, ect…

All without TF. you are correct no CUDA needed either.

Honesty, anything you need to do, can do it PyTorch. Or,

  1. Python for data munging
  2. https://pytorch.org/docs/stable/index.html to build your model

OR - easier…

  1. Convert your TF models to MXNet -
  2. Find your converter: https://github.com/mechanicalAI/deep-learning-model-convertor https://github.com/Microsoft/MMdnn#conversion

OR, here is what sane people do who have to do this for a living…

  1. Data pro? Munge that data like no-ones business?
  2. Keras!!! OR https://github.com/mechanicalAI/autokeras or toss the GPU and learn

1… FInd the model yourself - example: image

OR -

or https://github.com/mikewlange/KETTLE or https://github.com/mechanicalAI/tpot OR you’re best bet https://github.com/mechanicalAI/h2o4gpu <- drop in replacement for ScikitLearn with with GPU support - https://github.com/mechanicalAI/h2o4gpu#requirements

Good luck and don’t let your self get stuck for more than 1:30 min. Draw a line…

Hi @shivaniag you would really help the community if you re-assign this issue… To All: Who would be willing to solve this issue?

I’m on my phone so have to be brief… However when installing cuda 9.1 the folder created is labeled cuda9.1. All the instructions say to set your $PATH to yadda/yadda/cuda/yadda/bin. Changing the path to $yadda/yadda/cuda9.1 shits the bed and you get this error. If you’ve installed 9.1, leave it alone and follow the installation instructions step 2 here https://github.com/mikewlange/tensorflow-gpu-install-ubuntu-16.04

Sent from my iPhone

On Apr 25, 2018, at 5:51 AM, sebma notifications@github.com wrote:

@mldm4 Hi, do you mean to downgrade coda to v9-0 ?

Can you please be more specific and point your hyperlink to the comment with the proposed solution to gain people more time ?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or mute the thread.

I had this problem trying to use tensor flow within a conda environment. tensorflow worked fine with the standard install in my base python 3.6, but not in my conda environment that I use for python. import tensorflow as tf I got the same error.

Here is what worked for me. I checked that the file did actually exist at /usr/local/cuda-9.0/lib64 and found that I have libcublas.so.9.0 Then I added the LD_LIBRARY_PATH to my ~/.bashrc: echo "export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda-9.0/lib64/" >> ~/.bashrc Restarted my terminal, built conda environment: ‘conda create -n tf-gpu python=3.6 tensorflow-gpu’ ‘source activate tf-gpu’ ‘conda install ipykernel’ ‘python -m ipykernel install --user --name --display-name “tf-gpu”’ ‘source deactivate’ ‘jupyter notebook’

and then made a new notebook that uses the tf-gpu kernel, and: ‘import tensorflow as tf’ ‘h = tf.constant(‘hellow’)’ ‘sess = tf.Session()’ ‘print(sess.run(h))’

Worked

mikewlange, Thanks for pointing me in the right direction. Your resources worked. It seems that ‘‘wonky’’ is the right term to describe this problem with the pip virtual environment.

I got same error message when import tensorflow. And now, I solve this problem already. Let try apt-get install cuda-cublas-[cuda-version] with cuda-version: 9-0, 9-1, etc…

Hi @shivaniag > it would be great if you can reassign this issue if you have other priorities at the moment. This issue affects many users and is therefore critical. Thank you for your caring consideration!

What worked for me was the process described at https://medium.com/@taylordenouden/installing-tensorflow-gpu-on-ubuntu-18-04-89a142325138 plus the protobuf bit @dashsd provided above. $PATH and $LD_LIBRARY_PATH all use ‘/usr/local/cuda-9.0’. The latter contains two separate entries, as described in https://github.com/tensorflow/tensorflow/issues/16750: one for /usr/local/cuda-9.0/extras/CUPTI/lib64 and another for /usr/local/cuda-9.0/lib64.

I use Ubuntu 18.04 on a Dell XPS 15" with NVIDIA GeForce GTX 1050 (GP107M), driver version 390.48. Tensorflow now runs on CUDA 9.0 and CUDNN 7.0.5.

I also face the same problem with same configuration. But when I install cuda-9.0 version the issue got solved. I feel tensorflow-gpu version is using the cuda-9.0 version specifically.