tensorflow: failed to create cublas handle: CUBLAS_STATUS_NOT_INITIALIZED on a AWS p2.xlarge instance
Hi,
I have been running docker images on a Centos 7.0 AWS p2.xlarge instance. I have previously installed on it: CUDA: cuda-repo-rhel7-8.0.44-1.x86_64.rpm NVIDIA drivers 361.42
I have also installed nvidia-docker following instructions
I have successfully run all notebooks from Docker images (as fas as I’ve tried tensorflow/tensorflow:latest-devel-gpu and tensorflow/tensorflow:latest-gpu):
Running tensorflow version within the docker containter: 0.11.0rc2
Bazel version: Build label: 0.3.2
root@de73edc73418:~# nvidia-smi -l
Wed Nov 2 12:02:54 2016
±-----------------------------------------------------+
| NVIDIA-SMI 361.42 Driver Version: 361.42 |
|-------------------------------±---------------------±---------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 Tesla K80 Off | 0000:00:1E.0 Off | 0 |
| N/A 57C P0 70W / 149W | 10948MiB / 11519MiB | 0% Default |
±------------------------------±---------------------±---------------------+
However when I try to launch a Single GPU computing example with tensorflow and get the following error:
I tensorflow/core/common_runtime/simple_placer.cc:819] MatMul_3: /job:localhost/replica:0/task:0/gpu:0 MatMul_4: /job:localhost/replica:0/task:0/gpu:0 I tensorflow/core/common_runtime/simple_placer.cc:819] MatMul_4: /job:localhost/replica:0/task:0/gpu:0 MatMul_5: /job:localhost/replica:0/task:0/gpu:0 I tensorflow/core/common_runtime/simple_placer.cc:819] MatMul_5: /job:localhost/replica:0/task:0/gpu:0 MatMul_6: /job:localhost/replica:0/task:0/gpu:0 I tensorflow/core/common_runtime/simple_placer.cc:819] MatMul_6: /job:localhost/replica:0/task:0/gpu:0 MatMul_7: /job:localhost/replica:0/task:0/gpu:0 I tensorflow/core/common_runtime/simple_placer.cc:819] MatMul_7: /job:localhost/replica:0/task:0/gpu:0 MatMul_8: /job:localhost/replica:0/task:0/gpu:0 I tensorflow/core/common_runtime/simple_placer.cc:819] MatMul_8: /job:localhost/replica:0/task:0/gpu:0 MatMul_9: /job:localhost/replica:0/task:0/gpu:0 I tensorflow/core/common_runtime/simple_placer.cc:819] MatMul_9: /job:localhost/replica:0/task:0/gpu:0 AddN: /job:localhost/replica:0/task:0/cpu:0 I tensorflow/core/common_runtime/simple_placer.cc:819] AddN: /job:localhost/replica:0/task:0/cpu:0 E tensorflow/stream_executor/cuda/cuda_blas.cc:367] failed to create cublas handle: CUBLAS_STATUS_NOT_INITIALIZED W tensorflow/stream_executor/stream.cc:1390] attempting to perform BLAS operation using StreamExecutor without BLAS support E tensorflow/stream_executor/cuda/cuda_blas.cc:367] failed to create cublas handle: CUBLAS_STATUS_NOT_INITIALIZED W tensorflow/stream_executor/stream.cc:1390] attempting to perform BLAS operation using StreamExecutor without BLAS support
Not sure if is something related to Nvidia drivers, OS or some library mismatch. Any idea?
About this issue
- Original URL
- State: closed
- Created 8 years ago
- Comments: 25 (6 by maintainers)
if you’re still having trouble, try adding
/usr/local/cuda/extras/CUPTI/lib64to yourLD_LIBRARY_PATH. I had the same error and this fixed it (I was on mac though, so verify that directory on your system)On mac it was
/usr/local/cuda/extras/CUPTI/libMaybe the following command helps:
Good luck.
I ran into this problem when I’m running https://github.com/davidsandberg/facenet/ inside docker image
tensorflow/tensorflow:latest-gpu-py3.However, the jupyter can run without any problem.
UPDATE:
After I set
per_process_gpu_memory_fractionfrom 1 to 0.5, this error gone.It seems if set
per_process_gpu_memory_fractionto 1 it will require the entire GPU memory, which will fail because myXorg,chromehave already used part of them.@hzxie A very good suggestion! In my case, it is
sudo rm -rf ~/.nv/