tensorflow: failed call to cuInit: CUDA_ERROR_UNKNOWN

when i run tensorflow gpu my gpy is gt 620m linux mint 18.1

E tensorflow/stream_executor/cuda/cuda_driver.cc:509] failed call to cuInit: CUDA_ERROR_UNKNOWN
I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:158] retrieving CUDA diagnostic information for host: essam-goda
I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:165] hostname: essam-goda
I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:189] libcuda reported version is: 340.101.0
I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:363] driver version file contents: """NVRM version: NVIDIA UNIX x86_64 Kernel Module  340.101  Thu Dec  1 15:52:31 PST 2016
GCC version:  gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.4) 
"""
I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:193] kernel reported version is: 340.101.0
I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:300] kernel version seems to match DSO: 340.101.0

About this issue

  • Original URL
  • State: closed
  • Created 7 years ago
  • Comments: 17 (5 by maintainers)

Most upvoted comments

Issue resolved after Ubuntu restart. I installed updates and was pending a restart, after the restart GPU was detected.

Could you google search this error and try to follow some of the advice. For example, this error appears here https://github.com/tensorflow/tensorflow/issues/2882

  1. One thing is you could have no permissions to run on the GPU as a user (you could try sudo)
  2. Make sure nvidia-smi works
  3. try some of the cuda demos and make sure they work to narrow it down from being just a GPU install/configure problem

Good Luck!

@aselle Hi, is there any alternative to make it work without running it as root?

nvidia-smi
Mon Aug 14 07:26:58 2017
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 375.82                 Driver Version: 375.82                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  Quadro P400         Off  | 0000:01:00.0      On |                  N/A |
| 34%   46C    P8    12W /  N/A |    116MiB /  1990MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   1  Tesla K40c          Off  | 0000:03:00.0     Off |                    0 |
| 27%   50C    P8    21W / 235W |      0MiB / 11439MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   2  Tesla K40c          Off  | 0000:04:00.0     Off |                    0 |
| 28%   55C    P8    30W / 235W |      0MiB / 11439MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID  Type  Process name                               Usage      |
|=============================================================================|
|    0      2165    G   /usr/lib/xorg/Xorg                              60MiB |
|    0      2271    G   gnome-shell                                     54MiB |
+-----------------------------------------------------------------------------+

Without sudo: Please note: 2017-08-14 07:28:29.379866: E tensorflow/stream_executor/cuda/cuda_driver.cc:406] failed call to cuInit: CUDA_ERROR_NO_DEVICE

python simple_mlp_tensorflow.py
2017-08-14 07:28:29.374191: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.
2017-08-14 07:28:29.374211: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
2017-08-14 07:28:29.374216: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
2017-08-14 07:28:29.374219: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations.
2017-08-14 07:28:29.374222: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations.
2017-08-14 07:28:29.379866: E tensorflow/stream_executor/cuda/cuda_driver.cc:406] failed call to cuInit: CUDA_ERROR_NO_DEVICE
2017-08-14 07:28:29.379893: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:158] retrieving CUDA diagnostic information for host: gonzo
2017-08-14 07:28:29.379898: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:165] hostname: gonzo
2017-08-14 07:28:29.379919: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:189] libcuda reported version is: 375.82.0
2017-08-14 07:28:29.379936: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:369] driver version file contents: """NVRM version: NVIDIA UNIX x86_64 Kernel Module  375.82  Wed Jul 19 21:16:49 PDT 2017
GCC version:  gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.4)
"""
2017-08-14 07:28:29.379948: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:193] kernel reported version is: 375.82.0
2017-08-14 07:28:29.379953: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:300] kernel version seems to match DSO: 375.82.0

As root with sudo:

sudo /nohome/jaan/abhishek/anaconda3/bin/python simple_mlp_tensorflow.py
2017-08-14 07:27:35.067365: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.
2017-08-14 07:27:35.067388: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
2017-08-14 07:27:35.067392: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
2017-08-14 07:27:35.067395: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations.
2017-08-14 07:27:35.067398: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations.
2017-08-14 07:27:35.294553: I tensorflow/core/common_runtime/gpu/gpu_device.cc:940] Found device 0 with properties:
name: Tesla K40c
major: 3 minor: 5 memoryClockRate (GHz) 0.745
pciBusID 0000:03:00.0
Total memory: 11.17GiB
Free memory: 11.10GiB
2017-08-14 07:27:35.479863: W tensorflow/stream_executor/cuda/cuda_driver.cc:523] A non-primary context 0x16b60d0 exists before initializing the StreamExecutor. We haven't verified StreamExecutor works with that.
2017-08-14 07:27:35.480328: I tensorflow/core/common_runtime/gpu/gpu_device.cc:940] Found device 1 with properties:
name: Quadro P400
major: 6 minor: 1 memoryClockRate (GHz) 1.2525
pciBusID 0000:01:00.0
Total memory: 1.94GiB
Free memory: 1.80GiB
2017-08-14 07:27:35.660972: W tensorflow/stream_executor/cuda/cuda_driver.cc:523] A non-primary context 0x130fb80 exists before initializing the StreamExecutor. We haven't verified StreamExecutor works with that.
2017-08-14 07:27:35.661598: I tensorflow/core/common_runtime/gpu/gpu_device.cc:940] Found device 2 with properties:
name: Tesla K40c
major: 3 minor: 5 memoryClockRate (GHz) 0.745
pciBusID 0000:04:00.0
Total memory: 11.17GiB
Free memory: 11.10GiB
2017-08-14 07:27:35.661622: I tensorflow/core/common_runtime/gpu/gpu_device.cc:832] Peer access not supported between device ordinals 0 and 1
2017-08-14 07:27:35.662012: I tensorflow/core/common_runtime/gpu/gpu_device.cc:832] Peer access not supported between device ordinals 1 and 0
2017-08-14 07:27:35.662023: I tensorflow/core/common_runtime/gpu/gpu_device.cc:832] Peer access not supported between device ordinals 1 and 2
2017-08-14 07:27:35.662037: I tensorflow/core/common_runtime/gpu/gpu_device.cc:832] Peer access not supported between device ordinals 2 and 1
2017-08-14 07:27:35.662051: I tensorflow/core/common_runtime/gpu/gpu_device.cc:961] DMA: 0 1 2
2017-08-14 07:27:35.662059: I tensorflow/core/common_runtime/gpu/gpu_device.cc:971] 0:   Y N Y
2017-08-14 07:27:35.662066: I tensorflow/core/common_runtime/gpu/gpu_device.cc:971] 1:   N Y N
2017-08-14 07:27:35.662072: I tensorflow/core/common_runtime/gpu/gpu_device.cc:971] 2:   Y N Y
2017-08-14 07:27:35.662085: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1030] Creating TensorFlow device (/gpu:0) -> (device: 0, name: Tesla K40c, pci bus id: 0000:03:00.0)
2017-08-14 07:27:35.662094: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1017] Ignoring gpu device (device: 1, name: Quadro P400, pci bus id: 0000:01:00.0) with Cuda multiprocessor count: 2. The minimum required count is 8. You can adjust this requirement with the env var TF_MIN_GPU_MULTIPROCESSOR_COUNT.
2017-08-14 07:27:35.662103: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1030] Creating TensorFlow device (/gpu:1) -> (device: 2, name: Tesla K40c, pci bus id: 0000:04:00.0)

Possible solution: I had this issue, nvidia-smi was working fine, and it turned out I was placing my device where none existed. I have two environments. One with many GPUs one with just one. Running the code one the single GPU machine but placing the device on position ‘1’ (or any other than ‘0’) gave rise to this error.

when i run tensorflow gpu my gpy is gt 620m linux mint 18.1

E tensorflow/stream_executor/cuda/cuda_driver.cc:509] failed call to cuInit: CUDA_ERROR_UNKNOWN
I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:158] retrieving CUDA diagnostic information for host: essam-goda
I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:165] hostname: essam-goda
I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:189] libcuda reported version is: 340.101.0
I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:363] driver version file contents: """NVRM version: NVIDIA UNIX x86_64 Kernel Module  340.101  Thu Dec  1 15:52:31 PST 2016
GCC version:  gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.4) 
"""
I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:193] kernel reported version is: 340.101.0
I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:300] kernel version seems to match DSO: 340.101.0

Sometimes a simple reboot works

Issue resolved after Ubuntu restart. I installed updates and was pending a restart, after the restart GPU was detected. The same for me after creating virtual environment and installing dependencies

Adding following lines to .bashrc worked for me!!

export LD_LIBRARY_PATH=“$LD_LIBRARY_PATH:/usr/local/cuda-8.0/lib64:/usr/local/cuda-8.0/extras/CUPTI/lib64:/usr/local/cuda-8.0/targets/x86_64-linux/lib/”

@essamgoda that sounds like cuda isn’t installed quite right. Try doing what it says, otherwise searching for methods of reinstalling nvidia drivers (sometimes you have to work pretty hard to remove remaining parts of cuda if you have used a mix of manual and automatic methods).