tensorflow: CUDA_ERROR_NO_DEVICE

I just installed TensorFlow for Ubuntu using the instructions on the website.

If I open a session in python:

import tensorflow as tf
hello = tf.constant('Hello, TensorFlow!')
sess = tf.Session()

I get these warnings:

I tensorflow/core/common_runtime/local_device.cc:25] Local device intra op parallelism threads: 8
E tensorflow/stream_executor/cuda/cuda_driver.cc:466] failed call to cuInit: CUDA_ERROR_NO_DEVICE
I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:86] kernel driver does not appear to be running on this host (joao): /proc/driver/nvidia/version does not exist
I tensorflow/core/common_runtime/gpu/gpu_init.cc:112] DMA: 
I tensorflow/core/common_runtime/local_session.cc:45] Local session inter op parallelism threads: 8

What have I done wrong?

About this issue

  • Original URL
  • State: closed
  • Created 9 years ago
  • Comments: 19 (1 by maintainers)

Commits related to this issue

Most upvoted comments

Do you have a GPU in your machine that you want to use?

If not:

  1. consider using the CPU binary, not the GPU binary
  2. you should be able to ignore these warnings. they are helpful for debugging for those who do want to use GPUs.

I had a problem: failed call to cuInit: CUDA_ERROR_NO_DEVICE. And root of problem was enviroment varible CUDA_VISIBLE_DEVICES = 1. One is ID, not count. Switch it to 0.

Just to expand on what @Alek-dr mentioned, if on your system the environment variable CUDA_VISIBLE_DEVICES is empty or set to any value other than IDs of GPUs on your system then you might encounter this issue. On my system echo $CUDA_VISIBLE_DEVICES=NoDevFiles for some reason. Bigger problem? May be.

I have two GPUs installed so setting export CUDA_VISIBLE_DEVICES=0,1 fixes the issue and TensorFlow is able to use both of them.

I think nvidia-driver is not working.

You have to check command nvidia-smi

If these command is not working, you have reboot your computer.

Maybe nvidia-smi works correctly.

Just fixed the issue. You must check if the script you are running doesn’t set the variable to “”.

so 2.1 capability devices won’t be much faster than CPU?

This doesn’t solve my issue. I ran tf.config.list_physical_devices() and I still can’t see the GPU displayed, despite that error message is gone.