ray: Ray is not finding GPU but TF, PyTorch and nvcc does
I have two NVIDIA TitanX but Ray isn’t seeing any:
ray.init(num_gpus=2)
print(ray.get_gpu_ids())
# prints []
Ray also prints below inicating no GPUs:
2019-10-16 18:20:17,954 INFO multi_gpu_optimizer.py:93 -- LocalMultiGPUOptimizer devices ['/cpu:0']
But TensorFlow sees all devices:
import tensorflow
from tensorflow.python.client import device_lib
print(device_lib.list_local_devices())
That prints:
[name: "/device:CPU:0"
device_type: "CPU"
...
, name: "/device:XLA_CPU:0"
device_type: "XLA_CPU"
...
, name: "/device:GPU:0"
device_type: "GPU"
...
, name: "/device:GPU:1"
device_type: "GPU"
...
, name: "/device:XLA_GPU:0"
device_type: "XLA_GPU"
...
, name: "/device:XLA_GPU:1"
device_type: "XLA_GPU"
...
]
Similarly,
/usr/local/cuda/bin/nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2019 NVIDIA Corporation
Built on Sun_Jul_28_19:07:16_PDT_2019
Cuda compilation tools, release 10.1, V10.1.243
Why Ray doesn’t see my GPUs?
About this issue
- Original URL
- State: closed
- Created 5 years ago
- Reactions: 9
- Comments: 16 (5 by maintainers)
I am having the same issue as @Wormh0-le. This is preventing me from training a torch policy without ray.tune which I do not which to use. I just want to call .train() on my agent.
and I explicit
num_gpus=1,but ray still can’t get GPU, andtorch.cuda.is_available()is True. why?How would it know how many GPUs to give to each trial?
Please see https://ray.readthedocs.io/en/latest/tune-usage.html#trial-parallelism
On Thu, Oct 17, 2019, 5:58 PM Christian Herz notifications@github.com wrote: