tensorflow: TF 2.0 XLA JIT reporting error: "./bin/ptxas not found"
System information
- OS Platform and Distribution: Ubuntu 16.04.6 LTS
- TensorFlow installed from (source or binary): pip3 install tensorflow-gpu
- TensorFlow version (use command below): 2.0.0
- Python version: 3.5.2
- CUDA/cuDNN version: 10.0
- GPU model and memory: TITAN Xp
Describe the current behavior
The test code is running with error as bellow:
2019-12-26 22:02:59.166382: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.0
2019-12-26 22:02:59.166422: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10.0
2019-12-26 22:02:59.166453: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10.0
2019-12-26 22:02:59.166482: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10.0
2019-12-26 22:02:59.166512: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10.0
2019-12-26 22:02:59.166541: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10.0
2019-12-26 22:02:59.166573: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2019-12-26 22:02:59.171144: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Adding visible gpu devices: 0
2019-12-26 22:02:59.171311: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.0
2019-12-26 22:02:59.174312: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1159] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-12-26 22:02:59.174418: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1165] 0
2019-12-26 22:02:59.174508: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1178] 0: N
2019-12-26 22:02:59.179990: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1304] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 11439 MB memory) -> physical GPU (device: 0, name: TITAN Xp, pci bus id: 0000:06:00.0, compute capability: 6.1)
sleep
2019-12-26 22:02:59.923393: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10.0
2019-12-26 22:03:00.348503: W tensorflow/stream_executor/cuda/redzone_allocator.cc:312] Not found: ./bin/ptxas not found
Relying on driver to perform ptx compilation. This message will be only logged once.
2019-12-26 22:03:00.355159: W tensorflow/core/framework/op_kernel.cc:1622] OP_REQUIRES failed at xla_compile_on_demand_op.cc:218 : Not found: ./bin/ptxas not found
Traceback (most recent call last):
File "tf.py", line 8, in <module>
c = tf.linalg.matmul(a, b)
File "/home/thincal/.local/lib/python3.5/site-packages/tensorflow_core/python/util/dispatch.py", line 180, in wrapper
return target(*args, **kwargs)
File "/home/thincal/.local/lib/python3.5/site-packages/tensorflow_core/python/ops/math_ops.py", line 2765, in matmul
a, b, transpose_a=transpose_a, transpose_b=transpose_b, name=name)
File "/home/thincal/.local/lib/python3.5/site-packages/tensorflow_core/python/ops/gen_math_ops.py", line 6126, in mat_mul
_six.raise_from(_core._status_to_exception(e.code, message), None)
File "<string>", line 2, in raise_from
tensorflow.python.framework.errors_impl.NotFoundError: ./bin/ptxas not found [Op:MatMul] name: MatMul/
Describe the expected behavior
The test code is running successfully.
Code to reproduce the issue
import tensorflow as tf
try:
with tf.device('device:XLA_GPU:0'):
a = tf.constant([[1.0, 2.0, 3.0], [4.0, 5.0, 6.0]])
b = tf.constant([[1.0, 2.0], [3.0, 4.0], [5.0, 6.0]])
c = tf.linalg.matmul(a, b)
print(c)
except RuntimeError as e:
print(e)
About this issue
- Original URL
- State: closed
- Created 5 years ago
- Comments: 22 (10 by maintainers)
Instead of looking for
./bin/ptxas, shouldn’t it check to see ifptxasis available first? In my case:The real
ptxaspath may differ on your system (just for info: I have Debian 10 installation with NVidia stuff frombuster-backportsrepository). As far as I understandtensorflowlibrary looks forptxasin the./bindirectory (note that the path is relative, i.e. the current working directory where you start your python script andln -s ...is important). The commands are:mkdir ./binln -s /usr/bin/ptxas ./bin/ptxas@cheshire, actually it matters, I have
ptxasin my$PATH, but the error gone only after I’ve created symlynk to./bin/ptxas.@cheshire
Summary:
FYI:
the first try: it reports missing some library so that GPU can’t be used:
the second try: I just upgrade the cudnn to v7.6.1, it reports the same error as above. But actually these shared libraries are existed but just with a version 10.0:
the third try: after making a symbolic link for above missing libraries it is running well now: