ROCm: 4.3.1 / gfx803 / tensorflow-rocm 2.6.0 - librocblas.so.0: cannot open shared object file: No such file or directory
So i know gfx803 is only unofficially support now, but any help would be great!
Ubuntu 20.04 gfx803 epyc zen1
so I just run:
sudo apt install rocm-dkms rocm-libs
rocm-smi shows all my gpus (all are the same gfx803)
and install tensorflow-rocm into a virtualenv from inside of pycharm
when i try to run:
from ai_benchmark import AIBenchmark
benchmark = AIBenchmark(use_CPU=False, verbose_level=1)
results = benchmark.run()
I get the following error:
~/Projects/mlenv/lib/python3.8/site-packages/tensorflow/python/pywrap_tensorflow.py in <module>
63 try:
---> 64 from tensorflow.python._pywrap_tensorflow_internal import *
65 # This try catch logic is because there is no bazel equivalent for py_extension.
ImportError: librocblas.so.0: cannot open shared object file: No such file or directory
If install this dirty version of rocblas from here(which is for 4.3.0 not 4.3.1): https://github.com/xuhuisheng/rocm-gfx803
the benchmark runs, but doesn’t see my GPUs and rocm-smi show 0% activity…
About this issue
- Original URL
- State: closed
- Created 3 years ago
- Comments: 15 (7 by maintainers)
tensorflow_rocm-2.6.0 drop gfx803 support. You could compile tensorflow_rocm-2.6.0 by yourself.
In my environment, tensorflow_rocm-2.6.0 can run properly on gfx803.
:1:rocdevice.cpp :438 : 47612163925 us: hsa_init failed.It is said that device init failed. https://github.com/ROCm-Developer-Tools/ROCclr/blob/rocm-4.3.x/device/rocm/rocdevice.cpp#L438
hsa_init() just try to acquire the Runtime, but failed. https://github.com/RadeonOpenCompute/ROCR-Runtime/blob/rocm-4.3.x/src/core/runtime/hsa.cpp#L206 https://github.com/RadeonOpenCompute/ROCR-Runtime/blob/rocm-4.3.x/src/core/runtime/runtime.cpp#L94
Guess the reason is HSA_STATUS_ERROR_OUT_OF_RESOURCES. Need recompile ROCR-Runtime, print more log.
The tensorflow_rocm-2.4.3 had official support on gfx803. https://pypi.org/project/tensorflow-rocm/2.4.3/
I just upload gfx803 version of tensorflow_rocm-2.6.0 to https://github.com/xuhuisheng/rocm-gfx803/releases/download/rocm43/tensorflow-2.6.0-cp38-cp38-linux_x86_64.whl, please try it.
The source codes comes from this branch: https://github.com/ROCmSoftwarePlatform/tensorflow-upstream/tree/r2.6-rocm-enhanced