tvm: TVM for ROCm 2.x is currently not working
Environment: Ubuntu 18.04 + ROCm 2.2 + TVM (built from current master with ROCM = ON)
I ensure the target TVM library successfully detect and link with ROCM, and the tuning procedure runs successfully, however, while executing tvm.build(s, arg_bufs, 'rocm', name='matmul'), it failed with the following error:
WARNING:autotvm:Too many errors happen in the tuning. Now is in debug mode
Finish loading 500 records
DEBUG:autotvm:Finish loading 500 records
Cannot find config for target=rocm, workload=('tvm_matmul_tune_op', 4, 256, 256). A fallback configuration is used, which may bring great performance regression.
WARNING:autotvm:Cannot find config for target=rocm, workload=('tvm_matmul_tune_op', 4, 256, 256). A fallback configuration is used, which may bring great performance regression.
Best config:
,None,None
[14:47:54] /host/docker/matmul_tvm/tvm/src/pass/vectorize_loop.cc:362: Detect vector condition in Vectorized Loop, scalarizing...
[14:47:54] /host/docker/matmul_tvm/tvm/src/pass/vectorize_loop.cc:362: Detect vector condition in Vectorized Loop, scalarizing...
Traceback (most recent call last):
File "matmul_autotvm.py", line 260, in <module>
search_matmul_config(4, 256, 256, 500) # m, k, n, num_trials
File "matmul_autotvm.py", line 165, in search_matmul_config
func = tvm.build(s, arg_bufs, 'rocm', name='matmul')
File "/host/docker/matmul_tvm/tvm/python/tvm/build_module.py", line 617, in build
fhost, mdev = _build_for_device(flist, tar, target_host)
File "/host/docker/matmul_tvm/tvm/python/tvm/build_module.py", line 484, in _build_for_device
mdev = codegen.build_module(fdevice, str(target)) if fdevice else None
File "/host/docker/matmul_tvm/tvm/python/tvm/codegen.py", line 36, in build_module
return _Build(lowered_func, target)
File "/host/docker/matmul_tvm/tvm/python/tvm/_ffi/_ctypes/function.py", line 206, in __call__
raise get_last_ffi_error()
tvm._ffi.base.TVMError: Traceback (most recent call last):
[bt] (2) /host/docker/matmul_tvm/tvm/build_rocm/libtvm.so(TVMFuncCall+0x61) [0x7f9598de3f01]
[bt] (1) /host/docker/matmul_tvm/tvm/build_rocm/libtvm.so(+0x14b2e9) [0x7f95986992e9]
[bt] (0) /host/docker/matmul_tvm/tvm/build_rocm/libtvm.so(+0x231aaa) [0x7f959877faaa]
File "/host/docker/matmul_tvm/tvm/src/codegen/codegen.cc", line 46
TVMError: Check failed: bf != nullptr: Target rocm is not enabled
About this issue
- Original URL
- State: closed
- Created 5 years ago
- Comments: 48 (45 by maintainers)
This is not clean, but you can modify this block for rocm target.
For rocm, max_shared_memory_per_block should be 48KB, and max threads per block should be 256. Don’t forget to add “if ‘rocm’ in self.task.target.keys”.
@masahi
You can get more information at here: https://llvm.org/docs/AMDGPUUsage.html#code-object-metadata
And I just simply add the
-mattr=-code-object-v3inBuildAMDGPUat codegen_amdgpu.cc:182hi, @masahi @ghostplant
I have same problem, and I found the solution that works for me. So, I would sharing my experience on this. The
hipErrorNotFounderror was got when your environment is ROCm2.3 (not compatible with LLVM6 or 7) or use LLVM8+.The reason is that LLVM uses
-mattr=+code-object-v3in default for HSACO generation, but the ROCmhipModuleLoadfunction which only accepts a HSACO with code object v2.It means you need to pass
-mattr=-code-object-v3as the option explicitly. Note the value given to-mattr, the plus sign+be replaced to minus sign-.