numba: ROCm 3.x is not supported by Numba

Reporting a bug

I’m on Vega 20 (gfx906) and try to run this example code: https://numba.readthedocs.io/en/stable/roc/examples.html, but it errors out with the following full log:

warning: Linking two modules of different data layouts: '' is 'e-p:64:64-p1:64:64-p2:32:32-p3:32:32-p4:64:64-p5:32:32-p6:32:32-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-v256:256-v512:512-v1024:1024-v2048:2048-n32:64-A5' whereas '<string>' is 'e-p:64:64-p1:64:64-p2:32:32-p3:32:32-p4:64:64-p5:32:32-p6:32:32-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-v256:256-v512:512-v1024:1024-v2048:2048-n32:64-S32-A5'

warning: Linking two modules of different target triples: ' is 'amdgcn-amd-amdhsa-amdgizcl' whereas '<string>' is 'amdgcn--amdhsa'

warning: Linking two modules of different data layouts: '' is 'e-p:64:64-p1:64:64-p2:32:32-p3:32:32-p4:64:64-p5:32:32-p6:32:32-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-v256:256-v512:512-v1024:1024-v2048:2048-n32:64-A5' whereas '<string>' is 'e-p:64:64-p1:64:64-p2:32:32-p3:32:32-p4:64:64-p5:32:32-p6:32:32-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-v256:256-v512:512-v1024:1024-v2048:2048-n32:64-S32-A5'

warning: Linking two modules of different target triples: ' is 'amdgcn-amd-amdhsa-amdgizcl' whereas '<string>' is 'amdgcn--amdhsa'

warning: Linking two modules of different data layouts: '' is 'e-p:64:64-p1:64:64-p2:32:32-p3:32:32-p4:64:64-p5:32:32-p6:32:32-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-v256:256-v512:512-v1024:1024-v2048:2048-n32:64-A5' whereas '<string>' is 'e-p:64:64-p1:64:64-p2:32:32-p3:32:32-p4:64:64-p5:32:32-p6:32:32-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-v256:256-v512:512-v1024:1024-v2048:2048-n32:64-S32-A5'

warning: Linking two modules of different target triples: ' is 'amdgcn-amd-amdhsa-amdgizcl' whereas '<string>' is 'amdgcn--amdhsa'

warning: Linking two modules of different data layouts: '' is 'e-p:64:64-p1:64:64-p2:32:32-p3:32:32-p4:64:64-p5:32:32-p6:32:32-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-v256:256-v512:512-v1024:1024-v2048:2048-n32:64-A5' whereas '<string>' is 'e-p:64:64-p1:64:64-p2:32:32-p3:32:32-p4:64:64-p5:32:32-p6:32:32-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-v256:256-v512:512-v1024:1024-v2048:2048-n32:64-S32-A5'

warning: Linking two modules of different target triples: ' is 'amdgcn-amd-amdhsa-amdgizcl' whereas '<string>' is 'amdgcn--amdhsa'

warning: Linking two modules of different data layouts: '' is 'e-p:64:64-p1:64:64-p2:32:32-p3:32:32-p4:64:64-p5:32:32-p6:32:32-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-v256:256-v512:512-v1024:1024-v2048:2048-n32:64-A5' whereas '<string>' is 'e-p:64:64-p1:64:64-p2:32:32-p3:32:32-p4:64:64-p5:32:32-p6:32:32-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-v256:256-v512:512-v1024:1024-v2048:2048-n32:64-S32-A5'

warning: Linking two modules of different target triples: ' is 'amdgcn-amd-amdhsa-amdgizcl' whereas '<string>' is 'amdgcn--amdhsa'

warning: Linking two modules of different data layouts: '' is 'e-p:64:64-p1:64:64-p2:32:32-p3:32:32-p4:64:64-p5:32:32-p6:32:32-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-v256:256-v512:512-v1024:1024-v2048:2048-n32:64-A5' whereas '<string>' is 'e-p:64:64-p1:64:64-p2:32:32-p3:32:32-p4:64:64-p5:32:32-p6:32:32-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-v256:256-v512:512-v1024:1024-v2048:2048-n32:64-S32-A5'

warning: Linking two modules of different target triples: ' is 'amdgcn-amd-amdhsa-amdgizcl' whereas '<string>' is 'amdgcn--amdhsa'

warning: Linking two modules of different data layouts: '' is 'e-p:64:64-p1:64:64-p2:32:32-p3:32:32-p4:64:64-p5:32:32-p6:32:32-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-v256:256-v512:512-v1024:1024-v2048:2048-n32:64-A5' whereas '<string>' is 'e-p:64:64-p1:64:64-p2:32:32-p3:32:32-p4:64:64-p5:32:32-p6:32:32-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-v256:256-v512:512-v1024:1024-v2048:2048-n32:64-S32-A5'

warning: Linking two modules of different target triples: ' is 'amdgcn-amd-amdhsa-amdgizcl' whereas '<string>' is 'amdgcn--amdhsa'

warning: Linking two modules of different data layouts: '' is 'e-p:64:64-p1:64:64-p2:32:32-p3:32:32-p4:64:64-p5:32:32-p6:32:32-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-v256:256-v512:512-v1024:1024-v2048:2048-n32:64-A5' whereas '<string>' is 'e-p:64:64-p1:64:64-p2:32:32-p3:32:32-p4:64:64-p5:32:32-p6:32:32-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-v256:256-v512:512-v1024:1024-v2048:2048-n32:64-S32-A5'

warning: Linking two modules of different target triples: ' is 'amdgcn-amd-amdhsa-amdgizcl' whereas '<string>' is 'amdgcn--amdhsa'

warning: Linking two modules of different data layouts: '' is 'e-p:64:64-p1:64:64-p2:32:32-p3:32:32-p4:64:64-p5:32:32-p6:32:32-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-v256:256-v512:512-v1024:1024-v2048:2048-n32:64-A5' whereas '<string>' is 'e-p:64:64-p1:64:64-p2:32:32-p3:32:32-p4:64:64-p5:32:32-p6:32:32-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-v256:256-v512:512-v1024:1024-v2048:2048-n32:64-S32-A5'

warning: Linking two modules of different target triples: ' is 'amdgcn-amd-amdhsa-amdgizcl' whereas '<string>' is 'amdgcn--amdhsa'

'gfx906' is not a recognized processor for this target (ignoring processor)
'gfx906' is not a recognized processor for this target (ignoring processor)
'gfx906' is not a recognized processor for this target (ignoring processor)
'gfx906' is not a recognized processor for this target (ignoring processor)
'gfx906' is not a recognized processor for this target (ignoring processor)
'gfx906' is not a recognized processor for this target (ignoring processor)
'gfx906' is not a recognized processor for this target (ignoring processor)
'gfx906' is not a recognized processor for this target (ignoring processor)
'gfx906' is not a recognized processor for this target (ignoring processor)
'gfx906' is not a recognized processor for this target (ignoring processor)
'gfx906' is not a recognized processor for this target (ignoring processor)
'gfx906' is not a recognized processor for this target (ignoring processor)
'gfx906' is not a recognized processor for this target (ignoring processor)
'gfx906' is not a recognized processor for this target (ignoring processor)
'gfx906' is not a recognized processor for this target (ignoring processor)
'gfx906' is not a recognized processor for this target (ignoring processor)
'gfx906' is not a recognized processor for this target (ignoring processor)
'gfx906' is not a recognized processor for this target (ignoring processor)
'gfx906' is not a recognized processor for this target (ignoring processor)
'gfx906' is not a recognized processor for this target (ignoring processor)
'gfx906' is not a recognized processor for this target (ignoring processor)
'gfx906' is not a recognized processor for this target (ignoring processor)
'gfx906' is not a recognized processor for this target (ignoring processor)
'gfx906' is not a recognized processor for this target (ignoring processor)
'gfx906' is not a recognized processor for this target (ignoring processor)
'gfx906' is not a recognized processor for this target (ignoring processor)
LLVM ERROR: Attempting to emit S_LOAD_DWORDX2_IMM_si instruction but the Feature_isGCN predicate(s) are not met

Looks like the llvm is outdated and so does not recognize newer models? I installed Numba via conda install -c conda-forge -c numba numba roctools.

Here’s my Numba config:

$ numba -s
System info:
--------------------------------------------------------------------------------
__Time Stamp__
Report started (local time)                   : 2020-09-01 00:22:15.338933
UTC start time                                : 2020-09-01 04:22:15.338937
Running time (s)                              : 1.006525

__Hardware Information__
Machine                                       : x86_64
CPU Name                                      : skylake-avx512
CPU Count                                     : 20
Number of accessible CPUs                     : 20
List of accessible CPUs cores                 : 0-19
CFS Restrictions (CPUs worth of runtime)      : None

CPU Features                                  : 64bit adx aes avx avx2 avx512bw
                                                avx512cd avx512dq avx512f avx512vl
                                                bmi bmi2 clflushopt clwb cmov cx16
                                                cx8 f16c fma fsgsbase fxsr invpcid
                                                lzcnt mmx movbe pclmul popcnt
                                                prfchw rdrnd rdseed rtm sahf sse
                                                sse2 sse3 sse4.1 sse4.2 ssse3
                                                xsave xsavec xsaveopt xsaves

Memory Total (MB)                             : 64003
Memory Available (MB)                         : 62021

__OS Information__
Platform Name                                 : Linux-5.4.0-42-generic-x86_64-with-debian-buster-sid
Platform Release                              : 5.4.0-42-generic
OS Name                                       : Linux
OS Version                                    : #46~18.04.1-Ubuntu SMP Fri Jul 10 07:21:24 UTC 2020
OS Specific Version                           : ?
Libc Version                                  : glibc 2.10

__Python Information__
Python Compiler                               : GCC 7.5.0
Python Implementation                         : CPython
Python Version                                : 3.7.8
Python Locale                                 : en_US.UTF-8

__LLVM Information__
LLVM Version                                  : 10.0.1

__CUDA Information__
CUDA Device Initialized                       : True
CUDA Driver Version                           : 11000
CUDA Detect Output:
Found 1 CUDA devices
id 0    b'GeForce RTX 2080 Ti'                              [SUPPORTED]
                      compute capability: 7.5
                           pci device id: 0
                              pci bus id: 26
Summary:
	1/1 devices are supported

CUDA Librairies Test Output:
Finding cublas from CUDA_HOME
	named  libcublas.so.10.0.130
	trying to open library...	ok
Finding cusparse from CUDA_HOME
	named  libcusparse.so.10.0.130
	trying to open library...	ok
Finding cufft from CUDA_HOME
	named  libcufft.so.10.0.145
	trying to open library...	ok
Finding curand from CUDA_HOME
	named  libcurand.so.10.0.130
	trying to open library...	ok
Finding nvvm from CUDA_HOME
	named  libnvvm.so.3.3.0
	trying to open library...	ok
Finding cudart from CUDA_HOME
	named  libcudart.so.10.0.130
	trying to open library...	ok
Finding libdevice from CUDA_HOME
	searching for compute_20...	ok
	searching for compute_30...	ok
	searching for compute_35...	ok
	searching for compute_50...	ok


__ROC information__
ROC Available                                 : True
ROC Toolchains                                : ['librocmlite library', 'ROC command line tools']
HSA Agents Count                              : 2
HSA Agents:
	Agent id                                     : 0
	Vendor                                       : CPU
	Name                                         : Intel(R) Core(TM) i9-9820X CPU @ 3.30GHz
	Type                                         : CPU
	Agent id                                     : 1
	Vendor                                       : AMD
	Name                                         : gfx906
	Type                                         : GPU
HSA Discrete GPUs Count                       : 1
HSA Discrete GPUs                             : gfx906

__SVML Information__
SVML State, config.USING_SVML                 : False
SVML Library Loaded                           : False
llvmlite Using SVML Patched LLVM              : True
SVML Operational                              : False

__Threading Layer Information__
TBB Threading Layer Available                 : True
+-->TBB imported successfully.
OpenMP Threading Layer Available              : True
+-->Vendor: GNU
Workqueue Threading Layer Available           : True
+-->Workqueue imported successfully.

__Numba Environment Variable Information__
None found.

__Conda Information__
Conda Build                                   : not installed
Conda Env                                     : 4.8.3
Conda Platform                                : linux-64
Conda Python Version                          : 3.8.3.final.0
Conda Root Writable                           : True

__Installed Packages__
_libgcc_mutex             0.1                 conda_forge    conda-forge
_openmp_mutex             4.5                       1_gnu    conda-forge
argon2-cffi               20.1.0           py37h8f50634_1    conda-forge
attrs                     19.3.0                     py_0    conda-forge
backcall                  0.2.0              pyh9f0ad1d_0    conda-forge
backports                 1.0                        py_2    conda-forge
backports.functools_lru_cache 1.6.1                      py_0    conda-forge
bleach                    3.1.5              pyh9f0ad1d_0    conda-forge
ca-certificates           2020.6.20            hecda079_0    conda-forge
certifi                   2020.6.20        py37hc8dfbb8_0    conda-forge
cffi                      1.14.1           py37h2b28604_0    conda-forge
cupy                      8.0.0b5                   dev_0    <develop>
cython                    0.29.21          py37h3340039_0    conda-forge
dbus                      1.13.6               he372182_0    conda-forge
decorator                 4.4.2                      py_0    conda-forge
defusedxml                0.6.0                      py_0    conda-forge
entrypoints               0.3             py37hc8dfbb8_1001    conda-forge
expat                     2.2.9                he1b5a44_2    conda-forge
fastrlock                 0.5              py37h3340039_0    conda-forge
flake8                    3.7.9            py37hc8dfbb8_1    conda-forge
fontconfig                2.13.1            h86ecdb6_1001    conda-forge
freetype                  2.10.2               he06d7ca_0    conda-forge
gettext                   0.19.8.1          hc5be6a0_1002    conda-forge
glib                      2.65.0               h6f030ca_0    conda-forge
gst-plugins-base          1.14.5               h0935bb2_2    conda-forge
gstreamer                 1.14.5               h36ae1b5_2    conda-forge
icu                       64.2                 he1b5a44_1    conda-forge
importlib-metadata        1.7.0            py37hc8dfbb8_0    conda-forge
importlib_metadata        1.7.0                         0    conda-forge
iniconfig                 1.0.1              pyh9f0ad1d_0    conda-forge
ipykernel                 5.3.4            py37h43977f1_0    conda-forge
ipython                   7.17.0           py37hc6149b9_0    conda-forge
ipython_genutils          0.2.0                      py_1    conda-forge
ipywidgets                7.5.1                      py_0    conda-forge
jedi                      0.15.2                   py37_0    conda-forge
jinja2                    2.11.2             pyh9f0ad1d_0    conda-forge
jpeg                      9d                   h516909a_0    conda-forge
jsonschema                3.2.0            py37hc8dfbb8_1    conda-forge
jupyter                   1.0.0                      py_2    conda-forge
jupyter_client            6.1.6                      py_0    conda-forge
jupyter_console           6.1.0                      py_1    conda-forge
jupyter_core              4.6.3            py37hc8dfbb8_1    conda-forge
ld_impl_linux-64          2.34                 hc38a660_9    conda-forge
libblas                   3.8.0               17_openblas    conda-forge
libcblas                  3.8.0               17_openblas    conda-forge
libffi                    3.2.1             he1b5a44_1007    conda-forge
libgcc-ng                 9.3.0               h24d8f2e_14    conda-forge
libgfortran-ng            7.5.0               hdf63c60_14    conda-forge
libgomp                   9.3.0               h24d8f2e_14    conda-forge
libiconv                  1.16                 h516909a_0    conda-forge
liblapack                 3.8.0               17_openblas    conda-forge
libllvm10                 10.0.1               he513fc3_3    conda-forge
libopenblas               0.3.10          pthreads_hb3c22a3_4    conda-forge
libpng                    1.6.37               hed695b0_2    conda-forge
libsodium                 1.0.18               h516909a_0    conda-forge
libstdcxx-ng              9.3.0               hdf63c60_14    conda-forge
libuuid                   2.32.1            h14c3975_1000    conda-forge
libxcb                    1.13              h14c3975_1002    conda-forge
libxml2                   2.9.10               hee79883_0    conda-forge
line_profiler             3.0.2            py37hc9558a2_0    conda-forge
llvmlite                  0.34.0           py37h5202443_1    conda-forge
markupsafe                1.1.1            py37h8f50634_1    conda-forge
mccabe                    0.6.1                      py_1    conda-forge
mistune                   0.8.4           py37h8f50634_1001    conda-forge
more-itertools            8.4.0                      py_0    conda-forge
nbconvert                 5.6.1            py37hc8dfbb8_1    conda-forge
nbformat                  5.0.7                      py_0    conda-forge
ncurses                   6.2                  he1b5a44_1    conda-forge
notebook                  6.1.3            py37hc8dfbb8_0    conda-forge
numba                     0.51.1           py37h9fdb41a_0    conda-forge
numpy                     1.19.1           py37h8960a57_0    conda-forge
openssl                   1.1.1g               h516909a_1    conda-forge
packaging                 20.4               pyh9f0ad1d_0    conda-forge
pandoc                    2.10.1               h516909a_0    conda-forge
pandocfilters             1.4.2                      py_1    conda-forge
parso                     0.5.2                      py_0  
pcre                      8.44                 he1b5a44_0    conda-forge
pexpect                   4.8.0            py37hc8dfbb8_1    conda-forge
pickleshare               0.7.5           py37hc8dfbb8_1001    conda-forge
pip                       20.2.2                     py_0    conda-forge
pluggy                    0.13.1           py37hc8dfbb8_2    conda-forge
prometheus_client         0.8.0              pyh9f0ad1d_0    conda-forge
prompt-toolkit            3.0.6                      py_0    conda-forge
prompt_toolkit            3.0.6                         0    conda-forge
pthread-stubs             0.4               h14c3975_1001    conda-forge
ptyprocess                0.6.0                   py_1001    conda-forge
py                        1.9.0              pyh9f0ad1d_0    conda-forge
pycodestyle               2.5.0                    py37_0  
pycparser                 2.20               pyh9f0ad1d_2    conda-forge
pyflakes                  2.1.1                    py37_0  
pygments                  2.6.1                      py_0    conda-forge
pyparsing                 2.4.7              pyh9f0ad1d_0    conda-forge
pyqt                      5.9.2            py37hcca6a23_4    conda-forge
pyrsistent                0.16.0           py37h8f50634_0    conda-forge
pytest                    6.0.1            py37hc8dfbb8_0    conda-forge
python                    3.7.8           h6f2ec95_1_cpython    conda-forge
python-dateutil           2.8.1                      py_0    conda-forge
python_abi                3.7                     1_cp37m    conda-forge
pyzmq                     19.0.2           py37hac76be4_0    conda-forge
qt                        5.9.7                h0c104cb_3    conda-forge
qtconsole                 4.7.6              pyh9f0ad1d_0    conda-forge
qtpy                      1.9.0                      py_0    conda-forge
readline                  8.0                  he28a2e2_2    conda-forge
roctools                  0.0.0                hf484d3e_1    numba
scipy                     1.5.2            py37hb14ef9d_0    conda-forge
send2trash                1.5.0                      py_0    conda-forge
setuptools                49.6.0           py37hc8dfbb8_0    conda-forge
sip                       4.19.8           py37hf484d3e_0  
six                       1.15.0             pyh9f0ad1d_0    conda-forge
sqlite                    3.32.3               h4cf870e_1    conda-forge
terminado                 0.8.3            py37hc8dfbb8_1    conda-forge
testpath                  0.4.4                      py_0    conda-forge
tk                        8.6.10               hed695b0_0    conda-forge
toml                      0.10.1             pyh9f0ad1d_0    conda-forge
tornado                   6.0.4            py37h8f50634_1    conda-forge
traitlets                 4.3.3            py37hc8dfbb8_1    conda-forge
wcwidth                   0.2.5              pyh9f0ad1d_1    conda-forge
webencodings              0.5.1                      py_1    conda-forge
wheel                     0.35.1             pyh9f0ad1d_0    conda-forge
widgetsnbextension        3.5.1            py37hc8dfbb8_1    conda-forge
xorg-libxau               1.0.9                h14c3975_0    conda-forge
xorg-libxdmcp             1.1.3                h516909a_0    conda-forge
xz                        5.2.5                h516909a_1    conda-forge
zeromq                    4.3.2                he1b5a44_3    conda-forge
zipp                      3.1.0                      py_0    conda-forge
zlib                      1.2.11            h516909a_1007    conda-forge

No errors reported.


__Warning log__
Warning (psutil): psutil cannot be imported. For more accuracy, consider installing it.
--------------------------------------------------------------------------------

On this machine I have ROCm 3.5.0 installed. I already verified its integrity as I am developing CuPy’s ROCm support on this machine.

About this issue

  • Original URL
  • State: closed
  • Created 4 years ago
  • Comments: 27 (25 by maintainers)

Most upvoted comments

@abitrolly Yes, hopefully that’d be changed at some point. See https://leofang.github.io/assets/HPC_with_CuPy.pdf.

ROCm support was officially removed in https://github.com/numba/numba/commit/4670b4c0d4a1e14a68c93513639f552ecd6330bd

I suppose the way to bring it back is to contact somebody from AMD for sponsorship.

@leofang the https://cupy.dev/#features looks like CUDA promo.

CuPy utilizes CUDA Toolkit libraries including cuBLAS, cuRAND, cuSOLVER, cuSPARSE, cuFFT, cuDNN and NCCL to make full use of the GPU architecture.

If it supports computing without NVIDIA, that would be great to see upfront.

Thanks for the pointer, @abitrolly! I missed that the ROCm target was completely removed from Numba in June (#6991). I think it’d be great to add a follow-up PR to advertise CuPy in the deprecation notice, as it can target both CUDA & ROCm devices with single JIT kernel implementation (cc: @stuartarchibald @kmaehashi), but due to conflict of interests it should not come from me…

@stuartarchibald I changed the issue title. I think that new hardware targets are not supported is just a symptom, and the root cause is Numba began the support in ROCm 2.7 (?), but since then ROCm has evolved quite a lot that many changes are needed to keep up with the evolving ROCm infrastructure. We at CuPy also spend quite some time on ROCm/HIP support lately, and we decided to support only ROCm 3.5+ and abandoned the 2.x series. The AMD people release a new version every 1 to 2 months, and we can’t possibly keep backward compatibility indefinitely…

Thanks @leofang I also think ROCm 3.x might indeed not work OOTB. There’s likely a couple of things that need major alterations, one of which is related to this…

I’ve started work on target specific overloads https://github.com/numba/numba/pull/6598 which will hopefully simplify the code a lot for all hardware targets, as the compiler pipeline, dispatchers and implementations could potentially be shared. Am hoping to start discussing it a bit next week at the Numba development meeting (these are entirely open meetings now, you are welcome to join, details are here: https://numba.discourse.group/t/weekly-open-dev-meeting-2021/). Feedback is also welcome on the draft PR. Thanks!

@stuartarchibald @esc Thank you. I am intrigued to see how exactly the whole thing is built and set up, so I’m taking notes along the way.

First, https://github.com/RadeonOpenCompute/llvm is fully abandoned afaik. It stopped being updated since late 2019, and no ROCm 3.x tags or branches can be found there. This matches my impression that recent ROCm compiler support has been fully integrated in upstream (i.e., LLVM). I think the librocmlite recipe will need to be updated too.

I’d guess at this https://github.com/RadeonOpenCompute/llvm-project/ opposed to LLVM upstream?

Second, in ROCm 3.5.0 I can’t find LLVM’s headers (there’s no /opt/rocm/llvm/include), so looks like it’s unavoidable to build my copy of LLVM 11.0.0…

rocmlite statically links so you’d likely need a LLVM build.

Looks like I was mistaken 😝

@stuartarchibald Sorry, I am confused now sweat_smile So do you mean it’s not sufficient to build llvmlite (assuming I have built a working, patched LLVM)?

Correct. llvmlite has nothing to do with the ROCm toolchain other than conceptual equivalence, llvmlite is to CPU what rocmlite is to AMDGCN GPUs.

Another way to ask this: From conda’s point of view which packages will need to be updated? I don’t see any package named rocmlite in my conda list output. Is it part of roctools?

Yes, roctools contains rocmlite README on https://github.com/numba/roctools has a little bit about it.

Once you have compiled LLVM, compiling llvmlite is a breeze, you only need to point it at llvm-config and do the usual python setup.py build_ext dance with whatever variations you prefer:

https://llvmlite.readthedocs.io/en/latest/admin-guide/install.html#compiling-llvmlite

As for building against a ROCm distribution, I really have no idea as this is unchartered territory for me. You can try to use the existing llvm-config if one is included. However, be advised, that Numba/llvmlite require some patches to be applied to the source/release distribution of LLVM for them to function correctly:

https://llvmlite.readthedocs.io/en/latest/admin-guide/install.html#compiling-llvm

So, I don’t think there is a way to circumvent compiling LLVM from scratch in this case (but of course, I may be mistaken).