cupy: [BUG] ROCm: the `cupy.cuda.cub` module cannot be built in v8

Looks like in CuPy v8 we assumed the cupy.cuda.cub module is always built, but the ROCm support was not added until recently (#4027), so an error is raised upon import cupy:

$ pip install -v cupy    # build from sdist
$ python -c "import cupy"
Traceback (most recent call last):
  File "/home/leofang/miniconda3/envs/cupy_test/lib/python3.7/site-packages/cupy/__init__.py", line 20, in <module>
    from cupy import core  # NOQA
  File "/home/leofang/miniconda3/envs/cupy_test/lib/python3.7/site-packages/cupy/core/__init__.py", line 1, in <module>
    from cupy.core import core  # NOQA
  File "cupy/core/core.pyx", line 1, in init cupy.core.core
  File "cupy/core/_routines_manipulation.pyx", line 1, in init cupy.core._routines_manipulation
  File "cupy/core/_routines_indexing.pyx", line 1, in init cupy.core._routines_indexing
  File "cupy/core/_routines_math.pyx", line 6, in init cupy.core._routines_math
  File "cupy/core/_reduction.pyx", line 1, in init cupy.core._reduction
  File "cupy/core/_cub_reduction.pyx", line 1, in init cupy.core._cub_reduction
ModuleNotFoundError: No module named 'cupy.cuda.cub'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/home/leofang/miniconda3/envs/cupy_test/lib/python3.7/site-packages/cupy/__init__.py", line 41, in <module>
    raise ImportError(_msg) from e
ImportError: CuPy is not correctly installed.

If you are using wheel distribution (cupy-cudaXX), make sure that the version of CuPy you installed matches with the version of CUDA on your host.
Also, confirm that only one CuPy package is installed:
  $ pip freeze

If you are building CuPy from source, please check your environment, uninstall CuPy and reinstall it with:
  $ pip install cupy --no-cache-dir -vvvv

Check the Installation Guide for details:
  https://docs.cupy.dev/en/latest/install.html

original error: No module named 'cupy.cuda.cub'

I think this can be fixed simply by backporting #4027.

ref: https://stackoverflow.com/questions/64613606/cupy-on-amd-gpu-causing-an-importerror

About this issue

  • Original URL
  • State: closed
  • Created 4 years ago
  • Comments: 15 (9 by maintainers)

Most upvoted comments

@leofang , do you mean to build CuPy in an environment that has only ROCm installed but has an NVCC env var exported? I tried it as follows:

cd ${cupy_source_dir}
git clean -fdx
export HCC_AMDGPU_TARGET=gfx900
export __HIP_PLATFORM_HCC__
export CUPY_INSTALL_USE_HIP=1
export ROCM_HOME=/opt/rocm-3.9.1
export NVCC=/usr/local/cuda 

pip install . --no-cache-dir -vvvv 

This cannot install CuPy successfully. Here is the log. log-ROCm39-NVCC.LOG

@leofang , with ROCm 3.5 and CuPy v8.0.0,I have no issue with import cupy. Currently I have tested CuPy with ROCm 3.5 and ROCm 3.9 on two different machines, both of which have more than one version of ROCm installed. I use module (an environment variables management tool) to switch among different versions of ROCm. I uninstalled the previous CuPy v8.0.0 on ROCm 3.5 and reinstelled it and got this log: log-3.5.LOG It seems that there are some major diffenences between the installation of v9 on ROCm 3.9 and v8.0.0 on ROCm 3.5,for example, in v8.0.0, cub module is not configured, instead it is cusolver that is configured:

    Modules:
      cuda      : Yes
      cusolver  : Yes
      thrust    : No
        -> nvcc command could not be found in PATH.
        -> Check your PATH environment variable.

    WARNING: Some modules could not be configured.
    CuPy will be installed without these modules.
    Please refer to the Installation Guide for details:
    https://docs.cupy.dev/en/stable/install.html

Is this the reason that there is no cub issue in v8.0.0? Here are the outputs of hipconfig on my two machines, one of which is configured to ROCm 3.5 and the other is ROCm 3.9.

hipconfig-rocm35.LOG hipconfig-rocm39.LOG

To be clear, the two logs are from two different machines, but both of the two machines have multiple versions of ROCm,from rocm 2.9 to 3.9, installed in different directories, and managed using the module tool. On the machine which has both ROCm 3.5 and ROCm 3.9, the CuPy v8.0.0 works with ROCm 3.5,but v8.1.0, v8.2.0 and v9 have the cub issue with ROCm 3.9.

@leofang , the machine I used before has only ROCm installed, no CUDA. Some of my test cases used to fail before I followed your instructions. We discussed those problems here #4216 .

Thanks! Fixed by backporting #4027 (+ #4162). Will add test in https://github.com/cupy/cupy/pull/4326.