llama-cpp-python: Installation via pip fails for ROCm / AMD cards

Expected Behavior

I have a machine with and AMD GPU (Radeon RX 7900 XT). I tried to install this library as written in the README by running

CMAKE_ARGS="-DLLAMA_HIPBLAS=on" FORCE_CMAKE=1 pip install llama-cpp-python

Current Behavior

The installation fails, however when I simply run pip install llama-cpp-python it works

Environment and Context

To make the issue reproducible i made a Docker conatiner with this Dockerfile (adapted from the llama-cpp repo)

ARG UBUNTU_VERSION=22.04

# This needs to generally match the container host's environment.
ARG ROCM_VERSION=5.6however when I simply run pip install llama-cpp-python it works

# Target the CUDA build image
ARG BASE_ROCM_DEV_CONTAINER=rocm/dev-ubuntu-${UBUNTU_VERSION}:${ROCM_VERSION}-complete

FROM ${BASE_ROCM_DEV_CONTAINER} as build

# Unless otherwise specified, we make a fat build.
# List from https://github.com/ggerganov/llama.cpp/pull/1087#issuecomment-1682807878
# This is mostly tied to rocBLAS supported archs.
ARG ROCM_DOCKER_ARCH=\
    gfx803 \
    gfx900 \
    gfx906 \
    gfx908 \
    gfx90a \
    gfx1010 \
    gfx1030 \
    gfx1100 \ # this is my rocm arch
    gfx1101 \
    gfx1102

# Set nvcc architecture
ENV GPU_TARGETS=${ROCM_DOCKER_ARCH}
# Enable ROCm
ENV CC=/opt/rocm/llvm/bin/clang
ENV CXX=/opt/rocm/llvm/bin/clang++
ENV LLAMA_HIPBLAS=on

RUN apt-get update && apt-get -y install cmake protobuf-compiler aria2 git

System Info:

CPU: 13th Gen Intel® Core™ i5-13400F GPU: Radeon RX 7900 XT

Ubuntu 22.04.1

Python 3.10.6 Make 4.3 g++ 11.3.0

Failure Information (for bugs)

The installation failed, here is the output when running CMAKE_ARGS="-DLLAMA_HIPBLAS=on" FORCE_CMAKE=1 pip install llama-cpp-python

root@8bebff5da3f1:/# CMAKE_ARGS="-DLLAMA_HIPBLAS=on" FORCE_CMAKE=1 pip install llama-cpp-python
Collecting llama-cpp-python
  Downloading llama_cpp_python-0.1.82.tar.gz (1.8 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.8/1.8 MB 3.0 MB/s eta 0:00:00
  Installing build dependencies ... done
  Getting requirements to build wheel ... done
  Preparing metadata (pyproject.toml) ... done
Requirement already satisfied: typing-extensions>=4.5.0 in /usr/local/lib/python3.10/dist-packages (from llama-cpp-python) (4.7.1)
Requirement already satisfied: numpy>=1.20.0 in /usr/local/lib/python3.10/dist-packages (from llama-cpp-python) (1.25.2)
Collecting diskcache>=5.6.1
  Downloading diskcache-5.6.1-py3-none-any.whl (45 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 45.6/45.6 KB 7.1 MB/s eta 0:00:00
Building wheels for collected packages: llama-cpp-python
  Building wheel for llama-cpp-python (pyproject.toml) ... error
  error: subprocess-exited-with-error
  
  × Building wheel for llama-cpp-python (pyproject.toml) did not run successfully.
  │ exit code: 1
  ╰─> [390 lines of output]
      
      
      --------------------------------------------------------------------------------
      -- Trying 'Ninja' generator
      --------------------------------
      ---------------------------
      ----------------------
      -----------------
      ------------
      -------
      --
      CMake Deprecation Warning at CMakeLists.txt:1 (cmake_minimum_required):
        Compatibility with CMake < 3.5 will be removed from a future version of
        CMake.
      
        Update the VERSION argument <min> value or use a ...<max> suffix to tell
        CMake that the project does not need compatibility with older versions.
      
      Not searching for unused variables given on the command line.
      
      -- The C compiler identification is Clang 16.0.0
      -- Detecting C compiler ABI info
      -- Detecting C compiler ABI info - done
      -- Check for working C compiler: /opt/rocm/llvm/bin/clang - skipped
      -- Detecting C compile features
      -- Detecting C compile features - done
      -- The CXX compiler identification is Clang 16.0.0
      -- Detecting CXX compiler ABI info
      -- Detecting CXX compiler ABI info - done
      -- Check for working CXX compiler: /opt/rocm/llvm/bin/clang++ - skipped
      -- Detecting CXX compile features
      -- Detecting CXX compile features - done
      -- Configuring done (0.6s)
      -- Generating done (0.0s)
      -- Build files have been written to: /tmp/pip-install-wf4bikyh/llama-cpp-python_19efb6e7a69446cd9a7c7007cc342888/_cmake_test_compile/build
      --
      -------
      ------------
      -----------------
      ----------------------
      ---------------------------
      --------------------------------
      -- Trying 'Ninja' generator - success
      --------------------------------------------------------------------------------
      
      Configuring Project
        Working directory:
          /tmp/pip-install-wf4bikyh/llama-cpp-python_19efb6e7a69446cd9a7c7007cc342888/_skbuild/linux-x86_64-3.10/cmake-build
        Command:
          /tmp/pip-build-env-_3ufrfgk/overlay/local/lib/python3.10/dist-packages/cmake/data/bin/cmake /tmp/pip-install-wf4bikyh/llama-cpp-python_19efb6e7a69446cd9a7c7007cc342888 -G Ninja -DCMAKE_MAKE_PROGRAM:FILEPATH=/tmp/pip-build-env-_3ufrfgk/overlay/local/lib/python3.10/dist-packages/ninja/data/bin/ninja --no-warn-unused-cli -DCMAKE_INSTALL_PREFIX:PATH=/tmp/pip-install-wf4bikyh/llama-cpp-python_19efb6e7a69446cd9a7c7007cc342888/_skbuild/linux-x86_64-3.10/cmake-install -DPYTHON_VERSION_STRING:STRING=3.10.6 -DSKBUILD:INTERNAL=TRUE -DCMAKE_MODULE_PATH:PATH=/tmp/pip-build-env-_3ufrfgk/overlay/local/lib/python3.10/dist-packages/skbuild/resources/cmake -DPYTHON_EXECUTABLE:PATH=/usr/bin/python3 -DPYTHON_INCLUDE_DIR:PATH=/usr/include/python3.10 -DPython_EXECUTABLE:PATH=/usr/bin/python3 -DPython_ROOT_DIR:PATH=/usr -DPython_FIND_REGISTRY:STRING=NEVER -DPython_INCLUDE_DIR:PATH=/usr/include/python3.10 -DPython3_EXECUTABLE:PATH=/usr/bin/python3 -DPython3_ROOT_DIR:PATH=/usr -DPython3_FIND_REGISTRY:STRING=NEVER -DPython3_INCLUDE_DIR:PATH=/usr/include/python3.10 -DCMAKE_MAKE_PROGRAM:FILEPATH=/tmp/pip-build-env-_3ufrfgk/overlay/local/lib/python3.10/dist-packages/ninja/data/bin/ninja -DLLAMA_HIPBLAS=on -DCMAKE_BUILD_TYPE:STRING=Release -DLLAMA_HIPBLAS=on
      
      Not searching for unused variables given on the command line.
      -- The C compiler identification is Clang 16.0.0
      -- The CXX compiler identification is Clang 16.0.0
      -- Detecting C compiler ABI info
      -- Detecting C compiler ABI info - done
      -- Check for working C compiler: /opt/rocm/llvm/bin/clang - skipped
      -- Detecting C compile features
      -- Detecting C compile features - done
      -- Detecting CXX compiler ABI info
      -- Detecting CXX compiler ABI info - done
      -- Check for working CXX compiler: /opt/rocm/llvm/bin/clang++ - skipped
      -- Detecting CXX compile features
      -- Detecting CXX compile features - done
      -- Found Git: /usr/bin/git (found version "2.34.1")
      fatal: not a git repository (or any of the parent directories): .git
      fatal: not a git repository (or any of the parent directories): .git
      CMake Warning at vendor/llama.cpp/CMakeLists.txt:118 (message):
        Git repository not found; to enable automatic generation of build info,
        make sure Git is installed and the project is a Git repository.
      
      
      -- Performing Test CMAKE_HAVE_LIBC_PTHREAD
      -- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success
      -- Found Threads: TRUE
      CMake Deprecation Warning at /opt/rocm/lib/cmake/hip/hip-config.cmake:20 (cmake_minimum_required):
        Compatibility with CMake < 3.5 will be removed from a future version of
        CMake.
      
        Update the VERSION argument <min> value or use a ...<max> suffix to tell
        CMake that the project does not need compatibility with older versions.
      Call Stack (most recent call first):
        vendor/llama.cpp/CMakeLists.txt:366 (find_package)
      
      
      -- hip::amdhip64 is SHARED_LIBRARY
      -- Performing Test HIP_CLANG_SUPPORTS_PARALLEL_JOBS
      -- Performing Test HIP_CLANG_SUPPORTS_PARALLEL_JOBS - Success
      CMake Deprecation Warning at /opt/rocm/lib/cmake/hip/hip-config.cmake:20 (cmake_minimum_required):
        Compatibility with CMake < 3.5 will be removed from a future version of
        CMake.
      
        Update the VERSION argument <min> value or use a ...<max> suffix to tell
        CMake that the project does not need compatibility with older versions.
      Call Stack (most recent call first):
        /tmp/pip-build-env-_3ufrfgk/overlay/local/lib/python3.10/dist-packages/cmake/data/share/cmake-3.27/Modules/CMakeFindDependencyMacro.cmake:76 (find_package)
        /opt/rocm/lib/cmake/hipblas/hipblas-config.cmake:90 (find_dependency)
        vendor/llama.cpp/CMakeLists.txt:367 (find_package)
      
      
      -- hip::amdhip64 is SHARED_LIBRARY
      -- HIP and hipBLAS found
      -- CMAKE_SYSTEM_PROCESSOR: x86_64
      -- x86 detected
      -- Configuring done (0.6s)
      -- Generating done (0.0s)
      -- Build files have been written to: /tmp/pip-install-wf4bikyh/llama-cpp-python_19efb6e7a69446cd9a7c7007cc342888/_skbuild/linux-x86_64-3.10/cmake-build
      [1/12] Building C object vendor/llama.cpp/CMakeFiles/ggml.dir/ggml-alloc.c.o
      [2/12] Building CXX object vendor/llama.cpp/common/CMakeFiles/common.dir/console.cpp.o
      [3/12] Building CXX object vendor/llama.cpp/common/CMakeFiles/common.dir/grammar-parser.cpp.o
      [4/12] Building C object vendor/llama.cpp/CMakeFiles/ggml.dir/k_quants.c.o
      /tmp/pip-install-wf4bikyh/llama-cpp-python_19efb6e7a69446cd9a7c7007cc342888/vendor/llama.cpp/k_quants.c:186:11: warning: variable 'sum_x' set but not used [-Wunused-but-set-variable]
          float sum_x = 0;
                ^
      /tmp/pip-install-wf4bikyh/llama-cpp-python_19efb6e7a69446cd9a7c7007cc342888/vendor/llama.cpp/k_quants.c:187:11: warning: variable 'sum_x2' set but not used [-Wunused-but-set-variable]
          float sum_x2 = 0;
                ^
      /tmp/pip-install-wf4bikyh/llama-cpp-python_19efb6e7a69446cd9a7c7007cc342888/vendor/llama.cpp/k_quants.c:182:14: warning: unused function 'make_qkx1_quants' [-Wunused-function]
      static float make_qkx1_quants(int n, int nmax, const float * restrict x, uint8_t * restrict L, float * restrict the_min,
                   ^
      3 warnings generated.
      [5/12] Building CXX object vendor/llama.cpp/common/CMakeFiles/common.dir/common.cpp.o
      [6/12] Building C object vendor/llama.cpp/CMakeFiles/ggml.dir/ggml.c.o
      /tmp/pip-install-wf4bikyh/llama-cpp-python_19efb6e7a69446cd9a7c7007cc342888/vendor/llama.cpp/ggml.c:2413:5: warning: implicit conversion increases floating-point precision: 'float' to 'ggml_float' (aka 'double') [-Wdouble-promotion]
          GGML_F16_VEC_REDUCE(sumf, sum);
          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      /tmp/pip-install-wf4bikyh/llama-cpp-python_19efb6e7a69446cd9a7c7007cc342888/vendor/llama.cpp/ggml.c:2045:37: note: expanded from macro 'GGML_F16_VEC_REDUCE'
      #define GGML_F16_VEC_REDUCE         GGML_F32Cx8_REDUCE
                                          ^
      /tmp/pip-install-wf4bikyh/llama-cpp-python_19efb6e7a69446cd9a7c7007cc342888/vendor/llama.cpp/ggml.c:2035:33: note: expanded from macro 'GGML_F32Cx8_REDUCE'
      #define GGML_F32Cx8_REDUCE      GGML_F32x8_REDUCE
                                      ^
      /tmp/pip-install-wf4bikyh/llama-cpp-python_19efb6e7a69446cd9a7c7007cc342888/vendor/llama.cpp/ggml.c:1981:11: note: expanded from macro 'GGML_F32x8_REDUCE'
          res = _mm_cvtss_f32(_mm_hadd_ps(t1, t1));                     \
              ~ ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      /tmp/pip-install-wf4bikyh/llama-cpp-python_19efb6e7a69446cd9a7c7007cc342888/vendor/llama.cpp/ggml.c:3456:9: warning: implicit conversion increases floating-point precision: 'float' to 'ggml_float' (aka 'double') [-Wdouble-promotion]
              GGML_F16_VEC_REDUCE(sumf[k], sum[k]);
              ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      /tmp/pip-install-wf4bikyh/llama-cpp-python_19efb6e7a69446cd9a7c7007cc342888/vendor/llama.cpp/ggml.c:2045:37: note: expanded from macro 'GGML_F16_VEC_REDUCE'
      #define GGML_F16_VEC_REDUCE         GGML_F32Cx8_REDUCE
                                          ^
      /tmp/pip-install-wf4bikyh/llama-cpp-python_19efb6e7a69446cd9a7c7007cc342888/vendor/llama.cpp/ggml.c:2035:33: note: expanded from macro 'GGML_F32Cx8_REDUCE'
      #define GGML_F32Cx8_REDUCE      GGML_F32x8_REDUCE
                                      ^
      /tmp/pip-install-wf4bikyh/llama-cpp-python_19efb6e7a69446cd9a7c7007cc342888/vendor/llama.cpp/ggml.c:1981:11: note: expanded from macro 'GGML_F32x8_REDUCE'
          res = _mm_cvtss_f32(_mm_hadd_ps(t1, t1));                     \
              ~ ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      /tmp/pip-install-wf4bikyh/llama-cpp-python_19efb6e7a69446cd9a7c7007cc342888/vendor/llama.cpp/ggml.c:596:23: warning: unused function 'mul_sum_i8_pairs' [-Wunused-function]
      static inline __m128i mul_sum_i8_pairs(const __m128i x, const __m128i y) {
                            ^
      /tmp/pip-install-wf4bikyh/llama-cpp-python_19efb6e7a69446cd9a7c7007cc342888/vendor/llama.cpp/ggml.c:627:19: warning: unused function 'hsum_i32_4' [-Wunused-function]
      static inline int hsum_i32_4(const __m128i a) {
                        ^
      /tmp/pip-install-wf4bikyh/llama-cpp-python_19efb6e7a69446cd9a7c7007cc342888/vendor/llama.cpp/ggml.c:692:23: warning: unused function 'packNibbles' [-Wunused-function]
      static inline __m128i packNibbles( __m256i bytes )
                            ^
      5 warnings generated.
      [7/12] Linking C static library vendor/llama.cpp/libggml_static.a
      [8/12] Building CXX object vendor/llama.cpp/CMakeFiles/llama.dir/llama.cpp.o
      [9/12] Building CXX object vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o
      [10/12] Linking CXX shared library vendor/llama.cpp/libggml_shared.so
      FAILED: vendor/llama.cpp/libggml_shared.so
      : && /opt/rocm/llvm/bin/clang++ -fPIC -O3 -DNDEBUG   -shared -Wl,-soname,libggml_shared.so -o vendor/llama.cpp/libggml_shared.so vendor/llama.cpp/CMakeFiles/ggml.dir/ggml.c.o vendor/llama.cpp/CMakeFiles/ggml.dir/ggml-alloc.c.o vendor/llama.cpp/CMakeFiles/ggml.dir/k_quants.c.o vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o  -Wl,-rpath,/opt/rocm-5.6.0/lib:/opt/rocm/lib:  --hip-link  --offload-arch=gfx900  --offload-arch=gfx906  --offload-arch=gfx908  --offload-arch=gfx90a  --offload-arch=gfx1030  /opt/rocm-5.6.0/lib/libhipblas.so.1.0.50600  /opt/rocm-5.6.0/lib/librocblas.so.3.0.50600  /opt/rocm-5.6.0/llvm/lib/clang/16.0.0/lib/linux/libclang_rt.builtins-x86_64.a  /opt/rocm/lib/libamdhip64.so.5.6.50600 && :
      ld.lld: error: relocation R_X86_64_32 cannot be used against local symbol; recompile with -fPIC
      >>> defined in vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o
      >>> referenced by ggml-cuda.cu
      >>>               vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(ggml_init_cublas)
      
      ld.lld: error: relocation R_X86_64_32 cannot be used against symbol '__gxx_personality_v0'; recompile with -fPIC
      >>> defined in /usr/lib/gcc/x86_64-linux-gnu/11/libstdc++.so
      >>> referenced by ggml-cuda.cu
      >>>               vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(.eh_frame+0x7343)
      
      ld.lld: error: relocation R_X86_64_PC32 cannot be used against symbol 'stderr'; recompile with -fPIC
      >>> defined in /lib/x86_64-linux-gnu/libc.so.6
      >>> referenced by ggml-cuda.cu
      >>>               vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(ggml_init_cublas)
      
      ld.lld: error: relocation R_X86_64_32 cannot be used against local symbol; recompile with -fPIC
      >>> defined in vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o
      >>> referenced by ggml-cuda.cu
      >>>               vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(ggml_init_cublas)
      
      ld.lld: error: relocation R_X86_64_32 cannot be used against local symbol; recompile with -fPIC
      >>> defined in vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o
      >>> referenced by ggml-cuda.cu
      >>>               vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(ggml_init_cublas)
      
      ld.lld: error: relocation R_X86_64_32 cannot be used against local symbol; recompile with -fPIC
      >>> defined in vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o
      >>> referenced by ggml-cuda.cu
      >>>               vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(.eh_frame+0x7361)
      
      ld.lld: error: relocation R_X86_64_PC32 cannot be used against symbol 'stderr'; recompile with -fPIC
      >>> defined in /lib/x86_64-linux-gnu/libc.so.6
      >>> referenced by ggml-cuda.cu
      >>>               vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(ggml_init_cublas)
      
      ld.lld: error: relocation R_X86_64_32 cannot be used against local symbol; recompile with -fPIC
      >>> defined in vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o
      >>> referenced by ggml-cuda.cu
      >>>               vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(ggml_init_cublas)
      
      ld.lld: error: relocation R_X86_64_32S cannot be used against local symbol; recompile with -fPIC
      >>> defined in vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o
      >>> referenced by ggml-cuda.cu
      >>>               vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(ggml_init_cublas)
      
      ld.lld: error: relocation R_X86_64_32 cannot be used against local symbol; recompile with -fPIC
      >>> defined in vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o
      >>> referenced by ggml-cuda.cu
      >>>               vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(.eh_frame+0x7399)
      
      ld.lld: error: relocation R_X86_64_32S cannot be used against local symbol; recompile with -fPIC
      >>> defined in vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o
      >>> referenced by ggml-cuda.cu
      >>>               vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(ggml_init_cublas)
      
      ld.lld: error: relocation R_X86_64_32S cannot be used against local symbol; recompile with -fPIC
      >>> defined in vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o
      >>> referenced by ggml-cuda.cu
      >>>               vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(ggml_init_cublas)
      
      ld.lld: error: relocation R_X86_64_32S cannot be used against local symbol; recompile with -fPIC
      >>> defined in vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o
      >>> referenced by ggml-cuda.cu
      >>>               vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(ggml_init_cublas)
      
      ld.lld: error: relocation R_X86_64_32S cannot be used against local symbol; recompile with -fPIC
      >>> defined in vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o
      >>> referenced by ggml-cuda.cu
      >>>               vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(ggml_init_cublas)
      llama-cpp-python/vendor/llama.cpp$ git log | head -3
commit 66874d4fbcc7866377246efbcee938e8cc9c7d76
Author: Kerfuffle <44031344+KerfuffleV2@users.noreply.github.com>
Date:   Thu May 25 20:18:01 2023 -0600
      ld.lld: error: relocation R_X86_64_32S cannot be used against local symbol; recompile with -fPIC
      >>> defined in vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o
      >>> referenced by ggml-cuda.cu
      >>>               vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(ggml_init_cublas)
      
      ld.lld: error: relocation R_X86_64_32 cannot be used against local symbol; recompile with -fPIC
      >>> defined in vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o
      >>> referenced by ggml-cuda.cu
      >>>               vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(ggml_init_cublas)
      
      ld.lld: error: relocation R_X86_64_32 cannot be used against local symbol; recompile with -fPIC
      >>> defined in vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o
      >>> referenced by ggml-cuda.cu
      >>>               vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(ggml_init_cublas)
      
      ld.lld: error: relocation R_X86_64_32 cannot be used against local symbol; recompile with -fPIC
      >>> defined in vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o
      >>> referenced by ggml-cuda.cu
      >>>               vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(ggml_init_cublas)
      
      ld.lld: error: relocation R_X86_64_32 cannot be used against local symbol; recompile with -fPIC
      >>> defined in vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o
      >>> referenced by ggml-cuda.cu
      >>>               vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(ggml_init_cublas)
      
      ld.lld: error: relocation R_X86_64_PC32 cannot be used against symbol 'stderr'; recompile with -fPIC
      >>> defined in /lib/x86_64-linux-gnu/libc.so.6
      >>> referenced by ggml-cuda.cu
      >>>               vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(ggml_init_cublas)
      
      ld.lld: error: too many errors emitted, stopping now (use --error-limit=0 to see all errors)
      clang-16: error: linker command failed with exit code 1 (use -v to see invocation)
      [11/12] Linking CXX shared library vendor/llama.cpp/libllama.so
      FAILED: vendor/llama.cpp/libllama.so
      : && /opt/rocm/llvm/bin/clang++ -fPIC -O3 -DNDEBUG   -shared -Wl,-soname,libllama.so -o vendor/llama.cpp/libllama.so vendor/llama.cpp/CMakeFiles/ggml.dir/ggml.c.o vendor/llama.cpp/CMakeFiles/ggml.dir/ggml-alloc.c.o vendor/llama.cpp/CMakeFiles/ggml.dir/k_quants.c.o vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o vendor/llama.cpp/CMakeFiles/llama.dir/llama.cpp.o  -Wl,-rpath,/opt/rocm-5.6.0/lib:/opt/rocm/lib:  --hip-link  --offload-arch=gfx900  --offload-arch=gfx906  --offload-arch=gfx908  --offload-arch=gfx90a  --offload-arch=gfx1030  /opt/rocm-5.6.0/lib/libhipblas.so.1.0.50600  /opt/rocm-5.6.0/lib/librocblas.so.3.0.50600  /opt/rocm-5.6.0/llvm/lib/clang/16.0.0/lib/linux/libclang_rt.builtins-x86_64.a  /opt/rocm/lib/libamdhip64.so.5.6.50600 && :
      ld.lld: error: relocation R_X86_64_32 cannot be used against symbol '__gxx_personality_v0'; recompile with -fPIC
      >>> defined in /usr/lib/gcc/x86_64-linux-gnu/11/libstdc++.so
      >>> referenced by ggml-cuda.cu
      >>>               vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(.eh_frame+0x9E3F)
      
      ld.lld: error: relocation R_X86_64_32 cannot be used against local symbol; recompile with -fPIC
      >>> defined in vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o
      >>> referenced by ggml-cuda.cu
      >>>               vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(ggml_init_cublas)
      
      ld.lld: error: relocation R_X86_64_32 cannot be used against local symbol; recompile with -fPIC
      >>> defined in vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o
      >>> referenced by ggml-cuda.cu
      >>>               vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(.eh_frame+0x9E5D)
      
      ld.lld: error: relocation R_X86_64_32 cannot be used against local symbol; recompile with -fPIC
      >>> defined in vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o
      >>> referenced by ggml-cuda.cu
      >>>               vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(.eh_frame+0x9E95)
      
      ld.lld: error: relocation R_X86_64_PC32 cannot be used against symbol 'stderr'; recompile with -fPIC
      >>> defined in /lib/x86_64-linux-gnu/libc.so.6
      >>> referenced by ggml-cuda.cu
      >>>               vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(ggml_init_cublas)
      
      ld.lld: error: relocation R_X86_64_32 cannot be used against local symbol; recompile with -fPIC
      >>> defined in vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o
      >>> referenced by ggml-cuda.cu
      >>>               vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(ggml_init_cublas)
      
      ld.lld: error: relocation R_X86_64_32 cannot be used against local symbol; recompile with -fPIC
      >>> defined in vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o
      >>> referenced by ggml-cuda.cu
      >>>               vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(ggml_init_cublas)
      
      ld.lld: error: relocation R_X86_64_PC32 cannot be used against symbol 'stderr'; recompile with -fPIC
      >>> defined in /lib/x86_64-linux-gnu/libc.so.6
      >>> referenced by ggml-cuda.cu
      >>>               vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(ggml_init_cublas)
      
      ld.lld: error: relocation R_X86_64_32 cannot be used against local symbol; recompile with -fPIC
      >>> defined in vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o
      >>> referenced by ggml-cuda.cu
      >>>               vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(ggml_init_cublas)
      
      ld.lld: error: relocation R_X86_64_32S cannot be used against local symbol; recompile with -fPIC
      >>> defined in vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o
      >>> referenced by ggml-cuda.cu
      >>>               vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(ggml_init_cublas)
      
      ld.lld: error: relocation R_X86_64_32S cannot be used against local symbol; recompile with -fPIC
      >>> defined in vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o
      >>> referenced by ggml-cuda.cu
      >>>               vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(ggml_init_cublas)
      
      ld.lld: error: relocation R_X86_64_32S cannot be used against local symbol; recompile with -fPIC
      >>> defined in vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o
      >>> referenced by ggml-cuda.cu
      >>>               vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(ggml_init_cublas)
      
      ld.lld: error: relocation R_X86_64_32S cannot be used against local symbol; recompile with -fPIC
      >>> defined in vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o
      >>> referenced by ggml-cuda.cu
      >>>               vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(ggml_init_cublas)
      
      ld.lld: error: relocation R_X86_64_32S cannot be used against local symbol; recompile with -fPIC
      >>> defined in vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o
      >>> referenced by ggml-cuda.cu
      >>>               vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(ggml_init_cublas)
      
      ld.lld: error: relocation R_X86_64_32S cannot be used against local symbol; recompile with -fPIC
      >>> defined in vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o
      >>> referenced by ggml-cuda.cu
      >>>               vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(ggml_init_cublas)
      
      ld.lld: error: relocation R_X86_64_32 cannot be used against local symbol; recompile with -fPIC
      >>> defined in vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o
      >>> referenced by ggml-cuda.cu
      >>>               vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(ggml_init_cublas)
      
      ld.lld: error: relocation R_X86_64_32 cannot be used against local symbol; recompile with -fPIC
      >>> defined in vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o
      >>> referenced by ggml-cuda.cu
      >>>               vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(ggml_init_cublas)
      
      ld.lld: error: relocation R_X86_64_32 cannot be used against local symbol; recompile with -fPIC
      >>> defined in vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o
      >>> referenced by ggml-cuda.cu
      >>>               vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(ggml_init_cublas)
      
      ld.lld: error: relocation R_X86_64_32 cannot be used against local symbol; recompile with -fPIC
      >>> defined in vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o
      >>> referenced by ggml-cuda.cu
      >>>               vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(ggml_init_cublas)
      
      ld.lld: error: relocation R_X86_64_PC32 cannot be used against symbol 'stderr'; recompile with -fPIC
      >>> defined in /lib/x86_64-linux-gnu/libc.so.6
      >>> referenced by ggml-cuda.cu
      >>>               vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(ggml_init_cublas)
      
      ld.lld: error: too many errors emitted, stopping now (use --error-limit=0 to see all errors)
      clang-16: error: linker command failed with exit code 1 (use -v to see invocation)
      ninja: build stopped: subcommand failed.
      Traceback (most recent call last):
        File "/tmp/pip-build-env-_3ufrfgk/overlay/local/lib/python3.10/dist-packages/skbuild/setuptools_wrap.py", line 674, in setup
          cmkr.make(make_args, install_target=cmake_install_target, env=env)
        File "/tmp/pip-build-env-_3ufrfgk/overlay/local/lib/python3.10/dist-packages/skbuild/cmaker.py", line 697, in make
          self.make_impl(clargs=clargs, config=config, source_dir=source_dir, install_target=install_target, env=env)
        File "/tmp/pip-build-env-_3ufrfgk/overlay/local/lib/python3.10/dist-packages/skbuild/cmaker.py", line 742, in make_impl
          raise SKBuildError(msg)
      
      An error occurred while building with CMake.
        Command:
          /tmp/pip-build-env-_3ufrfgk/overlay/local/lib/python3.10/dist-packages/cmake/data/bin/cmake --build . --target install --config Release --
        Install target:
          install
        Source directory:
          /tmp/pip-install-wf4bikyh/llama-cpp-python_19efb6e7a69446cd9a7c7007cc342888
        Working directory:
          /tmp/pip-install-wf4bikyh/llama-cpp-python_19efb6e7a69446cd9a7c7007cc342888/_skbuild/linux-x86_64-3.10/cmake-build
      Please check the install target is valid and see CMake's output for more information.
      
      [end of output]
  
  note: This error originates from a subprocess, and is likely not a problem with pip.
  ERROR: Failed building wheel for llama-cpp-python
Failed to build llama-cpp-python
ERROR: Could not build wheels for llama-cpp-python, which is required to install pyproject.toml-based projects

As reference here is what happens when i simpy run pip install llama-cpp-python

pip install llama-cpp-python
Collecting llama-cpp-python
  Using cached llama_cpp_python-0.1.82.tar.gz (1.8 MB)
  Installing build dependencies ... done
  Getting requirements to build wheel ... done
  Preparing metadata (pyproject.toml) ... done
Requirement already satisfied: numpy>=1.20.0 in /usr/local/lib/python3.10/dist-packages (from llama-cpp-python) (1.25.2)
Collecting diskcache>=5.6.1
  Using cached diskcache-5.6.1-py3-none-any.whl (45 kB)
Requirement already satisfied: typing-extensions>=4.5.0 in /usr/local/lib/python3.10/dist-packages (from llama-cpp-python) (4.7.1)
Building wheels for collected packages: llama-cpp-python
  Building wheel for llama-cpp-python (pyproject.toml) ... done
  Created wheel for llama-cpp-python: filename=llama_cpp_python-0.1.82-cp310-cp310-linux_x86_64.whl size=593844 sha256=5523b29af1720e7931b4ca3caee8ebb65b502a8640db4f1e6a633eb7d444dff5
  Stored in directory: /root/.cache/pip/wheels/d5/5a/02/e3a3e540045da967de35d1ac2220a194e26e57b120bb46b466
Successfully built llama-cpp-python
Installing collected packages: diskcache, llama-cpp-python
Successfully installed diskcache-5.6.1 llama-cpp-python-0.1.82
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv

After installation with this second method the code runs as expected and it utilizes the gpu

Steps to Reproduce

Make sure you have an AMD GPU

  1. Build a docker container with the dockerfile written above docker build --pull --rm -f "Dockerfile" -t llama-cpp-python-container:latest
  2. Run it docker run -it --device=/dev/kfd --device=/dev/dri llama-cpp-python-container bash
  3. try the two intallation methods CMAKE_ARGS="-DLLAMA_HIPBLAS=on" FORCE_CMAKE=1 pip install llama-cpp-python and pip install llama-cpp-python

Failure Logs

Environment info

llama-cpp-python$ pip list | egrep "uvicorn|fastapi|sse-starlette|numpy"
numpy              1.25.2

I’m not sure where to get the llama-cpp version

About this issue

  • Original URL
  • State: open
  • Created 10 months ago
  • Comments: 15 (6 by maintainers)

Most upvoted comments

I have a similar but slightly different problem. I’m running ROCM 6.0 on Ubuntu 22 and I instally pytorch with rocm 5.7. Now I updated the above install command for llama-cpp-python with the correct references for Rocm 6.0 and a 7900xt GPU (GFX 1100)

CMAKE_ARGS="-D LLAMA_HIPBLAS=ON -D CMAKE_C_COMPILER=/opt/rocm/bin/amdclang -D CMAKE_CXX_COMPILER=/opt/rocm/bin/amdclang++ -D CMAKE_PREFIX_PATH=/opt/rocm -D AMDGPU_TARGETS=gfx1100" FORCE_CMAKE=1 pip install llama-cpp-python==0.2.29 --upgrade --force-reinstall --no-cache-dir

This resulted in the following error

ERROR: Failed building wheel for llama-cpp-python

which is due to this suberror error: unable to find library -lstdc++

I fixed it by installing the latest g++ libraries with

sudo apt install libstdc++-12-dev
sudo apt install libstdc++-12-doc