llama-cpp-python: Installation via pip fails for ROCm / AMD cards
Expected Behavior
I have a machine with and AMD GPU (Radeon RX 7900 XT). I tried to install this library as written in the README by running
CMAKE_ARGS="-DLLAMA_HIPBLAS=on" FORCE_CMAKE=1 pip install llama-cpp-python
Current Behavior
The installation fails, however when I simply run pip install llama-cpp-python
it works
Environment and Context
To make the issue reproducible i made a Docker conatiner with this Dockerfile (adapted from the llama-cpp repo)
ARG UBUNTU_VERSION=22.04
# This needs to generally match the container host's environment.
ARG ROCM_VERSION=5.6however when I simply run pip install llama-cpp-python it works
# Target the CUDA build image
ARG BASE_ROCM_DEV_CONTAINER=rocm/dev-ubuntu-${UBUNTU_VERSION}:${ROCM_VERSION}-complete
FROM ${BASE_ROCM_DEV_CONTAINER} as build
# Unless otherwise specified, we make a fat build.
# List from https://github.com/ggerganov/llama.cpp/pull/1087#issuecomment-1682807878
# This is mostly tied to rocBLAS supported archs.
ARG ROCM_DOCKER_ARCH=\
gfx803 \
gfx900 \
gfx906 \
gfx908 \
gfx90a \
gfx1010 \
gfx1030 \
gfx1100 \ # this is my rocm arch
gfx1101 \
gfx1102
# Set nvcc architecture
ENV GPU_TARGETS=${ROCM_DOCKER_ARCH}
# Enable ROCm
ENV CC=/opt/rocm/llvm/bin/clang
ENV CXX=/opt/rocm/llvm/bin/clang++
ENV LLAMA_HIPBLAS=on
RUN apt-get update && apt-get -y install cmake protobuf-compiler aria2 git
System Info:
CPU: 13th Gen Intel® Core™ i5-13400F GPU: Radeon RX 7900 XT
Ubuntu 22.04.1
Python 3.10.6 Make 4.3 g++ 11.3.0
Failure Information (for bugs)
The installation failed, here is the output when running CMAKE_ARGS="-DLLAMA_HIPBLAS=on" FORCE_CMAKE=1 pip install llama-cpp-python
root@8bebff5da3f1:/# CMAKE_ARGS="-DLLAMA_HIPBLAS=on" FORCE_CMAKE=1 pip install llama-cpp-python
Collecting llama-cpp-python
Downloading llama_cpp_python-0.1.82.tar.gz (1.8 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.8/1.8 MB 3.0 MB/s eta 0:00:00
Installing build dependencies ... done
Getting requirements to build wheel ... done
Preparing metadata (pyproject.toml) ... done
Requirement already satisfied: typing-extensions>=4.5.0 in /usr/local/lib/python3.10/dist-packages (from llama-cpp-python) (4.7.1)
Requirement already satisfied: numpy>=1.20.0 in /usr/local/lib/python3.10/dist-packages (from llama-cpp-python) (1.25.2)
Collecting diskcache>=5.6.1
Downloading diskcache-5.6.1-py3-none-any.whl (45 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 45.6/45.6 KB 7.1 MB/s eta 0:00:00
Building wheels for collected packages: llama-cpp-python
Building wheel for llama-cpp-python (pyproject.toml) ... error
error: subprocess-exited-with-error
× Building wheel for llama-cpp-python (pyproject.toml) did not run successfully.
│ exit code: 1
╰─> [390 lines of output]
--------------------------------------------------------------------------------
-- Trying 'Ninja' generator
--------------------------------
---------------------------
----------------------
-----------------
------------
-------
--
CMake Deprecation Warning at CMakeLists.txt:1 (cmake_minimum_required):
Compatibility with CMake < 3.5 will be removed from a future version of
CMake.
Update the VERSION argument <min> value or use a ...<max> suffix to tell
CMake that the project does not need compatibility with older versions.
Not searching for unused variables given on the command line.
-- The C compiler identification is Clang 16.0.0
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /opt/rocm/llvm/bin/clang - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- The CXX compiler identification is Clang 16.0.0
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /opt/rocm/llvm/bin/clang++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Configuring done (0.6s)
-- Generating done (0.0s)
-- Build files have been written to: /tmp/pip-install-wf4bikyh/llama-cpp-python_19efb6e7a69446cd9a7c7007cc342888/_cmake_test_compile/build
--
-------
------------
-----------------
----------------------
---------------------------
--------------------------------
-- Trying 'Ninja' generator - success
--------------------------------------------------------------------------------
Configuring Project
Working directory:
/tmp/pip-install-wf4bikyh/llama-cpp-python_19efb6e7a69446cd9a7c7007cc342888/_skbuild/linux-x86_64-3.10/cmake-build
Command:
/tmp/pip-build-env-_3ufrfgk/overlay/local/lib/python3.10/dist-packages/cmake/data/bin/cmake /tmp/pip-install-wf4bikyh/llama-cpp-python_19efb6e7a69446cd9a7c7007cc342888 -G Ninja -DCMAKE_MAKE_PROGRAM:FILEPATH=/tmp/pip-build-env-_3ufrfgk/overlay/local/lib/python3.10/dist-packages/ninja/data/bin/ninja --no-warn-unused-cli -DCMAKE_INSTALL_PREFIX:PATH=/tmp/pip-install-wf4bikyh/llama-cpp-python_19efb6e7a69446cd9a7c7007cc342888/_skbuild/linux-x86_64-3.10/cmake-install -DPYTHON_VERSION_STRING:STRING=3.10.6 -DSKBUILD:INTERNAL=TRUE -DCMAKE_MODULE_PATH:PATH=/tmp/pip-build-env-_3ufrfgk/overlay/local/lib/python3.10/dist-packages/skbuild/resources/cmake -DPYTHON_EXECUTABLE:PATH=/usr/bin/python3 -DPYTHON_INCLUDE_DIR:PATH=/usr/include/python3.10 -DPython_EXECUTABLE:PATH=/usr/bin/python3 -DPython_ROOT_DIR:PATH=/usr -DPython_FIND_REGISTRY:STRING=NEVER -DPython_INCLUDE_DIR:PATH=/usr/include/python3.10 -DPython3_EXECUTABLE:PATH=/usr/bin/python3 -DPython3_ROOT_DIR:PATH=/usr -DPython3_FIND_REGISTRY:STRING=NEVER -DPython3_INCLUDE_DIR:PATH=/usr/include/python3.10 -DCMAKE_MAKE_PROGRAM:FILEPATH=/tmp/pip-build-env-_3ufrfgk/overlay/local/lib/python3.10/dist-packages/ninja/data/bin/ninja -DLLAMA_HIPBLAS=on -DCMAKE_BUILD_TYPE:STRING=Release -DLLAMA_HIPBLAS=on
Not searching for unused variables given on the command line.
-- The C compiler identification is Clang 16.0.0
-- The CXX compiler identification is Clang 16.0.0
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /opt/rocm/llvm/bin/clang - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /opt/rocm/llvm/bin/clang++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Found Git: /usr/bin/git (found version "2.34.1")
fatal: not a git repository (or any of the parent directories): .git
fatal: not a git repository (or any of the parent directories): .git
CMake Warning at vendor/llama.cpp/CMakeLists.txt:118 (message):
Git repository not found; to enable automatic generation of build info,
make sure Git is installed and the project is a Git repository.
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success
-- Found Threads: TRUE
CMake Deprecation Warning at /opt/rocm/lib/cmake/hip/hip-config.cmake:20 (cmake_minimum_required):
Compatibility with CMake < 3.5 will be removed from a future version of
CMake.
Update the VERSION argument <min> value or use a ...<max> suffix to tell
CMake that the project does not need compatibility with older versions.
Call Stack (most recent call first):
vendor/llama.cpp/CMakeLists.txt:366 (find_package)
-- hip::amdhip64 is SHARED_LIBRARY
-- Performing Test HIP_CLANG_SUPPORTS_PARALLEL_JOBS
-- Performing Test HIP_CLANG_SUPPORTS_PARALLEL_JOBS - Success
CMake Deprecation Warning at /opt/rocm/lib/cmake/hip/hip-config.cmake:20 (cmake_minimum_required):
Compatibility with CMake < 3.5 will be removed from a future version of
CMake.
Update the VERSION argument <min> value or use a ...<max> suffix to tell
CMake that the project does not need compatibility with older versions.
Call Stack (most recent call first):
/tmp/pip-build-env-_3ufrfgk/overlay/local/lib/python3.10/dist-packages/cmake/data/share/cmake-3.27/Modules/CMakeFindDependencyMacro.cmake:76 (find_package)
/opt/rocm/lib/cmake/hipblas/hipblas-config.cmake:90 (find_dependency)
vendor/llama.cpp/CMakeLists.txt:367 (find_package)
-- hip::amdhip64 is SHARED_LIBRARY
-- HIP and hipBLAS found
-- CMAKE_SYSTEM_PROCESSOR: x86_64
-- x86 detected
-- Configuring done (0.6s)
-- Generating done (0.0s)
-- Build files have been written to: /tmp/pip-install-wf4bikyh/llama-cpp-python_19efb6e7a69446cd9a7c7007cc342888/_skbuild/linux-x86_64-3.10/cmake-build
[1/12] Building C object vendor/llama.cpp/CMakeFiles/ggml.dir/ggml-alloc.c.o
[2/12] Building CXX object vendor/llama.cpp/common/CMakeFiles/common.dir/console.cpp.o
[3/12] Building CXX object vendor/llama.cpp/common/CMakeFiles/common.dir/grammar-parser.cpp.o
[4/12] Building C object vendor/llama.cpp/CMakeFiles/ggml.dir/k_quants.c.o
/tmp/pip-install-wf4bikyh/llama-cpp-python_19efb6e7a69446cd9a7c7007cc342888/vendor/llama.cpp/k_quants.c:186:11: warning: variable 'sum_x' set but not used [-Wunused-but-set-variable]
float sum_x = 0;
^
/tmp/pip-install-wf4bikyh/llama-cpp-python_19efb6e7a69446cd9a7c7007cc342888/vendor/llama.cpp/k_quants.c:187:11: warning: variable 'sum_x2' set but not used [-Wunused-but-set-variable]
float sum_x2 = 0;
^
/tmp/pip-install-wf4bikyh/llama-cpp-python_19efb6e7a69446cd9a7c7007cc342888/vendor/llama.cpp/k_quants.c:182:14: warning: unused function 'make_qkx1_quants' [-Wunused-function]
static float make_qkx1_quants(int n, int nmax, const float * restrict x, uint8_t * restrict L, float * restrict the_min,
^
3 warnings generated.
[5/12] Building CXX object vendor/llama.cpp/common/CMakeFiles/common.dir/common.cpp.o
[6/12] Building C object vendor/llama.cpp/CMakeFiles/ggml.dir/ggml.c.o
/tmp/pip-install-wf4bikyh/llama-cpp-python_19efb6e7a69446cd9a7c7007cc342888/vendor/llama.cpp/ggml.c:2413:5: warning: implicit conversion increases floating-point precision: 'float' to 'ggml_float' (aka 'double') [-Wdouble-promotion]
GGML_F16_VEC_REDUCE(sumf, sum);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/tmp/pip-install-wf4bikyh/llama-cpp-python_19efb6e7a69446cd9a7c7007cc342888/vendor/llama.cpp/ggml.c:2045:37: note: expanded from macro 'GGML_F16_VEC_REDUCE'
#define GGML_F16_VEC_REDUCE GGML_F32Cx8_REDUCE
^
/tmp/pip-install-wf4bikyh/llama-cpp-python_19efb6e7a69446cd9a7c7007cc342888/vendor/llama.cpp/ggml.c:2035:33: note: expanded from macro 'GGML_F32Cx8_REDUCE'
#define GGML_F32Cx8_REDUCE GGML_F32x8_REDUCE
^
/tmp/pip-install-wf4bikyh/llama-cpp-python_19efb6e7a69446cd9a7c7007cc342888/vendor/llama.cpp/ggml.c:1981:11: note: expanded from macro 'GGML_F32x8_REDUCE'
res = _mm_cvtss_f32(_mm_hadd_ps(t1, t1)); \
~ ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/tmp/pip-install-wf4bikyh/llama-cpp-python_19efb6e7a69446cd9a7c7007cc342888/vendor/llama.cpp/ggml.c:3456:9: warning: implicit conversion increases floating-point precision: 'float' to 'ggml_float' (aka 'double') [-Wdouble-promotion]
GGML_F16_VEC_REDUCE(sumf[k], sum[k]);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/tmp/pip-install-wf4bikyh/llama-cpp-python_19efb6e7a69446cd9a7c7007cc342888/vendor/llama.cpp/ggml.c:2045:37: note: expanded from macro 'GGML_F16_VEC_REDUCE'
#define GGML_F16_VEC_REDUCE GGML_F32Cx8_REDUCE
^
/tmp/pip-install-wf4bikyh/llama-cpp-python_19efb6e7a69446cd9a7c7007cc342888/vendor/llama.cpp/ggml.c:2035:33: note: expanded from macro 'GGML_F32Cx8_REDUCE'
#define GGML_F32Cx8_REDUCE GGML_F32x8_REDUCE
^
/tmp/pip-install-wf4bikyh/llama-cpp-python_19efb6e7a69446cd9a7c7007cc342888/vendor/llama.cpp/ggml.c:1981:11: note: expanded from macro 'GGML_F32x8_REDUCE'
res = _mm_cvtss_f32(_mm_hadd_ps(t1, t1)); \
~ ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/tmp/pip-install-wf4bikyh/llama-cpp-python_19efb6e7a69446cd9a7c7007cc342888/vendor/llama.cpp/ggml.c:596:23: warning: unused function 'mul_sum_i8_pairs' [-Wunused-function]
static inline __m128i mul_sum_i8_pairs(const __m128i x, const __m128i y) {
^
/tmp/pip-install-wf4bikyh/llama-cpp-python_19efb6e7a69446cd9a7c7007cc342888/vendor/llama.cpp/ggml.c:627:19: warning: unused function 'hsum_i32_4' [-Wunused-function]
static inline int hsum_i32_4(const __m128i a) {
^
/tmp/pip-install-wf4bikyh/llama-cpp-python_19efb6e7a69446cd9a7c7007cc342888/vendor/llama.cpp/ggml.c:692:23: warning: unused function 'packNibbles' [-Wunused-function]
static inline __m128i packNibbles( __m256i bytes )
^
5 warnings generated.
[7/12] Linking C static library vendor/llama.cpp/libggml_static.a
[8/12] Building CXX object vendor/llama.cpp/CMakeFiles/llama.dir/llama.cpp.o
[9/12] Building CXX object vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o
[10/12] Linking CXX shared library vendor/llama.cpp/libggml_shared.so
FAILED: vendor/llama.cpp/libggml_shared.so
: && /opt/rocm/llvm/bin/clang++ -fPIC -O3 -DNDEBUG -shared -Wl,-soname,libggml_shared.so -o vendor/llama.cpp/libggml_shared.so vendor/llama.cpp/CMakeFiles/ggml.dir/ggml.c.o vendor/llama.cpp/CMakeFiles/ggml.dir/ggml-alloc.c.o vendor/llama.cpp/CMakeFiles/ggml.dir/k_quants.c.o vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o -Wl,-rpath,/opt/rocm-5.6.0/lib:/opt/rocm/lib: --hip-link --offload-arch=gfx900 --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx1030 /opt/rocm-5.6.0/lib/libhipblas.so.1.0.50600 /opt/rocm-5.6.0/lib/librocblas.so.3.0.50600 /opt/rocm-5.6.0/llvm/lib/clang/16.0.0/lib/linux/libclang_rt.builtins-x86_64.a /opt/rocm/lib/libamdhip64.so.5.6.50600 && :
ld.lld: error: relocation R_X86_64_32 cannot be used against local symbol; recompile with -fPIC
>>> defined in vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o
>>> referenced by ggml-cuda.cu
>>> vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(ggml_init_cublas)
ld.lld: error: relocation R_X86_64_32 cannot be used against symbol '__gxx_personality_v0'; recompile with -fPIC
>>> defined in /usr/lib/gcc/x86_64-linux-gnu/11/libstdc++.so
>>> referenced by ggml-cuda.cu
>>> vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(.eh_frame+0x7343)
ld.lld: error: relocation R_X86_64_PC32 cannot be used against symbol 'stderr'; recompile with -fPIC
>>> defined in /lib/x86_64-linux-gnu/libc.so.6
>>> referenced by ggml-cuda.cu
>>> vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(ggml_init_cublas)
ld.lld: error: relocation R_X86_64_32 cannot be used against local symbol; recompile with -fPIC
>>> defined in vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o
>>> referenced by ggml-cuda.cu
>>> vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(ggml_init_cublas)
ld.lld: error: relocation R_X86_64_32 cannot be used against local symbol; recompile with -fPIC
>>> defined in vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o
>>> referenced by ggml-cuda.cu
>>> vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(ggml_init_cublas)
ld.lld: error: relocation R_X86_64_32 cannot be used against local symbol; recompile with -fPIC
>>> defined in vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o
>>> referenced by ggml-cuda.cu
>>> vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(.eh_frame+0x7361)
ld.lld: error: relocation R_X86_64_PC32 cannot be used against symbol 'stderr'; recompile with -fPIC
>>> defined in /lib/x86_64-linux-gnu/libc.so.6
>>> referenced by ggml-cuda.cu
>>> vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(ggml_init_cublas)
ld.lld: error: relocation R_X86_64_32 cannot be used against local symbol; recompile with -fPIC
>>> defined in vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o
>>> referenced by ggml-cuda.cu
>>> vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(ggml_init_cublas)
ld.lld: error: relocation R_X86_64_32S cannot be used against local symbol; recompile with -fPIC
>>> defined in vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o
>>> referenced by ggml-cuda.cu
>>> vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(ggml_init_cublas)
ld.lld: error: relocation R_X86_64_32 cannot be used against local symbol; recompile with -fPIC
>>> defined in vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o
>>> referenced by ggml-cuda.cu
>>> vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(.eh_frame+0x7399)
ld.lld: error: relocation R_X86_64_32S cannot be used against local symbol; recompile with -fPIC
>>> defined in vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o
>>> referenced by ggml-cuda.cu
>>> vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(ggml_init_cublas)
ld.lld: error: relocation R_X86_64_32S cannot be used against local symbol; recompile with -fPIC
>>> defined in vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o
>>> referenced by ggml-cuda.cu
>>> vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(ggml_init_cublas)
ld.lld: error: relocation R_X86_64_32S cannot be used against local symbol; recompile with -fPIC
>>> defined in vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o
>>> referenced by ggml-cuda.cu
>>> vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(ggml_init_cublas)
ld.lld: error: relocation R_X86_64_32S cannot be used against local symbol; recompile with -fPIC
>>> defined in vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o
>>> referenced by ggml-cuda.cu
>>> vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(ggml_init_cublas)
llama-cpp-python/vendor/llama.cpp$ git log | head -3
commit 66874d4fbcc7866377246efbcee938e8cc9c7d76
Author: Kerfuffle <44031344+KerfuffleV2@users.noreply.github.com>
Date: Thu May 25 20:18:01 2023 -0600
ld.lld: error: relocation R_X86_64_32S cannot be used against local symbol; recompile with -fPIC
>>> defined in vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o
>>> referenced by ggml-cuda.cu
>>> vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(ggml_init_cublas)
ld.lld: error: relocation R_X86_64_32 cannot be used against local symbol; recompile with -fPIC
>>> defined in vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o
>>> referenced by ggml-cuda.cu
>>> vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(ggml_init_cublas)
ld.lld: error: relocation R_X86_64_32 cannot be used against local symbol; recompile with -fPIC
>>> defined in vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o
>>> referenced by ggml-cuda.cu
>>> vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(ggml_init_cublas)
ld.lld: error: relocation R_X86_64_32 cannot be used against local symbol; recompile with -fPIC
>>> defined in vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o
>>> referenced by ggml-cuda.cu
>>> vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(ggml_init_cublas)
ld.lld: error: relocation R_X86_64_32 cannot be used against local symbol; recompile with -fPIC
>>> defined in vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o
>>> referenced by ggml-cuda.cu
>>> vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(ggml_init_cublas)
ld.lld: error: relocation R_X86_64_PC32 cannot be used against symbol 'stderr'; recompile with -fPIC
>>> defined in /lib/x86_64-linux-gnu/libc.so.6
>>> referenced by ggml-cuda.cu
>>> vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(ggml_init_cublas)
ld.lld: error: too many errors emitted, stopping now (use --error-limit=0 to see all errors)
clang-16: error: linker command failed with exit code 1 (use -v to see invocation)
[11/12] Linking CXX shared library vendor/llama.cpp/libllama.so
FAILED: vendor/llama.cpp/libllama.so
: && /opt/rocm/llvm/bin/clang++ -fPIC -O3 -DNDEBUG -shared -Wl,-soname,libllama.so -o vendor/llama.cpp/libllama.so vendor/llama.cpp/CMakeFiles/ggml.dir/ggml.c.o vendor/llama.cpp/CMakeFiles/ggml.dir/ggml-alloc.c.o vendor/llama.cpp/CMakeFiles/ggml.dir/k_quants.c.o vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o vendor/llama.cpp/CMakeFiles/llama.dir/llama.cpp.o -Wl,-rpath,/opt/rocm-5.6.0/lib:/opt/rocm/lib: --hip-link --offload-arch=gfx900 --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx1030 /opt/rocm-5.6.0/lib/libhipblas.so.1.0.50600 /opt/rocm-5.6.0/lib/librocblas.so.3.0.50600 /opt/rocm-5.6.0/llvm/lib/clang/16.0.0/lib/linux/libclang_rt.builtins-x86_64.a /opt/rocm/lib/libamdhip64.so.5.6.50600 && :
ld.lld: error: relocation R_X86_64_32 cannot be used against symbol '__gxx_personality_v0'; recompile with -fPIC
>>> defined in /usr/lib/gcc/x86_64-linux-gnu/11/libstdc++.so
>>> referenced by ggml-cuda.cu
>>> vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(.eh_frame+0x9E3F)
ld.lld: error: relocation R_X86_64_32 cannot be used against local symbol; recompile with -fPIC
>>> defined in vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o
>>> referenced by ggml-cuda.cu
>>> vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(ggml_init_cublas)
ld.lld: error: relocation R_X86_64_32 cannot be used against local symbol; recompile with -fPIC
>>> defined in vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o
>>> referenced by ggml-cuda.cu
>>> vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(.eh_frame+0x9E5D)
ld.lld: error: relocation R_X86_64_32 cannot be used against local symbol; recompile with -fPIC
>>> defined in vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o
>>> referenced by ggml-cuda.cu
>>> vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(.eh_frame+0x9E95)
ld.lld: error: relocation R_X86_64_PC32 cannot be used against symbol 'stderr'; recompile with -fPIC
>>> defined in /lib/x86_64-linux-gnu/libc.so.6
>>> referenced by ggml-cuda.cu
>>> vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(ggml_init_cublas)
ld.lld: error: relocation R_X86_64_32 cannot be used against local symbol; recompile with -fPIC
>>> defined in vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o
>>> referenced by ggml-cuda.cu
>>> vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(ggml_init_cublas)
ld.lld: error: relocation R_X86_64_32 cannot be used against local symbol; recompile with -fPIC
>>> defined in vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o
>>> referenced by ggml-cuda.cu
>>> vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(ggml_init_cublas)
ld.lld: error: relocation R_X86_64_PC32 cannot be used against symbol 'stderr'; recompile with -fPIC
>>> defined in /lib/x86_64-linux-gnu/libc.so.6
>>> referenced by ggml-cuda.cu
>>> vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(ggml_init_cublas)
ld.lld: error: relocation R_X86_64_32 cannot be used against local symbol; recompile with -fPIC
>>> defined in vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o
>>> referenced by ggml-cuda.cu
>>> vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(ggml_init_cublas)
ld.lld: error: relocation R_X86_64_32S cannot be used against local symbol; recompile with -fPIC
>>> defined in vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o
>>> referenced by ggml-cuda.cu
>>> vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(ggml_init_cublas)
ld.lld: error: relocation R_X86_64_32S cannot be used against local symbol; recompile with -fPIC
>>> defined in vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o
>>> referenced by ggml-cuda.cu
>>> vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(ggml_init_cublas)
ld.lld: error: relocation R_X86_64_32S cannot be used against local symbol; recompile with -fPIC
>>> defined in vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o
>>> referenced by ggml-cuda.cu
>>> vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(ggml_init_cublas)
ld.lld: error: relocation R_X86_64_32S cannot be used against local symbol; recompile with -fPIC
>>> defined in vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o
>>> referenced by ggml-cuda.cu
>>> vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(ggml_init_cublas)
ld.lld: error: relocation R_X86_64_32S cannot be used against local symbol; recompile with -fPIC
>>> defined in vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o
>>> referenced by ggml-cuda.cu
>>> vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(ggml_init_cublas)
ld.lld: error: relocation R_X86_64_32S cannot be used against local symbol; recompile with -fPIC
>>> defined in vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o
>>> referenced by ggml-cuda.cu
>>> vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(ggml_init_cublas)
ld.lld: error: relocation R_X86_64_32 cannot be used against local symbol; recompile with -fPIC
>>> defined in vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o
>>> referenced by ggml-cuda.cu
>>> vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(ggml_init_cublas)
ld.lld: error: relocation R_X86_64_32 cannot be used against local symbol; recompile with -fPIC
>>> defined in vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o
>>> referenced by ggml-cuda.cu
>>> vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(ggml_init_cublas)
ld.lld: error: relocation R_X86_64_32 cannot be used against local symbol; recompile with -fPIC
>>> defined in vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o
>>> referenced by ggml-cuda.cu
>>> vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(ggml_init_cublas)
ld.lld: error: relocation R_X86_64_32 cannot be used against local symbol; recompile with -fPIC
>>> defined in vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o
>>> referenced by ggml-cuda.cu
>>> vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(ggml_init_cublas)
ld.lld: error: relocation R_X86_64_PC32 cannot be used against symbol 'stderr'; recompile with -fPIC
>>> defined in /lib/x86_64-linux-gnu/libc.so.6
>>> referenced by ggml-cuda.cu
>>> vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(ggml_init_cublas)
ld.lld: error: too many errors emitted, stopping now (use --error-limit=0 to see all errors)
clang-16: error: linker command failed with exit code 1 (use -v to see invocation)
ninja: build stopped: subcommand failed.
Traceback (most recent call last):
File "/tmp/pip-build-env-_3ufrfgk/overlay/local/lib/python3.10/dist-packages/skbuild/setuptools_wrap.py", line 674, in setup
cmkr.make(make_args, install_target=cmake_install_target, env=env)
File "/tmp/pip-build-env-_3ufrfgk/overlay/local/lib/python3.10/dist-packages/skbuild/cmaker.py", line 697, in make
self.make_impl(clargs=clargs, config=config, source_dir=source_dir, install_target=install_target, env=env)
File "/tmp/pip-build-env-_3ufrfgk/overlay/local/lib/python3.10/dist-packages/skbuild/cmaker.py", line 742, in make_impl
raise SKBuildError(msg)
An error occurred while building with CMake.
Command:
/tmp/pip-build-env-_3ufrfgk/overlay/local/lib/python3.10/dist-packages/cmake/data/bin/cmake --build . --target install --config Release --
Install target:
install
Source directory:
/tmp/pip-install-wf4bikyh/llama-cpp-python_19efb6e7a69446cd9a7c7007cc342888
Working directory:
/tmp/pip-install-wf4bikyh/llama-cpp-python_19efb6e7a69446cd9a7c7007cc342888/_skbuild/linux-x86_64-3.10/cmake-build
Please check the install target is valid and see CMake's output for more information.
[end of output]
note: This error originates from a subprocess, and is likely not a problem with pip.
ERROR: Failed building wheel for llama-cpp-python
Failed to build llama-cpp-python
ERROR: Could not build wheels for llama-cpp-python, which is required to install pyproject.toml-based projects
As reference here is what happens when i simpy run pip install llama-cpp-python
pip install llama-cpp-python
Collecting llama-cpp-python
Using cached llama_cpp_python-0.1.82.tar.gz (1.8 MB)
Installing build dependencies ... done
Getting requirements to build wheel ... done
Preparing metadata (pyproject.toml) ... done
Requirement already satisfied: numpy>=1.20.0 in /usr/local/lib/python3.10/dist-packages (from llama-cpp-python) (1.25.2)
Collecting diskcache>=5.6.1
Using cached diskcache-5.6.1-py3-none-any.whl (45 kB)
Requirement already satisfied: typing-extensions>=4.5.0 in /usr/local/lib/python3.10/dist-packages (from llama-cpp-python) (4.7.1)
Building wheels for collected packages: llama-cpp-python
Building wheel for llama-cpp-python (pyproject.toml) ... done
Created wheel for llama-cpp-python: filename=llama_cpp_python-0.1.82-cp310-cp310-linux_x86_64.whl size=593844 sha256=5523b29af1720e7931b4ca3caee8ebb65b502a8640db4f1e6a633eb7d444dff5
Stored in directory: /root/.cache/pip/wheels/d5/5a/02/e3a3e540045da967de35d1ac2220a194e26e57b120bb46b466
Successfully built llama-cpp-python
Installing collected packages: diskcache, llama-cpp-python
Successfully installed diskcache-5.6.1 llama-cpp-python-0.1.82
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
After installation with this second method the code runs as expected and it utilizes the gpu
Steps to Reproduce
Make sure you have an AMD GPU
- Build a docker container with the dockerfile written above
docker build --pull --rm -f "Dockerfile" -t llama-cpp-python-container:latest
- Run it
docker run -it --device=/dev/kfd --device=/dev/dri llama-cpp-python-container bash
- try the two intallation methods
CMAKE_ARGS="-DLLAMA_HIPBLAS=on" FORCE_CMAKE=1 pip install llama-cpp-python
andpip install llama-cpp-python
Failure Logs
Environment info
llama-cpp-python$ pip list | egrep "uvicorn|fastapi|sse-starlette|numpy"
numpy 1.25.2
I’m not sure where to get the llama-cpp version
About this issue
- Original URL
- State: open
- Created 10 months ago
- Comments: 15 (6 by maintainers)
I have a similar but slightly different problem. I’m running ROCM 6.0 on Ubuntu 22 and I instally pytorch with rocm 5.7. Now I updated the above install command for llama-cpp-python with the correct references for Rocm 6.0 and a 7900xt GPU (GFX 1100)
This resulted in the following error
which is due to this suberror
error: unable to find library -lstdc++
I fixed it by installing the latest g++ libraries with