llama.cpp: aarch64 CUDA build: ggml.h(309): error: identifier "half" is undefined
Steps To Reproduce
Steps to reproduce the behavior:
- Attempt a native aarch64 build with CUDA support (only tested as
nix build .#packages.aarch64-linux.jetson-xavier
)
Build log
CI: https://github.com/ggerganov/llama.cpp/actions/runs/7514510149/job/20457461738#step:8:1499 Cleaner logs: https://gist.github.com/SomeoneSerge/33008b08b7bd887e994b7e52cd432af0
[14/105] Building CUDA object CMakeFiles/ggml.dir/ggml-cuda.cu.o
FAILED: CMakeFiles/ggml.dir/ggml-cuda.cu.o
/nix/store/69di7mgz1c5864ghppzzidwv3vy1r3p7-cuda_nvcc-11.8.89/bin/nvcc -forward-unknown-to-host-compiler -ccbin=/nix/store/a4vw7jhihwkh7zp6vj3cn8375phb31ds-gcc-wrapper-11.4.0/bin/c++ -DGGML_CUDA_DMMV_X=32 -DGGML_>
nvcc warning : incompatible redefinition for option 'compiler-bindir', the last value of this option was used
/build/source/ggml.h(309): error: identifier "half" is undefined
1 error detected in the compilation of "/build/source/ggml-cuda.cu".
Additional context
❯ git rev-parse HEAD
f172de03f11465dc6c5a0fc3a22f8ec254c6832c
❯ nix path-info --derivation .#packages.aarch64-linux.jetson-xavier --recursive | rg -- '(gcc|nvcc)-\d.*\d\.drv'
/nix/store/9zr86pkcj6cbba7g3kkqzg2smx3q74fc-xgcc-12.3.0.drv
/nix/store/px2vi9df2z1zk5qi2ql7phnbp8i0v011-gcc-12.3.0.drv
/nix/store/w9w0pii96jp5fjxafzky7bybyrdcr7bx-gcc-11.4.0.drv
/nix/store/y17s03wj6lzbp7rfrk87gvmp5sslwcgy-cuda_nvcc-11.8.89.drv
❯ # ^^^ uses gcc11 and cuda 11.8 for the build, and gcc12's libstdc++ for the link
Previous work and related issues
https://github.com/ggerganov/llama.cpp/issues/1455 had faced a related issue and introduced (https://github.com/ggerganov/llama.cpp/pull/2670/) the typedef
at the line 309.
The failure is at most three weeks old, a successful xavier build confirmed e.g. in https://github.com/ggerganov/llama.cpp/pull/4605#issuecomment-1869676842
I’ll run a bisect if/when I get access to an aarch64 builder
About this issue
- Original URL
- State: closed
- Created 6 months ago
- Comments: 17 (4 by maintainers)
I think problem is there https://github.com/ggerganov/llama.cpp/issues/4766
this patch add
#include "ggml-cuda.h"
before include#include <cuda_fp16.h>
so the solution is move
#include "ggml-cuda.h"
after#include <cuda_fp16.h>
the patch to fix this problem is below
@planform’s patch is sufficient and seems to be minimal (
ggml.h
already has a preprocessor branch for__CUDACC__
)@planform @KyL0N either of you would like to open a PR or should I?
The more general answer is “use the correct protobuf version from any source that ships it” (e.g. another distribution, conda, a prebuilt multi-gigabyte docker image, or 🙃 Nixpkgs). I’ll stop here and abstain from being a shill
Thank you very nice. Will have a look at it. Yes i know but since i havent worked long time with docker and the Jetson i hesitated to use docker most probably i have some dependancy wrong i just didnt understand why i get the error when the protoc is the same version installed as it is used in the build. Will check it definitly thanks for the advice and the help.
@ark626 in https://github.com/dusty-nv/jetson-containers you have https://github.com/oobabooga/text-generation-webui working out of the box on jetson - this webui has openai compatible API which you can use in HA extension. I’m also working on my own extension 😉
I strongly recommend to use docker for those experiments. It’s easier to manage dependencies & configurations