bitsandbytes: Building on Jetson AGX Xavier Development Kit fails
Hi,
i am trying to build bitsandbytes on a Nvidia Jetson AGX Xavier Kit, but it fails, not finding emmintrin.h:
/home/g/bitsandbytes# CUDA_VERSION=114 make cuda11x_nomatmul
ENVIRONMENT ============================ CUDA_VERSION: 114 ============================ NVCC path: /usr/local/cuda/bin/nvcc GPP path: /usr/bin/g++ VERSION: g++ (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0 CUDA_HOME: /usr/local/cuda CONDA_PREFIX: PATH: /usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin:/usr/local/cuda/bin/ LD_LIBRARY_PATH: ============================ /usr/local/cuda/bin/nvcc -gencode arch=compute_50,code=sm_50 -gencode arch=compute_52,code=sm_52 -gencode arch=compute_60,code=sm_60 -gencode arch=compute_61,code=sm_61 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_72,code=sm_72 -gencode arch=compute_75,code=sm_75 -gencode arch=compute_80,code=sm_80 -gencode arch=compute_86,code=sm_86 -gencode arch=compute_35,code=sm_35 -gencode arch=compute_37,code=sm_37 -Xcompiler ‘-fPIC’ --use_fast_math -Xptxas=-v -dc /home/g/bitsandbytes/csrc/ops.cu /home/g/bitsandbytes/csrc/kernels.cu -I /home/g/sse2neon -I /usr/local/cuda/include -I /home/g/bitsandbytes/csrc -I /include -I /home/g/bitsandbytes/include -L /usr/local/cuda/lib64 -lcudart -lcublas -lcublasLt -lcurand -lcusparse -L /lib --output-directory /home/g/bitsandbytes/build -D NO_CUBLASLT nvcc warning : The ‘compute_35’, ‘compute_37’, ‘compute_50’, ‘sm_35’, ‘sm_37’ and ‘sm_50’ architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). In file included from /home/g/bitsandbytes/include/BinSearch.h:5, from /home/g/bitsandbytes/csrc/ops.cu:10: /home/g/bitsandbytes/include/SIMD.h:32:10: fatal error: emmintrin.h: No such file or directory 32 | #include <emmintrin.h> | ^~~~~~~~~~~~~ compilation terminated. make: *** [Makefile:83: cuda11x_nomatmul] Error 1
Did a bit of research and, not knowing what i am doing, I changed SMID.h to include sse2neon.h instead of emmintrin.h. NOW it fails again, catastrophically, not finding builtin functions:
/usr/lib/gcc/aarch64-linux-gnu/9/include/arm_neon.h(38): error: identifier “__Int8x8_t” is undefined /usr/lib/gcc/aarch64-linux-gnu/9/include/arm_neon.h(39): error: identifier “__Int16x4_t” is undefined /usr/lib/gcc/aarch64-linux-gnu/9/include/arm_neon.h(40): error: identifier “__Int32x2_t” is undefined /usr/lib/gcc/aarch64-linux-gnu/9/include/arm_neon.h(41): error: identifier “__Int64x1_t” is undefined /usr/lib/gcc/aarch64-linux-gnu/9/include/arm_neon.h(42): error: identifier “__Float16x4_t” is undefined /usr/lib/gcc/aarch64-linux-gnu/9/include/arm_neon.h(43): error: identifier “__Float32x2_t” is undefined /usr/lib/gcc/aarch64-linux-gnu/9/include/arm_neon.h(44): error: identifier “__Poly8x8_t” is undefined /usr/lib/gcc/aarch64-linux-gnu/9/include/arm_neon.h(45): error: identifier “__Poly16x4_t” is undefined /usr/lib/gcc/aarch64-linux-gnu/9/include/arm_neon.h(46): error: identifier “__Uint8x8_t” is undefined /usr/lib/gcc/aarch64-linux-gnu/9/include/arm_neon.h(47): error: identifier “__Uint16x4_t” is undefined /usr/lib/gcc/aarch64-linux-gnu/9/include/arm_neon.h(48): error: identifier “__Uint32x2_t” is undefined /usr/lib/gcc/aarch64-linux-gnu/9/include/arm_neon.h(49): error: identifier “__Float64x1_t” is undefined /usr/lib/gcc/aarch64-linux-gnu/9/include/arm_neon.h(50): error: identifier “__Uint64x1_t” is undefined /usr/lib/gcc/aarch64-linux-gnu/9/include/arm_neon.h(51): error: identifier “__Int8x16_t” is undefined /usr/lib/gcc/aarch64-linux-gnu/9/include/arm_neon.h(52): error: identifier “__Int16x8_t” is undefined /usr/lib/gcc/aarch64-linux-gnu/9/include/arm_neon.h(53): error: identifier “__Int32x4_t” is undefined /usr/lib/gcc/aarch64-linux-gnu/9/include/arm_neon.h(54): error: identifier “__Int64x2_t” is undefined /usr/lib/gcc/aarch64-linux-gnu/9/include/arm_neon.h(55): error: identifier “__Float16x8_t” is undefined /usr/lib/gcc/aarch64-linux-gnu/9/include/arm_neon.h(56): error: identifier “__Float32x4_t” is undefined /usr/lib/gcc/aarch64-linux-gnu/9/include/arm_neon.h(57): error: identifier “__Float64x2_t” is undefined /usr/lib/gcc/aarch64-linux-gnu/9/include/arm_neon.h(58): error: identifier “__Poly8x16_t” is undefined /usr/lib/gcc/aarch64-linux-gnu/9/include/arm_neon.h(59): error: identifier “__Poly16x8_t” is undefined /usr/lib/gcc/aarch64-linux-gnu/9/include/arm_neon.h(60): error: identifier “__Poly64x2_t” is undefined /usr/lib/gcc/aarch64-linux-gnu/9/include/arm_neon.h(61): error: identifier “__Poly64x1_t” is undefined /usr/lib/gcc/aarch64-linux-gnu/9/include/arm_neon.h(62): error: identifier “__Uint8x16_t” is undefined /usr/lib/gcc/aarch64-linux-gnu/9/include/arm_neon.h(63): error: identifier “__Uint16x8_t” is undefined /usr/lib/gcc/aarch64-linux-gnu/9/include/arm_neon.h(64): error: identifier “__Uint32x4_t” is undefined /usr/lib/gcc/aarch64-linux-gnu/9/include/arm_neon.h(65): error: identifier “__Uint64x2_t” is undefined /usr/lib/gcc/aarch64-linux-gnu/9/include/arm_neon.h(67): error: identifier “__Poly8_t” is undefined /usr/lib/gcc/aarch64-linux-gnu/9/include/arm_neon.h(68): error: identifier “__Poly16_t” is undefined /usr/lib/gcc/aarch64-linux-gnu/9/include/arm_neon.h(69): error: identifier “__Poly64_t” is undefined /usr/lib/gcc/aarch64-linux-gnu/9/include/arm_neon.h(70): error: identifier “__Poly128_t” is undefined /usr/lib/gcc/aarch64-linux-gnu/9/include/arm_neon.h(72): error: identifier “__fp16” is undefined /usr/lib/gcc/aarch64-linux-gnu/9/include/arm_neon.h(795): error: identifier “__builtin_aarch64_saddlv8qi” is undefined /usr/lib/gcc/aarch64-linux-gnu/9/include/arm_neon.h(802): error: identifier “__builtin_aarch64_saddlv4hi” is undefined /usr/lib/gcc/aarch64-linux-gnu/9/include/arm_neon.h(809): error: identifier “__builtin_aarch64_saddlv2si” is undefined /usr/lib/gcc/aarch64-linux-gnu/9/include/arm_neon.h(816): error: identifier “__builtin_aarch64_uaddlv8qi” is undefined /usr/lib/gcc/aarch64-linux-gnu/9/include/arm_neon.h(824): error: identifier “__builtin_aarch64_uaddlv4hi” is undefined /usr/lib/gcc/aarch64-linux-gnu/9/include/arm_neon.h(832): error: identifier “__builtin_aarch64_uaddlv2si” is undefined /usr/lib/gcc/aarch64-linux-gnu/9/include/arm_neon.h(840): error: identifier “__builtin_aarch64_saddl2v16qi” is undefined /usr/lib/gcc/aarch64-linux-gnu/9/include/arm_neon.h(847): error: identifier “__builtin_aarch64_saddl2v8hi” is undefined /usr/lib/gcc/aarch64-linux-gnu/9/include/arm_neon.h(854): error: identifier “__builtin_aarch64_saddl2v4si” is undefined /usr/lib/gcc/aarch64-linux-gnu/9/include/arm_neon.h(861): error: identifier “__builtin_aarch64_uaddl2v16qi” is undefined /usr/lib/gcc/aarch64-linux-gnu/9/include/arm_neon.h(869): error: identifier “__builtin_aarch64_uaddl2v8hi” is undefined /usr/lib/gcc/aarch64-linux-gnu/9/include/arm_neon.h(877): error: identifier “__builtin_aarch64_uaddl2v4si” is undefined /usr/lib/gcc/aarch64-linux-gnu/9/include/arm_neon.h(885): error: identifier “__builtin_aarch64_saddwv8qi” is undefined /usr/lib/gcc/aarch64-linux-gnu/9/include/arm_neon.h(892): error: identifier “__builtin_aarch64_saddwv4hi” is undefined /usr/lib/gcc/aarch64-linux-gnu/9/include/arm_neon.h(899): error: identifier “__builtin_aarch64_saddwv2si” is undefined /usr/lib/gcc/aarch64-linux-gnu/9/include/arm_neon.h(906): error: identifier “__builtin_aarch64_uaddwv8qi” is undefined /usr/lib/gcc/aarch64-linux-gnu/9/include/arm_neon.h(914): error: identifier “__builtin_aarch64_uaddwv4hi” is undefined /usr/lib/gcc/aarch64-linux-gnu/9/include/arm_neon.h(922): error: identifier “__builtin_aarch64_uaddwv2si” is undefined /usr/lib/gcc/aarch64-linux-gnu/9/include/arm_neon.h(930): error: identifier “__builtin_aarch64_saddw2v16qi” is undefined /usr/lib/gcc/aarch64-linux-gnu/9/include/arm_neon.h(937): error: identifier “__builtin_aarch64_saddw2v8hi” is undefined /usr/lib/gcc/aarch64-linux-gnu/9/include/arm_neon.h(944): error: identifier “__builtin_aarch64_saddw2v4si” is undefined /usr/lib/gcc/aarch64-linux-gnu/9/include/arm_neon.h(951): error: identifier “__builtin_aarch64_uaddw2v16qi” is undefined /usr/lib/gcc/aarch64-linux-gnu/9/include/arm_neon.h(959): error: identifier “__builtin_aarch64_uaddw2v8hi” is undefined
SETUP:
nvcc: NVIDIA ® Cuda compiler driver Copyright © 2005-2022 NVIDIA Corporation Built on Sun_Oct_23_22:16:07_PDT_2022 Cuda compilation tools, release 11.4, V11.4.315 Build cuda_11.4.r11.4/compiler.31964100_0
Flashed using JetPack 5.1 (Ubuntu 20.04)
R35 (release), REVISION: 2.1, GCID: 32413640, BOARD: t186ref, EABI: aarch64, DATE: Tue Jan 24 23:38:33 UTC 2023 Linux ubuntu 5.10.104-tegra #1 SMP PREEMPT Tue Jan 24 15:09:44 PST 2023 aarch64 aarch64 aarch64 GNU/Linux
Any help would be greatly appreciated, thank you!
About this issue
- Original URL
- State: closed
- Created a year ago
- Reactions: 1
- Comments: 50 (4 by maintainers)
Support for Apple silicon #252 shows another Aarch64 approach. Would be a good idea to merge these efforts.
you were right! I systematically replaced all chars with in8_t and it works now, it was somewhere in kernels.cu. will find out which change exactly did it and update the repository later
sure. here’s the fork: https://github.com/g588928812/bitsandbytes_jetsonX