llama-cpp-python: Error running on M1 Mac

Hi!

I am having issues with using it on a M1 Mac:

from llama_cpp import Llama produces this error:

zsh: illegal hardware instruction

Best, Benjamin

About this issue

  • Original URL
  • State: closed
  • Created a year ago
  • Reactions: 2
  • Comments: 24 (8 by maintainers)

Commits related to this issue

Most upvoted comments

@abetlen I can import now without a problem. Thank you so much!

@tilo-schroeder @tgcandido I’ve pushed a potential fix to this repo that builds the package using make on linux and macos, would one of you mind testing it out to confirm it works? You should be able to test it by just following the development instructions in the README.

@tilo-schroeder I’m working on a potential fix, would appreciate if you could test something out for me, can you add set(CMAKE_SYSTEM_PROCESSOR "arm64") to the top of the CMakeLists.txt at the root of this repo and try to re-install with python3 setup.py develop and let me know if that works.

CMake Error at CMakeLists.txt:7 (add_subdirectory):
  The source directory

    /Users/ben/Downloads/llama-cpp-python/vendor/llama.cpp

  does not contain a CMakeLists.txt file.

My bad, I forgot that vendor/llama.cpp is a git submodule so it doesn’t get cloned automatically (shouldn’t be an issue with the PyPI version though). Running git submodule init and git submodule update after cloning the repo should work.

main: seed = 1679966146
llama_model_load: loading model from 'models/models/7B/ggml-model-q4_0.bin' - please wait ...
llama_model_load: invalid model file 'models/models/7B/ggml-model-q4_0.bin' (too old, regenerate your model files!)
llama_init_from_file: failed to load model
main: error: failed to load model 'models/models/7B/ggml-model-q4_0.bin'

The error you’re getting here is because llama.cpp recently updated the binary format for the model weights. I believe dalai is using an older version of llama.cpp which is why it’s not giving you an error. You’ll need to run the convert model and quantize steps on the original weights documented here.

python3 convert-pth-to-ggml.py models/7B/ 1
python3 quantize.py 7B

Finally, back to our original error

(mach-o file, but is an incompatible architecture (have 'arm64', need 'x86_64')),

I’ve actually tested the install (through pip not conda) on an M1 Macbook Air and it all worked, this seems to be an issue with the Mac M1s x86_64 compatibility mode. I think it has to do with installing it through conda but I’m not sure. I’ll try conda installing the repo tomorrow and see what happens.

In the meantime can you try running arch -x86_64 before running the script?

Ok, I learned that I need to put the llama.cpp files into .vendor/llama.cpp. Then it worked. Giving me this error:

OSError: dlopen(/Users/ben/opt/miniconda3.10/miniconda3/lib/python3.10/site-packages/llama_cpp/libllama.dylib, 0x0006): tried: ‘/Users/ben/opt/miniconda3.10/miniconda3/lib/python3.10/site-packages/llama_cpp/libllama.dylib’ (mach-o file, but is an incompatible architecture (have ‘arm64’, need ‘x86_64’)),

‘/System/Volumes/Preboot/Cryptexes/OS/Users/ben/opt/miniconda3.10/miniconda3/lib/python3.10/site-packages/llama_cpp/libllama.dylib’ (no such file), ‘/Users/ben/opt/miniconda3.10/miniconda3/lib/python3.10/site-packages/llama_cpp/libllama.dylib’ (mach-o file, but is an incompatible architecture (have ‘arm64’, need ‘x86_64’))