alpaca.cpp: Segmentation fault (only) with 13B model.

~/alpaca# ./chat -m ggml-alpaca-13b-q4.bin
main: seed = 1679150968
llama_model_load: loading model from 'ggml-alpaca-13b-q4.bin' - please wait ...
llama_model_load: ggml ctx size = 10959.49 MB
Segmentation fault

I just downloaded the 13B model from the torrent (ggml-alpaca-13b-q4.bin), pulled the latest master and compiled. It works absolutely fine with the 7B model, but I just get the Segmentation fault with 13B model.

Checksum of the 13B model; 66f3554e700bd06104a4a5753e5f3b5b

I’m running Ubuntu under WSL on Windows.

About this issue

Original URL
State: closed
Created a year ago
Reactions: 5
Comments: 17 (1 by maintainers)

Most upvoted comments

It has nothing to do with converting. Main.cpp thinks this is a multi-part file. Usually the 13B model is splitted into two files. But here, we have only one file. In the main.cpp file of the llama.cpp upstream I changed (hacked) the line number 130 into n_parts = 1; //LLAMA_N_PARTS.at(hparams.n_embd); which let me load the model.

PriNova on Mar 18, 2023

I have the same result, I also ran it under WSL on windows, works with 7B model, not with 13B model. Same md5sum. Same result btw for ggerganov/llama.cpp, from which this project is forked. It gives a more detailed error message:

llama_model_load: llama_model_load: tensor 'tok_embeddings.weight' has wrong size in model file
main: failed to load model from 'ggml-alpaca-13b-q4.bin'

I did not use the 7B model from the torrent, but from the download url in this repo, did you do the same? Perhaps the 13B model has to be transformed to an appropriate format before it can be used in this project?

barry163 on Mar 18, 2023