llama.cpp: runtime error in example/server

To build and run the just released example/server executable, I made the server executable with cmake build(adding option: -DLLAMA_BUILD_SERVER=ON),

And I followed the ReadMe.md and ran the following code.

./build/bin/server -m models/ggml-vicuna-13b-1.1/ggml-vicuna-13b-1.1-q4_1.bin --ctx_size 2048

And the following error occurred.

In Mac

main: seed = 1684723159
llama.cpp: loading model from models/ggml-vicuna-13b-1.1/ggml-vicuna-13b-1.1-q4_1.bin
libc++abi: terminating due to uncaught exception of type std::runtime_error: unexpectedly reached end of file
zsh: abort      ./build/bin/server -m models/ggml-vicuna-13b-1.1/ggml-vicuna-13b-1.1-q4_1.bin

In Ubuntu(with cuBLAS)

main: seed = 1684728245
llama.cpp: loading model from models/ggml-vicuna-13b-1.1/ggml-vicuna-13b-1.1-q4_1.bin
terminate called after throwing an instance of 'std::runtime_error'
  what():  unexpectedly reached end of file
Aborted (core dumped)

Same Runtime Error. what more do I need?

About this issue

  • Original URL
  • State: closed
  • Created a year ago
  • Comments: 15

Most upvoted comments

I was able to solve for gpt4all doing convert + quantization.

python3 convert.py models/gpt4all-7B/gpt4all-lora-quantized.bin --outtype f16

./quantize models/gpt4all-7B/ggml-model-f16.bin models/gpt4all-7B/ggml-model-q4_0.bin q4_0

Which looking at the output of convert, it should have been doing already

same error in gpt2_ggml_model when run ./quantize ./gpt2_13b/ggml-model-f16.bin ./gpt2_13b/ggml-model-f16.bin:

terminate called after throwing an instance of 'std::runtime_error'
  what():  unexpectedly reached end of file
Aborted (core dumped)

I hit a similar issue. Mine was caused by https://github.com/ggerganov/llama.cpp/commit/2d5db48371052087a83974abda3767d1aedec598. The delta field in each quantize block is changed from fp32 to fp16, so the model file fails to load. There is a file version bump and checking the file version, but the checking was too late. When “some” tensor data is loaded with incorrect size, following length related fields such as name_len may load corrupted data, causing unexpected end of file.