llama.cpp: runtime error in example/server
To build and run the just released example/server executable, I made the server executable with cmake build(adding option: -DLLAMA_BUILD_SERVER=ON),
And I followed the ReadMe.md and ran the following code.
./build/bin/server -m models/ggml-vicuna-13b-1.1/ggml-vicuna-13b-1.1-q4_1.bin --ctx_size 2048
And the following error occurred.
In Mac
main: seed = 1684723159
llama.cpp: loading model from models/ggml-vicuna-13b-1.1/ggml-vicuna-13b-1.1-q4_1.bin
libc++abi: terminating due to uncaught exception of type std::runtime_error: unexpectedly reached end of file
zsh: abort ./build/bin/server -m models/ggml-vicuna-13b-1.1/ggml-vicuna-13b-1.1-q4_1.bin
In Ubuntu(with cuBLAS)
main: seed = 1684728245
llama.cpp: loading model from models/ggml-vicuna-13b-1.1/ggml-vicuna-13b-1.1-q4_1.bin
terminate called after throwing an instance of 'std::runtime_error'
what(): unexpectedly reached end of file
Aborted (core dumped)
Same Runtime Error. what more do I need?
About this issue
- Original URL
- State: closed
- Created a year ago
- Comments: 15
I was able to solve for gpt4all doing convert + quantization.
Which looking at the output of convert, it should have been doing already
I hit a similar issue. Mine was caused by https://github.com/ggerganov/llama.cpp/commit/2d5db48371052087a83974abda3767d1aedec598. The delta field in each quantize block is changed from fp32 to fp16, so the model file fails to load. There is a file version bump and checking the file version, but the checking was too late. When “some” tensor data is loaded with incorrect size, following length related fields such as
name_lenmay load corrupted data, causing unexpected end of file.