root@fd13db08a5d7:/usr/src/app# python ./privateGPT.py
llama.cpp: loading model from models/ggml-model-q4_0.bin
llama.cpp: can't use mmap because tensors are not aligned; convert to new format to avoid this
llama_model_load_internal: format = 'ggml' (old version with low tokenizer quality and no mmap support)
llama_model_load_internal: n_vocab = 32000
llama_model_load_internal: n_ctx = 1000
llama_model_load_internal: n_embd = 4096
llama_model_load_internal: n_mult = 256
llama_model_load_internal: n_head = 32
llama_model_load_internal: n_layer = 32
llama_model_load_internal: n_rot = 128
llama_model_load_internal: ftype = 2 (mostly Q4_0)
llama_model_load_internal: n_ff = 11008
llama_model_load_internal: n_parts = 1
llama_model_load_internal: model size = 7B
llama_model_load_internal: ggml ctx size = 4113748.20 KB
llama_model_load_internal: mem required = 5809.33 MB (+ 2052.00 MB per state)
...................................................................................................
.
llama_init_from_file: kv self size = 1000.00 MB
AVX = 1 | AVX2 = 0 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 0 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 0 | SSE3 = 1 | VSX = 0 |
Using embedded DuckDB with persistence: data will be stored in: db
Illegal instruction (core dumped)
I ran into this on an older machine. But when I tried it on a newer CPU, it worked successfully. I think the difference is if the CPU supports the AVX2 instruction. To quickly check for this:
I just follow the instructions in README, all the steps when well,
pip install
python ingest.py
… when I try the last steppython ./privateGPT.py
I get this.I run the code on an old xeon CPU (2012), will it be something missing on the CPU? I run 8 cores + 16GB ram
I run in docker with image python:3