llama.cpp: Segmentation fault when running llava

Prerequisites

Please answer the following questions for yourself before submitting an issue.

I am running the latest code. Development is very rapid so there are no tagged versions as of now.
I carefully followed the README.md.
I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).
I reviewed the Discussions, and have a new bug or useful enhancement to share.

Expected Behavior

./llava <arguments> should not run into a segmentation fault or at least offer some context for the segmentation fault.

Current Behavior

I run ./llava -m models/llava/ggml-model-q5_k.gguf --mmproj models/llava/mmproj-model-f16.gguf --image example.jpg and it outputs

clip_model_load: model name:   openai/clip-vit-large-patch14-336
clip_model_load: description:  image encoder for LLaVA
clip_model_load: GGUF version: 2
clip_model_load: alignment:    32
clip_model_load: n_tensors:    377
clip_model_load: n_kv:         18
clip_model_load: ftype:        f16

clip_model_load: text_encoder:   0
clip_model_load: vision_encoder: 1
clip_model_load: llava_projector:  1
clip_model_load: model size:     595.61 MB
clip_model_load: metadata size:  0.13 MB
clip_model_load: total allocated memory: 201.27 MB
zsh: segmentation fault  ./llava -m models/llava/ggml-model-q5_k.gguf --mmproj  --image example.jpg

Environment and Context

Please provide detailed information about your computer setup. This is important in case the issue is not reproducible except for under certain specific conditions.

Physical (or virtual) hardware you are using, e.g. for Linux:

❯ neofetch
                    'c.          
                 ,xNMM.          Software information
               .OMMMMo           OS: macOS 13.2 22D49 arm64
               OMMM0,            Kernel: 22.3.0
     .;loddo:' loolloddol;.      DE: Aqua
   cKMMMMMMMMMMNWMMMMMMMMMM0:    WM: Quartz Compositor
 .KMMMMMMMMMMMMMMMMMMMMMMMWd.    WM Theme: Blue (Light)
 XMMMMMMMMMMMMMMMMMMMMMMMX.      Terminal: iTerm2
;MMMMMMMMMMMMMMMMMMMMMMMM:       Terminal Font: Hack-Regular 12 (normal) / HackNerdFontComplete-Italic 12 (non-ascii)
:MMMMMMMMMMMMMMMMMMMMMMMM:       Shell: zsh 5.8.1
.MMMMMMMMMMMMMMMMMMMMMMMMX.      Packages: 3 (port), 182 (brew)
 kMMMMMMMMMMMMMMMMMMMMMMMMWd.    
 .XMMMMMMMMMMMMMMMMMMMMMMMMMMk   Hardware information
  .XMMMMMMMMMMMMMMMMMMMMMMMMK.   Host: MacBookPro18,1
    kMMMMMMMMMMMMMMMMMMMMMMd     CPU: Apple M1 Pro
     ;KMMMMMMMWXXWMMMMMMMk.      GPU: Apple M1 Pro
       .cooc,.    .,coo:.        Disk: /
                                 Memory: 3109MiB / 16384MiB
                                 Resolution: 1728x1117

Operating System, e.g. for Linux:

❯ uname -a
Darwin MacBook-Pro-von-Michael.fritz.box 22.3.0 Darwin Kernel Version 22.3.0: Thu Jan  5 20:48:54 PST 2023; root:xnu-8792.81.2~2/RELEASE_ARM64_T6000 arm64

SDK version, e.g. for Linux:

❯ make --version
GNU Make 3.81
Copyright (C) 2006  Free Software Foundation, Inc.
This is free software; see the source for copying conditions.
There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A
PARTICULAR PURPOSE.

This program built for i386-apple-darwin11.3.0

Failure Information (for bugs)

Please help provide information about the failure if this is a bug. If it is not a bug, please remove the rest of this template.

Steps to Reproduce

Please provide detailed steps for reproducing the issue. We are not sitting in front of your screen, so the more detail the better.

git clone https://github.com/ggerganov/llama.cpp.git
make
mkdir models/llava & cd models/llava
wget https://huggingface.co/mys/ggml_llava-v1.5-7b/resolve/main/mmproj-model-f16.gguf && wget https://huggingface.co/mys/ggml_llava-v1.5-7b/resolve/main/ggml-model-q5_k.gguf
cd ../..
wget -O example.jpg https://i.redd.it/j2ex7z8tyqf21.jpg
./llava -m models/llava/ggml-model-q5_k.gguf --mmproj models/llava/mmproj-model-f16.gguf --image example.jpg

Failure Logs

❯ git log | head -1
commit 11bff290458f12f020b588792707f76ec658a27a

❯ sysctl -a | grep machdep.cpu
machdep.cpu.cores_per_package: 10
machdep.cpu.core_count: 10
machdep.cpu.logical_per_package: 10
machdep.cpu.thread_count: 10
machdep.cpu.brand_string: Apple M1 Pro

❯ make --version | head -1
GNU Make 3.81

❯ md5sum ./models/llava/ggml-model-q5_k.gguf
01878e0b413786b3a2e7845689c999da  ./models/llava/ggml-model-q5_k.gguf

Plain run

❯ ./llava -m models/llava/ggml-model-q5_k.gguf --mmproj models/llava/mmproj-model-f16.gguf --image example.jpg
clip_model_load: model name:   openai/clip-vit-large-patch14-336
clip_model_load: description:  image encoder for LLaVA
clip_model_load: GGUF version: 2
clip_model_load: alignment:    32
clip_model_load: n_tensors:    377
clip_model_load: n_kv:         18
clip_model_load: ftype:        f16

clip_model_load: text_encoder:   0
clip_model_load: vision_encoder: 1
clip_model_load: llava_projector:  1
clip_model_load: model size:     595.61 MB
clip_model_load: metadata size:  0.13 MB
clip_model_load: total allocated memory: 201.27 MB
zsh: segmentation fault  ./llava -m models/llava/ggml-model-q5_k.gguf --mmproj  --image example.jpg

Edit:

Output of lldb:

❯ lldb -- ./llava -m models/llava/ggml-model-q5_k.gguf --mmproj "models/llava/mmproj-model-f16.gguf" --image example.jpg
(lldb) target create "./llava"
Current executable set to '/Users/bogdan/git/llama.cpp/llava' (arm64).
(lldb) settings set -- target.run-args  "-m" "models/llava/ggml-model-q5_k.gguf" "--mmproj" "models/llava/mmproj-model-f16.gguf" "--image" "example.jpg"
(lldb) run
Process 23062 launched: '/Users/bogdan/git/llama.cpp/llava' (arm64)
clip_model_load: model name:   openai/clip-vit-large-patch14-336
clip_model_load: description:  image encoder for LLaVA
clip_model_load: GGUF version: 2
clip_model_load: alignment:    32
clip_model_load: n_tensors:    377
clip_model_load: n_kv:         18
clip_model_load: ftype:        f16

clip_model_load: text_encoder:   0
clip_model_load: vision_encoder: 1
clip_model_load: llava_projector:  1
clip_model_load: model size:     595.61 MB
clip_model_load: metadata size:  0.13 MB
clip_model_load: total allocated memory: 201.27 MB
Process 23062 stopped
* thread #4, stop reason = EXC_BAD_ACCESS (code=1, address=0x0)
    frame #0: 0x0000000000000000
error: memory read failed for 0x0
Target 0: (llava) stopped.
(lldb) bt
* thread #4, stop reason = EXC_BAD_ACCESS (code=1, address=0x0)
  * frame #0: 0x0000000000000000
    frame #1: 0x000000010004f638 llava`ggml_compute_forward_mul_mat + 816
    frame #2: 0x0000000100039570 llava`ggml_graph_compute_thread + 428
    frame #3: 0x000000019caee06c libsystem_pthread.dylib`_pthread_start + 148
(lldb)

About this issue

Original URL
State: closed
Created 9 months ago
Reactions: 2
Comments: 20 (1 by maintainers)

Most upvoted comments

Seems that surgery.py will delete corresponding keys in the original checkpoint.

This is needed because the Pytorch checkpoints should contain only the LLaMA weights when running convert.py, so I save it to a file named llava.projector, which is then appended to the CLIP model.

monatis on Nov 6, 2023