llama.cpp: Docker error: Cannot access '/models//7B/ggml-model-f16.bin*': No such file or directory

Prerequisites

Please answer the following questions for yourself before submitting an issue.

[*] I am running the latest code. Development is very rapid so there are no tagged versions as of now.
[*] I carefully followed the README.md.
[*] I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).
[*] I reviewed the Discussions, and have a new bug or useful enhancement to share.

llama └── models ├── 13B │ ├── checklist.chk │ ├── consolidated.00.pth │ ├── consolidated.01.pth │ └── params.json ├── 30B │ ├── checklist.chk │ ├── consolidated.00.pth │ ├── consolidated.01.pth │ ├── consolidated.02.pth │ ├── consolidated.03.pth │ └── params.json ├── 65B │ ├── checklist.chk │ ├── consolidated.00.pth │ ├── consolidated.01.pth │ ├── consolidated.02.pth │ ├── consolidated.03.pth │ ├── consolidated.04.pth │ ├── consolidated.05.pth │ ├── consolidated.06.pth │ ├── consolidated.07.pth │ └── params.json ├── 7B │ ├── checklist.chk │ ├── consolidated.00.pth │ └── params.json ├── ggml-vocab.bin ├── llama.sh ├── tokenizer_checklist.chk └── tokenizer.model

using docker run -v /llama/models:/models ghcr.io/ggerganov/llama.cpp:full --all-in-one "/models/" 7B

output :

Converting PTH to GGML...
ls: cannot access '/models//7B/ggml-model-f16.bin*': No such file or directory

About this issue

Original URL
State: closed
Created a year ago
Reactions: 4
Comments: 15

Most upvoted comments

Hello, I had the same error on windows 10. try : docker run -v /c/llama/models:/models ghcr.io/ggerganov/llama.cpp:full --all-in-one “/models/” 7B It solved the issue for me. Regards.

3F-cl34n on Mar 28, 2023

Do it in 3 steps, not –all-in-one, like this,

Convert it:

docker run -v /home/hola/llama/models:/models ghcr.io/ggerganov/llama.cpp:full \
  --convert "/models/7B" 1

Quantize it:

docker run -v /home/hola/llama/models:/models ghcr.io/ggerganov/llama.cpp:full \
  --quantize "/models/7B/ggml-model-f16.bin" "/models/7B/ggml-model-q4_0.bin" 2

Run it:

docker run -v /home/hola/llama/models:/models --entrypoint '/app/main' ghcr.io/ggerganov/llama.cpp:full \
  -m /models/7B/ggml-model-q4_0.bin -n 512 -p "Building a website can be done in 10 simple steps"

Same for the other models.

mrsipan on Apr 7, 2023

Hi, I have the same error. Is there any news about this bug? Thanks in advance 😃

yoanndelattre on Mar 28, 2023

This all reads like users were not providing the actual path to the models, but just copy-pasting /llama/models from the readme. In the meantime, the readme has been updated to use /path/to/models as the path + there is a note that this path should be replaced by the actual path. Closing.

That’s definitely not the case. When I run the --all-in-one command, it does not work. When I use the commands given by @mrsipan, the conversion and the creation of the “ggml-model-f16.bin”-file works with the same volume mapping.

lafe on May 27, 2023

Hello, have you a link from where we can download the ggml model files?

cth18012017 on Apr 19, 2023

I was able to get around this issue by making the same changes mentioned in #408 and building the full Docker image myself. For those interested in getting started fast, I copied the entire contents of my modified .devops/tools.sh into this gist. You should be able to copy/paste it into a local checkout of llama.cpp to get up and running.

I built the Docker image by running:

docker build -f .devops/full.Dockerfile -t llama-cpp:full .

I then ran the following command to convert the LLaMA model to ggml (my copy of LLaMA lives in a dir called llama/):

docker run -v "$(pwd)/llama/:/models" llama-cpp:full --all-in-one "/models" 7B

I then tried to run inference using the example in the README, but unfortunately it looks like there’s a bug that prevents passing more than one word to the -p option (I think whatever mechanism splits args is ignoring quotes). Fortunately llama.cpp allows sticking the prompt into a file, so I copy/pasted the prompt example from the README into llama/prompt.txt and executed this command to run the inference:

docker run -v "$(pwd)/llama/:/models" llama-cpp:full --run -m "/models/7B/ggml-model-q4_0.bin" -f /models/prompt.txt -n 512

And it worked 🎉 It’s a little slow on my basic 'ol x86 Macbook, but it worked 🎉

camertron on Mar 29, 2023