llama.cpp: Docker error: Cannot access '/models//7B/ggml-model-f16.bin*': No such file or directory

Prerequisites

Please answer the following questions for yourself before submitting an issue.

  • [*] I am running the latest code. Development is very rapid so there are no tagged versions as of now.
  • [*] I carefully followed the README.md.
  • [*] I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).
  • [*] I reviewed the Discussions, and have a new bug or useful enhancement to share.

llama └── models ├── 13B │   ├── checklist.chk │   ├── consolidated.00.pth │   ├── consolidated.01.pth │   └── params.json ├── 30B │   ├── checklist.chk │   ├── consolidated.00.pth │   ├── consolidated.01.pth │   ├── consolidated.02.pth │   ├── consolidated.03.pth │   └── params.json ├── 65B │   ├── checklist.chk │   ├── consolidated.00.pth │   ├── consolidated.01.pth │   ├── consolidated.02.pth │   ├── consolidated.03.pth │   ├── consolidated.04.pth │   ├── consolidated.05.pth │   ├── consolidated.06.pth │   ├── consolidated.07.pth │   └── params.json ├── 7B │   ├── checklist.chk │   ├── consolidated.00.pth │   └── params.json ├── ggml-vocab.bin ├── llama.sh ├── tokenizer_checklist.chk └── tokenizer.model

using docker run -v /llama/models:/models ghcr.io/ggerganov/llama.cpp:full --all-in-one "/models/" 7B

output :

Converting PTH to GGML...
ls: cannot access '/models//7B/ggml-model-f16.bin*': No such file or directory

About this issue

  • Original URL
  • State: closed
  • Created a year ago
  • Reactions: 4
  • Comments: 15

Most upvoted comments

Hello, I had the same error on windows 10. try : docker run -v /c/llama/models:/models ghcr.io/ggerganov/llama.cpp:full --all-in-one “/models/” 7B It solved the issue for me. Regards.

Do it in 3 steps, not –all-in-one, like this,

Convert it:

docker run -v /home/hola/llama/models:/models ghcr.io/ggerganov/llama.cpp:full \
  --convert "/models/7B" 1

Quantize it:

docker run -v /home/hola/llama/models:/models ghcr.io/ggerganov/llama.cpp:full \
  --quantize "/models/7B/ggml-model-f16.bin" "/models/7B/ggml-model-q4_0.bin" 2

Run it:

docker run -v /home/hola/llama/models:/models --entrypoint '/app/main' ghcr.io/ggerganov/llama.cpp:full \
  -m /models/7B/ggml-model-q4_0.bin -n 512 -p "Building a website can be done in 10 simple steps"

Same for the other models.

Hi, I have the same error. Is there any news about this bug? Thanks in advance 😃

This all reads like users were not providing the actual path to the models, but just copy-pasting /llama/models from the readme. In the meantime, the readme has been updated to use /path/to/models as the path + there is a note that this path should be replaced by the actual path. Closing.

That’s definitely not the case. When I run the --all-in-one command, it does not work. When I use the commands given by @mrsipan, the conversion and the creation of the “ggml-model-f16.bin”-file works with the same volume mapping.

Hello, have you a link from where we can download the ggml model files?

I was able to get around this issue by making the same changes mentioned in #408 and building the full Docker image myself. For those interested in getting started fast, I copied the entire contents of my modified .devops/tools.sh into this gist. You should be able to copy/paste it into a local checkout of llama.cpp to get up and running.

I built the Docker image by running:

docker build -f .devops/full.Dockerfile -t llama-cpp:full .

I then ran the following command to convert the LLaMA model to ggml (my copy of LLaMA lives in a dir called llama/):

docker run -v "$(pwd)/llama/:/models" llama-cpp:full --all-in-one "/models" 7B

I then tried to run inference using the example in the README, but unfortunately it looks like there’s a bug that prevents passing more than one word to the -p option (I think whatever mechanism splits args is ignoring quotes). Fortunately llama.cpp allows sticking the prompt into a file, so I copy/pasted the prompt example from the README into llama/prompt.txt and executed this command to run the inference:

docker run -v "$(pwd)/llama/:/models" llama-cpp:full --run -m "/models/7B/ggml-model-q4_0.bin" -f /models/prompt.txt -n 512

And it worked 🎉 It’s a little slow on my basic 'ol x86 Macbook, but it worked 🎉