localGPT: Used docker image to implement the localGPT not work.

hi @KonradHoeffner and everyone,

I followed the docker section to build the docker image first, and start the container. However, it encountered this error below.

Steps

  1. docker build . -t localgpt
  2. docker run -it --mount src="$HOME/.cache",target=/root/.cache,type=bind --gpus=all localgpt

Error message

localGPT ›› docker run -it --mount src="$HOME/.cache",target=/root/.cache,type=bind --gpus=all localgpt                                                        

==========
== CUDA ==
==========

CUDA Version 11.7.1

Container image Copyright (c) 2016-2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved.

This container image and its contents are governed by the NVIDIA Deep Learning Container License.
By pulling and using the container, you accept the terms and conditions of this license:
https://developer.nvidia.com/ngc/nvidia-deep-learning-container-license

A copy of this license is made available in this container at /NGC-DL-CONTAINER-LICENSE for your convenience.

2023-09-17 02:43:59,219 - INFO - run_localGPT.py:212 - Running on: cuda
2023-09-17 02:43:59,219 - INFO - run_localGPT.py:213 - Display Source Documents set to: False
2023-09-17 02:43:59,219 - INFO - run_localGPT.py:214 - Use history set to: False
2023-09-17 02:43:59,497 - INFO - SentenceTransformer.py:66 - Load pretrained SentenceTransformer: hkunlp/instructor-large
Downloading (…)c7233/.gitattributes: 100%|████████████████████████████████████████████████████████████████████████████████████████████| 1.48k/1.48k [00:00<00:00, 12.9MB/s]
Downloading (…)_Pooling/config.json: 100%|████████████████████████████████████████████████████████████████████████████████████████████████| 270/270 [00:00<00:00, 2.45MB/s]
Downloading (…)/2_Dense/config.json: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████| 116/116 [00:00<00:00, 943kB/s]
Downloading pytorch_model.bin: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████| 3.15M/3.15M [00:00<00:00, 9.00MB/s]
Downloading (…)9fb15c7233/README.md: 100%|████████████████████████████████████████████████████████████████████████████████████████████| 66.3k/66.3k [00:00<00:00, 24.6MB/s]
Downloading (…)b15c7233/config.json: 100%|████████████████████████████████████████████████████████████████████████████████████████████| 1.53k/1.53k [00:00<00:00, 12.0MB/s]
Downloading (…)ce_transformers.json: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████| 122/122 [00:00<00:00, 990kB/s]
Downloading pytorch_model.bin: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████| 1.34G/1.34G [02:39<00:00, 8.41MB/s]
Downloading (…)nce_bert_config.json: 100%|███████████████████████████████████████████████████████████████████████████████████████████████| 53.0/53.0 [00:00<00:00, 433kB/s]
Downloading (…)cial_tokens_map.json: 100%|████████████████████████████████████████████████████████████████████████████████████████████| 2.20k/2.20k [00:00<00:00, 19.1MB/s]
Downloading spiece.model: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████| 792k/792k [00:00<00:00, 9.28MB/s]
Downloading (…)c7233/tokenizer.json: 100%|████████████████████████████████████████████████████████████████████████████████████████████| 2.42M/2.42M [00:00<00:00, 3.14MB/s]
Downloading (…)okenizer_config.json: 100%|████████████████████████████████████████████████████████████████████████████████████████████| 2.41k/2.41k [00:00<00:00, 19.2MB/s]
Downloading (…)15c7233/modules.json: 100%|████████████████████████████████████████████████████████████████████████████████████████████████| 461/461 [00:00<00:00, 4.13MB/s]
load INSTRUCTOR_Transformer
max_seq_length  512
2023-09-17 02:46:51,770 - INFO - posthog.py:16 - Anonymized telemetry enabled. See https://docs.trychroma.com/telemetry for more information.
2023-09-17 02:46:51,888 - INFO - run_localGPT.py:50 - Loading Model: TheBloke/Llama-2-7b-Chat-GGUF, on: cuda
2023-09-17 02:46:51,888 - INFO - run_localGPT.py:51 - This action can take a few minutes!
2023-09-17 02:46:51,888 - INFO - load_models.py:38 - Using Llamacpp for GGUF/GGML quantized models
Downloading (…)-7b-chat.Q4_K_M.gguf: 100%|████████████████████████████████████████████████████████████████████████████████████████████| 4.08G/4.08G [08:14<00:00, 8.25MB/s]
Traceback (most recent call last):
  File "//run_localGPT.py", line 250, in <module>
    main()
  File "/usr/local/lib/python3.10/dist-packages/click/core.py", line 1157, in __call__
    return self.main(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/click/core.py", line 1078, in main
    rv = self.invoke(ctx)
  File "/usr/local/lib/python3.10/dist-packages/click/core.py", line 1434, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/usr/local/lib/python3.10/dist-packages/click/core.py", line 783, in invoke
    return __callback(*args, **kwargs)
  File "//run_localGPT.py", line 220, in main
    qa = retrieval_qa_pipline(device_type, use_history, promptTemplate_type="llama")
  File "//run_localGPT.py", line 139, in retrieval_qa_pipline
    qa = RetrievalQA.from_chain_type(llm=llm, 
  File "/usr/local/lib/python3.10/dist-packages/langchain/chains/retrieval_qa/base.py", line 100, in from_chain_type
    combine_documents_chain = load_qa_chain(
  File "/usr/local/lib/python3.10/dist-packages/langchain/chains/question_answering/__init__.py", line 249, in load_qa_chain
    return loader_mapping[chain_type](
  File "/usr/local/lib/python3.10/dist-packages/langchain/chains/question_answering/__init__.py", line 73, in _load_stuff_chain
    llm_chain = LLMChain(
  File "/usr/local/lib/python3.10/dist-packages/langchain/load/serializable.py", line 74, in __init__
    super().__init__(**kwargs)
  File "pydantic/main.py", line 341, in pydantic.main.BaseModel.__init__
pydantic.error_wrappers.ValidationError: 1 validation error for LLMChain
llm
  none is not an allowed value (type=type_error.none.not_allowed)

Does anyone also encounter this problem?

About this issue

  • Original URL
  • State: closed
  • Created 9 months ago
  • Comments: 20 (8 by maintainers)

Most upvoted comments

Oh I see, that used to be inside requirements.txt. Commit 16f949ed93818ead1201eb294bd40f9e8a819b72 contains:

  • Removed llamacpp from requirement.txt file. It needs to be installed separately to ensure it supports GPU

I will add it back in the Dockerfile

Yes, I get the same error message. I run this on Arch linux. As a reference, here is my installation history:

python3.10 -m venv /mnt/yhteinen/docker-localgpt source bin/activate CMAKE_ARGS=“-DLLAMA_CUBLAS=on” FORCE_CMAKE=1 pip install llama-cpp-python==0.1.83 --no-cache-dir sudo docker build . -t localgpt sudo docker run -it --mount src=“$HOME/.cache”,target=/root/.cache,type=bind --gpus=all localgpt

2023-09-17 05:54:52,701 - INFO - run_localGPT.py:212 - Running on: cuda 2023-09-17 05:54:52,701 - INFO - run_localGPT.py:213 - Display Source Documents set to: False 2023-09-17 05:54:52,701 - INFO - run_localGPT.py:214 - Use history set to: False 2023-09-17 05:54:52,933 - INFO - SentenceTransformer.py:66 - Load pretrained SentenceTransformer: hkunlp/instructor-large load INSTRUCTOR_Transformer max_seq_length 512 2023-09-17 05:54:57,061 - INFO - posthog.py:16 - Anonymized telemetry enabled. See https://docs.trychroma.com/telemetry for more information. 2023-09-17 05:54:57,154 - INFO - run_localGPT.py:50 - Loading Model: TheBloke/Llama-2-7b-Chat-GGUF, on: cuda 2023-09-17 05:54:57,154 - INFO - run_localGPT.py:51 - This action can take a few minutes! 2023-09-17 05:54:57,154 - INFO - load_models.py:38 - Using Llamacpp for GGUF/GGML quantized models Downloading (…)-7b-chat.Q4_K_M.gguf: 100%|████████████████████████████████████████████| 4.08G/4.08G [01:02<00:00, 65.0MB/s] Traceback (most recent call last): File “//run_localGPT.py”, line 250, in <module> main() File “/usr/local/lib/python3.10/dist-packages/click/core.py”, line 1157, in call return self.main(*args, **kwargs) File “/usr/local/lib/python3.10/dist-packages/click/core.py”, line 1078, in main rv = self.invoke(ctx) File “/usr/local/lib/python3.10/dist-packages/click/core.py”, line 1434, in invoke return ctx.invoke(self.callback, **ctx.params) File “/usr/local/lib/python3.10/dist-packages/click/core.py”, line 783, in invoke return __callback(*args, **kwargs) File “//run_localGPT.py”, line 220, in main qa = retrieval_qa_pipline(device_type, use_history, promptTemplate_type=“llama”) File “//run_localGPT.py”, line 139, in retrieval_qa_pipline qa = RetrievalQA.from_chain_type(llm=llm, File “/usr/local/lib/python3.10/dist-packages/langchain/chains/retrieval_qa/base.py”, line 100, in from_chain_type combine_documents_chain = load_qa_chain( File “/usr/local/lib/python3.10/dist-packages/langchain/chains/question_answering/init.py”, line 249, in load_qa_chain return loader_mapping[chain_type]( File “/usr/local/lib/python3.10/dist-packages/langchain/chains/question_answering/init.py”, line 73, in _load_stuff_chain llm_chain = LLMChain( File “/usr/local/lib/python3.10/dist-packages/langchain/load/serializable.py”, line 74, in init super().init(**kwargs) File “pydantic/main.py”, line 341, in pydantic.main.BaseModel.init pydantic.error_wrappers.ValidationError: 1 validation error for LLMChain llm none is not an allowed value (type=type_error.none.not_allowed)