localGPT: AssertionError: Torch not compiled with CUDA enabled

I have PyTorch with CUDA enabled:

# Name                    Version                   Build  Channel
pytorch                   2.0.1           py3.11_cuda11.8_cudnn8_0    pytorch
pytorch-cuda              11.8                 h24eeafa_5    pytorch
pytorch-mutex             1.0                        cuda    pytorch

This error message needs improvement. What is the actual problem?

requirements.txt needs to be updated to include the correct pytorch version?

OS Name Microsoft Windows 10 Pro Version 10.0.19045 Build 19045

About this issue

  • Original URL
  • State: open
  • Created a year ago
  • Comments: 29 (5 by maintainers)

Most upvoted comments

What worked for me: Windows 10, 3080 RTX, Cuda 12.1, python 3.10, pip 23.1.2, running inside virtual env in Powershell installed everything like in told in the tutorial: pip install -r requirements.txt

–check if cuda is installed, should be visible top right corner

nvidia-smi.exe

–check if torch is installed

pip list | findstr torch –if you see something like torch 2.0.1 torchvision 0.15.2 –then torch probably was compiled without cuda –check with another method python -c ‘import torch; print(torch.cuda.is_available())’ –if next line is “false” then torch wasn’t compiled with cuda

–clean up

pip uninstall torch torchvision torchaudio pip cache purge pip list | findstr torch

–go to this site an get the propper command for your system an cuda installation –https://pytorch.org/get-started/locally/

pip3 install --pre torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/cu121 –this loaded and installed torch with cuda

–check again

pip list | findstr torch python -c ‘import torch; print(torch.cuda.is_available())’ – output should be “true”

–try to ingest python ingest.py

Hi @adjiap! Were you abe to solve the memory problem? Thanks in advance

I’m rather new in understanding the intricacies of machine models and embedding, but here I get to learn a few stuff 😃

In the end I was able to run the project by using the base instructor (using my GPU) during ingest.py, but running localGPT.py using CPU.

This brings me to the next problem of having too little RAM, as the Vicuna-7B takes 30GB load of my 32GB RAM (not GPU VRAM, btw). Though it works, the questions are really slow. I haven’t tried PromtEngineer’s comment about setting the RAM there, (as I’m not sure yet what the argument actually does), because Vicuna-7B afaik, is 30GB, and if it’s limited to something smaller, like 5 GB, would probably not work as intended.

A colleague of mine helped me using his machine with dual RTX2080TI, with 12 GB VRAM each, and he was able to run the ingest.py and run_localGPT.py with no issue, though he did show me that when the runLocalGPT.py was run, both of his GPUs are maintaining a 9 GB load.

tl;dr: Vicuna 7B and the large instructor doesn’t work without at least a 20 GB VRAM GPU in total. The Embedding (ingest.py) would still work if I were to use the base instructor, but the actual model execution doesn’t.

yeah, same here. conda on windows cpu.

@endolith I will have a look at the code and see what is causing this.

What worked for me: Windows 10, 3080 RTX, Cuda 12.1, python 3.10, pip 23.1.2, running inside virtual env in Powershell installed everything like in told in the tutorial: pip install -r requirements.txt

–check if cuda is installed, should be visible top right corner

nvidia-smi.exe

–check if torch is installed

pip list | findstr torch –if you see something like torch 2.0.1 torchvision 0.15.2 –then torch probably was compiled without cuda –check with another method python -c ‘import torch; print(torch.cuda.is_available())’ –if next line is “false” then torch wasn’t compiled with cuda

–clean up

pip uninstall torch torchvision torchaudio pip cache purge pip list | findstr torch

–go to this site an get the propper command for your system an cuda installation –https://pytorch.org/get-started/locally/

pip3 install --pre torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/cu121 –this loaded and installed torch with cuda

–check again

pip list | findstr torch python -c ‘import torch; print(torch.cuda.is_available())’ – output should be “true”

–try to ingest python ingest.py

I tried this and it worked, thanks for defining it step by step!

But now I have the next problem…I was able to ingest the document, but I couldn’t run it due to

torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 124.00 MiB. GPU 0 has a total capacty of 8.00 GiB of which 0 bytes is free. Of the allocated memory 6.95 GiB is allocated by PyTorch, and 289.00 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

I tried changing the EMBEDDING_MODEL_NAME to instructor-base, but it seems it’s not yet small enough.

I have an RTX3070, so only 8 GB VRAM unfortunately…anyone know what instructor model would work? or something else that might be done?

BTW @PromtEngineer: the current code checks for CUDA explicitly for full models, which makes it unusable for MPS: https://github.com/PromtEngineer/localGPT/blob/main/run_localGPT.py#L68

@ChristianWeyer I finally got a M2 and just tested it, that is the case. Need to figure out if there is another way.

@ChristianWeyer this seems to be a bug, thanks for highlighting it. I am not sure if auto_gptq supports M1/M2. Will need to test that.

I fixed this issue by installing requirements.txt through conda. I’ll summarize the steps I followed below:

Install conda for your platform here: https://docs.conda.io/projects/conda/en/latest/user-guide/install/index.html

Then I ran the following commands from the YouTube video:

conda create -n localGPT
conda activate localGPT
# Your terminal line should now start with (localGPT)
git clone https://github.com/PromtEngineer/localGPT.git
cd localGPT
pip install -r .\requirements.txt

If it worked, the output of torch.cuda.is_available() should be True.

python3
Python 3.11.4 (tags/v3.11.4:d2340ef, Jun  7 2023, 05:45:37) [MSC v.1934 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
>>> torch.cuda.is_available()
True