gisting: Running out of memory when trying to compress
Basically title, my spec is 6 GB VRAM on a 1070.
I used a gist model, specifically your flan-t5-gist model on HuggingFace, along with bf16 precision as suggested inside compress
however I keep running into a CUDA Out of Memory error. Is there a minimum amount of VRAM any system needs before they can make use of gisting? (In another issue you pointed at 12 GB being able to work, so I’m guessing my only option is to use Accelerate)
About this issue
- Original URL
- State: closed
- Created a year ago
- Comments: 25 (12 by maintainers)
Did you try
pip install -r requirements.txt
? Does it throw an error?Assuming you cloned my repository, you should alternatively be able to clone the huggingface transformers repository as well, checkout the relevant commit, then do
pip install -e .
in the repo directory to install the package locally.yep, at long last haha
See #10, your transformers version is likely wrong.
Hi, the FLAN-T5-gist model on huggingface is 11B parameters and needs around 20-30GB VRAM in bf16 inference mode. If you have less GPU VRAM you have two options: (1) look into lower precision inference e.g. https://github.com/TimDettmers/bitsandbytes or (2) train a smaller gist model from scratch (the training commands in the README support this, but unfortunately I don’t have checkpoints for smaller gist models)