diffusers: [examples/text_to_image] cuda out of memory, though I followed the instructions of train_text_to_image.py
Describe the bug
when I ran the script: examples/text_to_image/text_to_image.py, using the follwing command:
export MODEL_NAME=“CompVis/stable-diffusion-v1-4” export dataset_name=“lambdalabs/pokemon-blip-captions”
python train_text_to_image.py
–pretrained_model_name_or_path=$MODEL_NAME
–dataset_name=$dataset_name
–use_ema
–resolution=512 --center_crop --random_flip
–train_batch_size=1
–gradient_accumulation_steps=4
–gradient_checkpointing
–max_train_steps=15000
–learning_rate=1e-05
–max_grad_norm=1
–lr_scheduler=“constant” --lr_warmup_steps=0
–output_dir=“sd-pokemon-model”
–mixed_precision=“fp16”
I tried to decrease the resolution and remove --center_crop --random_flip, it did not work.
The hardware I used: V100, 32GB pytorch1.11
logs:
Reproduction
Logs
No response
System Info
diffusers
version: 0.17.0.dev0- Platform: Linux-5.4.0-60-generic-x86_64-with-debian-buster-sid
- Python version: 3.7.5
- PyTorch version (GPU?): 1.11.0+cu102 (True)
- Huggingface_hub version: 0.14.1
- Transformers version: 4.29.1
- Accelerate version: 0.19.0
- xFormers version: not installed
- Using GPU in script?: <fill in> yes
- Using distributed or parallel set-up in script?: <fill in> neither 1p or 8p can work
About this issue
- Original URL
- State: closed
- Created a year ago
- Comments: 16 (8 by maintainers)
My tests:
--enable_xformers_memory_efficient_attention
) works fine and takes only ~14 GB of GPU RAM.When using PyTorch 2 I verified that we are using
AttnProcessor2_0
here: https://github.com/huggingface/diffusers/blob/c6ae8837512d0572639b9f57491d4482fdc8948c/src/diffusers/models/attention_processor.py#L161. I’m not sure what’s the reason for not fitting in 24 GB any more.