diffusers: IndexError on num_inference_steps values: 3, 9, 27, ...?
Describe the bug
Some num_inference_steps
values seem to be “problematic” and throw an IndexError.
Also the iteration status output does not end on that value in any relation. If it’s on purpose, it’s still confusing:
num_inference_steps: 5 6it [00:49, 8.27s/it]
num_inference_steps: 6 8it [01:02, 7.87s/it]
num_inference_steps: 7 9it [01:10, 7.88s/it]
num_inference_steps: 8 9it [01:13, 8.19s/it]
Reproduction
from diffusers import StableDiffusionPipeline
TOKEN = "(censored)"
MODEL_ID = "CompVis/stable-diffusion-v1-4"
pipe = StableDiffusionPipeline.from_pretrained(MODEL_ID, use_auth_token=TOKEN)
res = pipe("a photo of an astronaut riding a horse on mars", num_inference_steps=3, guidance_scale=7.5)
Tested on command line too. Unrelated to PyCharm environment.
Logs
/usr/bin/python3 /home/as/.config/JetBrains/PyCharm2022.2/scratches/scratch.py
0it [00:07, ?it/s]
Traceback (most recent call last):
File "/home/as/.config/JetBrains/PyCharm2022.2/scratches/scratch.py", line 7, in <module>
res = pipe("a photo of an astronaut riding a horse on mars", num_inference_steps=3, guidance_scale=7.5)
File "/home/as/.local/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
return func(*args, **kwargs)
File "/usr/lib/python3.10/site-packages/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion.py", line 148, in __call__
latents = self.scheduler.step(noise_pred, t, latents, **extra_step_kwargs)["prev_sample"]
File "/usr/lib/python3.10/site-packages/diffusers/schedulers/scheduling_pndm.py", line 136, in step
return self.step_plms(model_output=model_output, timestep=timestep, sample=sample)
File "/usr/lib/python3.10/site-packages/diffusers/schedulers/scheduling_pndm.py", line 212, in step_plms
prev_sample = self._get_prev_sample(sample, timestep, prev_timestep, model_output)
File "/usr/lib/python3.10/site-packages/diffusers/schedulers/scheduling_pndm.py", line 230, in _get_prev_sample
alpha_prod_t = self.alphas_cumprod[timestep + 1 - self._offset]
IndexError: index 1000 is out of bounds for dimension 0 with size 1000
System Info
diffusers
version: 0.4.0.dev0- Platform: Linux-5.19.7-arch1-1-x86_64-with-glibc2.36
- Python version: 3.10.6
- PyTorch version (GPU?): 1.12.1+cu102 (True)
- Huggingface_hub version: 0.9.1
- Transformers version: 4.21.3
- Using GPU in script?: No
- Using distributed or parallel set-up in script?: No
About this issue
- Original URL
- State: closed
- Created 2 years ago
- Comments: 20 (18 by maintainers)
I opened #465 to keep this on track about the indexing error / confusion.
Okay, I’ve tracked down what
offset
is for, seems like it’s used to make the final timestep of inference in stable diffusion1
rather than zero (which you said). The option is just a more elegant approach that makes it feature proof, but I see your point that making abool
may be better (if you want that, make a new issue).You can see where it is used in diffusers with this search (the stable diffusion pipelines). This is matching the stable diffusion implementation here (with a weird comment from the authors).
Now for the harder one, checking the index error. I remember encountering this in the past, I’ll likely add a test to the schedulers to cover this.
@natolambert Thank for fixing quickly. Will test it tommorrow too. Found some other non 3 related iterations yesterday.
@natolambert sure I will!
I’m also looking through openai/glide-text2im and I’m almost sure they don’t use the 1 offset there, so maybe that’s the reason for making it configurable.
@natolambert yeah, my confusion was mostly that in the original implementation it is set to 1 and same in the stable diffusion pipeline, so I wondered if
offset=0
is ever used. Now I looked for set_timesteps and looks like in most cases offset is not specified.Interestingly in CompVis/latent-diffusion they also use offset of 1 (ref), which is not the case in the latent diffusion pipeline (ref), not sure if that’s expected or not.
Another possible inconsistency I found is this:
https://github.com/huggingface/diffusers/blob/8eaaa546d89f836b716e92348786d878f883ee86/src/diffusers/schedulers/scheduling_pndm.py#L338-L339
https://github.com/huggingface/diffusers/blob/8eaaa546d89f836b716e92348786d878f883ee86/src/diffusers/schedulers/scheduling_ddim.py#L204-L205
It’s equivalent for
offset=1
, but not foroffset=0
Thanks for all the details on this, I’m going to take a look.