diffusers: Potential regression in deterministic outputs
Describe the bug
I’ve started noticing different outputs in the latest version of diffusers starting from diffusers 0.4.0 when compared against 0.3.0. This is my test code (extracted from a notebook):
import diffusers
from diffusers import StableDiffusionPipeline, DDIMScheduler
import torch
from IPython.display import display
def run_tests(pipe):
torch.manual_seed(1000)
display(pipe("A photo of Barack Obama smiling with a big grin").images[0])
torch.manual_seed(1000)
display(pipe("Labrador in the style of Vermeer").images[0])
pipe = StableDiffusionPipeline.from_pretrained("CompVis/stable-diffusion-v1-4")
pipe = pipe.to("cuda")
run_tests(pipe)
The first prompt produces identical results. The second one, however, results in different outputs:
0.3.0

main@a3efa433eac5feba842350c38a1db29244963fb5

Using DDIM, both prompts generate different images.
scheduler = DDIMScheduler(beta_start=0.00085, beta_end=0.012, beta_schedule="scaled_linear", num_train_timesteps=1000)
pipe = StableDiffusionPipeline.from_pretrained("CompVis/stable-diffusion-v1-4", scheduler=scheduler)
pipe = pipe.to("cuda")
run_tests(pipe)
DDIM 0.3.0

DDIM main

DDIM 0.3.0

DDIM main

In addition, there’s this post from a forum user with very different results in the img2img pipeline: https://discuss.huggingface.co/t/notable-differences-between-other-implementations-of-stable-diffusion-particularly-in-the-img2img-pipeline/24635/5. They opened another issue recently #901. Cross-referencing, may or may not be related to this issue.
Reproduction
As explained above.
Logs
No response
System Info
diffusers: main @ a3efa433eac5feba842350c38a1db29244963fb5 vs v0.3.0
About this issue
- Original URL
- State: closed
- Created 2 years ago
- Reactions: 1
- Comments: 20 (17 by maintainers)
Commits related to this issue
- replace new model_db.json (#902) — committed to nod-ai/diffusers by dan-garvey a year ago
- Revert "replace new model_db.json (#902)" (#904) This reverts commit 842adef29c642fde636cb5cd6e0e276d44aba65d. — committed to nod-ai/diffusers by powderluv a year ago
Once the pipeline tests are fully updated we should also make a doc explaining the problem with reproducibility in general with diffusion models. cc @anton-l
Small update here:
between 0.3.0 and 0.7.0dev using a V100
Overall this issue to me now seems much less severe than originally and a bit part of it is probably simply to “uncontrollable” randomness
Next:
Just a bit curious:
What kind of difference you are checking/looking here, @patrickvonplaten ?
Well, if you mean there is no visual difference, there would still be numerical difference, as I have found in the analysis. I think it would still be a good idea to record when such difference occurs among commits (or on a daily basis), so we can track them easily. But just a suggestion.
Update:
0.4.0seems to suffer from the same behavior as0.6.0.