peft: [`torch.compile()`] fails on PEFT LoRA in `diffusers`

Unsupported: call_method ListIteratorVariable() __contains__ [ConstantVariable(str)] {}

from user code:
   File "/usr/local/lib/python3.10/dist-packages/diffusers/models/unet_2d_condition.py", line 1047, in forward
    scale_lora_layers(self, lora_scale)
  File "/usr/local/lib/python3.10/dist-packages/diffusers/utils/peft_utils.py", line 114, in scale_lora_layers
    module.scale_layer(weight)
  File "/usr/local/lib/python3.10/dist-packages/peft/tuners/lora/layer.py", line 165, in scale_layer
    if active_adapter not in self.lora_A.keys():

Set TORCH_LOGS="+dynamo" and TORCHDYNAMO_VERBOSE=1 for more information


You can suppress this exception and fall back to eager by setting:
    import torch._dynamo
    torch._dynamo.config.suppress_errors = True

Code:

from diffusers import DiffusionPipeline
import torch

pipe_id = "stabilityai/stable-diffusion-xl-base-1.0"
pipe = DiffusionPipeline.from_pretrained(pipe_id, torch_dtype=torch.float16).to("cuda")

pipe.load_lora_weights("CiroN2022/toy-face", weight_name="toy_face_sdxl.safetensors", adapter_name="toy")

pipe.unet = torch.compile(pipe.unet, mode="reduce-overhead", fullgraph=True)

prompt = "toy_face of a hacker with a hoodie"

lora_scale= 0.9
image = pipe(
    prompt, num_inference_steps=30, cross_attention_kwargs={"scale": lora_scale}, generator=torch.manual_seed(0)
).images[0]

Reproduction Colab: https://colab.research.google.com/gist/sayakpaul/986f8b0b70945a77a3ba301adc4c2969/scratchpad.ipynb

Cc: @patrickvonplaten @apolinario

About this issue

  • Original URL
  • State: closed
  • Created 7 months ago
  • Comments: 15 (11 by maintainers)

Most upvoted comments

Agreed. I thought documenting and being aware of it would be beneficial as we tackle questions on these from the community all the time.

Hello,

These numbers are expected as there are graph breaks with PEFT due to conditional logic to handle multipl adapters. This isn’t the case without PEFT at the cost of multi adapters not being supported.

Hello, mode="reduce-overhead", fullgraph=True shouldn’t be used with PEFT. Becuase of support for multiple adapters there is conditional logic present such as checking if an adapter is active. Because of this fullgraph is never possible and the error you are seeing is because of these options which break during the first encounter of conditional logic. Removing these options and then compiling should work as per my last checks.

it does.

My 2 cents here is that issue should have very low prio - why would one not just fuse the LoRA and then apply torch.compile to the whole unet? I don’t see a use case where one would torch compile the unet + LoRA without merging the LoRA

Keeping it open for awareness then?

Indeed, PEFT currently doesn’t work with torch.compile (maybe some methods do, but not, for instance, LoRA). We have reported back to PyTorch and are waiting for a fix. There isn’t really anything on our side we could do, short of completely rewriting PEFT, so 🤞 that the PyTorch devs resolve some of the torch.compile limitations.

We have a GH action that runs all PEFT tests with torch.compile enabled and allows us to see what does and doesn’t work right now. It has to be triggered manually.