diffusers: xformers error

Describe the bug

WARNING:xformers:WARNING: /usr/local/lib/python3.8/dist-packages/xformers/_C.so: undefined symbol: _ZNK3c104impl13OperatorEntry20reportSignatureErrorENS0_12CppSignatureE Need to compile C++ extensions to get sparse attention support. Please run python setup.py build develop /usr/local/lib/python3.8/dist-packages/xformers/_C.so: undefined symbol: _ZNK3c104impl13OperatorEntry20reportSignatureErrorENS0_12CppSignatureE /usr/local/lib/python3.8/dist-packages/diffusers/models/attention.py:435: UserWarning: Could not enable memory efficient attention. Make sure xformers is installed correctly and a GPU is available: No such operator xformers::efficient_attention_forward_cutlass - did you forget to build xformers with python setup.py develop? warnings.warn(

Reproduction

No response

Logs

RuntimeError                              Traceback (most recent call last)
<ipython-input-6-6235797efbee> in <module>
     10 
     11 with autocast("cuda"), torch.inference_mode():
---> 12     images = pipe(
     13         prompt,
     14         height=height,

15 frames
/usr/local/lib/python3.8/dist-packages/xformers/ops/common.py in no_such_operator(*args, **kwargs)
      9 def get_xformers_operator(name: str):
     10     def no_such_operator(*args, **kwargs):
---> 11         raise RuntimeError(
     12             f"No such operator xformers::{name} - did you forget to build xformers with `python setup.py develop`?"
     13         )

RuntimeError: No such operator xformers::efficient_attention_forward_cutlass - did you forget to build xformers with `python setup.py develop`?

System Info

Colab notebook

About this issue

  • Original URL
  • State: closed
  • Created 2 years ago
  • Comments: 26 (1 by maintainers)

Most upvoted comments

I’ve put the xformers wheels compiled by facebookresearch here (at least the one that works on Google Colab for Tesla T4 and A100, the only ones I’ve tested):

https://github.com/brian6091/xformers-wheels/releases

@ShivamShrirao is there any update? still google collab is not working fine

You have to # comment line 1 and then uncomment line 5 and then wait about 42 mins for a wheel to compile. I did that and then did a quick test and that worked. I’m sure the link will be updated soon.

Problem on Google Colab:

The following values were not passed to accelerate launch and had defaults used instead: --num_processes was set to a value of 1 --num_machines was set to a value of 1 --mixed_precision was set to a value of 'no' --num_cpu_threads_per_process was set to 1 to improve out-of-box performance To avoid this warning pass in values for each of the problematic parameters or run accelerate config. /usr/local/lib/python3.8/dist-packages/xformers/_C.so: undefined symbol: _ZNK3c104impl13OperatorEntry20reportSignatureErrorENS0_12CppSignatureE WARNING: /usr/local/lib/python3.8/dist-packages/xformers/_C.so: undefined symbol: _ZNK3c104impl13OperatorEntry20reportSignatureErrorENS0_12CppSignatureE Need to compile C++ extensions to get sparse attention support. Please run python setup.py build develop

Fetching 15 files: 100% 15/15 [00:50<00:00, 3.38s/it] /usr/local/lib/python3.8/dist-packages/diffusers/models/attention.py:435: UserWarning: Could not enable memory efficient attention. Make sure xformers is installed correctly and a GPU is available: No such operator xformers::efficient_attention_forward_cutlass - did you forget to build xformers with python setup.py develop? warnings.warn( Generating class images: 0% 0/13 [00:06<?, ?it/s] Traceback (most recent call last): File “train_dreambooth.py”, line 822, in <module> main(args) File “train_dreambooth.py”, line 475, in main images = pipeline(example[“prompt”]).images File “/usr/local/lib/python3.8/dist-packages/torch/autograd/grad_mode.py”, line 27, in decorate_context return func(*args, **kwargs) File “/usr/local/lib/python3.8/dist-packages/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion.py”, line 532, in call noise_pred = self.unet(latent_model_input, t, encoder_hidden_states=text_embeddings).sample File “/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py”, line 1190, in _call_impl return forward_call(*input, **kwargs) File “/usr/local/lib/python3.8/dist-packages/diffusers/models/unet_2d_condition.py”, line 341, in forward sample, res_samples = downsample_block( File “/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py”, line 1190, in _call_impl return forward_call(*input, **kwargs) File “/usr/local/lib/python3.8/dist-packages/diffusers/models/unet_2d_blocks.py”, line 644, in forward hidden_states = attn(hidden_states, encoder_hidden_states=encoder_hidden_states).sample File “/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py”, line 1190, in _call_impl return forward_call(*input, **kwargs) File “/usr/local/lib/python3.8/dist-packages/diffusers/models/attention.py”, line 221, in forward hidden_states = block(hidden_states, context=encoder_hidden_states, timestep=timestep) File “/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py”, line 1190, in _call_impl return forward_call(*input, **kwargs) File “/usr/local/lib/python3.8/dist-packages/diffusers/models/attention.py”, line 479, in forward hidden_states = self.attn1(norm_hidden_states) + hidden_states File “/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py”, line 1190, in _call_impl return forward_call(*input, **kwargs) File “/usr/local/lib/python3.8/dist-packages/diffusers/models/attention.py”, line 568, in forward hidden_states = self._memory_efficient_attention_xformers(query, key, value) File “/usr/local/lib/python3.8/dist-packages/diffusers/models/attention.py”, line 624, in _memory_efficient_attention_xformers hidden_states = xformers.ops.memory_efficient_attention(query, key, value, attn_bias=None) File “/usr/local/lib/python3.8/dist-packages/xformers/ops/memory_efficient_attention.py”, line 967, in memory_efficient_attention return op.forward_no_grad( File “/usr/local/lib/python3.8/dist-packages/xformers/ops/memory_efficient_attention.py”, line 343, in forward_no_grad return cls.FORWARD_OPERATOR( File “/usr/local/lib/python3.8/dist-packages/xformers/ops/common.py”, line 11, in no_such_operator raise RuntimeError( RuntimeError: No such operator xformers::efficient_attention_forward_cutlass - did you forget to build xformers with python setup.py develop? Traceback (most recent call last): File “/usr/local/bin/accelerate”, line 8, in <module> sys.exit(main()) File “/usr/local/lib/python3.8/dist-packages/accelerate/commands/accelerate_cli.py”, line 43, in main args.func(args) File “/usr/local/lib/python3.8/dist-packages/accelerate/commands/launch.py”, line 837, in launch_command simple_launcher(args) File “/usr/local/lib/python3.8/dist-packages/accelerate/commands/launch.py”, line 354, in simple_launcher raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd) subprocess.CalledProcessError: Command ‘[’/usr/bin/python3’, ‘train_dreambooth.py’, ‘–pretrained_model_name_or_path=runwayml/stable-diffusion-v1-5’, ‘–pretrained_vae_name_or_path=stabilityai/sd-vae-ft-mse’, ‘–output_dir=/content/drive/MyDrive/stable_diffusion_weights/zwx’, ‘–revision=fp16’, ‘–with_prior_preservation’, ‘–prior_loss_weight=1.0’, ‘–seed=1337’, ‘–resolution=512’, ‘–train_batch_size=1’, ‘–train_text_encoder’, ‘–mixed_precision=fp16’, ‘–use_8bit_adam’, ‘–gradient_accumulation_steps=1’, ‘–learning_rate=1e-6’, ‘–lr_scheduler=constant’, ‘–lr_warmup_steps=0’, ‘–num_class_images=50’, ‘–sample_batch_size=4’, ‘–max_train_steps=1000’, ‘–save_interval=10000’, ‘–save_sample_prompt=kayswe’, ‘–concepts_list=concepts_list.json’]’ returned non-zero exit status 1.

Thanks for steering the wheel for us Brian.

I’ve put the xformers wheels compiled by facebookresearch here (at least the one that works on Google Colab for Tesla T4 and A100, the only ones I’ve tested):

https://github.com/brian6091/xformers-wheels/releases

It worked! I’ve changed the line of code for the xformer block to: %pip install -q https://github.com/brian6091/xformers-wheels/releases/download/0.0.15.dev0%2B4c06c79/xformers-0.0.15.dev0+4c06c79.d20221205-cp38-cp38-linux_x86_64.whl

Thanks @brian6091 , I have updated 58a03e9aafaeeacadb4392f21f5571cc8da8fd6d

I’ve done it a few times before, the wheel gets outdated a lot. It’s like 40 minutes to an hour or more. Shouldn’t go much more than 60.

I’ve just done a compile that took 1,5 hours 😬 Hope there’ll be a fix soon 🙏🏽 Colab: A100-SXM4-40GB, 40536 MiB, 40536 MiB

I’ve done it a few times before, the wheel gets outdated a lot. It’s like 40 minutes to an hour or more. Shouldn’t go much more than 60.

You have to # comment line 1 and then uncomment line 5 and then wait about 42 mins for a wheel to compile. I did that and then did a quick test and that worked. I’m sure the link will be updated soon.

Thank you for this tip. It seems to work now - it got past the training step. The compiling of the wheel did take way longer then 40 minutes… 🫠

thank you so much. i hope it will be updated quickly