diffusers: DreamBooth Diffusers tutorial failed
Describe the bug
Issue with the DreamBooth Diffusers tutorial. After executing the accelerate launch command, the script breaks with the following runtimeError: RuntimeError: Given groups=1, weight of size [320, 4, 3, 3], expected input[1, 3, 512, 512] to have 4 channels, but got 3 channels instead
.
I’ve not been able to locate the source of the issue, and I’ve not seen any other issues reported on this particular one, therefore I’d appreciate if anyone would give me some guidance on what steps could I do to debug this.
Here’s the initial screenshot where the script successfully downloaded necessary files such as the safetensors.
I’m going to attach a log of the result once I try to execute the script again (without traces of the downloads for clearer visualization of the warnings/errors).
The command I’m trying to execute is the following:
accelerate launch train_dreambooth.py --pretrained_model_name_or_path=“CompVis/stable-diffusion-v1-4” --instance_data_dir=“./dog” --output_dir=“./test” --instance_prompt=“a photo of sks dog” --resolution=512 --train_batch_size=1 --gradient_accumulation_steps=1 --learning_rate=5e-6 --lr_scheduler=“constant” --lr_warmup_steps=0 --max_train_steps=400
Reproduction
To reproduce this error, all you need is to follow the steps here:
https://huggingface.co/docs/diffusers/training/dreambooth
Logs
D:\Anaconda\lib\site-packages\accelerate\accelerator.py:258: FutureWarning: `logging_dir` is deprecated and will be removed in version 0.18.0 of 🤗 Accelerate. Use `project_dir` instead.
warnings.warn(
05/27/2023 17:07:31 - INFO - __main__ - Distributed environment: NO
Num processes: 1
Process index: 0
Local process index: 0
Device: cpu
Mixed precision type: no
You are using a model of type clip_text_model to instantiate a model of type . This is not supported for all configurations of models and can yield errors.
{'variance_type', 'dynamic_thresholding_ratio', 'prediction_type', 'sample_max_value', 'clip_sample_range', 'thresholding'} was not found in config. Values will be initialized to default values.
{'addition_embed_type_num_heads', 'class_embeddings_concat', 'resnet_skip_time_act', 'encoder_hid_dim', 'time_cond_proj_dim', 'time_embedding_dim', 'encoder_hid_dim_type', 'upcast_attention', 'only_cross_attention', 'dual_cross_attention', 'use_linear_projection', 'class_embed_type', 'projection_class_embeddings_input_dim', 'resnet_out_scale_factor', 'mid_block_only_cross_attention', 'cross_attention_norm', 'conv_in_kernel', 'addition_embed_type', 'timestep_post_act', 'conv_out_kernel', 'time_embedding_type', 'time_embedding_act_fn', 'mid_block_type', 'num_class_embeds', 'resnet_time_scale_shift'} was not found in config. Values will be initialized to default values.
05/27/2023 17:07:37 - INFO - __main__ - ***** Running training *****
05/27/2023 17:07:37 - INFO - __main__ - Num examples = 5
05/27/2023 17:07:37 - INFO - __main__ - Num batches each epoch = 5
05/27/2023 17:07:37 - INFO - __main__ - Num Epochs = 80
05/27/2023 17:07:37 - INFO - __main__ - Instantaneous batch size per device = 1
05/27/2023 17:07:37 - INFO - __main__ - Total train batch size (w. parallel, distributed & accumulation) = 1
05/27/2023 17:07:37 - INFO - __main__ - Gradient Accumulation steps = 1
05/27/2023 17:07:37 - INFO - __main__ - Total optimization steps = 400
Steps: 0%| | 0/400 [00:00<?, ?it/s]Traceback (most recent call last):
File "D:\stableDiffusionFolder\modelTraining\diffusers\examples\dreambooth\train_dreambooth.py", line 1323, in <module>
main(args)
File "D:\stableDiffusionFolder\modelTraining\diffusers\examples\dreambooth\train_dreambooth.py", line 1194, in main
model_pred = unet(noisy_model_input, timesteps, encoder_hidden_states).sample
File "D:\Anaconda\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "D:\Anaconda\lib\site-packages\diffusers\models\unet_2d_condition.py", line 807, in forward
sample = self.conv_in(sample)
File "D:\Anaconda\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "D:\Anaconda\lib\site-packages\torch\nn\modules\conv.py", line 463, in forward
return self._conv_forward(input, self.weight, self.bias)
File "D:\Anaconda\lib\site-packages\torch\nn\modules\conv.py", line 459, in _conv_forward
return F.conv2d(input, weight, bias, self.stride,
RuntimeError: Given groups=1, weight of size [320, 4, 3, 3], expected input[1, 3, 512, 512] to have 4 channels, but got 3 channels instead
Steps: 0%| | 0/400 [00:01<?, ?it/s]
Traceback (most recent call last):
File "D:\Anaconda\lib\runpy.py", line 197, in _run_module_as_main
return _run_code(code, main_globals, None,
File "D:\Anaconda\lib\runpy.py", line 87, in _run_code
exec(code, run_globals)
File "D:\Anaconda\Scripts\accelerate.exe\__main__.py", line 7, in <module>
File "D:\Anaconda\lib\site-packages\accelerate\commands\accelerate_cli.py", line 45, in main
args.func(args)
File "D:\Anaconda\lib\site-packages\accelerate\commands\launch.py", line 918, in launch_command
simple_launcher(args)
File "D:\Anaconda\lib\site-packages\accelerate\commands\launch.py", line 580, in simple_launcher
raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)
subprocess.CalledProcessError: Command '['D:\\Anaconda\\python.exe', 'train_dreambooth.py', '--pretrained_model_name_or_path=CompVis/stable-diffusion-v1-4', '--instance_data_dir=./dog', '--output_dir=./test', '--instance_prompt=a photo of sks dog', '--resolution=512', '--train_batch_size=1', '--gradient_accumulation_steps=1', '--learning_rate=5e-6', '--lr_scheduler=constant', '--lr_warmup_steps=0', '--max_train_steps=400']' returned non-zero exit status
1.
System Info
Windows, Python 3.9.13
All of the libraries were installed on source, from this repository, via the build.py and the requirements.txt, but if needed I can provide the versions of the libraries you think can be causing this issue.
About this issue
- Original URL
- State: closed
- Created a year ago
- Reactions: 2
- Comments: 19 (9 by maintainers)
@sayakpaul @CamooCodee I was successful at running the training, but with the lora variant, on the windows setup.
I changed the pytorch version to 1.13.1 aswell, I haven’t tried with the 2.1 because I’m too scared I’m going to break it all again. I also reduced the resolution to 256. Here’s the versions:
diffusers
version: 0.17.0.dev0