imaginAIry: Error when using Custom models (i.e. DReambooth)

I am using diffusers_sd_v2/scripts from https://github.com/ShivamShrirao/diffusers

I created my own model via:

accelerate launch train_dreambooth.py

Then ran conversion via:

python3 convert_diffusers_to_original_stable_diffusion.py --model_path ...

Reshaping encoder.mid.attn_1.q.weight for SD format
Reshaping encoder.mid.attn_1.k.weight for SD format
Reshaping encoder.mid.attn_1.v.weight for SD format
Reshaping encoder.mid.attn_1.proj_out.weight for SD format
Reshaping decoder.mid.attn_1.q.weight for SD format
Reshaping decoder.mid.attn_1.k.weight for SD format
Reshaping decoder.mid.attn_1.v.weight for SD format
Reshaping decoder.mid.attn_1.proj_out.weight for SD format

Finishes without errors.

When loading model into imaginAIry I get this:

imagine --model /home/ubuntu/.cache/model.ckpt  "horse in the forest" 

🤖🧠 imaginAIry received 1 prompt(s) and will repeat them 1 times to create 1 images.
Generating 🖼  1/1: "horse in the forest" 512x512px seed:324180400 prompt-strength:7.5 steps:15 sampler-type:k_dpmpp_2m
Loading model /home/ubuntu/.cache/ima.ckpt onto cuda backend...
/usr/lib/python3/dist-packages/apport/report.py:13: DeprecationWarning: the imp module is deprecated in favour of importlib; see the module's documentation for alternative uses
  import fnmatch, glob, traceback, errno, sys, atexit, locale, imp, stat
Traceback (most recent call last):
  File "/home/ubuntu/.local/bin/imagine", line 8, in <module>
    sys.exit(imagine_cmd())
  File "/usr/lib/python3/dist-packages/click/core.py", line 764, in __call__
    return self.main(*args, **kwargs)
  File "/usr/lib/python3/dist-packages/click/core.py", line 717, in main
    rv = self.invoke(ctx)
  File "/usr/lib/python3/dist-packages/click/core.py", line 956, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/usr/lib/python3/dist-packages/click/core.py", line 555, in invoke
    return callback(*args, **kwargs)
  File "/usr/lib/python3/dist-packages/click/decorators.py", line 17, in new_func
    return f(get_current_context(), *args, **kwargs)
  File "/home/ubuntu/.local/lib/python3.8/site-packages/imaginairy/cmds.py", line 266, in imagine_cmd
    imagine_image_files(
  File "/home/ubuntu/.local/lib/python3.8/site-packages/imaginairy/api.py", line 75, in imagine_image_files
    for result in imagine(
  File "/home/ubuntu/.local/lib/python3.8/site-packages/imaginairy/api.py", line 134, in imagine
    model = get_diffusion_model(
  File "/home/ubuntu/.local/lib/python3.8/site-packages/imaginairy/model_manager.py", line 115, in get_diffusion_model
    return _get_diffusion_model(
  File "/home/ubuntu/.local/lib/python3.8/site-packages/imaginairy/model_manager.py", line 174, in _get_diffusion_model
    model.num_timesteps_cond  # noqa
  File "/home/ubuntu/.local/lib/python3.8/site-packages/imaginairy/model_manager.py", line 50, in __getattr__
    model = load_model_from_config(
  File "/home/ubuntu/.local/lib/python3.8/site-packages/imaginairy/model_manager.py", line 91, in load_model_from_config
    missing_keys, unexpected_keys = model.load_state_dict(state_dict, strict=False)
  File "/home/ubuntu/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1667, in load_state_dict
    raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for LatentDiffusion:
	size mismatch for model.diffusion_model.input_blocks.1.1.proj_in.weight: copying a param with shape torch.Size([320, 320]) from checkpoint, the shape in current model is torch.Size([320, 320, 1, 1]).
	size mismatch for model.diffusion_model.input_blocks.1.1.transformer_blocks.0.attn2.to_k.weight: copying a param with shape torch.Size([320, 1024]) from checkpoint, the shape in current model is torch.Size([320, 768]).
	size mismatch for model.diffusion_model.input_blocks.1.1.transformer_blocks.0.attn2.to_v.weight: copying a param with shape torch.Size([320, 1024]) from checkpoint, the shape in current model is torch.Size([320, 768]).
	size mismatch for model.diffusion_model.input_blocks.1.1.proj_out.weight: copying a param with shape torch.Size([320, 320]) from checkpoint, the shape in current model is torch.Size([320, 320, 1, 1]).
	size mismatch for model.diffusion_model.input_blocks.2.1.proj_in.weight: copying a param with shape torch.Size([320, 320]) from checkpoint, the shape in current model is torch.Size([320, 320, 1, 1]).
	size mismatch for model.diffusion_model.input_blocks.2.1.transformer_blocks.0.attn2.to_k.weight: copying a param with shape torch.Size([320, 1024]) from checkpoint, the shape in current model is torch.Size([320, 768]).
	size mismatch for model.diffusion_model.input_blocks.2.1.transformer_blocks.0.attn2.to_v.weight: copying a param with shape torch.Size([320, 1024]) from checkpoint, the shape in current model is torch.Size([320, 768]).
	size mismatch for model.diffusion_model.input_blocks.2.1.proj_out.weight: copying a param with shape torch.Size([320, 320]) from checkpoint, the shape in current model is torch.Size([320, 320, 1, 1]).
	size mismatch for model.diffusion_model.input_blocks.4.1.proj_in.weight: copying a param with shape torch.Size([640, 640]) from checkpoint, the shape in current model is torch.Size([640, 640, 1, 1]).
	size mismatch for model.diffusion_model.input_blocks.4.1.transformer_blocks.0.attn2.to_k.weight: copying a param with shape torch.Size([640, 1024]) from checkpoint, the shape in current model is torch.Size([640, 768]).
	size mismatch for model.diffusion_model.input_blocks.4.1.transformer_blocks.0.attn2.to_v.weight: copying a param with shape torch.Size([640, 1024]) from checkpoint, the shape in current model is torch.Size([640, 768]).
	size mismatch for model.diffusion_model.input_blocks.4.1.proj_out.weight: copying a param with shape torch.Size([640, 640]) from checkpoint, the shape in current model is torch.Size([640, 640, 1, 1]).
	size mismatch for model.diffusion_model.input_blocks.5.1.proj_in.weight: copying a param with shape torch.Size([640, 640]) from checkpoint, the shape in current model is torch.Size([640, 640, 1, 1]).
	size mismatch for model.diffusion_model.input_blocks.5.1.transformer_blocks.0.attn2.to_k.weight: copying a param with shape torch.Size([640, 1024]) from checkpoint, the shape in current model is torch.Size([640, 768]).
	size mismatch for model.diffusion_model.input_blocks.5.1.transformer_blocks.0.attn2.to_v.weight: copying a param with shape torch.Size([640, 1024]) from checkpoint, the shape in current model is torch.Size([640, 768]).
	size mismatch for model.diffusion_model.input_blocks.5.1.proj_out.weight: copying a param with shape torch.Size([640, 640]) from checkpoint, the shape in current model is torch.Size([640, 640, 1, 1]).
	size mismatch for model.diffusion_model.input_blocks.7.1.proj_in.weight: copying a param with shape torch.Size([1280, 1280]) from checkpoint, the shape in current model is torch.Size([1280, 1280, 1, 1]).
	size mismatch for model.diffusion_model.input_blocks.7.1.transformer_blocks.0.attn2.to_k.weight: copying a param with shape torch.Size([1280, 1024]) from checkpoint, the shape in current model is torch.Size([1280, 768]).
	size mismatch for model.diffusion_model.input_blocks.7.1.transformer_blocks.0.attn2.to_v.weight: copying a param with shape torch.Size([1280, 1024]) from checkpoint, the shape in current model is torch.Size([1280, 768]).
	size mismatch for model.diffusion_model.input_blocks.7.1.proj_out.weight: copying a param with shape torch.Size([1280, 1280]) from checkpoint, the shape in current model is torch.Size([1280, 1280, 1, 1]).
	size mismatch for model.diffusion_model.input_blocks.8.1.proj_in.weight: copying a param with shape torch.Size([1280, 1280]) from checkpoint, the shape in current model is torch.Size([1280, 1280, 1, 1]).
	size mismatch for model.diffusion_model.input_blocks.8.1.transformer_blocks.0.attn2.to_k.weight: copying a param with shape torch.Size([1280, 1024]) from checkpoint, the shape in current model is torch.Size([1280, 768]).
	size mismatch for model.diffusion_model.input_blocks.8.1.transformer_blocks.0.attn2.to_v.weight: copying a param with shape torch.Size([1280, 1024]) from checkpoint, the shape in current model is torch.Size([1280, 768]).
	size mismatch for model.diffusion_model.input_blocks.8.1.proj_out.weight: copying a param with shape torch.Size([1280, 1280]) from checkpoint, the shape in current model is torch.Size([1280, 1280, 1, 1]).
	size mismatch for model.diffusion_model.middle_block.1.proj_in.weight: copying a param with shape torch.Size([1280, 1280]) from checkpoint, the shape in current model is torch.Size([1280, 1280, 1, 1]).
	size mismatch for model.diffusion_model.middle_block.1.transformer_blocks.0.attn2.to_k.weight: copying a param with shape torch.Size([1280, 1024]) from checkpoint, the shape in current model is torch.Size([1280, 768]).
	size mismatch for model.diffusion_model.middle_block.1.transformer_blocks.0.attn2.to_v.weight: copying a param with shape torch.Size([1280, 1024]) from checkpoint, the shape in current model is torch.Size([1280, 768]).
	size mismatch for model.diffusion_model.middle_block.1.proj_out.weight: copying a param with shape torch.Size([1280, 1280]) from checkpoint, the shape in current model is torch.Size([1280, 1280, 1, 1]).
	size mismatch for model.diffusion_model.output_blocks.3.1.proj_in.weight: copying a param with shape torch.Size([1280, 1280]) from checkpoint, the shape in current model is torch.Size([1280, 1280, 1, 1]).
	size mismatch for model.diffusion_model.output_blocks.3.1.transformer_blocks.0.attn2.to_k.weight: copying a param with shape torch.Size([1280, 1024]) from checkpoint, the shape in current model is torch.Size([1280, 768]).
	size mismatch for model.diffusion_model.output_blocks.3.1.transformer_blocks.0.attn2.to_v.weight: copying a param with shape torch.Size([1280, 1024]) from checkpoint, the shape in current model is torch.Size([1280, 768]).
	size mismatch for model.diffusion_model.output_blocks.3.1.proj_out.weight: copying a param with shape torch.Size([1280, 1280]) from checkpoint, the shape in current model is torch.Size([1280, 1280, 1, 1]).
	size mismatch for model.diffusion_model.output_blocks.4.1.proj_in.weight: copying a param with shape torch.Size([1280, 1280]) from checkpoint, the shape in current model is torch.Size([1280, 1280, 1, 1]).
	size mismatch for model.diffusion_model.output_blocks.4.1.transformer_blocks.0.attn2.to_k.weight: copying a param with shape torch.Size([1280, 1024]) from checkpoint, the shape in current model is torch.Size([1280, 768]).
	size mismatch for model.diffusion_model.output_blocks.4.1.transformer_blocks.0.attn2.to_v.weight: copying a param with shape torch.Size([1280, 1024]) from checkpoint, the shape in current model is torch.Size([1280, 768]).
	size mismatch for model.diffusion_model.output_blocks.4.1.proj_out.weight: copying a param with shape torch.Size([1280, 1280]) from checkpoint, the shape in current model is torch.Size([1280, 1280, 1, 1]).
	size mismatch for model.diffusion_model.output_blocks.5.1.proj_in.weight: copying a param with shape torch.Size([1280, 1280]) from checkpoint, the shape in current model is torch.Size([1280, 1280, 1, 1]).
	size mismatch for model.diffusion_model.output_blocks.5.1.transformer_blocks.0.attn2.to_k.weight: copying a param with shape torch.Size([1280, 1024]) from checkpoint, the shape in current model is torch.Size([1280, 768]).
	size mismatch for model.diffusion_model.output_blocks.5.1.transformer_blocks.0.attn2.to_v.weight: copying a param with shape torch.Size([1280, 1024]) from checkpoint, the shape in current model is torch.Size([1280, 768]).
	size mismatch for model.diffusion_model.output_blocks.5.1.proj_out.weight: copying a param with shape torch.Size([1280, 1280]) from checkpoint, the shape in current model is torch.Size([1280, 1280, 1, 1]).
	size mismatch for model.diffusion_model.output_blocks.6.1.proj_in.weight: copying a param with shape torch.Size([640, 640]) from checkpoint, the shape in current model is torch.Size([640, 640, 1, 1]).
	size mismatch for model.diffusion_model.output_blocks.6.1.transformer_blocks.0.attn2.to_k.weight: copying a param with shape torch.Size([640, 1024]) from checkpoint, the shape in current model is torch.Size([640, 768]).
	size mismatch for model.diffusion_model.output_blocks.6.1.transformer_blocks.0.attn2.to_v.weight: copying a param with shape torch.Size([640, 1024]) from checkpoint, the shape in current model is torch.Size([640, 768]).
	size mismatch for model.diffusion_model.output_blocks.6.1.proj_out.weight: copying a param with shape torch.Size([640, 640]) from checkpoint, the shape in current model is torch.Size([640, 640, 1, 1]).
	size mismatch for model.diffusion_model.output_blocks.7.1.proj_in.weight: copying a param with shape torch.Size([640, 640]) from checkpoint, the shape in current model is torch.Size([640, 640, 1, 1]).
	size mismatch for model.diffusion_model.output_blocks.7.1.transformer_blocks.0.attn2.to_k.weight: copying a param with shape torch.Size([640, 1024]) from checkpoint, the shape in current model is torch.Size([640, 768]).
	size mismatch for model.diffusion_model.output_blocks.7.1.transformer_blocks.0.attn2.to_v.weight: copying a param with shape torch.Size([640, 1024]) from checkpoint, the shape in current model is torch.Size([640, 768]).
	size mismatch for model.diffusion_model.output_blocks.7.1.proj_out.weight: copying a param with shape torch.Size([640, 640]) from checkpoint, the shape in current model is torch.Size([640, 640, 1, 1]).
	size mismatch for model.diffusion_model.output_blocks.8.1.proj_in.weight: copying a param with shape torch.Size([640, 640]) from checkpoint, the shape in current model is torch.Size([640, 640, 1, 1]).
	size mismatch for model.diffusion_model.output_blocks.8.1.transformer_blocks.0.attn2.to_k.weight: copying a param with shape torch.Size([640, 1024]) from checkpoint, the shape in current model is torch.Size([640, 768]).
	size mismatch for model.diffusion_model.output_blocks.8.1.transformer_blocks.0.attn2.to_v.weight: copying a param with shape torch.Size([640, 1024]) from checkpoint, the shape in current model is torch.Size([640, 768]).
	size mismatch for model.diffusion_model.output_blocks.8.1.proj_out.weight: copying a param with shape torch.Size([640, 640]) from checkpoint, the shape in current model is torch.Size([640, 640, 1, 1]).
	size mismatch for model.diffusion_model.output_blocks.9.1.proj_in.weight: copying a param with shape torch.Size([320, 320]) from checkpoint, the shape in current model is torch.Size([320, 320, 1, 1]).
	size mismatch for model.diffusion_model.output_blocks.9.1.transformer_blocks.0.attn2.to_k.weight: copying a param with shape torch.Size([320, 1024]) from checkpoint, the shape in current model is torch.Size([320, 768]).
	size mismatch for model.diffusion_model.output_blocks.9.1.transformer_blocks.0.attn2.to_v.weight: copying a param with shape torch.Size([320, 1024]) from checkpoint, the shape in current model is torch.Size([320, 768]).
	size mismatch for model.diffusion_model.output_blocks.9.1.proj_out.weight: copying a param with shape torch.Size([320, 320]) from checkpoint, the shape in current model is torch.Size([320, 320, 1, 1]).
	size mismatch for model.diffusion_model.output_blocks.10.1.proj_in.weight: copying a param with shape torch.Size([320, 320]) from checkpoint, the shape in current model is torch.Size([320, 320, 1, 1]).
	size mismatch for model.diffusion_model.output_blocks.10.1.transformer_blocks.0.attn2.to_k.weight: copying a param with shape torch.Size([320, 1024]) from checkpoint, the shape in current model is torch.Size([320, 768]).
	size mismatch for model.diffusion_model.output_blocks.10.1.transformer_blocks.0.attn2.to_v.weight: copying a param with shape torch.Size([320, 1024]) from checkpoint, the shape in current model is torch.Size([320, 768]).
	size mismatch for model.diffusion_model.output_blocks.10.1.proj_out.weight: copying a param with shape torch.Size([320, 320]) from checkpoint, the shape in current model is torch.Size([320, 320, 1, 1]).
	size mismatch for model.diffusion_model.output_blocks.11.1.proj_in.weight: copying a param with shape torch.Size([320, 320]) from checkpoint, the shape in current model is torch.Size([320, 320, 1, 1]).
	size mismatch for model.diffusion_model.output_blocks.11.1.transformer_blocks.0.attn2.to_k.weight: copying a param with shape torch.Size([320, 1024]) from checkpoint, the shape in current model is torch.Size([320, 768]).
	size mismatch for model.diffusion_model.output_blocks.11.1.transformer_blocks.0.attn2.to_v.weight: copying a param with shape torch.Size([320, 1024]) from checkpoint, the shape in current model is torch.Size([320, 768]).
	size mismatch for model.diffusion_model.output_blocks.11.1.proj_out.weight: copying a param with shape torch.Size([320, 320]) from checkpoint, the shape in current model is torch.Size([320, 320, 1, 1]).
	size mismatch for cond_stage_model.transformer.text_model.embeddings.token_embedding.weight: copying a param with shape torch.Size([49408, 1024]) from checkpoint, the shape in current model is torch.Size([49408, 768]).
	size mismatch for cond_stage_model.transformer.text_model.embeddings.position_embedding.weight: copying a param with shape torch.Size([77, 1024]) from checkpoint, the shape in current model is torch.Size([77, 768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.0.self_attn.k_proj.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([768, 768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.0.self_attn.k_proj.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.0.self_attn.v_proj.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([768, 768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.0.self_attn.v_proj.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.0.self_attn.q_proj.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([768, 768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.0.self_attn.q_proj.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.0.self_attn.out_proj.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([768, 768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.0.self_attn.out_proj.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.0.layer_norm1.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.0.layer_norm1.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.0.mlp.fc1.weight: copying a param with shape torch.Size([4096, 1024]) from checkpoint, the shape in current model is torch.Size([3072, 768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.0.mlp.fc1.bias: copying a param with shape torch.Size([4096]) from checkpoint, the shape in current model is torch.Size([3072]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.0.mlp.fc2.weight: copying a param with shape torch.Size([1024, 4096]) from checkpoint, the shape in current model is torch.Size([768, 3072]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.0.mlp.fc2.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.0.layer_norm2.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.0.layer_norm2.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.1.self_attn.k_proj.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([768, 768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.1.self_attn.k_proj.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.1.self_attn.v_proj.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([768, 768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.1.self_attn.v_proj.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.1.self_attn.q_proj.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([768, 768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.1.self_attn.q_proj.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.1.self_attn.out_proj.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([768, 768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.1.self_attn.out_proj.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.1.layer_norm1.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.1.layer_norm1.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.1.mlp.fc1.weight: copying a param with shape torch.Size([4096, 1024]) from checkpoint, the shape in current model is torch.Size([3072, 768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.1.mlp.fc1.bias: copying a param with shape torch.Size([4096]) from checkpoint, the shape in current model is torch.Size([3072]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.1.mlp.fc2.weight: copying a param with shape torch.Size([1024, 4096]) from checkpoint, the shape in current model is torch.Size([768, 3072]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.1.mlp.fc2.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.1.layer_norm2.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.1.layer_norm2.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.2.self_attn.k_proj.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([768, 768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.2.self_attn.k_proj.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.2.self_attn.v_proj.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([768, 768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.2.self_attn.v_proj.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.2.self_attn.q_proj.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([768, 768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.2.self_attn.q_proj.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.2.self_attn.out_proj.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([768, 768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.2.self_attn.out_proj.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.2.layer_norm1.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.2.layer_norm1.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.2.mlp.fc1.weight: copying a param with shape torch.Size([4096, 1024]) from checkpoint, the shape in current model is torch.Size([3072, 768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.2.mlp.fc1.bias: copying a param with shape torch.Size([4096]) from checkpoint, the shape in current model is torch.Size([3072]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.2.mlp.fc2.weight: copying a param with shape torch.Size([1024, 4096]) from checkpoint, the shape in current model is torch.Size([768, 3072]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.2.mlp.fc2.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.2.layer_norm2.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.2.layer_norm2.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.3.self_attn.k_proj.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([768, 768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.3.self_attn.k_proj.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.3.self_attn.v_proj.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([768, 768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.3.self_attn.v_proj.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.3.self_attn.q_proj.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([768, 768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.3.self_attn.q_proj.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.3.self_attn.out_proj.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([768, 768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.3.self_attn.out_proj.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.3.layer_norm1.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.3.layer_norm1.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.3.mlp.fc1.weight: copying a param with shape torch.Size([4096, 1024]) from checkpoint, the shape in current model is torch.Size([3072, 768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.3.mlp.fc1.bias: copying a param with shape torch.Size([4096]) from checkpoint, the shape in current model is torch.Size([3072]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.3.mlp.fc2.weight: copying a param with shape torch.Size([1024, 4096]) from checkpoint, the shape in current model is torch.Size([768, 3072]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.3.mlp.fc2.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.3.layer_norm2.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.3.layer_norm2.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.4.self_attn.k_proj.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([768, 768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.4.self_attn.k_proj.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.4.self_attn.v_proj.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([768, 768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.4.self_attn.v_proj.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.4.self_attn.q_proj.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([768, 768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.4.self_attn.q_proj.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.4.self_attn.out_proj.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([768, 768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.4.self_attn.out_proj.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.4.layer_norm1.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.4.layer_norm1.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.4.mlp.fc1.weight: copying a param with shape torch.Size([4096, 1024]) from checkpoint, the shape in current model is torch.Size([3072, 768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.4.mlp.fc1.bias: copying a param with shape torch.Size([4096]) from checkpoint, the shape in current model is torch.Size([3072]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.4.mlp.fc2.weight: copying a param with shape torch.Size([1024, 4096]) from checkpoint, the shape in current model is torch.Size([768, 3072]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.4.mlp.fc2.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.4.layer_norm2.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.4.layer_norm2.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.5.self_attn.k_proj.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([768, 768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.5.self_attn.k_proj.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.5.self_attn.v_proj.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([768, 768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.5.self_attn.v_proj.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.5.self_attn.q_proj.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([768, 768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.5.self_attn.q_proj.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.5.self_attn.out_proj.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([768, 768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.5.self_attn.out_proj.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.5.layer_norm1.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.5.layer_norm1.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.5.mlp.fc1.weight: copying a param with shape torch.Size([4096, 1024]) from checkpoint, the shape in current model is torch.Size([3072, 768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.5.mlp.fc1.bias: copying a param with shape torch.Size([4096]) from checkpoint, the shape in current model is torch.Size([3072]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.5.mlp.fc2.weight: copying a param with shape torch.Size([1024, 4096]) from checkpoint, the shape in current model is torch.Size([768, 3072]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.5.mlp.fc2.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.5.layer_norm2.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.5.layer_norm2.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.6.self_attn.k_proj.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([768, 768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.6.self_attn.k_proj.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.6.self_attn.v_proj.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([768, 768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.6.self_attn.v_proj.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.6.self_attn.q_proj.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([768, 768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.6.self_attn.q_proj.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.6.self_attn.out_proj.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([768, 768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.6.self_attn.out_proj.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.6.layer_norm1.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.6.layer_norm1.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.6.mlp.fc1.weight: copying a param with shape torch.Size([4096, 1024]) from checkpoint, the shape in current model is torch.Size([3072, 768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.6.mlp.fc1.bias: copying a param with shape torch.Size([4096]) from checkpoint, the shape in current model is torch.Size([3072]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.6.mlp.fc2.weight: copying a param with shape torch.Size([1024, 4096]) from checkpoint, the shape in current model is torch.Size([768, 3072]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.6.mlp.fc2.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.6.layer_norm2.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.6.layer_norm2.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.7.self_attn.k_proj.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([768, 768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.7.self_attn.k_proj.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.7.self_attn.v_proj.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([768, 768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.7.self_attn.v_proj.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.7.self_attn.q_proj.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([768, 768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.7.self_attn.q_proj.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.7.self_attn.out_proj.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([768, 768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.7.self_attn.out_proj.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.7.layer_norm1.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.7.layer_norm1.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.7.mlp.fc1.weight: copying a param with shape torch.Size([4096, 1024]) from checkpoint, the shape in current model is torch.Size([3072, 768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.7.mlp.fc1.bias: copying a param with shape torch.Size([4096]) from checkpoint, the shape in current model is torch.Size([3072]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.7.mlp.fc2.weight: copying a param with shape torch.Size([1024, 4096]) from checkpoint, the shape in current model is torch.Size([768, 3072]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.7.mlp.fc2.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.7.layer_norm2.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.7.layer_norm2.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.8.self_attn.k_proj.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([768, 768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.8.self_attn.k_proj.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.8.self_attn.v_proj.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([768, 768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.8.self_attn.v_proj.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.8.self_attn.q_proj.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([768, 768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.8.self_attn.q_proj.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.8.self_attn.out_proj.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([768, 768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.8.self_attn.out_proj.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.8.layer_norm1.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.8.layer_norm1.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.8.mlp.fc1.weight: copying a param with shape torch.Size([4096, 1024]) from checkpoint, the shape in current model is torch.Size([3072, 768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.8.mlp.fc1.bias: copying a param with shape torch.Size([4096]) from checkpoint, the shape in current model is torch.Size([3072]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.8.mlp.fc2.weight: copying a param with shape torch.Size([1024, 4096]) from checkpoint, the shape in current model is torch.Size([768, 3072]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.8.mlp.fc2.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.8.layer_norm2.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.8.layer_norm2.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.9.self_attn.k_proj.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([768, 768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.9.self_attn.k_proj.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.9.self_attn.v_proj.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([768, 768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.9.self_attn.v_proj.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.9.self_attn.q_proj.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([768, 768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.9.self_attn.q_proj.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.9.self_attn.out_proj.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([768, 768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.9.self_attn.out_proj.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.9.layer_norm1.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.9.layer_norm1.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.9.mlp.fc1.weight: copying a param with shape torch.Size([4096, 1024]) from checkpoint, the shape in current model is torch.Size([3072, 768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.9.mlp.fc1.bias: copying a param with shape torch.Size([4096]) from checkpoint, the shape in current model is torch.Size([3072]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.9.mlp.fc2.weight: copying a param with shape torch.Size([1024, 4096]) from checkpoint, the shape in current model is torch.Size([768, 3072]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.9.mlp.fc2.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.9.layer_norm2.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.9.layer_norm2.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.10.self_attn.k_proj.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([768, 768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.10.self_attn.k_proj.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.10.self_attn.v_proj.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([768, 768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.10.self_attn.v_proj.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.10.self_attn.q_proj.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([768, 768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.10.self_attn.q_proj.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.10.self_attn.out_proj.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([768, 768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.10.self_attn.out_proj.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.10.layer_norm1.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.10.layer_norm1.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.10.mlp.fc1.weight: copying a param with shape torch.Size([4096, 1024]) from checkpoint, the shape in current model is torch.Size([3072, 768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.10.mlp.fc1.bias: copying a param with shape torch.Size([4096]) from checkpoint, the shape in current model is torch.Size([3072]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.10.mlp.fc2.weight: copying a param with shape torch.Size([1024, 4096]) from checkpoint, the shape in current model is torch.Size([768, 3072]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.10.mlp.fc2.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.10.layer_norm2.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.10.layer_norm2.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.11.self_attn.k_proj.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([768, 768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.11.self_attn.k_proj.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.11.self_attn.v_proj.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([768, 768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.11.self_attn.v_proj.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.11.self_attn.q_proj.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([768, 768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.11.self_attn.q_proj.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.11.self_attn.out_proj.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([768, 768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.11.self_attn.out_proj.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.11.layer_norm1.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.11.layer_norm1.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.11.mlp.fc1.weight: copying a param with shape torch.Size([4096, 1024]) from checkpoint, the shape in current model is torch.Size([3072, 768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.11.mlp.fc1.bias: copying a param with shape torch.Size([4096]) from checkpoint, the shape in current model is torch.Size([3072]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.11.mlp.fc2.weight: copying a param with shape torch.Size([1024, 4096]) from checkpoint, the shape in current model is torch.Size([768, 3072]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.11.mlp.fc2.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.11.layer_norm2.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([768]).
	size mismatch for cond_stage_model.transformer.text_model.encoder.layers.11.layer_norm2.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([768]).
	size mismatch for cond_stage_model.transformer.text_model.final_layer_norm.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([768]).
	size mismatch for cond_stage_model.transformer.text_model.final_layer_norm.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([768]).

About this issue

Original URL
State: closed
Created 2 years ago
Comments: 19 (10 by maintainers)

Most upvoted comments

I’m working on this but I got a little distracted… working on allowing imaginairy to train models by itself.

brycedrennan on Jan 10, 2023

docs for training: https://github.com/brycedrennan/imaginAIry/blob/master/docs/concept-training.md

brycedrennan on Jan 16, 2023

I do have a branch where I got a parallax effect working: https://www.youtube.com/shorts/GslIUQ3KS2M

brycedrennan on Jan 16, 2023

I also added model training (similar to dreambooth). It works for sure but I can’t say I ever got results trained on my own face that felt as consistent as what I see people getting online.

I tried to make it an easy workflow at least. It uses face detection to auto-crop the provided images and will upscale any images that are not big enough.

brycedrennan on Jan 16, 2023