diffusers: Unmasked areas are degraded during inpainting.

Describe the bug

Unmasked areas are degraded during inpainting.

init_image i

Easy to see the difference under the bench. (This is a GIF animation) output

Is this a bug? Or how can I generate an image while keeping the original image except for the mask?

Reproduction

I tried inpaint according to the following documentation Text-guided image-inpainting

The same results were obtained for both the method by code and the method by HuggingFace Space.

Logs

No response

System Info

HuggingFace Space

or

Google Colab (!pip install --upgrade diffusers[torch] transformers)

- `diffusers` version: 0.17.1
- Platform: Linux-5.15.107+-x86_64-with-glibc2.31
- Python version: 3.10.12
- PyTorch version (GPU?): 2.0.1+cu118 (True)
- Huggingface_hub version: 0.15.1
- Transformers version: 4.30.2
- Accelerate version: 0.20.3
- xFormers version: not installed
- Using GPU in script?: <fill in>
- Using distributed or parallel set-up in script?: <fill in>

Who can help?

@patrickvonplaten @sayakpaul @williamberman

About this issue

  • Original URL
  • State: closed
  • Created a year ago
  • Comments: 16 (8 by maintainers)

Most upvoted comments

@patrickvonplaten Thanks!

I could see the difference in your sample images. I think it is important to explain why the functionality of unmasked areas not being changed is necessary.

For example, if someone uses inpainting in their work, the ability to keep unmasked areas unchanged is important, because unmasked areas are not required to be modified. This is because there will be work requirements that unmasked areas will not be changed. In addition, if the inpainting process needs to be repeated multiple times, the changes will be more deviated from the original image and will not be able to withstand the demands.

I’d like to help with the PR for that, as I believe it’s a necessary feature, but it seemed difficult to understand right away.

So now I am enforcing it myself by adding a process to replace unmasked areas with the prior original image. I realize this is a bad pattern, of course.

from PIL import Image
import cv2
import numpy as np

# A generated image of unmasked areas by diffusers is changed a little bit
# So we need to replace the unmasked areas with the original image
def fix_converted_image(
    original_image: Image.Image, generated_image: Image.Image, mask_image: Image.Image
) -> Image.Image:
    # PIL.Image to numpy
    original_image_array = np.array(original_image)
    generated_image_array = np.array(generated_image)

    # invert mask
    mask_image_array = np.array(mask_image)
    mask_image_inverted = cv2.bitwise_not(mask_image_array)

    # replace pixels
    converted_image_array = np.where(
        mask_image_inverted == 255, original_image_array, generated_image_array
    )

    # numpy to PIL.Image
    converted_image = Image.fromarray(converted_image_array)
    return converted_image

@dg845 that sounds great to me!