diffusers: Allow safety checker pipeline configuration that returns boolean array but does not black out images
Is your feature request related to a problem? Please describe. See this PR comment: https://github.com/huggingface/diffusers/pull/815#discussion_r994418216
TLDR: with recent changes in #815, developers have the ability to disable the safety checker. Currently, the only options available to devs is to either have the safety checker or not have it at all. While this is useful, many applications of NSFW content require opt in access from end users. For example, consider the Reddit NSFW model – the end user is shown a ‘nsfw’ overlay that they have to manually click through. Currently, the diffusers library does not make it easy to support such a use case.
Describe the solution you’d like
I think the best approach is to add a flag to the SafetyChecker class called black_out_images
. This flag would then modify the if statement on this line: https://github.com/huggingface/diffusers/blob/797b290ed09a84091a4c23884b7c104f8e94b128/src/diffusers/pipelines/stable_diffusion/safety_checker.py#L74
for idx, has_nsfw_concept in enumerate(has_nsfw_concepts):
if has_nsfw_concept and black_out_images:
images[idx] = np.zeros(images[idx].shape) # black image
The flag would then be passed into the SafetyChecker from the top level pipeline config.
Describe alternatives you’ve considered
Another alternative is to do this at the pipeline level. For example, we could pass in a flag to the Pipeline class called black_out_nsfw_images
. This flag would then modify the safety_checker call here: https://github.com/huggingface/diffusers/blob/797b290ed09a84091a4c23884b7c104f8e94b128/src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion.py#L335
safety_checker_input = self.feature_extractor(self.numpy_to_pil(image), return_tensors="pt").to(self.device)
cleaned_image, has_nsfw_concept = self.safety_checker(
images=image, clip_input=safety_checker_input.pixel_values.to(text_embeddings.dtype)
)
if black_out_nsfw_images:
image = cleaned_image
Additional context In both cases, I believe the config can default to ‘nsfw images will be blacked out’. Having the option is critical, however.
About this issue
- Original URL
- State: closed
- Created 2 years ago
- Comments: 43 (20 by maintainers)
My solution how to get nsfw status and blur images:
I was ok with @patrickvonplaten’s proposal, and thought it neatly solved everyones concerns. See: https://github.com/huggingface/diffusers/issues/845#issuecomment-1280541219
Patrick offers implementing ‘blurred image’ as a default, but I would rather the options be “enabled”, “warn_only”, “none”, where:
Thanks so much for this discussion, and just popping in to say I completely agree that Hugging Face should create, and maintain, code for post-processing generated content (not limited to blurring, but that’s definitely one). IIRC, the current safety/censoring approach came from discussions with Stability AI – it definitely wasn’t one I had recommended.
From my perspective, there just haven’t been enough engineers around at HF with “free time” to do the more nuanced work needed here. We’re creating a job description for this task (and others), but it’s not approved yet + not up yet + not hired-for yet.
In the mean time, any community work on this would be beneficial for everyone (I think), and hopefully makes sense since Stable Diffusion is a community contribution. We’d want to have Stability AI agree to the approach too, since we’d be changing from what had been worked through earlier.
@patrickvonplaten
I think it should at least make sense… Right now, it returns a boolean for if an image is NSFW, and then also returns a black image. So from even a highest-level API viewers standpoint, the implementation just doesn’t make sense. If you were going to have a boolean for a developer to implement something (a message, blurring the image, etc), you’d have something.
It’s not usable in any API sense, high, low, or whatever. It just doesn’t make sense. You even advertise the use of the safety_checker.py in this repos documentation, but it has no use but brute force censoring. No API aspect to it.
Is this really more about liabilities, shoving it off to “community” pipelines?
Considering like, everyone, ever, that isn’t a service, just straight disabled this with a dummy function or
None
it does seem to be clearly not so much about what’s right for the usage, but something ulterior. If someone wants to censor, they can, and they should be able to use the safety checker as it seems inherently implied to be used as, a warning to the devloper, to then handle as they please.If you do keep it as is, please think about renaming it, so it makes sense. Like “safety_censor”. It ain’t checking Jack (form a usage and implementation standpoint)
Wouldn’t it be far more beneficial to return Gaussian blurred images, like in the Discord beta? If the image is black, you don’t know if this is something to refine in your prompt, or a prompt to scrap.
The way I have edited the safety checker is to simply return the image, regardless. Now it’s up to the developer to make use of the
has_nsfw_concepts
and decide what to do with it. In my case, I have chosen the Gaussian blur, so the user can see if:It actually doesn’t make much sense to return a
has_nsfw_concepts
type deal if you’re just returning a black image… clearly, they’ll get it. Lol@alexisrolland as far as I remember, it is from Torchvision
No, I have not had the time to do more with this. I removed the checker altogether for my use case.
The false positives made it challenging, and unless individuals are specifically requesting NSFW prompts, the bulk of the outputs seem to be kosher.
I won’t have time to look into this anytime soon. @justinmerrell feel free to open a PR for this one (should be quite simple) if you need it urgently
@meg-huggingface can you comment more on the discussion with Stability AI? I was under the impression that the Stable Diffusion model is tangentially related to Stability AI at best (they don’t seem to be the maintainers on the original SD repo, nor are they authors on the paper), so I’m curious why Stability AI would be involved in any discussions around usage of the model
I don’t think this library is to safely and easily enable many people to use SD. It’s a API to allow developers to build services, which then let many people utilize SD. In no way is installing python, dependencies, and copying and pasting/customizing a script “many people” (or easy) and inherently a miniscule scope of them by the user bases of services like mine and others.
To that end, an arbitrary safety “checker” that doesn’t do as it describes, or even the functionality it’s programmed to do correctly, is beneficial to an API.
Having an option for black out/ heavy Gaussian was my initial intended idea (and I have Gaussian implemented so a user can actually screen what’s going on in safe non-descriptive/explicit manner as I described above).
And lets be honest, the safety checker is weird. It censors random stuff and you don’t know why. Was it actually explicit (did it just throw in a random nude person or something? Happens) You’d likely be able to tell through a non-descriptive Gaussian blur, and then make use of negative prompts.