black: Black does not honor exclude regex when files explicitly listed on the command line

Operating system: OSX Python version: 3.6.2 Black version: black, version 18.6b4

The problem: certain directories in our repo contain generated python code that we don’t want black to change. We’ve configure our repo to run black via pre-commit. Pre-commit invokes black with a list of changed files on the command line, and black’s exclude regex does not work against those files and paths.

i.e.

black --exclude "/migrations/" content/migrations/0049_publicationstore_is_test.py
reformatted content/migrations/0049_publicationstore_is_test.py
All done! ✨ 🍰 ✨
1 file reformatted.

This makes us sad, since we’ve carefully put exclusion regexes into our pyproject.toml and black doesn’t honor them when pre-commit calls it. Instead, we’re having to workaround by configuring pre-commit to skip that path:

repos:
-   repo: https://github.com/ambv/black
    rev: stable
    hooks:
    - id: black
      language_version: python3.6
      exclude: migrations

The behavior we’d like to see is that black’s exclude regex would apply even when full file paths are listed on the commandline. I’d be happy to try for a PR if this seems like desirable behavior to anyone else…

About this issue

  • Original URL
  • State: closed
  • Created 6 years ago
  • Reactions: 25
  • Comments: 48 (20 by maintainers)

Commits related to this issue

Most upvoted comments

When using an editor integration that calls black with a changed file’s path (which it could do automatically, such as on a post save action), this behaviour also means that it would reformat the file even if it’s excluded by the config.

So I would be in favor of either changing the default to always consider the ignore/exclude rules, or to include an option to do so even when a full path is provided.

From a prior art perspective, flake8 has this same issue (marked: WONTFIX) – that said, I personally think the decision in flake8 is incorrect/inconsistent and that implementing this is a good idea 😃

it has a merge conflict because is a month old, and I wouldn’t keep rebasing if there was no interest. For tests, as I mention in the PR:

I Have not added or amended tests yet, but if we agree on the approach, I can work on them

so, yes, it was a way to throw a more concrete proposal and discussion. Sometimes is better to actually do a PR than discuss on the issue. you say that there hasn’t been a concrete proposal, I’d say it doesn’t get more concrete than a PR 😄

so, yes, what’s missing is an agreement, and for that we need core maintainers and contributors to chime in. I am just trying to bring attention on an issue that seems fairly popular.

Thanks!

asking again. I think the ball is in the core contributor’s court to decide what to do, or at least give an update. Any answer is legitimate (won’t fix, no time for this, alternative proposal) but at least an answer will help people set expectations ✌️ This is something that needs to come from core contributors. What we can do is give awareness that this is a problem for many people, but we need guidance in order to solve this.

I have added a PR following @chebee7i proposal, which I think is the most sensical https://github.com/psf/black/pull/1032. I think that black needs to work well when integrating with other tools (editors, pre-commit hooks) for larger adoption. this concerns probably were less of a thing some years ago when flake8 was developed, but if we look at other modern tools and ecosystems (node: prettier, eslint, husky) this is the expected behavior.

--include= and --exclude= are only consulted for recursive search, not for files passed on the command line.

How is the decision of flake8 inconsistent here? The rationale is rather simple: by default, we exclude some paths and only include some file extensions in recursive search. But if you specifically give us a file path on the command line which doesn’t match the file extension or would otherwise belong to the exclusion list, you probably know what you’re doing.

I do agree that interfacing with CI and editors is surprising in this case so I’m not downright rejecting changing this. But I need to carefully consider whether there are backwards compatibility disasters hiding in changing this. And even if we agree to do this, what does it mean for defaults in --include=? Should I reject non- .py/.pyi files from now on unless somebody clears --include= on their call?

This is not going to be straight-forward to change but let’s try to figure something out. Anthony, what would you suggest?

Hi everyone,

I also run into this problem with multiple editors and this behavior was surprising to me, as a user.

IMHO you don’t want to have multiple configurations stating the same files to exclude: pyproject.toml, pre-commit (not everyone using the tool named pre-commit), VIM, PyDev, IntelliJ etc.

We have many tools in the team so it’s essential one configuration will be used by everyone.

I think that in terms of usability, surprising the user is never a good idea. We can add two flags (tentative naming) to help black behave nicely when an excluded file is given to black as an argument:

  1. -f/--force to force formatting an excluded file, return 0 (or whatever we return from the formatting)
  2. --ignore-excluded which means black silently ignore the excluded files, formatting the other files (if any) and return 0 (assuming the rest was fine)
  3. Without flags, return an error (e.g. -1) and warn that excluded files were given explicitly

If this change of behavior is agreed upon, I can try and create the PR.

Can we get the above PR reviewed? The change seems pretty small and it seems like it would benefit a fair number of folks.

weird I thought I mentioned this but I guess not, if you’re only invoking through pre-commit it’s usually better to use pre-commit’s exclude: ... pattern:

repos:
-   repo: ...
    rev: ...
    hooks:
    -   id: black
        exclude: ^testing/test_data/

or if you’re globally excluding


repos:
exclude: ^vendor/
-   repo: ...
    rev: ...
    hooks:
    -   id: black

ah let me clarify – black and flake8 currently have the same behaviour here (they are consistent).

Maybe it’s a bit snowflakey but intuitively it makes a lot of sense for me to apply --exclude even if passed on the commandline. This simplifies a lot of editor configurations, pre-commit, and even just black foo/*.py. I don’t think --include should be applied except in the recursive case however.

That said, there’s certainly arguments in both directions – it may very well be simpler to only apply it during the recursive routines.

the one thing I usually point at here is “pre-commit is better at running your linter than your linter is” because it can take advantage of a few things:

  • “recursive” doesn’t really matter, pre-commit knows which files are part of your version control
  • “exclusion” is configurable in multiple ways (global exclusion, per-linter exclusion)
  • pre-commit knows how to find files by shebangs thanks to identify

Though for tools that often means you have to configure both pre-commit exclusion and tool exclusion and keep them in sync (or a superset / subset of each other). It would be nice to only configure this in one place and for most that usually means “configure it using the tool’s configuration”. But then you run into OP’s issue 😃

Not to muddy the waters any. I see it looks like a consensus is forming around having the exclusion rule also apply to file passed in via the command line (and I hope this, too, will go for flake8).

But thought I’d point out isort “fixed” this by adding another flag filter_files, which says

Tells isort to filter files even when they are explicitly passed in as part of the command. This is especially useful to get skip and skip_glob to work when running isort through pre-commit.

I don’t think @ambv has weighed in on his opinion of another command-line flag (which I see has been brought up), which solves this problem in a backwards-compatible way (since @ambv did mention backwards-compatibility is something to not take lightly).

it’s really sad to see no answer at all from maintainers. we simply forked with the least intrusive modification and moved on with our lives…

please don’t suggest that per above, thanks

even with black’s cache it’s going to be the wrong thing during merge conflicts and you’re back to linting files that aren’t checked in and you have to do filesystem traversals which themselves are pretty slow

given how frequent this comes up I’m considering changing flake8’s behaviour to honor flake8 foo.py --exclude foo.py to be a noop (as silly as such a command is) and perhaps if that goes well black should do the same

@itajaja, thanks for persevering with this issue. If you made your pull request to use --force-exclude, I will gladly merge it.

first, it’s possible to enable this behaviour but I don’t think it’s desirable (as it sidesteps the benefits of the framework):

    -   id: black
        args: [.]
        pass_filenames: false
        always_run: true

pre-commit is (generally) better at running linters than linters themselves are, here’s a couple of reasons why:

  • pre-commit knows about what files are in version control and will never run against files which aren.t There’s no need to parse / worry about .gitignore / exclude .git|.tox|venv|..., etc. / recurse / etc.
  • pre-commit knows how to lint extensionless executables with a conventional shebang (#!)
  • pre-commit knows when or when not to run the linter (only passing filenames which change, not even executing at all when there are zero files which change)

An example of a case where always running black on all the files is (very) wrong is during a merge conflict resolution. pre-commit will only execute a linter on files which conflict or are manually changed avoiding the headache of waiting for black (or other linters) to run across every file that changed in the upstream (and potentially dealing with other people’s mistakes as a punishment for merging).

(gotta run, hope this succinct reply is enough, if not I can elaborate / link some more prose on this – hope it helps!)

uhm…

I like the idea of --filter-files. Let’s name it --force-exclude.

Hello all, I’m using VSCode and would also like to see this option come to Black for a more correct format-on-save behaviour. Maybe there could be an optional boolean flag like --force-hard-exclude (similar to isort’s --filter-files) which flips the behaviour to unconditionally honoring the exclude list regardless of the command line arguments.

And thanks for the fantastic tool! 🙂👍

🙏 I’ll take a stab!

@ambv and here’s the stab 🔪 https://github.com/psf/black/pull/1032 All tests pass, so we are pretty confident it’s backward compat. If this looks good, I’ll add tests, but let me know what tests you think would be useful

@c0state for the life of me I couldn’t find way to do that, or docs from jetbrains on it.

See https://www.jetbrains.com/help/idea/settings-scopes.html. All file watchers have a scope. The default is “Project Files” I believe which will probably format stuff you don’t want. I usually set one up that includes my project files and then explicitly excludes venv/.venv, build, etc. dirs and such. If all your projects look the same (structurally), you can make this scope global and use it in all your projects, otherwise you can set this up manually for each project.

yep

this is the 4th top commented issue in the repo, it would help a lot of people if some attention could be spared on this. There is an open PR, so my question is why this is not moving forward? a) simply no time to look into this? b) won’t fix? c) something else?

They are all acceptable answers 😃 but I’d like to set some expectations for myself

I agree with Anthony that pre-commit is in a better position to figure out when to run Black on what, so if we could somehow get away with only ever running Black through pre-commit, that would provide a way to solve this problem.

I don’t think it’s OK to assume Black is always going to be called through pre-commit though, so we do need a separate include/exclude mechanism (and unfortunately we do need to worry about parsing .gitignore et al) for when it’s called directly from the command line or through editors.

As for the performance concerns around running black . on the entire repo every time you commit: after the first run, it should be O(changed files) instead of O(repo) because of Black’s cache.

So I think all in all we should encourage people to configure Black with the following in their .pre-commit-config.yaml

    -   id: black
        args: [.]
        pass_filenames: false
        always_run: true

Any progress/decisions on this?

I’d like to wait for what Anthony thinks about it. If he agrees that we should change the hook setup so that Black is ran on the entire repo every commit, then we will do just that and the README then remains fine.

Otherwise, yeah, we will have to explicitly call out in the README that --exclude= in pyproject.toml is not used by pre-commit. I would like to avoid this.

Would a PR with an update to the readme be warranted then?

Avoid using args in the hook. Instead, store necessary configuration in pyproject.toml so that editors and command-line usage of Black all behave consistently for your project. See Black's own pyproject.toml for an example.

This heavily implies that pre-commit should not have any config (although it admittedly just calls out args specifically, as a first time user this was confusing). Some clarification that pre-commit will call files explicitly, thus will ignore black’s exclude list would be nice.