fd: Discussion: show Git-ignored files by default?
Since fd was first published, the feature to hide Git-ignored files by default has always been controversial. It’s the number one pitfall for new users, as witnessed by the numerous issues that have been opened over time (even though this is the first point in the Troubleshooting section). Even experienced users will likely run into this from time to time.
We have had past discussions about this (see #179, #220, #18), but I’m not so sure anymore if this default is the best possible option for the “average user”.
I thought it might make sense to discuss this again and see what others think. Whatever we choose as the default, it will always be easy for users to select a different default via an alias.
Pro current behavior (do not show .gitignored entries by default):
- Most searches are faster if we take
.gitignorefiles into account..gitignored directories tend to contain huge amounts of automatically generated build artifacts or downloaded dependency files. Pruning these directories from the search tree typically results in a faster search overall. There are counterexamples to this where the parsing of long.gitignorefiles takes longer than actually traversing these directories. - Most of the time,
.gitignored results are not “interesting” to the user (however, see counterpart below). - When running
fdwithout any arguments, I typically don’t want to see.gitignored files.
Cons:
- It can be very confusing to (new) users. If 10% of users go so far as to create a ticket on GitHub to ask about their problem, there must be hundreds of users that ran into this problem at some point.
- Even if you know about the default, it can be annoying to repeat the search because you forgot to add
-Ior-u. There are a lot of valid use cases where users are - in fact - interested in results from ignored directories or files. Personally, I would estimate that I use-uuor-HIin roughly 20% of my searches, which is quite high.
About this issue
- Original URL
- State: open
- Created 4 years ago
- Reactions: 25
- Comments: 63 (5 by maintainers)
I want to add that the nature of the files in
.gitignoredepends a lot on the nature of our projects. In my case for example most of the time the ignored files are files with sensitive information (not crap) that I want to be able to search withfd.But I understand that for other people often the files in
.gitignorehave to be ignored byfdas well.For this reason the desired default behavior is not the same for everyone.
In my opinion, the default behavior should be “search all”, because it’s easier to figure out why there are too many results than it is to figure out why there are missing results.
But such a change in behavior will not be backward compatible, which is never good. To overcome this, people must be allowed to easily return to the old default behavior. Hence the need to be able to configure the default behavior of
fd(#362).Just to throw my opinion into the ring. I’m in favor of changing the default.
When I use fd with no options/arguments other than a pattern, 99% of the time I’m just using it to quickly narrow down the list of files I need to look at to find what I want. In that case I’m okay if I get some things that I don’t care about in the search, but I’m much more annoyed if I don’t find something that’s actually there because I forgot to specify that I wanted to search .gitignored files as well. @sharkdp said that he adds
-Hor-uaround 20% of the time, meaning those flags mattered 20% of the time. But I’m willing to bet that if those flags were enabled by default then they would have to be disabled much less than 20% of the time.Also, from a scripting/reducing noise perspective, normally when when I’m doing something more precise than just quickly narrowing down a list of files, I’m more willing to add flags and check the documentation in order to narrow down the search results to be only what I care about.
And concerning adding an
fdgbinary (or symlink), I don’t see how that’s better than just adding an optional flag. It feels like it would complicate CI and packaging a lot for something that essentially just flips a flag on by default.I think the point of having a tool like this is that it’s opinionated. If I have to start adding flags to reach the default behavior/length of
find, I might as well usefindLike
rgand the rest of the modern tools, what makes them great is their defaults. If the only advantage is a very minor speed bump, people would just use the preinstalled tools they already know.The fact that it ignores hidden and
gitignored files is in the main bullet point feature list. If one doesn’t bother to read that…The dilemma
I think that for this issue and for #362 the question is:
Should
fdbehaves as “general” or “git-style” tool ?What I mean by this is summarized in the following table
find,grepgit,rgActually
fdis in between the two worlds. And respecting the ignore files without the configuration feature is bad, IMO. Sofdshould choose the red pill or the blue pill 😉My proposal
Or may be we don’t have to choose and we can have both.
I think that :
fdshould be a general tool (by default)--gitthat makes it act in a “git style”fdgshould be another compiled version from the same code but with different default behavior, that is equivalent tofd --gitand heaving--no-gitflag to switch it to “general tool”.My arguments for creating the additional
fdgare the following:--no-gitoption as default, the other with--gitas default.powershellandcmdincluded)fdwe should setup the aliases, so the advantage of “no install, just download and use it” is not valid any morefdfdgor even to renamefdgfofdand continue using it as before (respecting the ignore files by default)fd(this would be done using the configuration file infdg)I would add my vote to search all files except hidden files by default.
Just adding my experience here that this caught me off guard multiple times. Most users install fd as a replacement for find, so it can be surprising when it doesn’t show certain files by default.
In case it helps, I thought
fdwas broken while searching for something I knew was in mynode_modulesdirectory due to this.I think this is a really good point and I am seriously considering a switch of the default behavior in fd version 9. This would be a major breaking change. I know for a fact that people are using
fdin scripts and pipelines. They will have to adapt (check) their code when upgrading to fd 9.One practical problem is that we have a set of (short) command-line options that are designed to work with the current default, like
-I/--no-ignore. We have a (somewhat hidden)--ignorecounterpart, but no suitable short option. We would also have to figure out what to do with--no-ignore-vcs,--no-ignore-parent,unrestricted, etc.If this default changes, I would humbly request that an inverse CLI flag allows us to override previous CLI flags.
For example…
This way folks can easily choose their default via an interactive shell alias, but still have the option e.g.
FWIW, ripgrep (which I imagine a lot of fd users also use) also respects gitignore by default.
Hello.
I’d like to make a point that
fdis a general-purpose file-searching utility that is not git specific, so having it to take into account.gitignorefiles, laying around in the filesystem, does not feel right. In fact, I’ve stumbled upon this issue the very first time I triedfd: I’ve tried to find something, starting from a non-git directory in subdirectories which happened to be git repos and found nothing, although I knew it was there. After that, the very next thing I did is patchedfdlocally, so it wouldn’t read.gitignorefiles by default.My suggestion would be to not change the default, so we don’t break anyone’s workflow. Instead, how about something like this:
With #595 implemented, users could make
fdan alias tofd --no-hidden --ignoreto keep the current behaviour and suppress the warning, or tofd -HIto show everything by default.I’d be okay with always printing that warning, even if there are matches. But especially if there are no matches it might be handy.
IMO, all of this discussion about what are the best defaults points to the following conclusions:
fd(aliases, wrappers, environment variables are not, they are shell dependent), so having a config file is IMO mandatory to solve this issue (see #362).fdruns reproducible, a--ignore-configflag should be added if the config file is introduced (see #362).My personal opinion on what are the “best” defaults should be discussed from a newbie’s point of view. More experienced users will know how to tweak the tool to their own needs.
I also like @tavianator proposal, possibly with the following caveats:
fdshould print a warning only if outputting tostdoutin an interactive shellThought dump:
gitis primarily concerned about the contents of files, their state, but not their presence. This means that.gitignorefiles are also about the state of the files, but not their presence.ripgrep, just likegit, also primarily concerned about the contents of files, and this shared concern makes its choice of consideration of.gitignorefiles understandable, although it could also be opt-in.fd, on the other hand, is not concerned about the state of the files, but is concerned about their presence, what differs from concerns ofgitandripgrep, what makes its consideration of.gitignorefiles slightly less fitting.@jchook This should be done already by https://github.com/sharkdp/fd/pull/822
Yeah, might require a patch to
ignore. We don’t need to know what paths they were, or even how many, just whether it ignored anything.I don’t think we need to show the warning when
-Iis passed, at least about ignored files. We could still warn about hidden files unless-His passed.Adding this to the “fd 9” milestone because I would like to settle this discussion and introduce the (possible) breaking change in that version (see #613).
If you do make this change, please consider using separate flags for .gitignore, .fdignore, etc. I have run into valid use cases for (observe .fdignore, ignore .gitignore) and visa-versa.
Examples:
-I/--ignore -- Ignore file patterns in .gitignore and .fdignore-Ig/--ignore-gitignore -- Ignore file patterns in .gitignore-If/--ignore-fdignore -- Ignore file patterns in .fdignore-N/--no-ignore -- do Not ignore file patterns in .gitignore and .fdignore-Ng/--no-ignore-gitignore -- do Not ignore file patterns in .gitignore-Nf/--no-ignore-fdignore -- do Not ignore file patterns in .fdignoreUnfortunately these all use double-negatives, and there is a potential confusion about the double-meaning of “ignore .gitignore” (ignoring the .gitignore file and ignoring the files within it have opposite meaning). Other terms that may be less confusing:
There is precedent for fine-grained ignore params in ripgrep: (
--no-ignore-dot, --no-ignore-global, --no-ignore-vcs, etc.)[ If the above comment about supporting non-git repositories is implemented, then
Igmight becomeIv(for vcs) ]I have an additive suggestion, which could leverage or make the suggestion obsolete: Add an according description to tldr.
rg/ripgrephas a descriptionrg -uu pattern, which is the second result and thus searchable in 1s.20% typing the thing would then overall still mean less time. Bonus is that
-uucould be established as use hidden github stuff or “do more work”.One client for
tldris tealdeer.We already have ~/.fdignore. Maybe this file could somehow ‘include’ gitignore (via something like @~/.gitignore or some other character/directive)? With this approach showing git-ignored files could be enabled via default, also allowing user to add his git-ignored entries that are already in ~/.gitignore (or ./.git/ignore when inside repository) in an easy way?
Personally, I wasn’t even aware that git-ignored files are omitted: https://imgur.com/a/UlLD8ED For now my .fdignore contains mostly 100% of .gitignore + other patterns. It would be great to be able to ‘include’ the file as a whole, not to copy it’s content manually.
I like that idea, with one caveat. Instead of having a separately compiled version of fd, fdg should just be a symlink to fd, and fd check the name that it is called with, and if it is “fd” use the general behavior, and if it is “fdg” use the “git” behavior. Or alternatively distribute OS-specific wrapper scripts for fdg (for example that does something like
exec fd --ignore, or whatever the windows equivalent is).I’m not entirely opposed to a change in the default, as long as there is an easy way for users to keep the current behavior if they want. Which could be as simple as being able to do
alias fd="fd --ignore-vcs"(or--ignore-git), as long as I can still use--no-ignore-vcs,-u,-I, etc. to turn off the previous--ignore-vcs(which is how it currently works).That brings up the question, should the new default be the equivalent of current
fd --no-ignore, orfd --no-ignore-vcs?Personally, I think it would be a little surprising if fd doesn’t respect .fdignore files by default. .ignore is more questionable. OTOH, in the case that you don’t use any ignore files, bypassing the ignore machinery could improve performance.
If we changed the default to
fd --no-ignore-vcs, then the-I,--no-ignoreoption would still be meaningful, since it excludes the .fdignore and .ignore files. Although perhaps not needed quite as often.For the long option, I think we would probably reverse the importance in the documentation (although maybe make the --no-ignore more prominent than it currently is).
As for the short option, that depends on what direction we went with for --no-ignore vs --no-ignore-vcs as the default.
If we went with
--no-ignoreas the default,-iwould be a good choice as an alias for--ignore, except that it is already taken for “case-insensitive”, but maybe we could change that, although that increases the potential breakage. Or we could invert the meaning of-I, which also would increase the scope of the breakage, and would be inconvenient for anyone who aliases fd tofd --ignore, since there isn’t a short option to re-disable it, but maybe we could add a new short option for that as well. Or we could do something like-Imeans--ignoreand+Imeans--no-ignore, but I don’t think clap supports that convention, and it isn’t a terribly common convention for CLIs.If we went with
--no-ignore-vcsas the default, there isn’t currently a short option for--no-ignore-vcs, but it might be worth adding a short--ignore-vcs, perhaps-Gfor git? Although if we ever added support for additional VCSs that would make less sense.-vis currently available, but I worry about that being confused for “version” or “verbose” (and possibly we would want to use that for an alias to--verboseat some point?).I think those could probably stay the same as they are. Although make
--ignore-vcsthe main option documented instead of--no-ignore-vcs.Where/when would we show this deprecation warning? Every time fd ran without a
--no-ignore(-vcs)flag? That would be incredibly annoying IMO.No, we should keep them. Because I think we should support the use case of using an alias (or wrapper script) that passes
--ignore-*, but allow negating it by--no-ignore-*later in the arg list. Just as we currently allow passing--ignoreto undo a previous--no-ignore.For scripts or aliases, I absolutely agree. However, for interactive use, I think that having short names for commonly used options is very valuable. And I think that turning the ignore functionality back on would be a pretty common usage, at least for me.
fd uses the same code for determining which files to ignore as rg. Some of
fd’s options were designed specifically to match options inrg. I generally viewfdbeing to find whatrgis togrep. And I strongly suspect that there is a large overlap between users ofrgand users offd. I do think it is relevant to the conversation. Maybe for searching for files based on their names, respecting .gitignore is less important than it is for ripgrep. But if so, I think it is worth asking why that is.rgis not part of the standard command set and isn’t really relevant to this conversation.My argument isn’t I (Steven) have this particular use case vs you (tmccombs or matu3ba) have a particular use case. I just gave those as an example.
My argument is “which default yields the lowest entropy”
The reasoning to my argument is “follow same set of defaults as the standard system.”
Personally, I’m just throwing
alias fd=fd --no-ignoreinto my rc and calling it a day. From a design perspective, I strongly believe more confusion is created by a default that excludes files that one would produce via standard commands suchfind,ls,grep,locate, and so on. We’re talking about default options. If you’re usingrgI assume you can be like me and throwalias fd=fd --gitinto your rc and call it a day too. The question is not “what do you find useful” but “what behavior is most expected from a new user”. Let’s just make sure we get the framing right and let’s also not forget thataliasexists. I mean we all have dotfiles, right?The frequent issues are explicit evidence that such default behavior does create huge surprisal to users. So is the fact that it’s in the first line of the documentation and in the feature list. When it doesn’t, those users probably read the documentation closely. If a user reads the documentation closely they can easily throw in an alias into their rc file (because that’s what those files are for, personalization) and go on about their day and the github issues will go away. We can even think about this from another perspective if the several I have given aren’t enough. Which would be a larger breaking change: if the default is to ignore the
*fooglob and you remove that default filter or the default is to have no filter and you introduce a*fooglob. Obviously the latter results in a higher surprisal to the user.The arguments for the default filter are arguments of personal use case, which is why I said the desired behavior depends on what type of developer you are. Providing customization options are fantastic and I’m super happy
fdhas these. That still doesn’t change the issue that the current default creates higher surprisal. If you want to convince the--no-ignorecrowd that we’re wrong you have to convince us that this default creates less surprisal.You’re not going to go over to
exaorlsdand find tons of issues “command outputs files that pattern match gitignore, this is unexpected behavior.” It would be silly to think so and that’s why it feels weird to even be having this discussion. I am surprised that you are surprised that people are surprised that the “better find” tool filters out files that aren’t hidden.I would also be in favour of this. Also inform the user how they can make sure not to skip these files.
Windows 7 user here. I have a dedicated folder with CLI tools added to PATH variable. For example, there is
ripgrepwith.ripgreprcnext to it, which contains settings I need with every launch like--smart-caseor coloring preferences. I like the portability of this approach instead of polluting %UserProfile%, registry or creating more env variables. So it would be nice to have similar configuration here. I would use it to make-Hpermanent, because I always forget to add it (findshows all).Did you ever run
grepon repos with huge binary files (>5 GB) or big amount of files ignored by.gitignore? Especially binary data (without newline) use linear time and that is whyripgrephas another default thangrep. For usage forfdof many, many files inside.gitignoreie compiling Linux Kernel the same argument can be made.Argument of authority is no technical argument on usage. And you cant make everyone happy for using the tool. Here a short catalogue for decision making:
How should this be maintained and name-clashes prevented ?
cfdisk,df,efi,rfkillare already used. Do you have specific names in mind?