nushell: Provide a string that does not expand globs
Related problem
Motivating example:
Recently I downloaded a video with yt-dlp. This tool puts youtube’s video id into the file name in brackets, like so: video_title [video_id].webm. I wanted to mv the file and used tab completion to get its name, but the returned string did not match the video title. This was because it didn’t escape the glob characters [ and ]. Hence, I had to fire up zsh to move the file.
There is currently no simple way to define a string which does not expand glob characters when working with file paths.
In #6014, it was implemented that glob expansion does not happen in " and ' strings, but only when an external command is executed.
I think this is confusing, see here:
touch a *
ls '*' # shows both files
^ls '*' # shows only one file
I believe that having behaviour differ for externals and built-ins like that is confusing and counterintuitive.
As far as I know, the only way right now to circumvent glob expansion when you’re not doing external commands is by creating a character class—in which there can be no nested globs—so that *, and friends are not parsed as glob characters, like so:
touch "hello [world] *"
rm `hello [[]world[]] [*]` # this works
This is very hard to read and I don’t event know if it is the intentional way to deal with this kind of problem.
Describe the solution you’d like
- Have single-quoted (
') strings not expand globs, regardless of where they’re used. - Make glob characters escapable in double-quoted (
") strings. - Remove the different treatment of glob expansion for external commands (as introduced in #6014).
Describe alternatives you’ve considered
An alternative could be to do it like bash and have globs only be expanded in bare strings, but this would require cumbersome notation when mixed with spaces:
ls *.rs # expand
ls "*.rs" # don't expand
ls ("the folder/" + *.rs) # expand, mixed with space -> need concatenation
Though, maybe this could in fact be an alternative if string concatenation didn’t require the + operator and parentheses. (i.e. if adjacent strings were concatenated like in bash.)
Additional context and details
There’s a discussion in #4631 about this. The requirements @jntrnr mentioned there would be satisfied by a non-expanding single-quoted string:
- A glob that expands automatically:
"*.rs" - A glob looking thing that doesn’t expand automatically:
'*.rs' - A path with a space and a glob that expands automatically:
"my files/*.rs" - A path with a space and a glob that doesn’t expand automatically
'my files/*.rs'With Windows path syntax you could write the third bullet as"my files\\*.rs".
Also, this is a list of some issues this would fix:
- #9222, if tab completion returned single-quoted strings. (This is the same issue I had and that made me write this issue, btw.)
- #9310, because strings such as mentioned there could be enclosed in single quotes.
- #5196, because
'...'and'~'wouldn’t expand. - #4631, because strings such as mentioned there could be enclosed in single quotes. (Although this issue is outdated it seems, as globs aren’t expanded on externals right now.)
Lastly, I noticed that glob behaviour is different in ls to how it is with mv, cp and rm. Here’s an example:
touch a b c aa bb cc
ls ? # does not work, "directory not found"
mkdir somedir
mv ? somedir # moves the three shorter files as expected
Another example:
touch "[]" # "escaped" file name is "[[][]]"
ls "[[][]]" # does not work, "Pattern, file or folder not found"
mv "[[][]]" nice_name # works
When doing the above, ls returns this error: Pattern, file or folder not found.
I believe this difference in behaviour may possibly (partially?) be caused by recent pull request #9416, as doing the same with an asterisk works fine:
touch "*"
ls "[*]" # works fine, expected output
Maybe this last part about ls belongs in its own issue, but it is somewhat related.
About this issue
- Original URL
- State: closed
- Created a year ago
- Reactions: 3
- Comments: 20 (5 by maintainers)
Commits related to this issue
- Allow filesystem commands to access files with glob metachars in name (#10694) (squashed version of #10557, clean commit history and review thread) Fixes #10571, also potentially: #10364, #10211, ... — committed to nushell/nushell by bobhy 8 months ago
- Allow filesystem commands to access files with glob metachars in name (#10694) (squashed version of #10557, clean commit history and review thread) Fixes #10571, also potentially: #10364, #10211, ... — committed to gaetschwartz/nushell by bobhy 8 months ago
- Allow filesystem commands to access files with glob metachars in name (#10694) (squashed version of #10557, clean commit history and review thread) Fixes #10571, also potentially: #10364, #10211, ... — committed to hardfau1t/nushell by bobhy 8 months ago
- Unify glob behavior on `open`, `rm`, `cp-old`, `mv`, `umv`, `cp` and `du` commands (#11621) # Description This pr is a follow up to [#11569](https://github.com/nushell/nushell/pull/11569#issuecomme... — committed to nushell/nushell by WindSoilder 5 months ago
- Unify glob behavior on `open`, `rm`, `cp-old`, `mv`, `umv`, `cp` and `du` commands (#11621) # Description This pr is a follow up to [#11569](https://github.com/nushell/nushell/pull/11569#issuecomme... — committed to dmatos2012/nushell by WindSoilder 5 months ago
totally agreed, and that makes me change my mind compared to my comment from last August. If we want to implement that without breaking changes though, we’re still left with the problem of what to do with glob patterns fed to these commands in the oldshell way. C.f. my comment above with the files
ab,a*bandacb. What shouldrm a*bdo?I think it would be elegant to have only the
globcommand expand glob patterns into lists. This would simplify @bobhy’s PR (#10557) into havingls,mv,rm& co unambiguously accept only literal file names, as piped lists or arguments:That has the added benefit of easily merging several glob patterns for these commands:
I totally follow @Ghoughpteighbteau’s line of thinking about having a canonical way of doing thing, which is
get-data | actioncoupled with a 1 command <=> 1 task principle (#10650). Accepting glob patterns in other commands for old time’s sake seems to be a headache (c.f. all the issues and PRs on this problem). After solving #9116 and related issues, users could still easily redefine their ownlscommand if they want it to work in the oldshell style. One could even imagine includingglobflavors (i.e. the current versions) of all these command in the standard library.I know exactly how you could fix this, and avoid any breaking changes on top.
cd, cp, ls, mv, rm, touch, and watch should all support piped inputs of strings and lists of strings. You have the
globcommand and this produced a list of strings, the most natural thing after executing this command to verify its output would be to pipe it into an action. I’ve personally just assumed rm or ls would do this and being surprised that they did not, then I awkwardly fudged them into aneach. (sometimes I would get weird path errors, This issue was why. I only realized that just now)glob *thing | actionfeels like nushell.action *thing*feels like oldshell.Having the globs is of course fine, we need our human conveniences for a shell. But if you allow these commands to accept piped inputs and only interpret piped values as raw strings, that solves all the problems, enhances the workflows, and avoids any 1.0 breaking changes.
This is just me thinking aloud but a big thing with nushell is that it’s not just passing strings around. You pass around data, not strings. Therefore it would make sense to me to have a glob type that can be passed to anything (also a regex type, etc). Then a string is a string and a glob is a glob. There’s no ambiguity and all commands are consistent, even if a glob is being piped through something else.
Admittedly the problem with this line of thought is that people will want a short hand syntax for globbing in some contexts.
agreed, this needs to be fixed prior to 1.0. nice write up! thanks.