go: io/fs,os: fs.ReadDir with an os.DirFS can produce invalid paths

What version of Go are you using (go version)?

$ go version
go version devel +c7494cf477 Fri Feb 5 10:32:19 2021 -0500 linux/amd64

What operating system and processor architecture are you using (go env)?

go env Output
$ go env

GO111MODULE=""
GOARCH="amd64"
GOBIN=""
GOCACHE="/usr/local/google/home/bcmills/.cache/go-build"
GOENV="/usr/local/google/home/bcmills/.config/go/env"
GOEXE=""
GOFLAGS="-benchtime=1x"
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOINSECURE=""
GOMODCACHE="/usr/local/google/home/bcmills/pkg/mod"
GONOPROXY=""
GONOSUMDB=""
GOOS="linux"
GOPATH="/usr/local/google/home/bcmills"
GOPRIVATE=""
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/usr/local/google/home/bcmills/go"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/usr/local/google/home/bcmills/go/pkg/tool/linux_amd64"
GOVCS=""
GOVERSION="devel +c7494cf477 Fri Feb 5 10:32:19 2021 -0500"
GCCGO="/usr/local/google/home/bcmills/bin/gccgo"
AR="ar"
CC="gcc"
CXX="c++"
CGO_ENABLED="1"
GOMOD="/tmp/tmp.K8mIsoRRRV/go.mod"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build3900520840=/tmp/go-build -gno-record-gcc-switches"

What did you do?

Ran the test in https://play.golang.org/p/HY3-ANaECG-.

package main

import (
	"io/fs"
	"os"
	"path/filepath"
	"testing"
)

func TestDirFSPathsValid(t *testing.T) {
	if filepath.Separator == '\\' {
		t.Skipf(`Can't create filename containing a backslash because path.Separator is '\'.`)
	}

	d := t.TempDir()
	if err := os.WriteFile(filepath.Join(d, "control.txt"), []byte(string("Hello, world!")), 0644); err != nil {
		t.Fatal(err)
	}
	if err := os.WriteFile(filepath.Join(d, `experi\ment.txt`), []byte(string("Hello, backslash!")), 0644); err != nil {
		t.Fatal(err)
	}

	fsys := os.DirFS(d)
	err := fs.WalkDir(fsys, ".", func(path string, e fs.DirEntry, err error) error {
		if fs.ValidPath(e.Name()) {
			t.Logf("%q ok", e.Name())
		} else {
			t.Errorf("%q INVALID", e.Name())
		}
		return nil
	})
	if err != nil {
		t.Fatal(err)
	}
}

What did you expect to see?

All paths returned by fs.ReadDir (and, by extension, fs.WalkDIr) satisfy fs.ValidPath.

What did you see instead?

--- FAIL: TestDirFSPathsValid (0.00s)
    dirfs_test.go:26: "." ok
    dirfs_test.go:26: "control.txt" ok
    dirfs_test.go:28: "experi\\ment.txt" INVALID
FAIL

About this issue

  • Original URL
  • State: closed
  • Created 3 years ago
  • Comments: 16 (13 by maintainers)

Most upvoted comments

Hmm. But if, say, someone implements a file browser on top of io/fs in Go, I would rather they be able to show the existence (and contents) of the erroneous Application\ Support directory than not.

If we disallow the path, then being able to successfully traverse that path would require a custom fs.FS implementation and some kind of additional path-escaping scheme, plus the insight that that workaround is needed in the first place.

While I am very sympathetic to the argument of not wanting to model every possible file system behavior and then having to maintain the complexity forever, if someone were to build antivirus software in Go and relied on the standard library for file system access as stdlib is a certain ‘guarantee’ of quality, presumably it wouldn’t be ideal if all it took for malware to hide from a scan is to have a backslash in its name. Certainly not if the headline would be along the lines of ‘due to a bug in the Go standard library’.

Two more options:

  • relax ValidPath such that it allows backslashes (and report a “file not found” error instead of an “invalid path” error for names containing backslashes on Windows)
  • define some kind of backslash-escaping for the paths reported by (and accepted by) the os.DirFS implementation of fs.FS.

It seems like the two options are:

  • leave things as they are, returning a name that can’t be opened
  • OR drop that name from the directory listing, pretending it’s not there at all

In general I would err on the side of showing that there’s an inaccessible file than pretending it doesn’t exist at all, but I’d be happy to listen to good arguments for the other.

The reason to reject backslashes is not quite to cater to Windows users but instead to define a simpler subset of possible names for all implementations to handle. Backslashes are a perennial special case in portable file system implementations, because of Windows, and it simplifies all future implementations to remove them from the required interface entirely.

It seems clear that approximately no one actually uses files named experi\ment.txt in their day-to-day activity on Unix. And since FS has not existed to date, there is certainly no actual usage of that name with existing FS implementations. This is our chance to keep it that way and save all future implementors a specific headache.