go: runtime: netpoll port_getn() fails with impossible errno on illumos
What version of Go are you using (go version)?
$ go version go version go1.16.3 illumos/amd64
Does this issue reproduce with the latest release?
I do not know, I found the issue running Navidrome
https://github.com/navidrome/navidrome
which enforces building only with Go 1.16
The relevant bits of netpoll_solaris.go do not seem to be changed against latest.
What operating system and processor architecture are you using (go env)?
go env Output
$ go env GO111MODULE="" GOARCH="amd64" GOBIN="" GOCACHE="/home/whorfin/.cache/go-build" GOENV="/home/whorfin/.config/go/env" GOEXE="" GOFLAGS="" GOHOSTARCH="amd64" GOHOSTOS="illumos" GOINSECURE="" GOMODCACHE="/home/whorfin/go/pkg/mod" GONOPROXY="" GONOSUMDB="" GOOS="illumos" GOPATH="/home/whorfin/go" GOPRIVATE="" GOPROXY="https://proxy.golang.org,direct" GOROOT="/opt/ooce/go-1.16" GOSUMDB="sum.golang.org" GOTMPDIR="" GOTOOLDIR="/opt/ooce/go-1.16/pkg/tool/illumos_amd64" GOVCS="" GOVERSION="go1.16.3" GCCGO="gccgo" AR="ar" CC="gcc" CXX="g++" CGO_ENABLED="1" GOMOD="/home/whorfin/navidrome/go.mod" CGO_CFLAGS="-g -O2" CGO_CPPFLAGS="" CGO_CXXFLAGS="-g -O2" CGO_FFLAGS="-g -O2" CGO_LDFLAGS="-g -O2" PKG_CONFIG="pkg-config" GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build3292860960=/tmp/go-build -gno-record-gcc-switches"
What did you do?
Having built the latest navidrome (master from https://github.com/navidrome/navidrome), it will sporadically crash with:
runtime: port_getn on fd 4 failed (errno=0)
fatal error: runtime: netpoll failed
runtime stack:
runtime.throw(0xe01def, 0x17)
/opt/ooce/go-1.16/src/runtime/panic.go:1117 +0x72
runtime.netpoll(0x71c11d, 0x12202ce5e8f688)
/opt/ooce/go-1.16/src/runtime/netpoll_solaris.go:249 +0x505
runtime.findrunnable(0xc000034000, 0x0)
/opt/ooce/go-1.16/src/runtime/proc.go:2879 +0x3ee
runtime.schedule()
/opt/ooce/go-1.16/src/runtime/proc.go:3125 +0x2d7
runtime.park_m(0xc000001080)
/opt/ooce/go-1.16/src/runtime/proc.go:3274 +0x9d
runtime.mcall(0x57eac8)
/opt/ooce/go-1.16/src/runtime/asm_amd64.s:327 +0x64
goroutine 1 [chan receive, 26 minutes]:
github.com/oklog/run.(*Group).Run(0xc000052d20, 0xc0003fa0b0, 0x1)
/home/whorfin/go/pkg/mod/github.com/oklog/run@v1.1.0/group.go:43 +0xed
github.com/navidrome/navidrome/cmd.runNavidrome()
/home/whorfin/navidrome/cmd/root.go:59 +0x107
github.com/navidrome/navidrome/cmd.glob..func2(0x25e8540, 0x266d098, 0x0, 0x0)
/home/whorfin/navidrome/cmd/root.go:31 +0x25
github.com/spf13/cobra.(*Command).execute(0x25e8540, 0xc000092020, 0x0, 0x0, 0x25e8540, 0xc000092020)
/home/whorfin/go/pkg/mod/github.com/spf13/cobra@v1.1.3/command.go:856 +0x2c2
github.com/spf13/cobra.(*Command).ExecuteC(0x25e8540, 0xc000000180, 0x200000003, 0xc000000180)
/home/whorfin/go/pkg/mod/github.com/spf13/cobra@v1.1.3/command.go:960 +0x375
github.com/spf13/cobra.(*Command).Execute(...)
/home/whorfin/go/pkg/mod/github.com/spf13/cobra@v1.1.3/command.go:897
github.com/navidrome/navidrome/cmd.Execute()
/home/whorfin/navidrome/cmd/root.go:39 +0x65
main.main()
/home/whorfin/navidrome/main.go:11 +0x2b
That (errno=0) appears to be impossible.
What did you expect to see?
Ideally, no crashes. If there were a crash, I’d expect a meaningful errno.
What did you see instead?
A crash with errno=0
About this issue
- Original URL
- State: closed
- Created 3 years ago
- Comments: 27 (12 by maintainers)
Here is a script that will trace some of the
port_getn()calls for a process:e.g.,
some output from an
acme-dnsprocess:Critically this script looks at two things:
libc-levelport_getn()call, including the per-threaderrnovalue that Go should be collecting viaerrno()portfssystem call made byport_getn()which is whattrussis showing you as well, including theerrnovalue that the kernel put in place for the system call return (this is separate but related to thelibcerrnovalue)There is a conditional print if we detect a failure return (
-1) whereerrnois 0, which should print something like:It would be good to capture this sort of trace output while the problem is occurring, so that we can determine whether the C library is mishandling the error numbers here, or if it is some part of the Go machinery for fetching thread local errno after a C library call, or something else.