go: runtime: nil-dereference panic refers to addr=0x8
What version of Go are you using (go version
)?
$ go version go version go1.18.7 linux/amd64
Does this issue reproduce with the latest release?
Yes, confirmed via playground.
What operating system and processor architecture are you using (go env
)?
go env
Output
$ go env GO111MODULE="" GOARCH="amd64" GOBIN="" GOEXE="" GOEXPERIMENT="" GOFLAGS="" GOHOSTARCH="amd64" GOHOSTOS="linux" GOINSECURE="" GONOPROXY="" GONOSUMDB="" GOOS="linux" GOPRIVATE="" GOPROXY="https://proxy.golang.org,direct" GOROOT="/snap/go/9981" GOSUMDB="sum.golang.org" GOTMPDIR="" GOTOOLDIR="/snap/go/9981/pkg/tool/linux_amd64" GOVCS="" GOVERSION="go1.18.7" GCCGO="gccgo" GOAMD64="v1" AR="ar" CC="gcc" CXX="g++" CGO_ENABLED="1" GOMOD="/dev/null" GOWORK="" CGO_CFLAGS="-g -O2" CGO_CPPFLAGS="" CGO_CXXFLAGS="-g -O2" CGO_FFLAGS="-g -O2" CGO_LDFLAGS="-g -O2" PKG_CONFIG="pkg-config" GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build662566094=/tmp/go-build -gno-record-gcc-switches"
What did you do?
Run the following program:
package main
type A struct {
F int
}
func (a *A) Foo() {
if a != nil {
a.F++
}
}
type B struct {
A
}
func main() {
var a *A
a.Foo()
var b *B
b.Foo()
}
Link to a playground.
What did you expect to see?
No panic for both method calls.
What did you see instead?
Panic on the b.Foo()
method call:
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x454ea2]
goroutine 1 [running]:
main.main()
/tmp/sandbox1079324543/prog.go:21 +0x2
Discussion
Technically, I would expect b.Foo()
to be equivalent to (*A).Foo(nil)
in this case, at least when the struct offset of embedded A
field is zero.
It is clear that the following would case panics in any case (struct offset is not zero, equivalent to (*A).Foo(0x8)
):
type B struct {
F2 int
A
}
As well as this code (always requires pointer de-reference):
type B struct {
*A
}
If I understand correctly, as of now the behavior of b.Foo()
is similar to (*A).Foo(&b.A)
, where first b.A
causes an operation analogous to pointer de-reference. I think this is a surprising behavior that makes it hard to embed structs (promoting methods) and make the new type nil-safe.
The only non-error-prone way to fix it currently, is to make a named type instead of embedding and manually expose all the methods:
type B A
func (b *B) Foo() {
(*A)(b).Foo()
}
As I mentioned previously, I see no technical reason why the compiler cannot call (*A).Foo(nil)
instead of requiring an additional boilerplate. Otherwise it defeats the purpose of struct composition, when nil-safety is required for promoted methods.
I believe this change doesn’t break the backward compatibility promise, only makes more programs safe(er).
Open questions are:
- Can this be implemented without slowing down existing programs?
- Should the case of
(*A).Foo(0x8)
be translated to(*A).Foo(nil)
as well, so that the struct offset doesn’t matter? This complicates the implementation, so likely is unnecessary.
I admit this is not a bug per-se and works “as intended”, however it’s surprising enough to be worth fixing. Also, when using unsafe
/cgo, I was able to somehow confuse the compiler to emit case similar to (*A).Foo(0x8)
(thus skipping the initial nil check). I will try to provide a reproducer for it as well.
About this issue
- Original URL
- State: closed
- Created 2 years ago
- Comments: 16 (15 by maintainers)
Yeah, the fix for #42673 fixes this one as well.
Ideally, if the compiler/runtime can enforce
<nil ptr> + <any off> -> nil
for method calls, it will make the original use case I described much safer (where methods of embedded struct atoff=0
can test for a nil pointer, which is propagated from a parent struct being nil).I hope that these cases with constants can start a larger discussion for possible spec change/clarification.
Well, with the CL above it still panic with addr=0x8. The SSA dump before scheduling is
Basically, after inlining and constant folding, the compiler is smart enough to know that B is at constant address 0 and A is at constant address 8. We issue a nil check for B, but there is no scheduling edge between that nil check and the constant 8.
Usually if a pointer is derived from another pointer, that derivation (pointer addition, or folded into a load with offset) is scheduled later than the nil check. But in this case the pointer is really a constant, which can be scheduled arbitrarily (and happened to be before the nil check in this case).
I don’t see an easy way to enforce scheduling order for nil check vs. constants. If we know statically a pointer is constant nil, maybe we can make derived pointers all constant nil, i.e.
(OffPtr [n] (ConstNil)) -> (ConstNil)
? Not sure if that is too much…That second case does seem like a potential quality of implementation issue. gccgo reports
That is, I think we agree that the program should crash, but the crash report from the gc compiler could be clearer as to where the bad nil dereference occurs.
Go 1.4 reports
but after that it gets more confusing due to the fact that older gc releases were not as reliable at reporting failures in inlined functions. At least I think that’s the reason, I didn’t look too deeply. The reference to
addr=0x8
starts in Go 1.10.Reopening in case this is something we want to fix in the gc compiler.
I’d expect
b.Foo()
be equivalent tob.A.Foo()
(the embedding makes the.A
implicit). And per spec, for a nilb
,b.A
panics.