go: runtime: fatal error: found bad pointer in Go heap
What version of Go are you using (go version)?
$ go version go version go1.11.2 darwin/amd64
Does this issue reproduce with the latest release?
Yes.
What operating system and processor architecture are you using (go env)?
go env Output
$ go env GOARCH="amd64" GOBIN="" GOCACHE="/Users/force/Library/Caches/go-build" GOEXE="" GOFLAGS="" GOHOSTARCH="amd64" GOHOSTOS="darwin" GOOS="darwin" GOPATH="/Users/force/.golang" GOPROXY="" GORACE="" GOROOT="/usr/local/Cellar/go/1.11.2/libexec" GOTMPDIR="" GOTOOLDIR="/usr/local/Cellar/go/1.11.2/libexec/pkg/tool/darwin_amd64" GCCGO="gccgo" CC="clang" CXX="clang++" CGO_ENABLED="1" GOMOD="" CGO_CFLAGS="-g -O2" CGO_CPPFLAGS="" CGO_CXXFLAGS="-g -O2" CGO_FFLAGS="-g -O2" CGO_LDFLAGS="-g -O2" PKG_CONFIG="pkg-config" GOGCCFLAGS="-fPIC -m64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fdebug-prefix-map=/var/folders/n5/c1hcw05942s2pfbdf3pgjnh80000gq/T/go-build447375688=/tmp/go-build -gno-record-gcc-switches -fno-common"
We use GOOS=linux GOARCH=amd64 when build.
We ran the program on the other machine.
Running machine info
$ cat /etc/os-release PRETTY_NAME="Debian GNU/Linux 9 (stretch)" NAME="Debian GNU/Linux" VERSION_ID="9" VERSION="9 (stretch)" ID=debian HOME_URL="https://www.debian.org/" SUPPORT_URL="https://www.debian.org/support" BUG_REPORT_URL="https://bugs.debian.org/"$ uname -a Linux (hostname) 4.9.0-5-amd64 #1 SMP Debian 4.9.65-3+deb9u2 (2018-01-04) x86_64 GNU/Linux
What did you do?
We’re building online game server with Go. We faced a random crash like this.
crash report
runtime: pointer 0xc009a038ca to unused region of span span.base()=0xc0035c2000 span.limit=0xc0035c4000 span.state=1
fatal error: found bad pointer in Go heap (incorrect use of unsafe or cgo?)
runtime stack:
runtime.throw(0xc046cf, 0x3e)
/usr/local/Cellar/go/1.11.2/libexec/src/runtime/panic.go:608 +0x72 fp=0xc0000abf00 sp=0xc0000abed0 pc=0x42bf02
runtime.findObject(0xc009a038ca, 0x0, 0x0, 0xc0024b5380, 0x7f67d219edc0, 0xd)
/usr/local/Cellar/go/1.11.2/libexec/src/runtime/mbitmap.go:399 +0x3b6 fp=0xc0000abf50 sp=0xc0000abf00 pc=0x413bf6
runtime.wbBufFlush1(0xc000086a00)
/usr/local/Cellar/go/1.11.2/libexec/src/runtime/mwbbuf.go:252 +0xd1 fp=0xc0000abfb8 sp=0xc0000abf50 pc=0x428121
runtime.wbBufFlush.func1()
/usr/local/Cellar/go/1.11.2/libexec/src/runtime/mwbbuf.go:195 +0x3a fp=0xc0000abfd0 sp=0xc0000abfb8 pc=0x457e1a
runtime.systemstack(0x0)
/usr/local/Cellar/go/1.11.2/libexec/src/runtime/asm_amd64.s:351 +0x66 fp=0xc0000abfd8 sp=0xc0000abfd0 pc=0x459af6
runtime.mstart()
/usr/local/Cellar/go/1.11.2/libexec/src/runtime/proc.go:1229 fp=0xc0000abfe0 sp=0xc0000abfd8 pc=0x4307d0
goroutine 143 [running]:
runtime.systemstack_switch()
/usr/local/Cellar/go/1.11.2/libexec/src/runtime/asm_amd64.s:311 fp=0xc0053e7d38 sp=0xc0053e7d30 pc=0x459a80
runtime.wbBufFlush(0x0, 0x0)
/usr/local/Cellar/go/1.11.2/libexec/src/runtime/mwbbuf.go:194 +0x4e fp=0xc0053e7d58 sp=0xc0053e7d38 pc=0x427fbe
runtime.typeBitsBulkBarrier(0xaf3640, 0xc006019b80, 0xc0053e7f28, 0x10)
/usr/local/Cellar/go/1.11.2/libexec/src/runtime/mbitmap.go:737 +0x111 fp=0xc0053e7dc0 sp=0xc0053e7d58 pc=0x4145c1
runtime.sendDirect(0xaf3640, 0xc00298c600, 0xc0053e7f28)
/usr/local/Cellar/go/1.11.2/libexec/src/runtime/chan.go:312 +0x50 fp=0xc0053e7df8 sp=0xc0053e7dc0 pc=0x4053d0
runtime.send(0xc0023b4f60, 0xc00298c600, 0xc0053e7f28, 0xc0053e7e88, 0x3)
/usr/local/Cellar/go/1.11.2/libexec/src/runtime/chan.go:283 +0xde fp=0xc0053e7e28 sp=0xc0053e7df8 pc=0x40533e
runtime.chansend(0xc0023b4f60, 0xc0053e7f28, 0x1, 0x755280, 0x1)
/usr/local/Cellar/go/1.11.2/libexec/src/runtime/chan.go:191 +0x4df fp=0xc0053e7ea8 sp=0xc0053e7e28 pc=0x40514f
runtime.chansend1(0xc0023b4f60, 0xc0053e7f28)
/usr/local/Cellar/go/1.11.2/libexec/src/runtime/chan.go:125 +0x35 fp=0xc0053e7ee0 sp=0xc0053e7ea8 pc=0x404c65
(snip)
Other info:
- We don’t use CGO.
- Crash happens even with
-raceflag, without any race report. - We don’t use many “unsafe” libraries. This is our glide.lock. Some packages including go-sqlite3 in the glide.lock are not used in the game server.
span.stateis always 1.- Stacktrace is always
runtime.wbBufFlush()…runtime.findObject(). But caller ofwbBufFlush()is various. - Crash happens more often when
GOMAXPROCS=1. It happens less than 30 min. - Crash happens with Go 1.11.2, 1.11.3, 1.11.4, and 1.12beta1.
- Crash doesn’t happen when
GOMAXPROCS=1 GODEBUG=invalidptr=0, more than 6 hours (false positive?) - Crash doesn’t happen with Go 1.10.5 and
GOMAXPROCS=1more than 9 hours. (Go 1.11 regression?) - No crash with
GODEBUG=gcstoptheworld=1
I don’t think this issue is same to #26243 because stack trace and environment are different.
Can we do anything to investigate this issue?
About this issue
- Original URL
- State: closed
- Created 6 years ago
- Comments: 30 (26 by maintainers)
Commits related to this issue
- runtime: add test for go function argument scanning Derived from Naoki's reproducer. Update #29362 Change-Id: I1cbd33b38a2f74905dbc22c5ecbad4a87a24bdd1 Reviewed-on: https://go-review.googlesource.c... — committed to golang/go by randall77 5 years ago
- runtime: don't scan go'd function args past length of ptr bitmap Use the length of the bitmap to decide how much to pass to the write barrier, not the total length of the arguments. The test needs e... — committed to golang/go by randall77 5 years ago
@aclements Would you take a look? I’m not sure about Go’s GC implementation.
In case of
Keep.func1,stkmap.nbitis 0. Butstkmap.bytedata[0]is0xf. So I think we should teststkmap.nbit>0before callingbulkBarrierBitmap()