go: runtime: unexpected return pc crash on linux-amd64-alpine builder
The revived linux-amd64-alpine builder has flaked twice in its short new lifetime with ‘unexpected return pc’ crashes during the cgo tests.
Here is a repro case using a gomote (note that if you ssh in, you have to set up your environment manually, and in particular you have to put /workdir/go/bin at the front of PATH and have to set GOROOT_BOOTSTRAP=/workdir/go1.4). Not sure why the environment is so messed up on Alpine. gomote run does not have these problems, only gomote ssh.
VM=$(gomote create linux-amd64-alpine)
gomote push $VM
gomote run $VM go/src/make.bash
gomote put -mode 0777 $VM - try.sh <<'EOF'
#!/bin/bash
cd /workdir/go/misc/cgo/test
for i in $(seq 100); do
date
if ! /workdir/go/bin/go test >log 2>&1; then
cat log
fi
done
EOF
gomote run $VM try.sh
You may need to repeat the try.sh a few times depending on how flaky the machine is feeling but most runs get at least one failure.
Here are some failures from that script:
runtime: g 3: unexpected return pc for runtime.gcenable.func1 called from 0x0
stack: frame={sp:0xc0000557c8, fp:0xc0000557e0} stack=[0xc000055000,0xc000055800)
0x000000c0000556c8: 0x000000c000055750 0x000000000040d21d <runtime.chansend+0x000000000000055d>
0x000000c0000556d8: 0x0000000000581220 0x000000c00007e060
0x000000c0000556e8: 0x00000000005e9f78 0x0000000000000000
0x000000c0000556f8: 0x0000000000000000 0x0000000000000000
0x000000c000055708: 0x0000000000000000 0x0000000000000000
0x000000c000055718: 0x000000c00007e058 0x0000000000000000
0x000000c000055728: 0x0000000000000000 0x0000000000000000
0x000000c000055738: 0x0000000000000000 0x0000000000000000
0x000000c000055748: 0x0000000000000000 0x000000c000055780
0x000000c000055758: 0x000000000040cc9d <runtime.chansend1+0x000000000000001d> 0x000000c00007e000
0x000000c000055768: 0x0000000000440bb6 <runtime.gopark+0x00000000000000d6> 0x0000000000000001
0x000000c000055778: 0x0000000000000000 0x000000c0000557b8
0x000000c000055788: 0x000000000042ba2e <runtime.bgsweep+0x000000000000008e> 0x0000000000000000
0x000000c000055798: 0x0000000000000000 0x0000000000000000
0x000000c0000557a8: 0x0000000000000000 0x000000c00007e000
0x000000c0000557b8: 0x000000c0000557d0 0x0000000000420706 <runtime.gcenable.func1+0x0000000000000026>
0x000000c0000557c8: <0x00007f8a890934b6 0x00007f8a61816b64
0x000000c0000557d8: !0x0000000000000000 >0x0000000000000000
0x000000c0000557e8: 0x0000000000000000 0x00007f8a890d3600
0x000000c0000557f8: 0x00007f8a89092acf
fatal error: unknown caller pc
runtime: g 19: unexpected return pc for runtime.gcenable.func2 called from 0x0
stack: frame={sp:0xc000050fc8, fp:0xc000050fe0} stack=[0xc000050800,0xc000051000)
0x000000c000050ec8: 0x000000000000000e 0x000000c0000061a0
0x000000c000050ed8: 0x000000c000050f60 0x000000000040d265 <runtime.chansend+0x00000000000005a5>
0x000000c000050ee8: 0x0000000000000050 0x000000c00009c000
0x000000c000050ef8: 0x0000000000000000 0x0000010000000000
0x000000c000050f08: 0x0000000000000003 0x0000000000000030
0x000000c000050f18: 0x0000000000000000 0x0000000000000050
0x000000c000050f28: 0x000000c000096058 0x000000c00007e000
0x000000c000050f38: 0x0000000000000000 0x0000000000000000
0x000000c000050f48: 0x0000000000440bb6 <runtime.gopark+0x00000000000000d6> 0x000000000040d320 <runtime.chansend.func1+0x0000000000000000>
0x000000c000050f58: 0x000000c000096000 0x000000c000050f90
0x000000c000050f68: 0x0000000000429ad3 <runtime.(*scavengerState).park+0x0000000000000053> 0x000000c000096000
0x000000c000050f78: 0x00000000005e9f78 0x0000000000000001
0x000000c000050f88: 0x0000000000000000 0x000000c000050fb8
0x000000c000050f98: 0x000000000042a0a5 <runtime.bgscavenge+0x0000000000000045> 0x00000000006f9960
0x000000c000050fa8: 0x0000000000000000 0x000000c000096000
0x000000c000050fb8: 0x000000c000050fd0 0x00000000004206a6 <runtime.gcenable.func2+0x0000000000000026>
0x000000c000050fc8: <0x00007f47256144b6 0x00007f46fdea3b64
0x000000c000050fd8: !0x0000000000000000 >0x0000000000000000
0x000000c000050fe8: 0x0000000000000000 0x00007f4725654600
0x000000c000050ff8: 0x00007f4725613acf
fatal error: unknown caller pc
This one did not happen during garbage collection:
runtime: g 20: unexpected return pc for testing.tRunner called from 0x7feeabb0dacf
stack: frame={sp:0xc000051770, fp:0xc0000517c0} stack=[0xc000051000,0xc000051800)
0x000000c000051670: 0x000000012a05f200 0x000000c0000880a0
0x000000c000051680: 0x000000c000094180 0x000000c0000516f8
0x000000c000051690: 0x000000c000102b80 0x000000c000102b60
0x000000c0000516a0: 0x0000000000000000 0x00000000005890c0
0x000000c0000516b0: 0x00000000006d7d50 0x0000000000000000
0x000000c0000516c0: 0x0000000000000000 0x0000000000000000
0x000000c0000516d0: 0x0000000000000000 0x000000c000051730
0x000000c0000516e0: 0x0000000000454a36 <runtime.sigpanic+0x00000000000002f6> 0x00000000005890c0
0x000000c0000516f0: 0x00000000006d7d50 0x000000c000051748
0x000000c000051700: 0x0000000000561ceb <misc/cgo/test.testSetgid+0x00000000000000ab> 0x000000c0001121e0
0x000000c000051710: 0x000000c000102b60 0x0000000000000001
0x000000c000051720: 0x00000000006ea660 0x00000000005eb418
0x000000c000051730: 0x000000c000051760 0x0000000000478bfe <sync.(*RWMutex).Lock+0x000000000000001e>
0x000000c000051740: 0x0000000000000000 0x000000c000051760
0x000000c000051750: 0x0000000000526bd9 <misc/cgo/test.TestSetgid+0x0000000000000019> 0x000000c0001029c0
0x000000c000051760: 0x000000c0000517b0 0x00000000004d6d15 <testing.tRunner+0x0000000000000115>
0x000000c000051770: <0x0000000000000000 0x0300000000000000
0x000000c000051780: 0x00000000004d6d80 <testing.tRunner.func2+0x0000000000000000> 0x00007feeabb0e4b6
0x000000c000051790: 0x00007feeabb4ed8c 0x0000000000000000
0x000000c0000517a0: 0x0000000000000000 0x0000000000000000
0x000000c0000517b0: 0x00007feeabb4e600 !0x00007feeabb0dacf
0x000000c0000517c0: >0x0000000000000000 0x00000000ffffffff
0x000000c0000517d0: 0x0000000000000000 0x00000000004710a1 <runtime.goexit+0x0000000000000001>
0x000000c0000517e0: 0x0000000000000000 0x0000000000000000
0x000000c0000517f0: 0x0000000000000000 0x00007feeabb0e5d2
fatal error: unknown caller pc
runtime stack:
runtime.throw({0x5ae5a1?, 0x6ea660?})
/workdir/go/src/runtime/panic.go:1047 +0x5d fp=0x7fee843e3648 sp=0x7fee843e3618 pc=0x43de7d
runtime.gentraceback(0x100000000467aba?, 0xc000100000?, 0xc000102b60?, 0x7fee843e3a18?, 0x0, 0x0, 0x7fffffff, 0x7fee843e3a08, 0x0?, 0x0)
/workdir/go/src/runtime/traceback.go:258 +0x1cf7 fp=0x7fee843e39b8 sp=0x7fee843e3648 pc=0x4658b7
runtime.addOneOpenDeferFrame.func1()
/workdir/go/src/runtime/panic.go:645 +0x6b fp=0x7fee843e3a30 sp=0x7fee843e39b8 pc=0x43d00b
runtime.systemstack()
/workdir/go/src/runtime/asm_amd64.s:492 +0x49 fp=0x7fee843e3a38 sp=0x7fee843e3a30 pc=0x46eee9
goroutine 20 [running]:
runtime.systemstack_switch()
/workdir/go/src/runtime/asm_amd64.s:459 fp=0xc0000515e8 sp=0xc0000515e0 pc=0x46ee80
runtime.addOneOpenDeferFrame(0xc0000221e0?, 0xc000094180?, 0xc000112180?)
/workdir/go/src/runtime/panic.go:644 +0x69 fp=0xc000051628 sp=0xc0000515e8 pc=0x43cf49
panic({0x5890c0, 0x6d7d50})
/workdir/go/src/runtime/panic.go:844 +0x112 fp=0xc0000516e8 sp=0xc000051628 pc=0x43d792
runtime.panicmem(...)
/workdir/go/src/runtime/panic.go:260
runtime.sigpanic()
/workdir/go/src/runtime/signal_unix.go:837 +0x2f6 fp=0xc000051740 sp=0xc0000516e8 pc=0x454a36
sync.(*RWMutex).Lock(0x0?)
/workdir/go/src/sync/rwmutex.go:147 +0x1e fp=0xc000051770 sp=0xc000051740 pc=0x478bfe
Here are the two build dashboard failures:
https://build.golang.org/log/658036e08c7a1d218c33808fdd1d6612b40502d8
runtime: g 2: unexpected return pc for runtime.forcegchelper called from 0x0
stack: frame={sp:0xc000056fb0, fp:0xc000056fe0} stack=[0xc000056800,0xc000057000)
0x000000c000056eb0: 0x0000000000000000 0x0000000000000000
0x000000c000056ec0: 0x0000000000000000 0x0000000000000000
0x000000c000056ed0: 0x0000000000000000 0x0000000000000000
0x000000c000056ee0: 0x0000000000000000 0x0000000000000000
0x000000c000056ef0: 0x0000000000000000 0x0000000000000000
0x000000c000056f00: 0x0000000000000000 0x0000000000000000
0x000000c000056f10: 0x0000000000000000 0x0000000000000000
0x000000c000056f20: 0x0000000000000000 0x0000000000000000
0x000000c000056f30: 0x0000000000000000 0x0000000000000000
0x000000c000056f40: 0x0000000000000000 0x0000000000000000
0x000000c000056f50: 0x0000000000000000 0x0000000000000000
0x000000c000056f60: 0x0000000000000000 0x0000000000000000
0x000000c000056f70: 0x0000000000000000 0x0000000000000000
0x000000c000056f80: 0x0000000000000000 0x00005637530dbdb6 <runtime.gopark+0x00000000000000d6>
0x000000c000056f90: 0x0000000000000000 0x0000000000000000
0x000000c000056fa0: 0x000000c000056fd0 0x00005637530dbc4d <runtime.forcegchelper+0x00000000000000ad>
0x000000c000056fb0: <0x0000000000000000 0x0000000000000000
0x000000c000056fc0: 0x0000000000000000 0x00007efee325e4b6
0x000000c000056fd0: 0x00007efebba04b64 !0x0000000000000000
0x000000c000056fe0: >0x0000000000000000 0x0000000000000000
0x000000c000056ff0: 0x00007efee329e600 0x00007efee325dacf
fatal error: unknown caller pc
and
https://build.golang.org/log/94cf14d78b116487dc76a921baf6ba76480a4c7a
runtime: g 5: unexpected return pc for runtime.sigpanic called from 0x7f52c162dd8c
stack: frame={sp:0xc000058700, fp:0xc000058758} stack=[0xc000058000,0xc000058800)
0x000000c000058600: 0x0000564cf403107b <runtime.write+0x000000000000003b> 0x0000000000000002
0x000000c000058610: 0x000000c000058648 0x0000564cf40109ce <runtime.recordForPanic+0x000000000000004e>
0x000000c000058620: 0x0000564cf403107b <runtime.write+0x000000000000003b> 0x0000000000000002
0x000000c000058630: 0x0000564cf4144017 0x0000000000000001
0x000000c000058640: 0x0000000000000001 0x000000c000058680
0x000000c000058650: 0x0000564cf4010cd2 <runtime.gwrite+0x00000000000000f2> 0x0000564cf4144017
0x000000c000058660: 0x0000000000000001 0x0000000000000001
0x000000c000058670: 0x000000c0000586e2 0x000000000000000e
0x000000c000058680: 0x0000564cf4040210 <runtime.systemstack+0x0000000000000030> 0x0000564cf400f3cc <runtime.fatalthrow+0x000000000000006c>
0x000000c000058690: 0x000000c0000586a0 0x000000c000007ba0
0x000000c0000586a0: 0x0000564cf400f400 <runtime.fatalthrow.func1+0x0000000000000000> 0x000000c000007ba0
0x000000c0000586b0: 0x0000564cf400f07f <runtime.throw+0x000000000000005f> 0x000000c0000586d0
0x000000c0000586c0: 0x000000c0000586f0 0x0000564cf400f07f <runtime.throw+0x000000000000005f>
0x000000c0000586d0: 0x000000c0000586d8 0x0000564cf400f0a0 <runtime.throw.func1+0x0000000000000000>
0x000000c0000586e0: 0x0000564cf414445e 0x0000000000000005
0x000000c0000586f0: 0x000000c000058748 0x0000564cf4025ca5 <runtime.sigpanic+0x00000000000002c5>
0x000000c000058700: <0x0000564cf414445e 0x000000c0000161e0
0x000000c000058710: 0x000000c000058728 0x0000000000000001
0x000000c000058720: 0x00007f52c162dd8c 0x000000c000007ba0
0x000000c000058730: 0x0000564cf41800e0 0x0000564cf40a7e14 <testing.tRunner+0x0000000000000034>
0x000000c000058740: 0x0000000000000000 0x00007f52c15ed4b6
0x000000c000058750: !0x00007f52c162dd8c >0x0000000000000000
0x000000c000058760: 0x0000000000000000 0x0000000000000000
0x000000c000058770: 0x00007f52c162d600 0x00007f52c15ecacf
0x000000c000058780: 0x0000000000000000 0x00000000ffffffff
0x000000c000058790: 0x0000564cf40a7fa0 <testing.tRunner.func1+0x0000000000000000> 0x000000c000007a00
0x000000c0000587a0: 0x000000c000058780 0x000000c000058790
0x000000c0000587b0: 0x000000c0000587d0 0x00007f52c15ed5d2
0x000000c0000587c0: 0x00007f52c15f0080 0x00007f52c162d600
0x000000c0000587d0: 0x00000000ffffffff 0x00007f52c15efbbb
0x000000c0000587e0: 0x0000000000000000 0x00007f52c15efb6d
0x000000c0000587f0: 0x00007f52c162d604 0x0000000000000000
Perhaps this is Alpine-specific, or perhaps it is musl-related. The Alpine image may have an old Linux kernel; maybe we should update it.
There are a few other open ‘unexpected return pc’ issues. Maybe they are all stale:
- #47003 is Go 1.16 on Ubuntu.
- #35005 is Go 1.13 on Alpine 3.10 (but disappears on Debian and on Alpine 3.9.4).
- #40401 is Go 1.14.6 on Windows
- #40469 is Go 1.13.14 on Windows
- #51707 is Go 1.16.2 on an unspecified system.
- #43496 is Go 1.15.6 on Debian (Docker golang image).
#35005 is the most interesting one but the repro case is a very large program running under Docker.
About this issue
- Original URL
- State: closed
- Created 2 years ago
- Comments: 17 (16 by maintainers)
To summarize:
sigaltstack
. All signal handlers must setSA_ONSTACK
to use the signal stack and avoid smashing the goroutine stack.SA_ONSTACK
if it is not already set.setxid
calls, but does not install the handler at startup. Instead, it is temporarily installed on each call to thesetxid
functions (in__synccall
).SA_ONSTACK
.I don’t see how we can work around this in Go given that we can’t adjust the signal handler flags, nor does
__synccall
respect flags from an existing signal handler. We would have to make goroutine stacks much larger, which would be a significant increase in stack allocations.There are several changes on the musl side that could address this:
__synccall
could query for an existing signal handler, and if it hasSA_ONSTACK
then keep that flag for their handler. In this case, Go would install a dummy signal 34 handler at startup just to exposeSA_ONSTACK
.man 2 sigaction
’sSA_ONSTACK
description: “If an alternate stack is not available, the default stack will be used.” If this is accurate (I haven’t verified), then__synccall
could setSA_ONSTACK
unconditionally, which would normally make no difference, but would use Go’s sigaltstack when linked with Go.