go: os/exec: failures with "netpollBreak write failed" on linux-amd64 since 2021-11-10
greplogs --dashboard -md -l -e 'fatal error: runtime: netpollBreak write failed'
2021-11-11T04:54:05-4b27d40/linux-amd64-race 2021-11-10T21:32:50-f410786/linux-amd64-fedora (Note the 2-year gap and difference in platforms here! This looks like a regression.) 2019-11-04T23:41:34-383b447/darwin-386-10_14 2019-11-04T16:32:38-7dcd343/darwin-amd64-nocgo 2019-11-02T21:51:21-177a36a/dragonfly-amd64 2019-11-02T21:51:14-a3ffb0d/darwin-amd64-10_14 2019-11-02T21:51:07-40b7455/darwin-amd64-nocgo 2019-11-02T05:52:33-8de0bb7/netbsd-amd64-8_0 2019-11-01T21:41:41-9bde9b4/darwin-amd64-10_14 2019-11-01T05:38:51-e96fd13/darwin-amd64-10_14
About this issue
- Original URL
- State: closed
- Created 3 years ago
- Comments: 16 (15 by maintainers)
I’ve also seen two failures of the form:
This is a failure attempting to add netpollBreakRd to epoll, which seems to be in the same vain.
I’m going to fix the races in that init function and we’ll see if that gets rid of the crashes. I think it is worthwhile either way.
I did manage to reproduce this on a linux-amd64-fedora gomote after ~30min with:
I think there is a race condition here. The call to
f.Statcould occur before the pipe is created, so it fails. Then the pipe could be created, perhaps with the same descriptor asf. Then theinitfunction callsf.Close, closing the pipe descriptor. This doesn’t seem like a likely race, but I doesn’t seem impossible.errno 32 is EPIPE, meaning the other end of the pipe (
netpollBreakRd) is closed. The runtime never closes that FD, so I suspect this is a bug of something closing an FD it doesn’t own.