go: runtime: Windows binaries built with -race occasionally deadlock

About this issue

  • Original URL
  • State: closed
  • Created 5 years ago
  • Comments: 17 (16 by maintainers)

Most upvoted comments

Alrighty thanks to @aclements and a Windows laptop we have a reproducer, a theory, and a partial fix.

The problem is a race between SuspendThread and ExitProcess on Windows. The order of events is as follows:

Thread 1: Suspend (asynchronously) Thread 2. Thread 2: Call ExitProcess, which terminates all threads except Thread 2. Thread 2: In ExitProcess, receives asynchronous notification to suspend, and stops.

This race is already handled in the runtime for the usual exits by putting a lock around suspending a thread (and effectively disallowing it in certain cases, like exit), but in race mode __tsan_fini (called by racefini) calls ExitProcess instead. The fix is to just grab this lock before calling into __tsan_fini.

Unfortunately this raises a bigger issue: what if C code, called from Go, calls ExitProcess on Windows? We have no way to synchronize asynchronous preemption with that like we do with exits we can actually control. One thought is that ExitProcess already calls a bunch of DLL hooks; could we throw in our own to side-step this issue maybe? More thought on this problem is required.