runtime: Performance workitems hang when trying to kill build servers at shutdown
We’ve seen workitems that hang trying to kill compiler servers on shutdown, and given that the workitems timeout is 4 hours, PRs just sit waiting forever and also clogging the queues.
These workitems just sit running the following command:
[2020/06/18 18:41:23][INFO] $ dotnet build-server shutdown
[2020/06/18 18:41:23][INFO] Shutting down MSBuild server...
[2020/06/18 18:41:23][INFO] Shutting down VB/C# compiler server...
[2020/06/18 18:41:23][INFO] VB/C# compiler server shut down successfully.
Maybe it is a dotnet build-server
issue.
cc: @dotnet/runtime-infrastructure @DrewScoggins @billwert @adamsitnik
About this issue
- Original URL
- State: closed
- Created 4 years ago
- Reactions: 2
- Comments: 50 (49 by maintainers)
At this point we’re going to stop. We enabled this leg optimistically, not having had an actual problem but because it seemed like a good idea. Since it has proven so problematic (and for little real gain) it’s not worth continuing to try and fix. Should we wind up with a huge influx of issues this would have caught we will revisit it.
Thanks. Once we have the logs, if the job is unstable we should disable it until we have a fix.
Just merged the logging fix.
I’m guessing it got as far as calling into MSBuild here https://github.com/dotnet/sdk/blob/b1223209644d900702287faea8e9b71f95ec49f8/src/Cli/dotnet/BuildServer/MSBuildServer.cs#L18 which ultimately to connect to all dotnet processes in turn, with 2x 30 sec timeout on each https://github.com/microsoft/msbuild/blob/93fec27d7168675a369729446ad96aaaaa84137f/src/Build/BackEnd/Components/Communications/NodeProviderOutOfProcBase.cs#L126 but I would expect it to fail immediately unless the node was MSBuild. If it connects, then it tries to read.
But, who knows what is going on – to investigate, you should
This will immediately show what it is doing.