vstest: CollectDumpOnTestSessionHang doesn't produce a dump file
Description
I’m trying to troubleshoot hanging builds on a CI server. I found this which seems very promising:
https://github.com/microsoft/vstest-docs/blob/master/RFCs/0028-BlameCollector-Hang-Detection.md
However, when I use the hang detector, I don’t get a dump file.
Steps to reproduce
The test hangs are intermittent, so they are hard to reproduce.
dotnet vstest
is invoked with:
<lots of DLLs> --Parallel --logger:"trx;LogFileName=NUnitTestsCore.trx" --logger:"console;verbosity=minimal" --ResultsDirectory:.../build/test-reports --Settings:...\tmpCF7A.tmp
The settings file is auto generated and contains something like this:
<RunSettings>
<RunConfiguration>
<MaxCpuCount>4</MaxCpuCount>
</RunConfiguration>
<DataCollectionRunSettings>
<DataCollectors>
<DataCollector friendlyName="blame" enabled="True">
<Configuration>
<ResultsDirectory>...\build</ResultsDirectory>
<CollectDumpOnTestSessionHang TestTimeout="120000" DumpType="full"/>
</Configuration>
</DataCollector>
</DataCollectors>
</DataCollectionRunSettings>
</RunSettings>
Expected behavior
I expect the hang detector to detect a hang and produce a crash dump file.
Actual behavior
The hang detector did detect a hang after ~2 minutes:
The active test run was aborted. Reason: Test host process crashed
...
Test Run Aborted.
Attachments:
...\build\test-reports\4a680b77-23cd-471a-9b82-ead6630865fa\Sequence_af08f6cfd55f4dd5989add68f10ea91f.xml
However, it only produces a sequence file, not a crash dump.
Note that the sequence file ends up in the result directory used on the command line, rather than the results directory in the settings file.
Diagnostic logs
None produced by the above command.
Environment
Windows Server 2012 .NET Core version 3.0.100
About this issue
- Original URL
- State: closed
- Created 4 years ago
- Comments: 22 (11 by maintainers)
Problem solved, the part about 4.6 targeting pack was a red herring. Installing VS extension development support in VS2017 did it.
Actually there is a lot. In the latest net5.0 release (I think since preview6). We are leveraging the Diagnostics NetCore client to create hang dumps. This works on Windows (with any target framework) and Linux (with netcoreapp3.1 and newer). There is no need for procdump.exe when creating hang dumps, or for the temporary folder.
To trigger a hang dump you can now simply do:
dotnet test --blame-hang-timeout 2min
orvstest.console /Blame:"CollectHangDump;TestTimeout=2min"
.For crash dumps the situation is similar as before, but it errors out a bit better. There you still need procdump, because that flow needs to attach to a running process and detect failure, which is no easy task. But luckily crash dumps are usually way less interesting than hang dumps, because when the process crashes it often has an eay to see reason.
From dotnet test help:
We are all actually using 2019, sorry. I am sure you need at least these workloads. The Visual Studio Extension development should be optional if you skip the vsix generating step in the script, see below.
And then from the individual components you’d need the Portable Pack and .NET 4.5.1.
I almost never run all acceptance tests locally. You should be good to go with just unit tests or at best smoke tests.
I did see the same issues (and more) when joining this project. And never got to go back and update the installation guide. Sorry about that. I will changing our release pipeline a lot, and imho you don’t need to build the vsix locally in most cases. You can comment out these steps in the build.ps1 and it should still build. If you need more help ping me on twitter or here, I can spend 15 minutes showing you stuff. 😃
I think you need VS enterprise, some of these dlls are only shipped on the enterprise version like “Microsoft.VisualStudio.CodeCoverage.Shim”.
Hehe these are pretty outdated, go ahead with only the UTs locally. The acceptance and smoke tests will get validated on the CI. Plus I don’t think blame data collector has any E2E tests.
I’m not trying to be annoying here, just wondering if you (maintainers/collaborators) are seeing these test errors as well?
@provegard please do. You can tag me to help out with the review. It was a pet project of mine but I never got round to polishing it, will help in any way I can.