diagnostics: Dotnet-trace collect command does not respect "--duration" parameter
Configuration
.NET Version: 3.1.403 dotnet-trace version: 3.1.141901+3f460984c62efa118b6bd87fa4c5f07b11074b34 OS: Linux Dockerized? The app is running in a docker container and the trace is taken within the container. Machine Load: High. Traces are taken during high load scenarios.
Description
Commands used:
1: ./dotnet-trace collect --process-id 1 --duration 00:00:00:30 --buffersize 1024 -o /home/application/app-trace/TRACE.nettrace
Size of trace: 101MB.
2: ./dotnet-trace collect --process-id 1 --providers Microsoft-DotNETCore-SampleProfiler --clrevents GC+GCHeapSurvivalAndMovement+Stack+Loader+GCHandle+Type+GCHeapDump+GCHeapCollect+GCHeapAndTypeNames+Exception --clreventlevel 5 --duration 00:00:00:10 --buffersize 1024 -o /home/application/app-trace/TRACE.nettrace
Size of trace: 7GB
While running these commands, the duration parameter was not honored, and the collect command kept on running until CTRL+C was pressed to cancel the operation. There were some instances where the first command would finish as it should, but there would be no information in the trace log.
This seems like a problem when collecting a trace when the machine is under high load. Also, when trying to open the large trace file(7GB), it would take substantially long for PerfView to parse the trace. This could also be a factor of collecting a trace with a high machine load. I have tested these commands under normal machine load and it is working fine. However, I did notice that whenever you click on the CMD/Terminal window while the collect command is running, the window seems to “Freeze” and the collect command will always go over the duration that is specified.
About this issue
- Original URL
- State: open
- Created 4 years ago
- Comments: 17 (9 by maintainers)
It’s not really about the shell you are using (WSL2, Bash, Powershell, CMD), but rather the console that’s hosting it. The default console host on Windows has that behavior.
Could be either but I’m guessing it’s the rundown that’s taking the majority of the time. Stopping the trace on dotnet-trace sends a rundown command to the target app which can take a while - we’ve seen it take minutes on certain apps with a lot of type definitions to resolve. If the app is under CPU pressure I wouldn’t be surprised if it takes tens of minutes.
This is normal too, although I’m not sure what order of magnitude of time we’re talking by “substantially long”. Parsing a trace file as big as 7GB is not something I’d expect PerfView to take in the order of seconds - my guess is it’d at least take several minutes to do this. It needs to first write out the ETLX format file, then parse that again to form TraceLog object to be able to display it fully.