runtime: GZipStream.ReadByte 10x slower than on Desktop
Description
Reading a binlog file takes 4 seconds on .NET Desktop, but 40 seconds on Core 3.1.
Originally reported by @vatsan-madhavan in https://github.com/KirillOsenkov/MSBuildStructuredLog/issues/376
Repro:
git clone https://github.com/vatsan-madhavan/MSBuildStructuredLogSummary
msbuild /r
- Download the binlog from https://1drv.ms/u/s!Ah63gfGkKca4lsl2sy71gpkpM9TCog?e=ifgsMj and save it anywhere, e.g. C:\temp\vatsan\vatsan.binlog
MSBuildBinLogSummary\bin\Debug\net472\MSBuildBinLogSummary.exe --log-file C:\temp\Vatsan\vatsan.binlog
- takes 4 secondsMSBuildBinLogSummary\bin\Debug\netcoreapp3.1\MSBuildBinLogSummary.exe --log-file C:\temp\Vatsan\vatsan.binlog
- takes 40 seconds
The stack that I saw was taking all the time is ReadByte from a GzipStream, it’s called millions of times (to read the binlog):
| ZLibNative.ZLibStreamHandle.Inflate Line 300
-- | --
| Inflater.Inflate Line 281
| Inflater.ReadInflateOutput Line 266
| Inflater.ReadOutput Line 122
| Inflater.InflateVerified Line 92
| Inflater.Inflate Line 55
| DeflateStream.ReadByte Line 229
| GZipStream.ReadByte Line 72
| BinaryReader.ReadBoolean Line 236
| BuildEventArgsReader.ReadTargetFinishedEventArgs Line 359
| BuildEventArgsReader.Read Line 71
| BinLogReader.Replay Line 78
| BinLogReader.Replay Line 37
| StructuredLoggerUtil..ctor Line 30
| Program.ProcessBinaryLog Line 29
| Program.Main Line 24
Configuration
- 3.1.301
- Windows 10
- x64
Regression?
.NET Framework 4.7.2 is 4 seconds, .NET Core 3.1 is 40 seconds
About this issue
- Original URL
- State: closed
- Created 4 years ago
- Reactions: 3
- Comments: 38 (32 by maintainers)
Links to this issue
Commits related to this issue
- Tweak the buffer size when reading the GZipStream. See here for details: https://github.com/dotnet/runtime/issues/39233#issuecomment-745598847 — committed to KirillOsenkov/MSBuildStructuredLog by KirillOsenkov 4 years ago
should we add this to the 7.0 milestone? it seems important enough to me so it doesn’t get lost in the Future milestone.
netcore 3.1:
netfx 472:
Focussed on the part that stands out as taking more time between the two. It looks like the netcore version is doing byte reads from the byte stream by reading a byte and then I’d guess that the native compression is pushing the rest of the contents back 1 byte and reading another byte to keep it full, that would explain the memcpy to me. Netfx is doing array based reads. I’d suggest that the msbuild library might want to add in a buffered reader around the binary reader.
On .NET Framework, you use this switch: https://docs.microsoft.com/en-us/dotnet/framework/migration-guide/retargeting/4.5.2-4.8#suggestion zlib is the only mechanism for .NET Core; it’s the primary mechanism for .NET Framework, which just kept the old managed implementations around for compatibility.
Makes sense. Using a BufferedStream for now is a good way to go. Even if/when this issue is addressed, BufferedStream would still be likely to help your use case.
I have a minimal repro of this issue in case it helps. With code like this:
When running this in .NET Framework 4.6.2, I get numbers like this:
When running in .NET 6, I get numbers like this:
So it’s ~10-11x slower here. In some cases in a larger codebase I’ve seen it be 13-14x slower. @stephentoub any thoughts about fixing this in .NET 7?
I had a look into this to see if it was anything I could diagnose and/or fix. The answer seems to be no but what I found may be of use to someone more knowledgeable. Using managed profiling tools it looks like it’s just spending a lot of time reading and spinning, not helpful. Using VTune I think it’s clear that it’s spending a lot of time in window_output_flush which is memcopying. Here’s what I see: