runtime: [Perf] Regressions in System.Memory.Span.BinarySearch
Run Information
Architecture | x64 |
---|---|
OS | ubuntu 18.04 |
Baseline | d148c34bdfbd49e3d820eaa4dcbc47832d52d0d2 |
Compare | bd3fb963e701743eb6847b898eda8571ead97155 |
Diff | Diff |
Regressions in System.Memory.Span<Byte>
Benchmark | Baseline | Test | Test/Base | Test Quality | Edge Detector | Baseline IR | Compare IR | IR Ratio | Baseline ETL | Compare ETL |
---|---|---|---|---|---|---|---|---|---|---|
BinarySearch - Duration of single invocation | 9.28 ns | 11.80 ns | 1.27 | 0.03 | True |
Historical Data in Reporting System
Repro
git clone https://github.com/dotnet/performance.git
python3 .\performance\scripts\benchmarks_ci.py -f netcoreapp5.0 --filter 'System.Memory.Span<Byte>*'
Payloads
Histogram
System.Memory.Span<Byte>.BinarySearch(Size: 512)
Docs
Profiling workflow for dotnet/runtime repository Benchmarking workflow for dotnet/runtime repository
About this issue
- Original URL
- State: closed
- Created 3 years ago
- Comments: 17 (17 by maintainers)
Yes, we want to add it and there was discussion frequently with @BruceForstall . I will see if I can add something behind a switch so we can try and validate it for benchmarks.
Make sure that you do not have any
COMPlus_
variables likeTier=0
orReadyToRun=0
set. I also use-d
switch to print the disassembly by the harness.I did little investigation and it seems to be jcc erratum might be affecting the performance. If you see all test history of windows, it is unstable, so it could be that the improvement that we saw in https://github.com/dotnet/perf-autofiling-issues/issues/28 is just another low of the instability on windows after your change.
Here is the diff on windows/x64:
On Linux, the benchmark seems pretty consistent with less variation if you see the all test history
Here is the diff on linux/x64:
If you notice, that both linux and windows has instruction
sub
that crosses the alignment boundary leading to JCC erratum and it could be that it is more evident in Linux than on Windows.