runtime: Performance regression: SpanCastBenchmark on AMD

System.Numerics.Tests.Constructor defines two kinds of benchmarks: ConstructorBenchmark and SpanCastBenchmark. All the ConstructorBenchmark benchmarks have improved for 3.0. However, SpanCastBenchmark has regressed on AMD.

Repro

git clone https://github.com/dotnet/performance.git
cd performance
# if you don't have cli installed and want python script to download the latest cli for you
py .\scripts\benchmarks_ci.py -f netcoreapp2.2 netcoreapp3.0 --filter *SpanCastBenchmark*
# if you do
dotnet run -p .\src\benchmarks\micro\MicroBenchmarks.csproj -c Release -f netcoreapp2.2 --runtimes netcoreapp2.2 netcoreapp3.0 --filter *SpanCastBenchmark*
BenchmarkDotNet=v0.11.3.1003-nightly, OS=Windows 10.0.18362
AMD Ryzen 7 1800X, 1 CPU, 16 logical and 8 physical cores
  Job-ZOELMG : .NET Core 2.2.6 (CoreCLR 4.6.27817.03, CoreFX 4.6.27818.02), 64bit RyuJIT
  Job-PXOTIF : .NET Core 3.0.0-preview8-27919-09 (CoreCLR 4.700.19.36901, CoreFX 4.700.19.36905), 64bit RyuJIT
Method Mean 2.2 Mean 3.0
SpanCastBenchmark_Byte 0.6217 ns 2.683 ns
SpanCastBenchmark_SByte 0.7052 ns 2.659 ns
SpanCastBenchmark_UInt16 0.5903 ns 2.929 ns
SpanCastBenchmark_Int16 0.7791 ns 2.930 ns
SpanCastBenchmark_UInt32 0.5929 ns 2.931 ns
SpanCastBenchmark_Int32 0.7792 ns 2.947 ns
SpanCastBenchmark_UInt64 0.7793 ns 2.929 ns
SpanCastBenchmark_Int64 0.7793 ns 2.930 ns
SpanCastBenchmark_Single 0.5933 ns 2.931 ns
SpanCastBenchmark_Double 0.7880 ns 2.878 ns
ConstructorBenchmark_Byte 7.2329 ns 2.657 ns
ConstructorBenchmark_SByte 6.9651 ns 2.658 ns
ConstructorBenchmark_UInt16 7.2337 ns 2.659 ns
ConstructorBenchmark_Int16 7.2346 ns 2.659 ns
ConstructorBenchmark_UInt32 6.9655 ns 2.658 ns
ConstructorBenchmark_Int32 7.2319 ns 2.663 ns
ConstructorBenchmark_UInt64 7.2346 ns 2.659 ns
ConstructorBenchmark_Int64 7.2349 ns 2.657 ns
ConstructorBenchmark_Single 7.2335 ns 2.660 ns
ConstructorBenchmark_Double 7.2305 ns 2.671 ns

/cc @danmosemsft @tannergooding @AndyAyersMS @billwert @DrewScoggins

About this issue

  • Original URL
  • State: closed
  • Created 5 years ago
  • Comments: 15 (15 by maintainers)

Most upvoted comments

So it looks like we are now faster across the board in .NET Core 5.

I currently see:

BenchmarkDotNet=v0.12.0, OS=Windows 10.0.18363
AMD Ryzen 9 3900X, 1 CPU, 24 logical and 12 physical cores
.NET Core SDK=5.0.100-preview.2.20156.4
  [Host]     : .NET Core 2.1.11 (CoreCLR 4.6.27617.04, CoreFX 4.6.27617.02), X64 RyuJIT
  Job-QSGPOG : .NET Core 2.1.11 (CoreCLR 4.6.27617.04, CoreFX 4.6.27617.02), X64 RyuJIT

PowerPlanMode=00000000-0000-0000-0000-000000000000  Runtime=.NET Core 2.1  Toolchain=netcoreapp2.1
IterationTime=250.0000 ms  MaxIterationCount=20  MinIterationCount=15
WarmupCount=1
Method Mean Error StdDev Median Min Max Gen 0 Gen 1 Gen 2 Allocated
SpanCastBenchmark_Byte 0.6495 ns 0.0172 ns 0.0161 ns 0.6492 ns 0.6293 ns 0.6797 ns - - - -
SpanCastBenchmark_SByte 0.5399 ns 0.0254 ns 0.0225 ns 0.5353 ns 0.5078 ns 0.5913 ns - - - -
SpanCastBenchmark_UInt16 0.6667 ns 0.0090 ns 0.0084 ns 0.6648 ns 0.6538 ns 0.6835 ns - - - -
SpanCastBenchmark_Int16 0.6714 ns 0.0120 ns 0.0112 ns 0.6687 ns 0.6555 ns 0.6987 ns - - - -
SpanCastBenchmark_UInt32 0.5416 ns 0.0140 ns 0.0124 ns 0.5384 ns 0.5228 ns 0.5626 ns - - - -
SpanCastBenchmark_Int32 0.5549 ns 0.0219 ns 0.0204 ns 0.5520 ns 0.5330 ns 0.6009 ns - - - -
SpanCastBenchmark_UInt64 0.6966 ns 0.0106 ns 0.0094 ns 0.6965 ns 0.6714 ns 0.7080 ns - - - -
SpanCastBenchmark_Int64 0.6921 ns 0.0036 ns 0.0034 ns 0.6920 ns 0.6849 ns 0.6976 ns - - - -
SpanCastBenchmark_Single 0.5451 ns 0.0080 ns 0.0067 ns 0.5449 ns 0.5378 ns 0.5624 ns - - - -
SpanCastBenchmark_Double 0.6933 ns 0.0049 ns 0.0046 ns 0.6948 ns 0.6842 ns 0.7009 ns - - - -

vs

BenchmarkDotNet=v0.12.0, OS=Windows 10.0.18363
AMD Ryzen 9 3900X, 1 CPU, 24 logical and 12 physical cores
.NET Core SDK=5.0.100-preview.2.20156.4
  [Host]     : .NET Core 5.0.0 (CoreCLR 5.0.20.15501, CoreFX 5.0.20.15501), X64 RyuJIT
  Job-YZGPFX : .NET Core 5.0.0 (CoreCLR 5.0.20.15501, CoreFX 5.0.20.15501), X64 RyuJIT

PowerPlanMode=00000000-0000-0000-0000-000000000000  Runtime=.NET Core 5.0  Toolchain=netcoreapp5.0
IterationTime=250.0000 ms  MaxIterationCount=20  MinIterationCount=15
WarmupCount=1
Method Mean Error StdDev Median Min Max Gen 0 Gen 1 Gen 2 Allocated
SpanCastBenchmark_Byte 0.2575 ns 0.0048 ns 0.0042 ns 0.2572 ns 0.2513 ns 0.2638 ns - - - -
SpanCastBenchmark_SByte 0.2732 ns 0.0047 ns 0.0041 ns 0.2721 ns 0.2663 ns 0.2791 ns - - - -
SpanCastBenchmark_UInt16 0.4818 ns 0.0150 ns 0.0140 ns 0.4869 ns 0.4564 ns 0.5052 ns - - - -
SpanCastBenchmark_Int16 0.5095 ns 0.0134 ns 0.0126 ns 0.5060 ns 0.4954 ns 0.5348 ns - - - -
SpanCastBenchmark_UInt32 0.5043 ns 0.0156 ns 0.0146 ns 0.4963 ns 0.4903 ns 0.5308 ns - - - -
SpanCastBenchmark_Int32 0.5090 ns 0.0125 ns 0.0117 ns 0.5084 ns 0.4929 ns 0.5281 ns - - - -
SpanCastBenchmark_UInt64 0.4250 ns 0.0143 ns 0.0133 ns 0.4187 ns 0.4127 ns 0.4611 ns - - - -
SpanCastBenchmark_Int64 0.5039 ns 0.0085 ns 0.0075 ns 0.5035 ns 0.4938 ns 0.5239 ns - - - -
SpanCastBenchmark_Single 0.5095 ns 0.0118 ns 0.0111 ns 0.5092 ns 0.4927 ns 0.5266 ns - - - -
SpanCastBenchmark_Double 0.5112 ns 0.0155 ns 0.0145 ns 0.5062 ns 0.4942 ns 0.5313 ns - - - -