runtime: Performance regressions in Quaternion.Conjugate and Quaternion.Negate

It looks like Conjugate and Negate methods of the Quaternion type have regressed compared to 3.1.

@tannergooding could you please take a look and triage it? I assume that this might be acceptable similarly to https://github.com/dotnet/runtime/issues/39035 but I don’t have enough knowledge to make any decisions here.

Repro

git clone https://github.com/dotnet/performance.git
py .\performance\scripts\benchmarks_ci.py -f netcoreapp3.1 netcoreapp5.0 --filter 'System.Numerics.Tests.Perf_Quaternion.ConjugateBenchmark' 'System.Numerics.Tests.Perf_Quaternion.Negat*'

To see all the numbers please click “details” below:

System.Numerics.Tests.Perf_Quaternion.ConjugateBenchmark

Result Base Diff Ratio Alloc Delta Modality Operating System Bit Processor Name Base V Diff V
Slower 1.30 7.91 0.16 +0 Windows 10.0.18363.959 X64 Intel Xeon CPU E5-1650 v4 3.60GHz 3.1.6 5.0.20.41714
Slower 6.43 18.76 0.34 +0 manjaro X64 Intel Core i7-4771 CPU 3.50GHz (Haswell) 3.1.6 5.0.20.41714
Slower 12.90 21.39 0.60 +0 ubuntu 16.04 Arm64 Unknown processor 3.1.6 5.0.20.41714
Slower 12.90 27.30 0.47 +0 ubuntu 16.04 Arm64 Unknown processor 3.1.7 5.0.20.41714
Slower 14.44 24.09 0.60 +0 ubuntu 16.04 Arm64 Unknown processor 3.1.6 5.0.20.41714
Slower 1.00 7.67 0.13 +0 Windows 10.0.18363.959 X86 Intel Xeon CPU E5-1650 v4 3.60GHz 3.1.6 5.0.20.41714
Faster 9.44 5.20 1.81 +0 Windows 10.0.19041.450 Arm Microsoft SQ1 3.0 GHz 3.1.6 5.0.20.41714
Slower 8.01 23.80 0.34 +0 macOS Mojave 10.14.5 X64 Intel Core i7-5557U CPU 3.10GHz (Broadwell) 3.1.6 5.0.20.41714

System.Numerics.Tests.Perf_Quaternion.NegateBenchmark

Result Base Diff Ratio Alloc Delta Modality Operating System Bit Processor Name Base V Diff V
Slower 1.54 7.44 0.21 +0 Windows 10.0.18363.959 X64 Intel Xeon CPU E5-1650 v4 3.60GHz 3.1.6 5.0.20.41714
Slower 6.49 17.96 0.36 +0 manjaro X64 Intel Core i7-4771 CPU 3.50GHz (Haswell) 3.1.6 5.0.20.41714
Slower 13.67 22.12 0.62 +0 ubuntu 16.04 Arm64 Unknown processor 3.1.6 5.0.20.41714
Slower 12.52 20.97 0.60 +0 ubuntu 16.04 Arm64 Unknown processor 3.1.7 5.0.20.41714
Slower 14.06 18.74 0.75 +0 ubuntu 16.04 Arm64 Unknown processor 3.1.6 5.0.20.41714
Slower 1.26 7.64 0.16 +0 bimodal Windows 10.0.18363.959 X86 Intel Xeon CPU E5-1650 v4 3.60GHz 3.1.6 5.0.20.41714
Faster 9.52 5.05 1.89 +0 Windows 10.0.19041.450 Arm Microsoft SQ1 3.0 GHz 3.1.6 5.0.20.41714
Slower 7.99 23.96 0.33 +0 macOS Mojave 10.14.5 X64 Intel Core i7-5557U CPU 3.10GHz (Broadwell) 3.1.6 5.0.20.41714

System.Numerics.Tests.Perf_Quaternion.NegationOperatorBenchmark

Result Base Diff Ratio Alloc Delta Modality Operating System Bit Processor Name Base V Diff V
Slower 1.56 7.47 0.21 +0 Windows 10.0.18363.959 X64 Intel Xeon CPU E5-1650 v4 3.60GHz 3.1.6 5.0.20.41714
Slower 6.48 18.24 0.36 +0 manjaro X64 Intel Core i7-4771 CPU 3.50GHz (Haswell) 3.1.6 5.0.20.41714
Slower 12.90 22.14 0.58 +0 ubuntu 16.04 Arm64 Unknown processor 3.1.6 5.0.20.41714
Slower 12.90 22.12 0.58 +0 ubuntu 16.04 Arm64 Unknown processor 3.1.7 5.0.20.41714
Slower 13.67 24.44 0.56 +0 ubuntu 16.04 Arm64 Unknown processor 3.1.6 5.0.20.41714
Slower 1.26 7.16 0.18 +0 Windows 10.0.18363.959 X86 Intel Xeon CPU E5-1650 v4 3.60GHz 3.1.6 5.0.20.41714
Faster 9.60 5.15 1.86 +0 Windows 10.0.19041.450 Arm Microsoft SQ1 3.0 GHz 3.1.6 5.0.20.41714
Slower 7.98 21.66 0.37 +0 macOS Mojave 10.14.5 X64 Intel Core i7-5557U CPU 3.10GHz (Broadwell) 3.1.6 5.0.20.41714

@DrewScoggins this regression did not get detected by the bot as it was added very recently 👍

About this issue

  • Original URL
  • State: closed
  • Created 4 years ago
  • Comments: 20 (20 by maintainers)

Most upvoted comments

@tannergooding can you work up a fix? I’m OOF the next few days.

Seems like this might clear the RC2 bar.

I would just add the [Intrinsic] attribute for now.

Looks like the Conjugate and Negate methods aren’t being inlined in .NET 5, which is leading to the perf slowdown.

However, as indicated the managed implementation hasn’t changed and is actually rather simple: https://github.com/dotnet/corefx/blob/release/3.1/src/System.Numerics.Vectors/src/System/Numerics/Quaternion.cs#L124, so this is likely to do with one of the JIT changes. Here is the netcoreapp3.1 disassembly: System.Numerics.Tests.Perf_Quaternion-asm.md.txt Here is the net5.0 disassembly: System.Numerics.Tests.Perf_Quaternion-asm.md.txt