runtime: Performance regressions in Quaternion.Conjugate and Quaternion.Negate
It looks like Conjugate
and Negate
methods of the Quaternion
type have regressed compared to 3.1.
@tannergooding could you please take a look and triage it? I assume that this might be acceptable similarly to https://github.com/dotnet/runtime/issues/39035 but I don’t have enough knowledge to make any decisions here.
Repro
git clone https://github.com/dotnet/performance.git
py .\performance\scripts\benchmarks_ci.py -f netcoreapp3.1 netcoreapp5.0 --filter 'System.Numerics.Tests.Perf_Quaternion.ConjugateBenchmark' 'System.Numerics.Tests.Perf_Quaternion.Negat*'
To see all the numbers please click “details” below:
System.Numerics.Tests.Perf_Quaternion.ConjugateBenchmark
Result | Base | Diff | Ratio | Alloc Delta | Modality | Operating System | Bit | Processor Name | Base V | Diff V |
---|---|---|---|---|---|---|---|---|---|---|
Slower | 1.30 | 7.91 | 0.16 | +0 | Windows 10.0.18363.959 | X64 | Intel Xeon CPU E5-1650 v4 3.60GHz | 3.1.6 | 5.0.20.41714 | |
Slower | 6.43 | 18.76 | 0.34 | +0 | manjaro | X64 | Intel Core i7-4771 CPU 3.50GHz (Haswell) | 3.1.6 | 5.0.20.41714 | |
Slower | 12.90 | 21.39 | 0.60 | +0 | ubuntu 16.04 | Arm64 | Unknown processor | 3.1.6 | 5.0.20.41714 | |
Slower | 12.90 | 27.30 | 0.47 | +0 | ubuntu 16.04 | Arm64 | Unknown processor | 3.1.7 | 5.0.20.41714 | |
Slower | 14.44 | 24.09 | 0.60 | +0 | ubuntu 16.04 | Arm64 | Unknown processor | 3.1.6 | 5.0.20.41714 | |
Slower | 1.00 | 7.67 | 0.13 | +0 | Windows 10.0.18363.959 | X86 | Intel Xeon CPU E5-1650 v4 3.60GHz | 3.1.6 | 5.0.20.41714 | |
Faster | 9.44 | 5.20 | 1.81 | +0 | Windows 10.0.19041.450 | Arm | Microsoft SQ1 3.0 GHz | 3.1.6 | 5.0.20.41714 | |
Slower | 8.01 | 23.80 | 0.34 | +0 | macOS Mojave 10.14.5 | X64 | Intel Core i7-5557U CPU 3.10GHz (Broadwell) | 3.1.6 | 5.0.20.41714 |
System.Numerics.Tests.Perf_Quaternion.NegateBenchmark
Result | Base | Diff | Ratio | Alloc Delta | Modality | Operating System | Bit | Processor Name | Base V | Diff V |
---|---|---|---|---|---|---|---|---|---|---|
Slower | 1.54 | 7.44 | 0.21 | +0 | Windows 10.0.18363.959 | X64 | Intel Xeon CPU E5-1650 v4 3.60GHz | 3.1.6 | 5.0.20.41714 | |
Slower | 6.49 | 17.96 | 0.36 | +0 | manjaro | X64 | Intel Core i7-4771 CPU 3.50GHz (Haswell) | 3.1.6 | 5.0.20.41714 | |
Slower | 13.67 | 22.12 | 0.62 | +0 | ubuntu 16.04 | Arm64 | Unknown processor | 3.1.6 | 5.0.20.41714 | |
Slower | 12.52 | 20.97 | 0.60 | +0 | ubuntu 16.04 | Arm64 | Unknown processor | 3.1.7 | 5.0.20.41714 | |
Slower | 14.06 | 18.74 | 0.75 | +0 | ubuntu 16.04 | Arm64 | Unknown processor | 3.1.6 | 5.0.20.41714 | |
Slower | 1.26 | 7.64 | 0.16 | +0 | bimodal | Windows 10.0.18363.959 | X86 | Intel Xeon CPU E5-1650 v4 3.60GHz | 3.1.6 | 5.0.20.41714 |
Faster | 9.52 | 5.05 | 1.89 | +0 | Windows 10.0.19041.450 | Arm | Microsoft SQ1 3.0 GHz | 3.1.6 | 5.0.20.41714 | |
Slower | 7.99 | 23.96 | 0.33 | +0 | macOS Mojave 10.14.5 | X64 | Intel Core i7-5557U CPU 3.10GHz (Broadwell) | 3.1.6 | 5.0.20.41714 |
System.Numerics.Tests.Perf_Quaternion.NegationOperatorBenchmark
Result | Base | Diff | Ratio | Alloc Delta | Modality | Operating System | Bit | Processor Name | Base V | Diff V |
---|---|---|---|---|---|---|---|---|---|---|
Slower | 1.56 | 7.47 | 0.21 | +0 | Windows 10.0.18363.959 | X64 | Intel Xeon CPU E5-1650 v4 3.60GHz | 3.1.6 | 5.0.20.41714 | |
Slower | 6.48 | 18.24 | 0.36 | +0 | manjaro | X64 | Intel Core i7-4771 CPU 3.50GHz (Haswell) | 3.1.6 | 5.0.20.41714 | |
Slower | 12.90 | 22.14 | 0.58 | +0 | ubuntu 16.04 | Arm64 | Unknown processor | 3.1.6 | 5.0.20.41714 | |
Slower | 12.90 | 22.12 | 0.58 | +0 | ubuntu 16.04 | Arm64 | Unknown processor | 3.1.7 | 5.0.20.41714 | |
Slower | 13.67 | 24.44 | 0.56 | +0 | ubuntu 16.04 | Arm64 | Unknown processor | 3.1.6 | 5.0.20.41714 | |
Slower | 1.26 | 7.16 | 0.18 | +0 | Windows 10.0.18363.959 | X86 | Intel Xeon CPU E5-1650 v4 3.60GHz | 3.1.6 | 5.0.20.41714 | |
Faster | 9.60 | 5.15 | 1.86 | +0 | Windows 10.0.19041.450 | Arm | Microsoft SQ1 3.0 GHz | 3.1.6 | 5.0.20.41714 | |
Slower | 7.98 | 21.66 | 0.37 | +0 | macOS Mojave 10.14.5 | X64 | Intel Core i7-5557U CPU 3.10GHz (Broadwell) | 3.1.6 | 5.0.20.41714 |
@DrewScoggins this regression did not get detected by the bot as it was added very recently 👍
About this issue
- Original URL
- State: closed
- Created 4 years ago
- Comments: 20 (20 by maintainers)
@tannergooding can you work up a fix? I’m OOF the next few days.
Seems like this might clear the RC2 bar.
I would just add the
[Intrinsic]
attribute for now.Looks like the Conjugate and Negate methods aren’t being inlined in .NET 5, which is leading to the perf slowdown.
However, as indicated the managed implementation hasn’t changed and is actually rather simple: https://github.com/dotnet/corefx/blob/release/3.1/src/System.Numerics.Vectors/src/System/Numerics/Quaternion.cs#L124, so this is likely to do with one of the JIT changes. Here is the netcoreapp3.1 disassembly: System.Numerics.Tests.Perf_Quaternion-asm.md.txt Here is the net5.0 disassembly: System.Numerics.Tests.Perf_Quaternion-asm.md.txt