runtime: Regression in AsSpan for uint[] Datatype
Netcoreapp 3.0 Sdk Version 3.0.100-preview-009841
BenchmarkDotNet=v0.11.3, OS=Windows 10.0.17134.471 (1803/April2018Update/Redstone4)
Intel Xeon CPU E5-1650 v4 3.60GHz, 1 CPU, 12 logical and 6 physical cores
.NET Core SDK=3.0.100-alpha1-009630
[Host] : .NET Core 3.0.0-preview1-26928-03 (CoreCLR 4.6.26927.03, CoreFX 4.6.26927.03), 64bit RyuJIT
Job-TQETEJ : .NET Core 3.0.0-preview1-26928-03 (CoreCLR 4.6.26927.03, CoreFX 4.6.26927.03), 64bit RyuJIT
BuildConfiguration=Release-Intrinsics Toolchain=netcoreapp3.0 InvocationCount=1
MaxIterationCount=20 UnrollFactor=1 WarmupCount=1
Method | Mean | Error | StdDev | Median | Extra Metric |
---|---|---|---|---|---|
AsSpanForUnitBenchmark | 1.445 us | 0.8379 us | 0.9649 us | 0.8400 us | - |
Sdk version 2.1.401 Netcoreapp2.1
BenchmarkDotNet=v0.11.3, OS=Windows 10.0.17134.471 (1803/April2018Update/Redstone4)
Intel Xeon CPU E5-1650 v4 3.60GHz, 1 CPU, 12 logical and 6 physical cores
.NET Core SDK=3.0.100-alpha1-009630
[Host] : .NET Core 2.1.6 (CoreCLR 4.6.27019.06, CoreFX 4.6.27019.05), 64bit RyuJIT
Job-JARWGC : .NET Core 2.1.6 (CoreCLR 4.6.27019.06, CoreFX 4.6.27019.05), 64bit RyuJIT
BuildConfiguration=Release Toolchain=netcoreapp2.1 InvocationCount=1
MaxIterationCount=20 UnrollFactor=1 WarmupCount=1
Method | Mean | Error | StdDev | Median | Extra Metric |
---|---|---|---|---|---|
AsSpanForUnitBenchmark | 48.75 ns | 29.60 ns | 29.07 ns | 65.00 ns | - |
The code for this benchmark is
public uint[] input;
public int end;
[IterationSetup(Target = nameof(AsSpanForUnitBenchmark))]
public void setup()
{
int min = 0;
int max = 100000;
Random randNum = new Random();
end = randNum.Next(min, max);
input = new uint[max]
for (int i = 0; i < input.Length; i++)
{
input[i] = (uint)randNum.Next(min, max);
}
}
[Benchmark]
public void AsSpanForUnitBenchmark()
{
input.AsSpan(0, end);
}
About this issue
- Original URL
- State: closed
- Created 6 years ago
- Reactions: 1
- Comments: 28 (28 by maintainers)
I’ll take a look, recently I stumbled upon a JIT change that gets rid of some of these funny moves. Maybe we’re lucky and it’s the same case.
Also, the second
mov
inmight be eliminated by another change I have.
Am going to close since I think this is fixed – please reopen if that’s not the case.
I don’t see much perf difference between 3.0 Preview 2 and 2.1 on the test case that was added in dotnet/performance#260 to try and capture this issue.
@Anipik can you remeasure your test with 3.0 preview 2?
This should have been be improved by dotnet/coreclr#19429 and dotnet/coreclr#22454 …
and master with the above PRs merged does show promise…
Not entirely. But given phase ordering and all, I think it’s generally accepted that a register allocator should incorporate copy elimination, either as part of its function, or done prior to.
Good find @Anipik 🥂
I do not think we have any active perf issue on this one.
I believe this regression was introduced by https://github.com/dotnet/coreclr/pull/20771. Here is the disassembly difference:
Before:
After:
The JIT code is bigger and its has more data dependencies, and thus runs slower. I have verified that reverting the commit changes the code to what it used to be.
cc @GrabYourPitchforks @ahsonkhan