runtime: String.StartsWith slower on Linux with some characters
string.StartsWith on Linux becomes 2 orders of magnitude slower when the string contains a dash (-).
On Linux:
BenchmarkDotNet=v0.11.3, OS=centos 7
Intel Xeon CPU E5-2630L v3 1.80GHz, 2 CPU, 32 logical and 16 physical cores
[Host] : .NET Core 3.0.0-preview8-28405-07 (CoreCLR 4.700.19.37902, CoreFX 4.700.19.40503), 64bit RyuJIT
Job-UBBGCZ : .NET Core 3.0.0-preview8-28405-07 (CoreCLR 4.700.19.37902, CoreFX 4.700.19.40503), 64bit RyuJIT
Runtime=Core Toolchain=netcoreapp3.0
Method | Mean | Error | StdDev |
--------------- |------------:|-----------:|-----------:|
StartsWith | 35.79 ns | 0.1069 ns | 0.0948 ns |
StartsWithDash | 4,411.13 ns | 35.0054 ns | 29.2311 ns |
On Windows (only for reference, the hardware is not the same):
BenchmarkDotNet=v0.11.3, OS=Windows 10.0.18362
Intel Xeon CPU E3-1271 v3 3.60GHz, 1 CPU, 8 logical and 4 physical cores
.NET Core SDK=3.0.100-preview8-013656
[Host] : .NET Core 3.0.0-preview8-28405-07 (CoreCLR 4.700.19.37902, CoreFX 4.700.19.40503), 64bit RyuJIT
DefaultJob : .NET Core 3.0.0-preview8-28405-07 (CoreCLR 4.700.19.37902, CoreFX 4.700.19.40503), 64bit RyuJIT
Method | Mean | Error | StdDev |
--------------- |---------:|----------:|----------:|
StartsWith | 69.42 ns | 0.2523 ns | 0.2236 ns |
StartsWithDash | 69.47 ns | 1.4200 ns | 1.6904 ns |
Benchmark code:
public class StartsWithBenchmark
{
private string _str1 = "aaaaaaaaaz";
private string _str2 = "aaaaaaaaa-";
[Benchmark]
public bool StartsWith()
{
return _str1.StartsWith("i");
}
[Benchmark]
public bool StartsWithDash()
{
return _str2.StartsWith("i");
}
}
The performance issue does not occur if using ordinal comparison.
About this issue
- Original URL
- State: closed
- Created 5 years ago
- Reactions: 4
- Comments: 20 (19 by maintainers)
@tarekgh It’s not truly helpful to simply label this as a corner case. People expect consistent performance and if you have a method that is suddenly magnitudes slower due to certain characters being used it’s like dropping time bombs into peoples code.
https://github.com/dotnet/coreclr/pull/26759 and https://github.com/dotnet/coreclr/pull/26621 combined together have fixed this problem.
Fun fact: while working on improving the performance of StartsWith on Linux we have found and fixed an 18 year old bug in ICU https://github.com/unicode-org/icu/pull/840 😉
@kevingosse thanks for you measurements. I believe @adamsitnik PR is going to help some with the StartsWith scenario.
@tarekgh is there any reason why we should not implement
StartsWithin the following way:Edit: nevermind, I’ve got an answer from @kevingosse in https://github.com/dotnet/coreclr/pull/26481 😉