runtime: [PERF] Regression on AMD

Between 3.1 and master (5.0) … 33% degradation

These numbers are for Plaintext, but the same regression appears on all scenarios.

-n Plaintext --webHost KestrelSockets -j /var/jenkins_home/workspace/baseline_citrine_amd/build/../src/Benchmarks/benchmarks.plaintext.json --description Baseline31 --aspnetCoreVersion 3.1 --runtimeVersion 3.1 --sdk 3.1.102

RequestsPerSecond:           3,980,990
Max CPU (%):                 89
WorkingSet (MB):             96
Avg. Latency (ms):           3.25
Startup (ms):                275
First Request (ms):          30.53
Latency (ms):                0.1
Total Requests:              80,017,797
Duration: (ms)               20,100
Socket Errors:               0
Bad Responses:               0
Build Time (ms):             8,502
Published Size (KB):         115,446
SDK:                         3.1.102
Runtime:                     3.1.2
ASP.NET Core:                3.1.2
-n Plaintext --webHost KestrelSockets -j /var/jenkins_home/workspace/baseline_citrine_amd/build/../src/Benchmarks/benchmarks.plaintext.json --description Baseline 

RequestsPerSecond:           2,988,782
Max CPU (%):                 77
WorkingSet (MB):             100
Avg. Latency (ms):           3.42
Startup (ms):                253
First Request (ms):          28.98
Latency (ms):                0.16
Total Requests:              60,073,627
Duration: (ms)               20,100
Socket Errors:               0
Bad Responses:               0
Build Time (ms):             3,501
Published Size (KB):         119,145
SDK:                         5.0.100-preview.2.20120.3
Runtime:                     5.0.0-preview.2.20125.16
ASP.NET Core:                5.0.0-preview.2.20126.7

About this issue

  • Original URL
  • State: closed
  • Created 4 years ago
  • Comments: 20 (20 by maintainers)

Most upvoted comments

Could someone list where to get the benchmark from and what command line to test with (I didn’t see it in dotnet/performance)?

It’s one of the ASP.NET TechEmpower benchmarks. As soon as I get the trace I am going to share it with you

I am almost sure it’s more related to thread scheduling than instructions. If you check the link you will see that we can get 712K RPS at 20 cores, but it goes down to 187K RPS when using the full 48 cores. This machine being an AMD might not be the cause, but the fact that is has more cores than the other machines we usually test on is adding some new information.