runtime: Huge slowdowns on threaded operations when debugger attached (macOS)
When debugging our application (with attached debugger, no breakpoints), performance can drop to a point it becomes frustrating to do anything. On occasion we are also seeing OS level hard locking for seconds to minutes, which may be related.
Reproducible in both VSCode and Jetbrains Rider. This is exclusive to netcore (2.0 and 2.1) – does not occur under mono or net471 runtime environments. It also seems limited to macOS as I have not been able to reproduce on windows.
This can easily be reproduced on our game framework project: https://github.com/ppy/osu-framework (building should require not extra steps beyond checking it out).
- Start in
VisualTests
configuration - Switch to
DelayedLoad
in the left menu - Observe severe frame drops when threaded load events occur
Testing with debugger attached should drop to less than 1fps while it is easy to maintain hundreds without a debugger attached.
It seems to be directly related to creation of threads, specifically with the TaskCreationOptions.LongRunning
flag. On removing this flag from hot paths (#1 #2), performance will return to normal.
I’ve been trying to reproduce this with a more isolated test case but have not succeeded yet. Some pointers on moving forward in diagnosing this issue would be appreciated!
About this issue
- Original URL
- State: closed
- Created 6 years ago
- Reactions: 2
- Comments: 32 (17 by maintainers)
Commits related to this issue
- Fix xplat debugging perf problem. Issue #18705 Add threadId to DebuggerIPCEvent so we don't need to use the slow DAC functions (because of extra memory reads) to get it. — committed to mikem8361/coreclr by mikem8361 6 years ago
- Fix xplat debugging perf problem. Issue #18705 Add threadId to DebuggerIPCEvent so we don't need to use the slow DAC functions (because of extra memory reads) to get it. Fixed CorDBIPC_BUFFER_SIZE ... — committed to mikem8361/coreclr by mikem8361 6 years ago
- Fix xplat debugging perf problem. (#19911) Issue #18705 Add threadId to DebuggerIPCEvent so we don't need to use the slow DAC functions (because of extra memory reads) to get it. Fixed CorDBIP... — committed to dotnet/coreclr by mikem8361 6 years ago
- Fix xplat debugging perf problem. (#19911) Issue #18705 Add threadId to DebuggerIPCEvent so we don't need to use the slow DAC functions (because of extra memory reads) to get it. Fixed CorDBIPC_BUF... — committed to mikem8361/coreclr by mikem8361 6 years ago
- Fix xplat debugging perf problem. (#19911) Issue #18705 Add threadId to DebuggerIPCEvent so we don't need to use the slow DAC functions (because of extra memory reads) to get it. Fixed CorDBIPC_BUF... — committed to mikem8361/coreclr by mikem8361 6 years ago
- Fix xplat debugging perf problem. (#19911) Issue #18705 Add threadId to DebuggerIPCEvent so we don't need to use the slow DAC functions (because of extra memory reads) to get it. Fixed CorDBIPC_BUF... — committed to mikem8361/coreclr by mikem8361 6 years ago
- Fix xplat debugging perf problem. Issue #18705 Add threadId to DebuggerIPCEvent so we don't need to use the slow DAC functions (because of extra memory reads) to get it. Fixed CorDBIPC_BUFFER_SIZE ... — committed to mikem8361/coreclr by mikem8361 6 years ago
- Fix xplat debugging perf problem. (#19911) (#20054) Issue #18705 Add threadId to DebuggerIPCEvent so we don't need to use the slow DAC functions (because of extra memory reads) to get it. Fixe... — committed to dotnet/coreclr by mikem8361 6 years ago
I would like to add that we’ve seen the same behaviour on Linux. Debugging our application is extremely slow on Linux and fast on Windows.
Using your small repo, I got the following results:
On Windows: 465 ms (0.0465 ms per call) On Linux: 14730 ms (1.473 ms per call)
Tried on Ubuntu 1804 and 1604 with .Net SDK 2.1.401, Runtime Version: 2.1.3 Commit: 124038c13e
The application makes extensive use of async/await and tasks. This is often on the call stack if you randomly break:
This has been fixed in master.