runtime: [arm64] Perf_FileStream.FlushAsync benchmark hangs on Debian 11
Reported by @carlossanlop offline. Carlos has hit this issue on Debian 11 arm64 with WSL2.
Repro:
git clone https://github.com/dotnet/performance.git
python3 ./performance/scripts/benchmarks_ci.py --architecture arm64 -f net7.0 --filter '*Perf_FileStream.FlushAsync*'
using (FileStream fileStream = new FileStream("repro.txt", FileMode.Create, FileAccess.Write, FileShare.Read, 4096, FileOptions.None))
{
for (int i = 0; i < 1024; i++)
{
fileStream.WriteByte(default);
await fileStream.FlushAsync();
}
}
In theory it could be an IO issue, but we have not touched FileStream
for a few months and I suspect that it’s a runtime bug similar to #64980.
@janvorli @jkotas what would be the best way to determine the reason of the hang on Linux?
About this issue
- Original URL
- State: closed
- Created 2 years ago
- Comments: 21 (21 by maintainers)
Commits related to this issue
- Disable PerfFileStream while we wait for dotnet/runtime#67545 to be fixed. — committed to LoopedBard3/performance by LoopedBard3 2 years ago
- Disable PerfFileStream while we wait for dotnet/runtime#67545 to be fixed. Should mitigate issue #2366 — committed to LoopedBard3/performance by LoopedBard3 2 years ago
- Disable PerfFileStream while we wait for dotnet/runtime#67545 to be fixed. (#2371) Should mitigate issue #2366 — committed to dotnet/performance by LoopedBard3 2 years ago
- Fix a race condition in the thread pool There is a case where on a work-stealing queue, both `LocalPop()` and `TrySteal()` may fail when running concurrently, and lead to a case where there is a work... — committed to kouvel/runtime by kouvel 2 years ago
- Revert "Disable PerfFileStream while we wait for dotnet/runtime#67545 to be fixed. (#2371)" - Depends on https://github.com/dotnet/runtime/pull/68171 - This reverts commit d8f1c477ec00c47a5989f4e7b42... — committed to kouvel/performance by kouvel 2 years ago
- Fix a race condition in the thread pool There is a case where on a work-stealing queue, both `LocalPop()` and `TrySteal()` may fail when running concurrently, and lead to a case where there is a work... — committed to kouvel/runtime by kouvel 2 years ago
- Fix a race condition in the thread pool (#68171) * Fix a race condition in the thread pool There is a case where on a work-stealing queue, both `LocalPop()` and `TrySteal()` may fail when running ... — committed to dotnet/runtime by kouvel 2 years ago
- Revert "Disable PerfFileStream while we wait for dotnet/runtime#67545 to be fixed. (#2371)" (#2381) - Depends on https://github.com/dotnet/runtime/pull/68171 - This reverts commit d8f1c477ec00c47a59... — committed to dotnet/performance by kouvel 2 years ago
- Fix a race condition in the thread pool (#68171) * Fix a race condition in the thread pool There is a case where on a work-stealing queue, both `LocalPop()` and `TrySteal()` may fail when running ... — committed to directhex/runtime by kouvel 2 years ago
- Fix a race condition in the thread pool (#68171) * Fix a race condition in the thread pool There is a case where on a work-stealing queue, both `LocalPop()` and `TrySteal()` may fail when running ... — committed to kouvel/runtime by kouvel 2 years ago
- Fix a race condition in the thread pool (#68171) * Fix a race condition in the thread pool There is a case where on a work-stealing queue, both `LocalPop()` and `TrySteal()` may fail when running ... — committed to rzikm/dotnet-runtime by kouvel 2 years ago
I have investigated the issue and the problem is in threadpool not processing a work item. As you can see from the SOS command below, there are no worker threads, the work request queue has 0 items and yet there is a queued work item. At this point, the process is waiting for that work item completion and nothing happens.
@adamsitnik sure, I’ll take a look.
FYI - I hit this hang in:
I uploaded the dump file to the location where we’re posting our manual run results. And I observed a peculiar behavior: