runtime: Strange behavior between Release and Debug builds [Random access violations and null exceptions]
Description
First the disclaimer: Our code is not saint, it could totally be our fault, I just don’t know where to look anymore.
This piece of code works properly when code generated is in debug mode, but not when it is release. However, there is another interesting behavior. This error started appearing after implementing a new high performance SIMD sorting code based on the same algorithm used by the Garbage Collector. So the biggest difference is we are calling that routine. When either the caller assembly Corax or the assembly that host that code Sparrow.Server is emitting debug code the error does not show itself.
As you can see from the image, I test for null before the call to .Fill() and when I try to do the check again it triggers a null exception. Furthermore match is an struct so there is no option on it becoming null unless somehow the return pointer is wrong or something overwrites the stack.
That counter measures how reliably the issue is triggered. The failure is totally unreliable sometimes it takes 1150 others 1270, and so on.
Reproduction Steps
- Clone the repo: https://github.com/redknightlois/ravendb/tree/repro-release-mode-memoryissue
- Execute in release mode: Voron.Benchmark
- After 1200+ rounds the error happens at: https://github.com/redknightlois/ravendb/blob/repro-release-mode-memoryissue/src/Corax/Queries/SortingMatch/SortingMatch.cs#L244
Expected behavior
No null exception.
Actual behavior
Null exception
Regression?
No response
Known Workarounds
No
Configuration
C# 11, .Net 7.0, SDK 7.0.100, Windows 10 , AMD x64
Other information
No response
About this issue
- Original URL
- State: closed
- Created 2 years ago
- Comments: 28 (18 by maintainers)
The code here computes
var tmpStartRight = tmpStartLeft + PARTITION_TMP_SIZE_IN_ELEMENTS;wheretmpStartLeftis along*. SotmpStartRightends up pointing outside_temp. If I change_tempto be an array of longs then the problem disappears.The problem seems to disappear if I change the initialization to a more well-defined (from the C# side):
This is not a duplicate #78206. If you set
DOTNET_TieredCompilation=0, it will crash immediately in the first iteration and the GC did not run at all at that point.@dotnet/jit-contrib This looks like a codegen optimization bug. Could you please take a look?
It looks like it’s VXSort related, btw, we recently had to patch our version due to potential buffer overruns https://github.com/dotnet/runtime/pull/75364/files