runtime: Exception caught in FormClosing / FormClosed lead to crash in coreclr.dll
—UPD— Small repro and further investigations from this comment. —UPD—
-
.NET Core Version: 3.1.7
-
Have you experienced this same bug with .NET Framework?: NO
Problem description:
Hi guys, need some help here. I just figure out very strange behavior of our app. Pre requirements:
- MDI WInForms app.
- Child form controls binds to
BindingSource
with data source is some class. e.g.bindingSource1.DataSource = typeof(app.Params);
- .net core only (on .net framework all ok).
- x86 only.
- with debugger attached only.
with all above, when we input some wrong data (for example non numbers in int filed) in textbox and than attempt to close form - app will crash. Event log:
Faulting application name: app.exe, version: 2.88.7534.41390, time stamp: 0x5f17576d Faulting module name: coreclr.dll, version: 4.700.20.36602, time stamp: 0x5f1096e7 Exception code: 0xc0000409 Fault offset: 0x0010a890 Faulting application path: D:\save\work\main\app\bin\Debug\netcoreapp3.1\app.exe Faulting module path: C:\Program Files (x86)\dotnet\shared\Microsoft.NETCore.App\3.1.7\coreclr.dll
From crash dumps: The error always the same and in ANY random place of our program:
Unhandled exception 0x617CA890 (coreclr.dll) in app.exe.12400.dmp: Stack cookie instrumentation code detected a stack-based buffer overrun.
It’s happens with all our forms where BindingSource present. No matter what code (if any) in OnFormClosing
. Without OnFormClosing
or even with
protected override void OnFormClosing(FormClosingEventArgs e)
{
e.Cancel = false;
base.OnFormClosing(e);
}
Expected behavior:
No crash.
Minimal repro: I spend about 2 hours trying to create simple repro, but failed 😦 bindingSource.zip this is not reproduce the problem, but show the structure.
I dunno is this a bug in our app, or in WinForms, or in core, or in V.S? Windows 10, 2004 x64. VS 16.7.1
Any advice?
P.S.
When trying to create a repro, I found 2 more bugs in VS 16.7.1 with multi targeting and designer 😦 Now with data sources. I will update my issue tomorrow…
About this issue
- Original URL
- State: closed
- Created 4 years ago
- Comments: 33 (24 by maintainers)
@kirsan31 @weltkante would one of you be able to verify the fix for #2240 locally? I verified that the sample app provided no longer crashes, but it would be nice to make sure that it solves the original issue for you before going through the servicing process.
I created a release on my fork of coreclr with binaries with the fix: https://github.com/davmason/coreclr/releases/tag/0.0.1
To test it out you would have to:
Please let me know if you run in to any issues
Great work getting a repro, I’ve reduced it further. Any caught exception in
FormClosing
handler together with a nontrivial workload will trigger this. Additional requirements from your previous post for completeness:Sounds like a runtime issue to me.
Depends on the nature of the stack corruption, considering that its only noticeable when the debugger/runtime are specifically looking for stack corruptions this might not be dangerous. Generally stack corruptions can be security issues so this should definitely be looked at to make sure. (After all we don’t know if this bug can also be triggered in an ASP.NET Core server application.)
I’ve tested with the original and the reduced repro scenario, thats both fixed, but lets wait for @kirsan31 in case he wants to test in his actual environment. As far as I’m concerned the fix looks good, great work.
This is caused by #2240. I was able to debug to the point where I saw that the frame chain was corrupted, and then by searching through issues that were fixed between 3.1 and 5.0 found this suspiciously similar issue.
I don’t know exactly why this only repros under the debugger, but I validated that applying the fix for #2240 makes the provided app no longer crash so I am pretty confident this is the same issue.
@janvorli is there anything preventing us from porting the fix back to 3.1?
Even with all modules loaded I can’t trap the exception
It is interesting to note, the first time I try to close the child form I get this:
The second time I try to close the form I get this:
After which point the app crashes (with some delay as mentioned above):
@danmosemsft this looks like this bug is upstream from Windows Forms.
Fixed in dotnet/coreclr#28090
Thank you both for validating the fix. I opened a PR to port it back to 3.1 in https://github.com/dotnet/coreclr/pull/28090.
@davmason Tested with our original project and provided repro - all works good. Thank you!
I don’t think it is, only 3.1 appears to be affected.
no, the diagnostic instrumentation detects a stack overrun, there is never any AV reported, not even with debugger attached
So something writes the stack in some place that its not supposed to write in, typical stack corruption. Not all kinds of stack corruptions lead to AV.
Its unclear what the consequences are, i.e. if this can have any negative effect or just happens to always overwrite unused stack memory.
IMHO the first step is to figure out what this “stack instrumentation” is - some CLR runtime feature enabled when the debugger is attached? something VS instruments by itself? Can it be enabled with windbg attached instead of VS?
Once its clear what diagnostic feature is detecting the overrun it may be easier to enable it in context of a TTD time travel session and look who does the bad write.
erm, maybe I’m misunderstanding, but just because a debugger notifies you of a bug doesn’t mean its causing the bug, or the bug isn’t present when nobody is looking.
This shouldn’t be about fixing the diagnostic/debugger experience, this should be about investigating whether this is a security-relevant stack corruption present in a .NET Core 3 LTS release.
Must have:
private static Logger _logger = LogManager.GetCurrentClassLogger();
private static Font _BoldDefFont;
and in constructor:_BoldDefFont = new Font(DefaultFont, FontStyle.Bold);
<TieredCompilation>false</TieredCompilation>
in csproj.Repro proj: bindingSource.zip
So, for now candidates:
What do you think - how is dangerous to run such an app in production?
Got it! It’s related to Nlog (simple logger instance - no logging at all) and this line of code in constructor of main form:
and one more condition:
<TieredCompilation>false</TieredCompilation>
😲 I will provide a repro a bit later…I assume the debugger just adds instrumentation which reports the issue, but it probably also exists without debugger, just nobody notices it.
Is it possible to reproduce with WinDbg instead of VS? (Available as app in the store, no need to install any sdk.) Then it might be possible to record a TTD trace which was immensely helpful when looking at memory corruptions.