runtime: .Net6: When i enable server gc mode my application start crashing frequently.

When the server GC is enabled my application start crashing frequently. Below is the win dgb analysis

Microsoft ® Windows Debugger Version 10.0.20348.1 AMD64 Copyright © Microsoft Corporation. All rights reserved.

Loading Dump File [D:\workspace\Investigation\20211220_coreclr\program.exe.110804\Bifrost.exe.110804.dmp] User Mini Dump File with Full Memory: Only application data is available

Symbol search path is: srv* Executable search path is: Windows 10 Version 17763 MP (36 procs) Free x64 Product: Server, suite: TerminalServer DataCenter SingleUserTS Edition build lab: 17763.1.amd64fre.rs5_release.180914-1434 Machine Name: Debug session time: Wed Dec 15 21:53:19.000 2021 (UTC + 8:00) System Uptime: 331 days 9:44:52.929 Process Uptime: 0 days 1:25:50.000 … … … … Loading unloaded module list . This dump file has an exception of interest stored in it. The stored exception information can be accessed via .ecxr. (1b0d4.14480): Access violation - code c0000005 (first/second chance not available) For analysis of this file, run !analyze -v ntdll!NtWaitForMultipleObjects+0x14: 00007fff`ed12fcd4 c3 ret 0:053> !analyze -v


  •                                                                         *
    
  •                    Exception Analysis                                   *
    
  •                                                                         *
    

*** WARNING: Unable to verify checksum for program.exe *** WARNING: Unable to verify checksum for libzmq.dll Failed to request MethodData, not in JIT code range

KEY_VALUES_STRING: 1

Key  : AV.Dereference
Value: NullClassPtr

Key  : AV.Fault
Value: Read

Key  : Analysis.CPU.mSec
Value: 11577

Key  : Analysis.DebugAnalysisProvider.CPP
Value: Create: 8007007e on PT-LUFI

Key  : Analysis.DebugData
Value: CreateObject

Key  : Analysis.DebugModel
Value: CreateObject

Key  : Analysis.Elapsed.mSec
Value: 12898

Key  : Analysis.Init.CPU.mSec
Value: 562

Key  : Analysis.Init.Elapsed.mSec
Value: 21200

Key  : Analysis.Memory.CommitPeak.Mb
Value: 264

Key  : Analysis.System
Value: CreateObject

Key  : CLR.Engine
Value: CORECLR

Key  : CLR.Version
Value: 6.0.21.52210

Key  : Timeline.OS.Boot.DeltaSec
Value: 28633492

Key  : Timeline.Process.Start.DeltaSec
Value: 5150

Key  : WER.OS.Branch
Value: rs5_release

Key  : WER.OS.Timestamp
Value: 2018-09-14T14:34:00Z

Key  : WER.OS.Version
Value: 10.0.17763.1

Key  : WER.Process.Version
Value: 3.4.0.118

ADDITIONAL_XML: 1

OS_BUILD_LAYERS: 1

NTGLOBALFLAG: 0

PROCESS_BAM_CURRENT_THROTTLED: 0

PROCESS_BAM_PREVIOUS_THROTTLED: 0

APPLICATION_VERIFIER_FLAGS: 0

CONTEXT: (.ecxr) rax=0000026f39fc3000 rbx=0000026f39fc2300 rcx=0000000000000028 rdx=0000000000000000 rsi=0000000001440003 rdi=0000026f39fc2300 rip=00007fffb2d85ccb rsp=000000ea4ce7c4c0 rbp=0000000026f39fc1 r8=0000026f39fc3000 r9=0000000026f39fc2 r10=000000026f39fc3f r11=0000000000000000 r12=0000026f39fc3e01 r13=0000026f39fc3e00 r14=0000026f39fc1d28 r15=00000275fde07fc0 iopl=0 nv up ei pl nz na po nc cs=0033 ss=002b ds=002b es=002b fs=0053 gs=002b efl=00010206 coreclr!SVR::my_get_size+0xa [inlined in coreclr!SVR::gc_heap::find_first_object+0xd7]: 00007fffb2d85ccb 833900 cmp dword ptr [rcx],0 ds:0000000000000028=??? Resetting default scope

EXCEPTION_RECORD: (.exr -1) ExceptionAddress: 00007fffb2d85ccb (coreclr!SVR::my_get_size+0x000000000000000a) ExceptionCode: c0000005 (Access violation) ExceptionFlags: 00000000 NumberParameters: 2 Parameter[0]: 0000000000000000 Parameter[1]: 0000000000000028 Attempt to read from address 0000000000000028

PROCESS_NAME: Bifrost.exe

READ_ADDRESS: 0000000000000028

ERROR_CODE: (NTSTATUS) 0xc0000005 - The instruction at 0x%p referenced memory at 0x%p. The memory could not be %s.

EXCEPTION_CODE_STR: c0000005

EXCEPTION_PARAMETER1: 0000000000000000

EXCEPTION_PARAMETER2: 0000000000000028

STACK_TEXT:
000000ea4ce7c4c0 00007fffb2d80d42 : 0000026f39fe12b8 000000ea4ce7c610 000000026f39fc3e 00000275fde07fc0 : coreclr!SVR::gc_heap::find_first_object+0xd7 000000ea4ce7c510 00007fffb2d661bc : 0000027500000001 00007fffb2d7e470 0000000000000000 00000275fde00ba0 : coreclr!SVR::gc_heap::mark_through_cards_for_segments+0x1e6 000000ea4ce7c6f0 00007fffb2d6184a : 0000000000000000 0000000000000000 00000275fde00ba0 00007fffb2d69211 : coreclr!SVR::gc_heap::mark_phase+0x61c 000000ea4ce7c7b0 00007fffb2d62fac : 0000000000000000 0000000000000000 00000275fde00ba0 00000000000008e0 : coreclr!SVR::gc_heap::gc1+0xb6 000000ea4ce7c820 00007fffb2d62a96 : 00000000000008e0 0000000000000000 0000000000000000 00000275fde00ba0 : coreclr!SVR::gc_heap::garbage_collect+0xec 000000ea4ce7c880 00007fffb2d62a30 : 00000275fde00ba0 000000ea4ce7fad0 000000ea4ad7ead0 0000000000000000 : coreclr!SVR::gc_heap::gc_thread_function+0x62 000000ea4ce7c8b0 00007fffb2e25fe4 : 0000000000000000 00000275fde00ba0 00007fffb2d62980 0000000000000000 : coreclr!SVR::gc_heap::gc_thread_stub+0xb0 000000ea4ce7fb00 00007fffeacf7974 : 0000000000000000 0000000000000000 0000000000000000 0000000000000000 : coreclr!<lambda_cb183d4f30cacb23ee8f0a2094f74691>::<lambda_invoker_cdecl>+0x74 000000ea4ce7fb30 00007fffed0ea0b1 : 0000000000000000 0000000000000000 0000000000000000 0000000000000000 : kernel32!BaseThreadInitThunk+0x14 000000ea4ce7fb60 0000000000000000 : 0000000000000000 0000000000000000 0000000000000000 0000000000000000 : ntdll!RtlUserThreadStart+0x21

FAULTING_SOURCE_LINE: D:\a_work\1\s\src\coreclr\gc\gc.cpp

FAULTING_SOURCE_FILE: D:\a_work\1\s\src\coreclr\gc\gc.cpp

FAULTING_SOURCE_LINE_NUMBER: 35423

SYMBOL_NAME: coreclr!SVR::gc_heap::find_first_object+d7

MODULE_NAME: coreclr

IMAGE_NAME: coreclr.dll

STACK_COMMAND: ~53s ; .ecxr ; kb

FAILURE_BUCKET_ID: NULL_CLASS_PTR_READ_c0000005_coreclr.dll!SVR::gc_heap::find_first_object

OS_VERSION: 10.0.17763.1

BUILDLAB_STR: rs5_release

OSPLATFORM_TYPE: x64

OSNAME: Windows 10

IMAGE_VERSION: 6.0.21.52210

FAILURE_ID_HASH: {336b3048-6228-039c-f897-490031da60b3}

Followup: MachineOwner

About this issue

  • Original URL
  • State: closed
  • Created 3 years ago
  • Comments: 25 (12 by maintainers)

Most upvoted comments

The fix is backported to release/6.0, it will be available in our next servicing release. Closing for now.

It appears that the fix does help with the situation. I am going to work with our management to get this into the next eligible servicing release. A PR https://github.com/dotnet/runtime/pull/63351 is created for the backporting.

@mangod9

Here is a build of 4822e3c3aa77eb82b2fb33c9321f923cf11ddde6 with the fix for #60966 applied on top.

To check for the integrity of the files, use the Get-FileHash cmdlet, the hash should be identical.

PS > Get-FileHash *

Algorithm       Hash                                                                   Path
---------       ----                                                                   ----
SHA256          8EEBB8A5633B8BECF788A9DFA4F744E58F2F9DAD64A052525F1D296A6E98072A       C:\backport-pin-fix\coreclr.dll
SHA256          7251FE18C2D455BDF5FB2F67AAB82528ADFA163D07F9D5CE60B3A5AFBB894CE1       C:\backport-pin-fix\coreclr.pdb
SHA256          C08EBDC6AF61B2CC78994B12C4206117008022E738030CEDE488CBEAFB4F139D       C:\backport-pin-fix\mscordaccore.dll
SHA256          837ED5BA486E70FCFCB1F94278EDB0D113FF834C9A71AC3FAE6CDD719205C131       C:\backport-pin-fix\mscordbi.dll

The source branch used to create the build is shared here.