diagnostics: dotnet-dump is incredibly slow in dumpheap scenario
I’m trying to detect a memory leak in the application. Here is what tool says:
> dumpheap -stat -min 5000
Statistics:
MT Count TotalSize Class Name
00007fa6a019a818 1 22080 System.Collections.Generic.Dictionary`2+Entry[[System.Type, System.Private.CoreLib],[System.Collections.Generic.List`1[[System.Type, System.Private.CoreLib]], System.Private.CoreLib]][]
00007fa6985150a0 1 23020 System.UInt16[]
00007fa6a01a39e0 1 32792 MS.Internal.Xml.Cache.XPathNode[]
00007fa69db666e0 1 32792 System.Collections.Concurrent.ConcurrentQueue`1+Segment+Slot[[System.Buffers.MemoryPoolBlock, Microsoft.AspNetCore.Server.Kestrel.Transport.Abstractions]][]
00007fa69ccd1778 1 35840 System.Collections.Concurrent.ConcurrentDictionary`2+Node[[System.Int64, System.Private.CoreLib],[Microsoft.AspNetCore.Server.Kestrel.Core.Internal.Infrastructure.HttpConnectionReference, Microsoft.AspNetCore.Server.Kestrel.Core]][]
00007fa69cbb5ad0 1 35840 UNKNOWN
00007fa6986f9328 1 194312 System.Collections.Generic.HashSet`1+Slot[[System.String, System.Private.CoreLib]][]
00007fa69cbb4ff0 1 196632 UNKNOWN
00007fa696067b40 4 261344 System.Object[]
00007fa69611d910 8 353816 System.Int32[]
00007fa69cbd6be8 1 420480 UNKNOWN
00007fa69611f500 1 420480 System.Collections.Generic.Dictionary`2+Entry[[System.String, System.Private.CoreLib],[System.Object, System.Private.CoreLib]][]
00007fa69810a440 1 524312 System.Collections.Concurrent.ConcurrentQueue`1+Segment+Slot[[System.Threading.IThreadPoolWorkItem, System.Private.CoreLib]][]
00007fa696104d68 1 524312 System.String[]
00007fa6a1b19580 21 688296 UNKNOWN
00007fa6a2244840 36 999264 UNKNOWN
00007fa69fff73d8 14 1182224 UNKNOWN
00007fa69fc2f4e0 25 1940280 UNKNOWN
00007fa69dda3ab0 54 2950416 System.Int16[]
00007fa69632cb10 7 4161704 UNKNOWN
00007fa6961050a0 335 29103136 System.Char[]
00007fa6961014b8 1064 91757068 System.Byte[]
00007fa696100fa0 15422 1358062600 System.String
0000000000c59fa0 8267 1559225242 Free
Total 25269 objects
Not great, but at least I can see that there is a lots of string. Okay, now I’m running
dumpheap -type System.String -min 100
I was awaiting for response for 11 (!) hours and it didn’t print a single line. I had to “cancel command at the user’s request”. So I actually have at least two problems:
- dump created with createdump has lots of UNKNOWN types in dump. I’ve managed to investigate it with
dumpobjand I understood some of them, but the others are just “value type array” and I don’t know what to do with it: there is nothing likedafrom a windbg. But maybe it’s ok because of symbols since I didn’t use--fullflag so I don’t include it in subject and merely point that this problem exists. Creating full dump solves the problem but dump is very large (21gb vs 1.7gb) - I just cannot find what eats the memory. Requests don’t get to the end. I’m patient and when I didn’t get result immediately I’ve just left it for the whole night. Now it still didn’t print a single line. On 32gb ram modern CPU hardware.
Configuration
- Is this related to a specific tool? I’m not sure, container is running 2.1 and I cannot change it due to incompatibilities between aspnetcore 2.1 and 3.1
- What OS and version, and what distro if applicable?
mcr.microsoft.com/dotnet/core/aspnet:2.1-stretch-slim - Are you running in any particular type of environment? (e.g. Containers, a cloud scenario, app you are trying to target is a different user) Docker
- Is it a self-contained published application? no
P.S. I wonder if there is an IM to discuss this kind of matter instead of creating issues since I’ve got absolutely no reaction in https://gitter.im/dotnet/csharplang and I don’t know other places.
About this issue
- Original URL
- State: closed
- Created 4 years ago
- Comments: 23 (13 by maintainers)
@Pzixel, I know it has been a long time since we have posted anything about this issue but the GC related commands like dumpheap have been re-written in the latest published version of SOS (v7.0.447801) and may be faster.
dumpheap -statgives you the method table too (first column). This tends to be better for name collisions (including same assembly loaded into different contexts). But I’ll definitely consider the ‘exact type’ command.As for the cgroups, it does respect limits to some extent. A lot of those settings were tuned in netcoreapp3.1 as part of the effort to improved the overall experience in containers (cc: @Maoni0)
I’ve tried this. It was extremely slow, both LLDB and dotnet-dump. I believe the problem is in the algorithm itself. A quarter of the time was spent getting a metadata provider + building type strings. I will try to take a look if we are properly marking visited objects as well.
also see: https://github.com/dotnet/diagnostics/blob/master/documentation/debugging-coredump.md
I’ve been on vacation and will get back to you on Monday.