runtime: OutOfMemoryException in dotnet application running inside a Linux container

Description

I’ve discovered that there are certain memory usage patterns which can lead to OOM exceptions when the dotnet application is running inside a container. The problem with OOM exception applies to both client GC (COMPlus_gcServer=0) and server GC (COMPlus_gcServer=1) modes. I believe that this is a bug because the same application running outside the container but with the same memory limit and the same arguments works fine. Also I think that there is an expectation that applications shouldn’t die due to OOM exceptions when it is possible for GC to free&compact memory for new allocations.

How to reproduce

  • Checkout & build the GcTesting app: docker build -t gctesting .
  • Execute the GcTesting app: docker run -i -t --env COMPlus_gcServer=0 --memory=4gb gctesting --memorypressurerate=100mb --allocationunitsize=80Kb --minimummemoryusage=1700mb
  • Every second it is going to print some memory and GC stats. We need to wait for about 20 seconds for it to crash with an OOM exception.
[09:01:17 INF] OSDescription:Linux 4.19.121-linuxkit #1 SMP Thu Jan 21 15:36:34 UTC 2021, OSArchitecture:X64, RuntimeIdentifier:ubuntu.20.04-x64, ProcessArchitecture:X64, FrameworkDescription:.NET 5.0.4, 
[09:01:17 INF] Starting GcStatsTask, UtcNow:03/30/2021 09:01:17, IsServerGC:False, LatencyMode:Interactive, LOHCompactionMode:Default, 
[09:01:17 INF] Starting FullGCLoggerTask
[09:01:17 INF] Starting MemoryPressureTask(allocationUnitSize=80.00 KB, memoryPressureRate=100.00 MB, minimumMemoryUsage=1.66 GB, leakMemory=False
[09:01:18 INF] /sys/fs/cgroup/memory/memory.oom_control:
[09:01:18 INF] oom_kill_disable 0
under_oom 0
oom_kill 0

/sys/fs/cgroup/memory/memory.limit_in_bytes:4.00 GB
Elapsed:  0s, GC-Rate:N/A, Gen012:0,0,0, Total:421.71 KB, GcAllocated:420.54 KB, HeapSize:0 Bytes, MemoryLoad:0 Bytes, Committed:0 Bytes, Available:2.88 GB, HighMemoryLoadThreshold:3.46 GB, CGroupUsageInBytes:15.70 MB, ManagedBlocks:0, UnmanagedBlocks:0, FullGc:0, 
Elapsed:  1s, GC-Rate:N/A, Gen012:305,155,7, Total:1.76 GB, GcAllocated:1.76 GB, HeapSize:1.76 GB, MemoryLoad:1.77 GB, Committed:1.76 GB, Available:2.88 GB, HighMemoryLoadThreshold:3.46 GB, CGroupUsageInBytes:1.79 GB, ManagedBlocks:23040, UnmanagedBlocks:0, FullGc:1, 
Elapsed:  2s, GC-Rate:N/A, Gen012:317,161,7, Total:1.83 GB, GcAllocated:1.83 GB, HeapSize:1.83 GB, MemoryLoad:1.85 GB, Committed:1.83 GB, Available:2.88 GB, HighMemoryLoadThreshold:3.46 GB, CGroupUsageInBytes:1.86 GB, ManagedBlocks:23945, UnmanagedBlocks:0, FullGc:1, 
Elapsed:  3s, GC-Rate:N/A, Gen012:335,170,7, Total:1.93 GB, GcAllocated:1.93 GB, HeapSize:1.93 GB, MemoryLoad:1.96 GB, Committed:1.93 GB, Available:2.88 GB, HighMemoryLoadThreshold:3.46 GB, CGroupUsageInBytes:1.97 GB, ManagedBlocks:25319, UnmanagedBlocks:0, FullGc:1, 
Elapsed:  4s, GC-Rate:N/A, Gen012:349,177,7, Total:2.01 GB, GcAllocated:2.02 GB, HeapSize:2.01 GB, MemoryLoad:2.04 GB, Committed:2.01 GB, Available:2.88 GB, HighMemoryLoadThreshold:3.46 GB, CGroupUsageInBytes:2.05 GB, ManagedBlocks:26396, UnmanagedBlocks:0, FullGc:1, 
Elapsed:  5s, GC-Rate:N/A, Gen012:367,186,7, Total:2.12 GB, GcAllocated:2.12 GB, HeapSize:2.12 GB, MemoryLoad:2.12 GB, Committed:2.12 GB, Available:2.88 GB, HighMemoryLoadThreshold:3.46 GB, CGroupUsageInBytes:2.16 GB, ManagedBlocks:27799, UnmanagedBlocks:0, FullGc:1, 
Elapsed:  6s, GC-Rate:N/A, Gen012:383,194,7, Total:2.22 GB, GcAllocated:2.22 GB, HeapSize:2.21 GB, MemoryLoad:2.23 GB, Committed:2.21 GB, Available:2.88 GB, HighMemoryLoadThreshold:3.46 GB, CGroupUsageInBytes:2.25 GB, ManagedBlocks:29058, UnmanagedBlocks:0, FullGc:1, 
Elapsed:  7s, GC-Rate:N/A, Gen012:398,202,7, Total:2.30 GB, GcAllocated:2.30 GB, HeapSize:2.30 GB, MemoryLoad:2.31 GB, Committed:2.30 GB, Available:2.88 GB, HighMemoryLoadThreshold:3.46 GB, CGroupUsageInBytes:2.34 GB, ManagedBlocks:30163, UnmanagedBlocks:0, FullGc:1, 
Elapsed:  8s, GC-Rate:N/A, Gen012:416,211,7, Total:2.41 GB, GcAllocated:2.41 GB, HeapSize:2.41 GB, MemoryLoad:2.42 GB, Committed:2.41 GB, Available:2.88 GB, HighMemoryLoadThreshold:3.46 GB, CGroupUsageInBytes:2.44 GB, ManagedBlocks:31558, UnmanagedBlocks:0, FullGc:1, 
Elapsed:  9s, GC-Rate:N/A, Gen012:433,220,8, Total:1.69 GB, GcAllocated:2.50 GB, HeapSize:2.48 GB, MemoryLoad:2.50 GB, Committed:2.49 GB, Available:2.88 GB, HighMemoryLoadThreshold:3.46 GB, CGroupUsageInBytes:2.53 GB, ManagedBlocks:32777, UnmanagedBlocks:0, FullGc:1, 
Elapsed: 10s, GC-Rate:N/A, Gen012:451,229,8, Total:1.79 GB, GcAllocated:2.61 GB, HeapSize:2.48 GB, MemoryLoad:2.50 GB, Committed:2.49 GB, Available:2.88 GB, HighMemoryLoadThreshold:3.46 GB, CGroupUsageInBytes:2.53 GB, ManagedBlocks:34163, UnmanagedBlocks:0, FullGc:1, 
Elapsed: 11s, GC-Rate:N/A, Gen012:472,239,8, Total:1.92 GB, GcAllocated:2.74 GB, HeapSize:2.48 GB, MemoryLoad:2.50 GB, Committed:2.49 GB, Available:2.88 GB, HighMemoryLoadThreshold:3.46 GB, CGroupUsageInBytes:2.53 GB, ManagedBlocks:35840, UnmanagedBlocks:0, FullGc:1, 
Elapsed: 12s, GC-Rate:N/A, Gen012:489,248,8, Total:2.02 GB, GcAllocated:2.83 GB, HeapSize:2.48 GB, MemoryLoad:2.50 GB, Committed:2.49 GB, Available:2.88 GB, HighMemoryLoadThreshold:3.46 GB, CGroupUsageInBytes:2.53 GB, ManagedBlocks:37120, UnmanagedBlocks:0, FullGc:1, 
Elapsed: 13s, GC-Rate:N/A, Gen012:506,256,8, Total:2.12 GB, GcAllocated:2.93 GB, HeapSize:2.48 GB, MemoryLoad:2.50 GB, Committed:2.49 GB, Available:2.88 GB, HighMemoryLoadThreshold:3.46 GB, CGroupUsageInBytes:2.53 GB, ManagedBlocks:38400, UnmanagedBlocks:0, FullGc:1, 
Elapsed: 14s, GC-Rate:N/A, Gen012:522,264,8, Total:2.21 GB, GcAllocated:3.03 GB, HeapSize:2.48 GB, MemoryLoad:2.50 GB, Committed:2.49 GB, Available:2.88 GB, HighMemoryLoadThreshold:3.46 GB, CGroupUsageInBytes:2.53 GB, ManagedBlocks:39680, UnmanagedBlocks:0, FullGc:1, 
Elapsed: 15s, GC-Rate:N/A, Gen012:539,273,8, Total:2.31 GB, GcAllocated:3.13 GB, HeapSize:2.48 GB, MemoryLoad:2.50 GB, Committed:2.49 GB, Available:2.88 GB, HighMemoryLoadThreshold:3.46 GB, CGroupUsageInBytes:2.53 GB, ManagedBlocks:40956, UnmanagedBlocks:0, FullGc:1, 
Elapsed: 16s, GC-Rate:N/A, Gen012:555,281,8, Total:2.41 GB, GcAllocated:3.22 GB, HeapSize:2.48 GB, MemoryLoad:2.50 GB, Committed:2.49 GB, Available:2.88 GB, HighMemoryLoadThreshold:3.46 GB, CGroupUsageInBytes:2.53 GB, ManagedBlocks:42240, UnmanagedBlocks:0, FullGc:1, 
Elapsed: 17s, GC-Rate:N/A, Gen012:572,289,8, Total:2.51 GB, GcAllocated:3.32 GB, HeapSize:2.51 GB, MemoryLoad:2.54 GB, Committed:2.51 GB, Available:2.88 GB, HighMemoryLoadThreshold:3.46 GB, CGroupUsageInBytes:2.55 GB, ManagedBlocks:43520, UnmanagedBlocks:0, FullGc:1, 
Elapsed: 18s, GC-Rate:N/A, Gen012:589,298,8, Total:2.60 GB, GcAllocated:3.42 GB, HeapSize:2.61 GB, MemoryLoad:2.62 GB, Committed:2.61 GB, Available:2.88 GB, HighMemoryLoadThreshold:3.46 GB, CGroupUsageInBytes:2.65 GB, ManagedBlocks:44800, UnmanagedBlocks:0, FullGc:1, 
Elapsed: 19s, GC-Rate:N/A, Gen012:601,304,8, Total:2.68 GB, GcAllocated:3.49 GB, HeapSize:2.68 GB, MemoryLoad:2.69 GB, Committed:2.68 GB, Available:2.88 GB, HighMemoryLoadThreshold:3.46 GB, CGroupUsageInBytes:2.73 GB, ManagedBlocks:45778, UnmanagedBlocks:0, FullGc:1, 
Elapsed: 20s, GC-Rate:N/A, Gen012:608,310,9, Total:1.66 GB, GcAllocated:3.52 GB, HeapSize:1.66 GB, MemoryLoad:2.73 GB, Committed:2.70 GB, Available:2.88 GB, HighMemoryLoadThreshold:3.46 GB, CGroupUsageInBytes:2.75 GB, ManagedBlocks:46095, UnmanagedBlocks:0, FullGc:2, 
System.OutOfMemoryException: Exception of type 'System.OutOfMemoryException' was thrown.
   at GcTesting.Program.<>c__DisplayClass10_0.<MemoryPressureTask>b__0() in /app/GcTesting/Program.cs:line 235
   at GcTesting.Program.MemoryPressureTask(Int64 allocationUnitSize, Int64 memoryPressureRate, Int64 minimumMemoryUsage, Boolean leakMemory) in /app/GcTesting/Program.cs:line 275
   at GcTesting.Program.<>c.<<Main>b__8_0>d.MoveNext() in /app/GcTesting/Program.cs:line 121
--- End of stack trace from previous location ---
   at CommandLine.ParserResultExtensions.WithParsedAsync[T](ParserResult`1 result, Func`2 action)
   at GcTesting.Program.Main(String[] args) in /app/GcTesting/Program.cs:line 88

[09:01:39 INF] Gen012:608,310,9, Total:1.66 GB, Allocated:3.52 GB, HeapSize:1.66 GB, MemoryLoad:2.73 GB, Committed:2.70 GB, Available:2.88 GB, HighMemoryLoadThreshold:3.46 GB, CGroupUsageInBytes:2.75 GB, ManagedBlocks:46095, UnmanagedBlocks:0, FullGcCompleted:2, 
Out of memory.

Configuration

  • Which version of .NET is the code running on? dotnet 5.0.4
  • What OS and version, and what distro if applicable? - Docker on Windows, mcr.microsoft.com/dotnet/sdk:5.0.201-focal-amd64 base image
  • What is the architecture (x64, x86, ARM, ARM64)? - x64

Regression?

I don’t think that this bug is a regression.

Other information

  • When the dotnet app runs inside container, the default value of GCMemoryInfo.TotalAvailableMemoryBytes is lower than the value of GCMemoryInfo.HighMemoryLoadThresholdBytes, I think it is wrong and should be other way around.
  • The log line just before the OutOfMemoryException  shows that the full GC has happened and the memory was recollected, but the OOM Exception is still being thrown despite it.
  • Using custom values for the COMPlus_GCHeapHardLimit and  the COMPlus_GCHighMemPercent is a possible mitigation.

About this issue

  • Original URL
  • State: closed
  • Created 3 years ago
  • Reactions: 1
  • Comments: 16 (15 by maintainers)

Most upvoted comments

@tmds I am watching this thread, I am planning to investigate it right after finishing stuff I am currently working on.