runtime: [Question] GC pause time is too large for 60 fps
We are investigating GC pause time in order to potentially achieve 60 fps, but we observe long GC EE suspend pauses.
For example, on next benchmark max GC pause on arm64(rpi4) can get as high as 55 ms with default GC setup (i.e. default latency mode, default max gen0/gen1 size). This is a huge pause, because 1 frame in 60 fps mode should fit in ~16 ms.
Some details:
- GC pause time is computed as diff between
GCRestartEEEnd
andGCSuspendEEBegin
events SustainedLowLatency
mode doesn’t improve max GC pause timeLowLatency
mode reduces max GC pause time, but max RSS increases signigicantly (on benchmark above ~10x increase in max RSS)- different values of
COMPlus_GCGen0MaxBudget
andCOMPlus_GCGen1MaxBudget
do not improve max GC pause time
Additionally, we’ve found the next measurements on coreclr/mono and results are pretty much the same: https://gist.github.com/jechter/2730225240163a806fcc15c44c5ac2d6.
Are there any plans to support GC mode with limited pause time?
using System;
using System.Threading;
using System.Collections.Generic;
using System.Diagnostics.Tracing;
namespace gctest2
{
class Test
{
static void Main()
{
GCEventListener.Start();
Test t = new Test();
t.mGCTest();
}
struct Log
{
public String log;
}
private static char[] base_string = "GC TEST TEST TEST TEST TEST TEST TEST TEST TEST TEST TEST TEST COUNT".ToCharArray();
public void mGCTest()
{
int count = 0;
int max = 102400;
List<Log> list = new List<Log>();
for (int index1 = 0; index1 < 2000; ++index1)
{
for (int index2 = 0; index2 < 1000; ++index2)
{
String sd = $"GC TEST TEST TEST TEST TEST TEST TEST TEST TEST TEST TEST TEST COUNT {count} / {max}";
Log log = new Log();
log.log = sd;
list.Add(log);
if (list.Count == max)
{
while (list.Count > 0)
{
list.RemoveAt(0);
}
}
count++;
}
}
}
}
public sealed class GCEventListener : EventListener
{
private static GCEventListener instance = null;
private long timeGCStart = 0;
private bool verbose = false;
private GCEventListener()
{
Console.WriteLine("GCEventListener Created");
}
public static void Start(bool verbose = false)
{
if(instance == null)
instance = new GCEventListener();
}
// Called whenever an EventSource is created.
protected override void OnEventSourceCreated(EventSource eventSource)
{
// Watch for the .NET runtime EventSource and enable all of its events.
if (eventSource.Name.Equals("Microsoft-Windows-DotNETRuntime"))
{
EnableEvents(eventSource, EventLevel.Informational, (EventKeywords)0x1);
}
}
private void PrintEventData(EventWrittenEventArgs eventData)
{
Console.WriteLine($"ThreadID = {eventData.OSThreadId} ID = {eventData.EventId} Name = {eventData.EventName} Tick = {eventData.TimeStamp.Ticks/10.0/1000.0}");
for (var i = 0; i < eventData.Payload.Count; i++)
{
string payloadString = eventData.Payload[i] != null ? eventData.Payload[i].ToString() : string.Empty;
Console.WriteLine($"\tName = \"{eventData.PayloadNames[i]}\" Value = \"{payloadString}\"");
}
Console.WriteLine("\n");
}
// Called whenever an event is written.
protected override void OnEventWritten(EventWrittenEventArgs eventData)
{
PrintEventData(eventData);
if (eventData.EventName.Contains("Concurrent"))
{
PrintEventData(eventData);
}
else if (eventData.EventName.Contains("GCStart"))
{
timeGCStart = eventData.TimeStamp.Ticks;
//PrintEventData(eventData);
}
else if (eventData.EventName.Contains("GCEnd"))
{
long timeGCEnd = eventData.TimeStamp.Ticks;
long gcIndex = long.Parse(eventData.Payload[0].ToString());
Console.WriteLine("GC#{0} took {1:f3}ms for generation {2}", gcIndex, (double) (timeGCEnd - timeGCStart)/10.0/1000.0, eventData.Payload[1]);
//PrintEventData(eventData);
}
}
}
}
About this issue
- Original URL
- State: open
- Created 2 years ago
- Reactions: 3
- Comments: 19 (15 by maintainers)
before we provide soft real time guarantee (which is a lot of work as you can imagine), I was actually thinking of provide a mode for our GC with no generations just for workloads with no generational behavior at all. so every GC is a full GC. I hacked up a version to do this tonight 😄 with @jkotas’s code I get this
the pause time in single gen is actually very tiny, but there’s one BGC that has that max pause which made the avg pause 8.1ms instead of < 1ms -
what I have right now is very hacky. but if anyone is interested, I can see if I could get some time in the not too distant future make it reasonable quality to become an experimental feature (ie, can be turned on with a config).
That’s true only for programs that behave according to the simplifying assumptions that the generational GC designs are based on. The example above violates these simplifying assumptions. It ends up with a large Gen2 array that has to be scanned by Gen0 GC every time since it may contain Gen0 back references.
Here is a simpler example that amplifies the problem. It has Gen0 GC pause times around 0.3 seconds on my desktop machine. Changing the Gen0 budged does not have affect on the GC pause times.
BTW: It is not unusual to see these kind of anti-patterns in “high-performance” code that tries to cache and reuse data structures to “save” the GC work.
Adding another voice, I am currently developing a latency sensitive application and I’m also interested in the above discussion of a generationless GC as a potential option.