runtime: .NET console logging uses a blocking collection which is not optimal for cloud-native applications

With the current design of console logging, there is a dedicated thread that writes outputs to console and once message queue is full (with 1024 messages in the queue) the console logger starts blocking the application.

This gist summarizes some trial and error experiments done to evaluate console logger against Console.WriteLine and in different scenarios. We couldn’t prove that console logging is slower than console write line. Instead what we want to focus on for this issue is to make sure we don’t get a sudden drop in RPS as soon as we hit the max queue size threshold limit.

We could add two configurations improvements to console logger: (1) to allow for both dropping logs and (2) making buffer size configurable.

Configuring buffer size

Given that the queue size today is limited for the blocking collection below: https://source.dot.net/#Microsoft.Extensions.Logging.Console/ConsoleLoggerProcessor.cs,15

The easiest way is to just expose the queue length to make it configurable.

But instead the main proposal here is to keep track of how many bytes each item in the queue is collectively carrying to decide what the desired maximum buffer size for the blocking collection should be.

Dropping logs

Dropping logs would not be allowed by default. The way dropping of logs would work is “if enabled”, then when blocking collection is full, the the console logger would throw the message away.

API Proposal

Since ConsoleLoggerProvider instantiates ConsoleLoggerProcessor, it makes sense to add these configurable options to ConsoleLoggerOptions which is inhected into ConsoleLoggerProvider:

https://github.com/dotnet/runtime/blob/42d1644a1635c334cabe6b216f91e98cd7606248/src/libraries/Microsoft.Extensions.Logging.Console/src/ConsoleLoggerProvider.cs#L41-L62

namespace Microsoft.Extensions.Logging.Console
{
    /// <summary>
    /// Options for a <see cref="ConsoleLogger"/>.
    /// </summary>
    public class ConsoleLoggerOptions
    {
        public bool DisableColors { get; set; }
        public ConsoleLoggerFormat Format { get; set; }
        public string? FormatterName { get; set; }
        public bool IncludeScopes { get; set; }
        public LogLevel LogToStandardErrorThreshold { get; set; } = LogLevel.None;
        public string? TimestampFormat { get; set; }
        public bool UseUtcTimestamp { get; set; }

+        /// <summary>
+        /// Gets or sets the desired console logger behavior when queue becomes full. Defaults to <c>Wait</c>.
+        /// </summary>
+        public ConsoleLoggerBufferFullMode BufferFullMode { get; set; }

+        /// <summary>
+        /// Gets or sets the maximum buffer size before the queue becomes full. Defaults to <c>1MB</c>.
+        /// </summary>
+        public int MaxBufferSizeInBytes { get; set; } = 1048576;
    }

+    /// <summary>
+    /// Determines the console logger behaviour when queue becomes full.
+    /// </summary>
+    public enum ConsoleLoggerBufferFullMode
+    {
+        /// <summary>
+        /// Blocks the console thread once the queue limit is reached
+        /// </summary>
+        Wait,
+        /// <summary>
+        /// Drops new log messages when the queue is full
+        /// </summary>
+        DropNewest,
+    }
}

Alternative Design

Alternatively, MaxQueueBufferSizeInBytes could be nullable and when not specified use an unbounded queue:

-        public int MaxBufferSizeInBytes { get; set; } = 1048576;
+        public int? MaxBufferSizeInBytes { get; set; }

Questions with design

  • Do we want to allow the MaxQueueBufferSizeInBytes to be reconfigurable on reload?

Usage samples

using var loggerFactory = LoggerFactory.Create(builder =>
{
    builder
        .AddConsole(o => o.QueueFullMode = ConsoleLoggerQueueFullMode.DropNewest)
        .AddSimpleConsole(o => o.ColorBehavior = LoggerColorBehavior.Disabled);
});
Original Issue

Console logger not quick enough per request to log for real world production applications. Refer to https://github.com/aspnet/Logging/issues/515#issuecomment-435220599

With the current design console logging happens on a separate thread and once message queue is full (1000 messages) console logger could start blocking the application.

https://github.com/dotnet/runtime/blob/78c6505cffe2558b036fbe44cd27038affbb6cce/src/libraries/Microsoft.Extensions.Logging.Console/src/ConsoleLoggerProcessor.cs#L14-L16

To improve this design, we could consider using a Channel rather than a blocking collection (refer to https://devblogs.microsoft.com/dotnet/an-introduction-to-system-threading-channels/). This would prevent something that would either block the app due to increased latency or cause out of memory exception.

With upside with the current blocking collection approach is that the order of logs gets respected but since the logging methods can be accompanied with timestamp, the channel approach may still be an improved approach overall to help with the goal of having a console logger that is performant enough for real world applications.

About this issue

  • Original URL
  • State: closed
  • Created 2 years ago
  • Reactions: 2
  • Comments: 46 (40 by maintainers)

Most upvoted comments

DropNewest

Should this be renamed to DropWrite (or similar) to not confuse it with the same named option in Channels?

https://github.com/dotnet/runtime/blob/6ca8c9bc0c4a5fc1082c690b6768ab3be8761b11/src/libraries/System.Threading.Channels/src/System/Threading/Channels/BoundedChannelFullMode.cs#L11-L16

I think it would be confusing for Channels’ DropNewest to mean “drop the most recent item in the channel”, and ConsoleLoggerQueueFullMode.DropNewest to mean “drop the item being written”.

@timmydo this went in for the next preview. hopefully you could get to test it out and help with feedback.

After going back and forth on this I think we should do and a couple of things come to mind:

  • OOM should be preferred over blocking threads. We should make the default queue size something massive that most services will never hit. Assuming most services have high memory detection of some sort, this will show up as a high memory usage and can be investigated.
  • If the queue size being a number (1M logs) doesn’t feel like the right thing to configure then maybe we can buffer the log data instead and treat it like memory limit (number of bytes the console logger will buffer in memory).
  • We can have a “drop logs over the limit” behavior configured and dropping cannot be the default.

TL;DR we turn this into a memory problem instead of what it is today.

cc @noahfalk

@maryamariyan do we know if the reported latency from the original issue is happening on Windows or Unix, or both? I peeked around the Unix implementation and I can definitely see a few places where there are as @davidfowl put it - “low hanging fruits” for optimization.

Would it be beneficial if we try and see if performance of the consumer part can be significantly sped up before introducing any configuration options or even changing the implementation?

I’m not clear on the fundamental problem / goal here.

In any producer/consumer system, if the consumer can’t keep up with the producer, you have a few escape valves:

  • make the consumer faster
  • buffer more and more until hopefully the consumer can catch up
  • add more consumers
  • throttle the producer, aka backpressure. That back pressure can be applied via multiple means, e.g. blocking threads, triggering continuation via a callback, etc.
  • skip some data, aka lossy

If the current system is applying backpressure via BlockingCollection and the goal is just to enable that backpressure to be applied synchronously, yeah, channels is likely a good answer. If the problem is that BlockingCollection has too much overhead and we still want blocking synchronously, channels might help, or we could fix BC, or we could devise a lighterweight handoff mechanism. If the problem is that the current bound is too low, raise the bound.

But if the problem is that we don’t want to slow down the producer and we don’t want to buffer as much as we currently are, well, the remaining options are make the consumer faster or drop data (or similarly make the data smaller if the overall size is the concern). Adding more consumers in the case of writing to the console isn’t really an option, since there’s only one console, unless we fake a console and redirect some output to another storage mechanism.

So… what’s the actual problem and goal here?

Right, it’s the size of the queue that ends up being the problem for systems that are logging ALOT

I don’t think the memory is the problem here. The blocking collection is throttling the application because the consumer part isn’t able to keep up with the producer if too much is being logged at the at the same time and thus blocking the logging itself.

BlockingCollection on my machine can accept 12 millions items a sec.

The collection can accept as many items as your machine is able to consume at the same time. Here the Console related part is slow and thus causing the collection to fill and thus block.

How does switching to a Channel help? Is it simply because it can be invoked async? Or because you’re imagining it could provide back pressure and the system can drop lower priority events for a while?

No back pressure here, anything that would cause blocking would have the same problem it has right now.

Should this be renamed to DropWrite (or similar) to not confuse it with the same named option in Channels?

+1 for using DropWrite 👍

Do we want to allow the MaxQueueBufferSizeInBytes to be reconfigurable on reload?

This seems reasonable in principle - does it make the implementation substantially harder?

Small suggestion on the naming, we might want to use ‘Buffer’ instead of ‘Queue’ or ‘QueueBuffer’. Ie: MaxQueueBufferSizeInBytes => MaxBufferSizeInBytes ConsoleLoggerQueueFullMode => ConsoleLoggerBufferFullMode QueueFullMode => BufferFullMode

However if the review crew decides they like ‘Queue’ better go for it, its just a small preference on my part.

MaxQueueBufferSizeInBytes should default to something based on the maximum memory the GC has available to it like Math.Max(memory / 4, 50MB) or something like that (those are made up numbers).

@davidfowl and I did a little profiling this evening. My earlier benchmarks were bottlenecked on how fast powershell could redirect the output to a file and it was very sensitive to newlines in the output which the ILogger format unintentionally had more of. Switching to cmd.exe piped to nul shows:

Console.WriteLine(text): 684MB/s ILogger.LogInformation(text) with LoggerFactory.AddSimpleConsole(): 381MB/s ILogger.LogInformation(text) with LoggerFactory.AddConsole(): 6.7MB/s

The high overhead for AddConsole comes from the coloring behavior, specifically the calls to kernelbase!GetConsoleScreenBufferInfo() within ConsolePal.Set_ForegroundColor and ConsolePal.Set_BackgroundColor. Setting DisableColors=true gets the same perf as AddSimpleConsole().

If we want to optimize in this area we’ll need to be careful defining the setup (OS, container, shell, redirection mechanism, content being logged, etc) to make sure we are identifying the same bottlenecks. The problem in my powershell repro was a completely different issue than coloring issue we found later.

I think the cloud native scenario is mostly a linux one. We should measure both windows and linux performance of course but I’m not sure they are comparable at the moment.

~170MB/s for raw Console.Write(), and ~700MB/s for StreamWriter.Write to disk

If you aren’t already, consider batching writes to the console Eg., on a 5 second timer tick, write to console everything from the last 5 seconds as a single call to the console API. In past projects I have seen improvements with this, at least on Windows where the console is pretty slow.

I’d suggest making no major changes to the default behavior (which I think is where you ended up as well?) Instead I’d propose: a) allow opt-in dropping behavior and app configurable buffer size (measured in bytes rather than entries ideally) b) Investigate throughput perf improvements:

  • Tuning default buffer size might help some bursty logging scenarios avoid blocking without substantially changing overall app memory footprint
  • Profile the code path and make changes where we see inefficiency. On my machine I am seeing ~10MB/s for ILogger to console, ~170MB/s for raw Console.Write(), and ~700MB/s for StreamWriter.Write to disk. I am guessing we have a lot of room for improvement here before we get into any of the tougher tradeoffs.

Currently we just disable console everywhere except on the dev machine due to performance issues, and it would be nice if that wasn’t the case.

We disable the console logger as well in our performance runs 😄.

.What I’m trying to say is that for something as foundational as logging, it’s nice it have it built-in…

I hear you but we can’t build everything into the framework, that’s no way to build an ecosystems (we’ve already seen the effects of this).

However, this is a reasonable request and should be built into the console logger (and potentially any other logger that does this async work).

Maybe we should also log when the logs are being dropped:

InceptionDeeperGIF

to allow the option to drop logs rather than block. Please correct me if I’m wrong but I assumed using Channel<T> could help with that.

The problem is random log messages will be dropped. There’s no way to opt in at the callsite. The other way I could maybe see log dropping working is if there was a filter or callback to determine if a log should be dropped or not. Other than that, I don’t see how that’s workable, but I’m open to hearing if that’s acceptable.

The logging API is synchronous, so we will need to block when the queue limit is reached, that’s the constraint. The problem is that systems that log ALOT (probably too much) hit this global 1024 queue limit, it starts blocking threads (probably thread pool threads) and that limit isn’t configurable (people seem to have lots of memory and 1024 is hindsight was maybe too low for systems that log alot). I think the simple fix is to allow tweaking the limit. Whether we change to use a channel or something more manual I think is another issue.

I don’t think we want to drop log messages. There’s no API to help producers determine which logs are droppable and which ones aren’t so blocking I think is the ideal solution.

We should also speed up the consumer (if there’s any room to do so). I’m not sure if there’s low hanging fruit.