logging: App deadlock on FileLoggerProvider.WriteEntry

We use the file logger in ASP.NET Core .net7 app the simple way by adding in Startup loggingBuilder.AddFile(loggingSection);

We have now face random (hours…week) situation when app gets “stuck”. By taking a dump we found that all threads wait in this place - entryQueue.Add(message); in FileLoggerProvider

		internal void WriteEntry(string message)
		{
			if (!entryQueue.IsAddingCompleted)
			{
				try
				{
					entryQueue.Add(message);
				}
				catch (InvalidOperationException)
				{
				}
			}
		}

image (the 2and 5 threads wait there as well…) But there is no clear indication “why” ? Of course the only way to get over this is to kill app and restart.

We noticed that the FileLoggerProvider is using a blocking collection

private readonly BlockingCollection<string> entryQueue = new BlockingCollection<string>(1024); 

And noticed that there is no “defensive measure” so once anything goes south here (or the loop simply terminates…

		private void ProcessQueue()
		{
			foreach (string item in entryQueue.GetConsumingEnumerable())
			{
				fWriter.WriteMessage(item, entryQueue.Count == 0);
			}
		}

We very likely end up with filled blocking queue, nobody taking items, app stuck…

We noticed that in methods WriteMessage(string message, bool flush) CheckForNewLogFile()

There are no try-catch so it is a matter of time, when something bad happens…

Do you consider it relevant ? Would you mind to add some corrections by some PR/Fork from us? Or you find it so easy to make it by yourself ?

We’d: a) Allow to configure an optional timeout, and changeentryQueue.Add(message);to entryQueue.TryAdd(message,timeout); b) Add some try-catch around manipulation with LogFileWriter

For us losing some part of log is much acceptable then having app deadlocked.

BR Jiri

About this issue

  • Original URL
  • State: closed
  • Created 9 months ago
  • Comments: 17 (8 by maintainers)

Commits related to this issue

Most upvoted comments

@AoooR42

I imagine there must be a real solution that lets you write to a file that’s already open elsewhere, since Visual Studio Code, for example, is capable of opening and modifying a .log file that’s already open in another application.

I’m afraid you have wrong expectations here, because what you mentioned (opening the same file for write from different processes) is about OS file locks and sharing. This is conceptual: you simply cannot write consistently into the same file in parallel (and this is not related to NReco.File.Logging specific at all).

As an author of NReco.Logging.File I don’t see any conceptual issues with the existing implementation, all file-related errors may be handled (via “HandleFileError”) according to the logic of your application.

@JiriZidek it is still unclear what is wrong with the current code. I wasn’t able to reproduce any file I/O-related issues that may affect the app when HandleFileError is specified.

My policy in accepting PRs is “Occam’s razor”: this should be minimal changes to fix something or add a new capability.

Is it supposed to be full path or just a file name and the path is reused from configuration ?

This should be a new filename. In the case of relative path (or just filename) app’s working directory is used to resolve a full path.

@JiriZidek I was able to reproduce a deadlock in the unit test, and provided a bugfix - now ProcessQueue behaves in this way:

  1. if WriteMessage fails and app’s code doesn’t propose a new (alternative) file log name via HandleFileError - file logger simply stops processing of log entries, app continue working
  2. if HandleFileError proposes a new file name, it tries to use it, and if it also fails file logger also stops processing of log entries (HandleFileError will not be called recursively)

With these changes an unit test cannot reproduce a deadlock anymore.

This is shipped in 1.1.7, you may already try it.

We very likely end up with filled blocking queue, nobody taking items, app stuck…

I agree that most likely the reason is full queue because of exception that occurs in ProcessQueue method. It is important to keep app working even if something goes wrong with file logging, so it is definitely should be try-catch here – if there are no error handler, at least FileLoggerProvider should not block WriteEntry and simply de-queue items (without any processing since write to file fails). Also, catch block can try to use HandleFileError handler if present - in that case app’s code can at least handle file write error somehow (and maybe propose a new log file name).

Would you mind to add some corrections by some PR/Fork from us? Or you find it so easy to make it by yourself ?

The root of the problem seems clear, I’ll try to propose a bugfix quickly by myself.

We’d: a) Allow to configure an optional timeout, and changeentryQueue.Add(message);to entryQueue.TryAdd(message,timeout); b) Add some try-catch around manipulation with LogFileWriter

I think (a) is not needed if ProcessQueue will guarantee dequeue even if exception occurs in fWriter.WriteMessage.