loguru: Rotating Error in multi-process (randomly happened)

Hi again 🤣 , got a new error from my multi-process project, which was designed with:

The problem is, when I try to restart my project on 2023/07/07 with existed log files :

record.2023-07-06.log
record.log # contains logs of 2023/07/07

loguru would rename record.log to record.2023-07-07.log and put all the logs into it, then generate a new record.log:

record.2023-07-06.log
record.2023-07-07.log # contains logs from `record.log`
record.log

and I am not sure how to reprocude this error because I tried 3 times after the error occured, but loguru works well then.

About this issue

Original URL
State: closed
Created a year ago
Comments: 25 (8 by maintainers)

Most upvoted comments

Hi @changchiyou.

The root cause of your problem (#894) has been fixed on master branch. Starting with the next Loguru release, you should no longer face unexpected behavior when using rotation.

I’m closing this issue consequently, as it should be solved. If you need any further help, I remain available.

Delgan on Aug 31, 2023

I’m sorry for bothering you for days 😭

No worries. Thank you for your patience. I appreciate that you haven’t condemned Loguru and are trying to find a solution. I welcome your work and detailed reports about what you observe during the tests. It’s just that there’s a lot of information out there and it’s sometimes difficult for me to understand and reproduce the problem you’re facing.

But I hope we’ll find a solution. I don’t like the idea of you deprecating your own project on a platform because of Loguru… 😃

Regarding your comment here: https://github.com/Delgan/loguru/issues/916#issuecomment-1628365845

with enqueue=True, unit-test failed:

tests/utils/test_logging_enhance.py::test_daily_rotation_with_different_rotation[daily-True] FAILED
tests/utils/test_logging_enhance.py::test_daily_rotation_with_different_rotation[daily-False] PASSED
tests/utils/test_logging_enhance.py::test_daily_rotation_with_different_rotation[00:00-True] FAILED
tests/utils/test_logging_enhance.py::test_daily_rotation_with_different_rotation[00:00-False] PASSED

Those tests are failing because enqueue=True implies the logged messages are not processed synchronously. They are handled in a background thread, which means the main thread will execute the next instruction likely before the message is actually logged. This is incompatible with the current implementation of test_daily_rotation_with_different_rotation(). It’s crucial in such situation to call logger.complete() to make sure all messages have been processed, otherwise the test expectations can’t be guaranteed.

logger.debug("First")

# Wait for the first message to be fully processed by the background thread.
logger.complete()  

 # If complete() were not called, this line could be executed while first message is still pending.
frozen.tick(delta=datetime.timedelta(hours=23, minutes=30))

# If complete() were not called, it's possible that both first and second message would be
# processed after the actual time change, leading to test failure.
logger.debug("Second")

Regarding your comment here: https://github.com/Delgan/loguru/issues/916#issuecomment-1629930638

BTW, aren’t the first 2 of my 4 unit-test in https://github.com/Delgan/loguru/issues/916#issuecomment-1628066817 clear enough? Let me know if you want more information/details, I am willing to add more. +1

Thanks for providing the tests. I could run them on my computer, but the failure reason was different than the one you reported.

Looking at test_single_process_pure_renaming(), the expectations you listed do not seems appropriate to me.

        check_dir(
            tmp_path,
            files=[
                ("record.2020-01-01.log", f"{MSG}\n" * 2),
                ("record.log", f"{MSG}\n" * 4),
            ],
        )

The test is started at the fake date of "2020-01-01 12:00:00". When the first message is logged, the "record.log" file should be created, which is fine. However, you are then moving the clock forward by 24 hours. New date is "2020-01-02 12:00:00". Remember that the compression i.e. rename function is executed on rotation, which is triggered by the first logged message after the rotation="00:00" threshold. That is, when frozen.tick(delta=datetime.timedelta(hours=24)), nothing happens within Loguru. However, after that, the first logger.debug() call will start the rotation process as Loguru will detect that the "00:00" limit is exceeded. This causes your rename() function to be called, and your rename function is implemented using today = datetime.datetime.now().strftime(r"%Y-%m-%d"). Therefore, the file is naturally renamed to "record.2020-01-02.log" and not "record.2020-01-01.log" as you seem to expect.

Maybe you actually want to subtract datetime.timedelta(days=1) as we previously discussed in #899?

In addition, there are also missing calls to logger.complete() as I explained above. It must be called after the for loop containing the logger.debug() calls. Otherwise, the results will be indeterminate. This is very important.

The case of test_multi_process_pure_renaming() is even more tricky. Basically, you’re right that freeze_time() does not work well with "spawn" method. This is because Loguru rotation mechanism relies on file creation timestamp, but that means the filesystem needs to be faked during tests. The freeze_time() helper maintains a structure of file paths associated with fake timestamp. However, when "spawn" is used, the content of this structure is lost in the child process, and the timestamp can’t be faked. The "fork" method copies the structure, therefore it’s fine. This explains why you face unexpected results during tests, but in production it shouldn’t be a problem.

Edit: On further reflection, I’m no longer sure of this explanation. The message are handled in the main process, therefore freeze_time should work correctly. I can’t explain the differences between "fork" and "spawn" for now.

Regarding your comment here: https://github.com/Delgan/loguru/issues/916#issuecomment-1630075894

Another small question: I refer to your unit-tests above #916 (comment), but I can’t pass context = multiprocessing.get_context("fork") as argument into set_loguru(), does it really matter for loguru?

Sorry, this is a new argument which exists on master branch but is not available in the most recent release (0.7.0). I’d forgotten about that.

It’s not a requirement. It mainly useful to isolate each test cases instead of calling multiprocessing.set_start_method() globally.

Regarding your comment here: https://github.com/Delgan/loguru/issues/916#issuecomment-1630435064

With the explanations I’ve given above, I don’t think we can conclude that there’s a bug in Loguru for now.

I arranged the two unit tests you shared as follow:

Make sure logger.complete() is called adequately
Change expectations regarding the file name in test_single_process_pure_renaming()
Replace usage of freeze_time() with actual time.sleep() in test_multi_process_pure_renaming()

The test cases work fine now, regardless of "spawn" or "fork" being used:

from loguru import logger
import datetime
import logging
import os
import re
import sys
import multiprocessing
import time
from ..conftest import check_dir

multiprocessing.set_start_method("spawn", force=True)


MSG = "test"

def message(logger_):
   logger_.debug(f"{MSG}")


def rename(filepath) -> None:
  today = datetime.datetime.now().strftime(r"%Y-%m-%d_%H-%M-%S")
  dirname = os.path.dirname(filepath)
  basename = "record.log"
  new_filepath = os.path.normpath(os.path.join(dirname, basename))

  os.rename(
      filepath,
      dynamic_insert_string(str(new_filepath), today),
  )

def dynamic_insert_string(base: str, insert: str) -> str:
  parts = base.split(".")

  if len(parts) > 1:
      base_filename = ".".join(parts[:-1])
      extension = parts[-1]
      result = f"{base_filename}.{insert}.{extension}"
  else:
      result = f"{base}.{insert}"

  return result

def remove_color_syntax(record) -> bool:
  ansi_escape = re.compile(r"\x1B(?:[@-Z\\-_]|\[[0-?]*[ -/]*[@-~])")
  record["extra"]["message_no_colors"] = ansi_escape.sub("", record["message"])

  return True

class InterceptHandler(logging.Handler):
  def emit(
      self,
      record: logging.LogRecord,
      _logger = logger,
  ) -> None:
      try:
          level = _logger.level(record.levelname).name
      except ValueError:
          level = record.levelno

      frame, depth = sys._getframe(6), 6
      while frame is not None and frame.f_code.co_filename == logging.__file__:
          frame = frame.f_back
          depth += 1

      _logger.opt(depth=depth, exception=record.exc_info).log(
          level, record.getMessage()
      )

def set_loguru_pure(tmp_path, rotation=None):
  import logging
  from loguru import logger
  import sys
  import datetime

  logger.remove()

  logger.add(
      f"{str(tmp_path)}/record.log",
      level=logging.DEBUG,
      rotation=datetime.time(
          hour=0,
          minute=0,
          tzinfo=datetime.timezone(datetime.timedelta(hours=8)),
      ) if rotation is None else rotation,
      compression=rename,
      format="{message}",
      enqueue=True,
      colorize=False,
      filter=remove_color_syntax,
  )
  logger.add(
      sys.stderr,
      level=logging.DEBUG,
      colorize=True,
      format="{message}",
      enqueue=True,
  )

  logger.configure(extra={"mark": "origin"})

  logging.basicConfig(handlers=[InterceptHandler()], level=0, force=True)

def test_single_process_pure_renaming(freeze_time, tmp_path):
  with freeze_time("2020-01-01 12:00:00") as frozen:
      set_loguru_pure(tmp_path)

      for _ in range(2):
          logger.debug(f"{MSG}")

      logger.complete()

      check_dir(
          tmp_path,
          files=[
              ("record.log", f"{MSG}\n" * 2),
          ],
      )

      frozen.tick(delta=datetime.timedelta(hours=24))

      for _ in range(4):
          logger.debug(f"{MSG}")

      logger.complete()

      check_dir(
          tmp_path,
          files=[
              ("record.2020-01-02_12-00-00.log", f"{MSG}\n" * 2),
              ("record.log", f"{MSG}\n" * 4),
          ],
      )

def test_multi_process_pure_renaming(tmp_path):
    in_2_seconds = (datetime.datetime.now() + datetime.timedelta(seconds=2)).time()

    set_loguru_pure(tmp_path, rotation=in_2_seconds)

    workers = []
    for _ in range(2):
        worker = multiprocessing.Process(target=message, args=(logger,))
        worker.start()
        workers.append(worker)

    for worker in workers:
        worker.join()

    logger.complete()

    check_dir(
        tmp_path,
        files=[
            ("record.log", f"{MSG}\n" * 2),
        ],
    )

    time.sleep(5)

    workers = []
    for _ in range(4):
        worker = multiprocessing.Process(target=message, args=(logger,))
        worker.start()
        workers.append(worker)

    for worker in workers:
        worker.join()

    logger.complete()

    files = sorted(tmp_path.iterdir())

    assert len(files) == 2

    first_file, second_file = files

    assert second_file.name == "record.log"
    assert second_file.read_text() == f"{MSG}\n" * 4

    assert first_file.name.startswith("record.2023")  # The exact name is not deterministic
    assert first_file.read_text() == f"{MSG}\n" * 2

Delgan on Jul 12, 2023