symfony: [Messenger] Message is lost if it cannot be processed by both handler and failed transport

Symfony version(s) affected: From 4.4 to master

Description
When message cannot be processed and cannot be written to failed transport, it gets lost.

How to reproduce

  1. Have a handler which processes a message and writes something to database. Database is down for some reason. Retries are disabled (or DB is down all the time during retry duration).
  2. Handler receives a message and throws an exception (because DB is down).
  3. Messenger component tries to send message to failed, but fails as well (because DB is down).
  4. In this case message remains in RabbitMQ (as expected).
  5. After a while consumer starts processing the same message. It fails again (because DB is still down).
  6. Now message is rejected from RabbitMQ and lost forever, this happens because of RejectRedeliveredMessageMiddleware which prevents message being stuck in queue.

In my opinion this is dangerous behavior which makes implementation of proper resilient applications impossible with Messenger component. The component must guarantee that with proper configuration messages will never be lost implicitly.

So expected behavior from my side:

  • Message should remain in queue until it is either processed or put into failed transport.
  • It is totally acceptable to have messages stuck in queue, because it must be caught by application monitoring.

Possible Solution
Remove this class and it’s definition from DI: https://github.com/symfony/symfony/blob/master/src/Symfony/Component/Messenger/Middleware/RejectRedeliveredMessageMiddleware.php

Also remove this code: https://github.com/symfony/symfony/blob/master/src/Symfony/Component/Messenger/Worker.php#L116-L120

To prevent messages being stuck in queue other solutions should be applied:

  • Proper failed transport configuration.
  • If failed transport is not healthy, application monitoring should be useful.

Additional context
I can implement a change myself, but I would like to raise it first and discuss. Someone implemented RejectRedeliveredMessageMiddleware, so there could be other use-case for it.

About this issue

  • Original URL
  • State: closed
  • Created 4 years ago
  • Reactions: 7
  • Comments: 17 (11 by maintainers)

Most upvoted comments

What I can see as solution for your case is to remove the RejectRedeliveredMessageMiddleware when retries are disabled.

Unfortunately it should be removed in any case. If retries are enabled, but all attempts exceeded (and DB is still down), same issue will happen: message will be lost.

To prevent messages being stuck in queue other solutions should be applied:

Please elaborate.

This is my point of view:

  • If failed transport is not available (not healthy), message should sit in a queue forever if needed. I would prefer to solve this situation manually rather than lose messages. If failed transport is not healthy, situation will require intervention and investigation anyway (DB should either get back automatically or be manually restored).
  • I don’t really see why RejectRedeliveredMessageMiddleware is useful. If message is redelivered but still cannot be processed, it should follow retry -> failed flow. If retry/failed is misconfigured or not available, I would again prefer not to have data lost but to solve the problem manually.

Yes, this is still actual, we cannot use native Messenger component because of this issue - it does not guarantee that messages will not be dropped randomly. My MR solves the problem, but it need to be reviewed.

Hello,

I worked on https://github.com/symfony/symfony/pull/39622 and I faced something similar.

Indeed, there is a flaw in the design of the component. There is many cases where a message can be lost.

Usually, components that deal with messaging proceed like this

  1. GET a message from a queue
  2. process it
  3. (N)ACK it

But there are many places in symfony/messenger that does the following

  1. GET a message from the queue
  2. process can not be done
  3. ACK it
  4. some code
  5. requeue it

And, obviously, something can go wrong between 3. or 5.

Some examples:

I did not search for all issues, but IIRC there are other places.

Hi bot, I will just repeat my previous comment: I believe the issue should stay open for transparency of Symfony users, one should understand that it is not 100% reliable and resilient.

I stopped using Symfony Messenger because of this issue, so I am not sure if bug is relevant. But I can see that the fix for the bug is still not merged and issue is ghosted. I believe the issue should stay open for transparency of Symfony users, one should understand that it is not 100% reliable and resilient.