MassTransit: Consumer doesn't consume messages after RabbitMQ restart
Is this a bug report?
Yes
Can you also reproduce the problem with the latest version?
Yes
Environment
- Operating system: Windows 10 Pro
- Visual Studio version: Community 2019
- Dotnet version: .NET Core 3.1
Steps to Reproduce
Steps:
- Set ConcurrentMessageLimit=1 (for easily bug reproducing; I noticed that it is more easily reproducible by hands with 1 concurrent message limit).
- Setup 3 RabbitMQ nodes (3.8 version) as a cluster (sync mode: ha-mode=all; ha-sync-mode: automatic), also they should be placed in docker for easy achieving of bug.
- Setup any proxy service for accessing the RabbitMQ cluster (e.g. HAProxy).
- Setup a consumer connection with RabbitMQ cluster via HAProxy (3).
- Send over 60000 messages (it happens when RabbitMQ syncing for a long time).
- Make a delay in a consume method (it can be equal 5 sec; it is important to get an error of already closed channel or “NACK failed”).
- Randomly kill and start RabbitMQ nodes while consuming (after some iterations of restarting RabbitMQ nodes and syncing consumer will stop to consume messages).
Expected Behavior
Consumer always reconnects to queue.
Actual Behavior
- Consumer doesn’t reconnect to queue.
- At channel section in RabbitMQ management message: “… no channels …”.
- Last exceptions:
Message NACK failed: 6, Original Exception: System.Threading.Channels.ChannelClosedException: The channel has been closed.
at MassTransit.Util.ChannelExecutor.Run[T](Func`1 method, CancellationToken cancellationToken)
at MassTransit.Transports.ReceivePipeDispatcher.Dispatch(ReceiveContext context, ReceiveLockContext receiveLock)
System.Threading.Channels.ChannelClosedException: The channel has been closed.
at MassTransit.Util.ChannelExecutor.Run[T](Func`1 method, CancellationToken cancellationToken)
at MassTransit.RabbitMqTransport.Pipeline.RabbitMqReceiveLockContext.Faulted(Exception exception)
...
System.Threading.Channels.ChannelClosedException: The channel has been closed.
at MassTransit.Util.ChannelExecutor.Run[T](Func`1 method, CancellationToken cancellationToken)
at MassTransit.Transports.ReceivePipeDispatcher.Dispatch(ReceiveContext context, ReceiveLockContext receiveLock)
at MassTransit.Transports.ReceivePipeDispatcher.Dispatch(ReceiveContext context, ReceiveLockContext receiveLock)
at MassTransit.Transports.ReceivePipeDispatcher.Dispatch(ReceiveContext context, ReceiveLockContext receiveLock)
at MassTransit.RabbitMqTransport.Pipeline.RabbitMqBasicConsumer.<>c__DisplayClass23_0.<<HandleBasicDeliver>b__0>d.MoveNext()
Reproducible Demo
(Paste the link to an example project and exact instructions to reproduce the issue.)
About this issue
- Original URL
- State: closed
- Created 4 years ago
- Comments: 60 (32 by maintainers)
Commits related to this issue
- Related to #2072 - Channel closed exception during NACK should be ignored — committed to phatboyg/MassTransit by phatboyg 4 years ago
- Capture the exception from an ACK, for more detail. Also, within the receiveLock complete of the dispatcher. #2072 — committed to phatboyg/MassTransit by phatboyg 4 years ago
- Related #2072 - Broker disconnect should cancel pending consumers and set complete, avoid trying to ACK/NACK when closed. — committed to MassTransit/MassTransit by phatboyg 3 years ago
- Related to #2072 (possible fix, no longer able to reproduce) - Loop was exiting in case of OperationInterruptedException from RabbitMQ.Client — committed to MassTransit/MassTransit by phatboyg 3 years ago
- Related to #2072 (possible fix, no longer able to reproduce) - Retry() extension was throwing after a successful subsequent attempt instead of returning — committed to MassTransit/MassTransit by phatboyg 3 years ago
- Related #2072 - The Retry loop should never end, and should probably be moved to the receive transport and unrolled to make it easier to understand. — committed to phatboyg/MassTransit by phatboyg 3 years ago
- Related #2072 - Broker disconnect should cancel pending consumers and set complete, avoid trying to ACK/NACK when closed. — committed to phatboyg/MassTransit by phatboyg 3 years ago
- Related #2072 - The Retry loop should never end, and should probably be moved to the receive transport and unrolled to make it easier to understand. — committed to phatboyg/MassTransit by phatboyg 3 years ago
So, I’m calling this done as I could consistently reproduce it and now with the final latest develop build it is no longer an issue.
Total time spent resolving it? Easily 12-16 hours in the past 1.5 days.
So, I tried to reproduce the failure to reconnect, but couldn’t. I did consistently reproduce the channel closed exception.
Source Code
Using MassTransit 7.0.6 directly from NuGet.
I’ll look at it.