MassTransit: Losing messages on failover
Contact Details
No response
Version
8.x
On which operating system(s) are you experiencing the issue?
Windows, Linux
Using which broker(s) did you encounter the issue?
ActiveMQ
What are the steps required to reproduce the issue?
1. Clone https://github.com/JamesDaltonGEDigital/MasstransitFailoverProblem
1. The file [./MassTransitCommon/appsettings.Common.json](./MassTransitCommon/appsettings.Common.json) has the ActiveMQ connection settings. It assumes localhost:61616 admin/admin
1. Start ActiveMQ
1. Open a shell and run the MassTransitReceive app in the MassTransitReceive folder. It has a simple consumer that just counts messages. You can press `s` to show the total received, `c` to clear, and `q` to quit
1. Open a second shell and start the MassTransitSend app in the MassTransitSend folder with the command `dotnet run wait=true`
1. Stop and restart ActiveMQ.
1. Return to MassTransitSend window and press any key. By default it will generate 40 messages. 4 threads sending 10 each.
1. Go to ActiveMQ admin page and check the message count. It should be 40, but will be less that that.
1. Or go to the MassTransitReceive window and press s. That should return 40, but won't.
The steps are repeated in the README
What is the expected behavior?
All messages should be sent
What actually happened?
Some messages don’t get sent.
Observations while testing:
- This only happens if failover hosts are set. Without failover hosts, it works fine.
- There must be some degree of concurrency. On at at time after restart works fine.
- I’m assuming this is a timing/locking issue in MassTransit in the fail over recovery code, but I can’t rule out a bug in the ActiveMQ library MassTransit uses.
- Possibly related, we have noticed a deadlock can sometimes occur when a failover happens while sending messages. We can’t reproduce this one on demand and it doesn’t happen every time.
Related log output, including any exceptions
No response
Link to repository that demonstrates/reproduces the issue
https://github.com/JamesDaltonGEDigital/MasstransitFailoverProblem
About this issue
- Original URL
- State: closed
- Created a year ago
- Comments: 27 (18 by maintainers)
Commits related to this issue
- Related to #4356 - catch/retry exceptions when producing messages on connection failure for ActiveMQ transport — committed to phatboyg/MassTransit by phatboyg a year ago
When it’s ready? 😃
I’d say no later than the end of July for an 8.1 release. There are a bunch of other minor changes going into that release that have yet to be completed/tested.