pika: Inconsistent Blocking Connection hanging when consumer is idle
Pika: 0.12.0b2 OS: Ubuntu
When the connection is left idle for a long time (e.g. 10 hours). A consumer hangs and does not consume anymore messages. This used to happen on pika 0.10 and it appeared to be solved in v. 0.11 Following example in Pika Examples, the issue could be reproduced.
Moreover:
- the same problem is difficult to reproduce when pushing to localhost (respect to RabbitMQ in another host/network)
- no specific output from pika (at debug level)
- RabbitMQ does not simply to complain about heartebeats
- When pushing new messages, it appears a “strange log”, that looks like it comes from pika application (that should consume)
rabbit | =WARNING REPORT==== 15-May-2018::08:59:41 ===
rabbit | closing AMQP connection <0.1090.0> (192.168.1.1:44386 -> 172.18.0.2:5672):
rabbit | client unexpectedly closed TCP connection
Apart from big headaches, I would like to point out that in pika 0.11.2 this problem should not be present, but another takes places. When operating very long tasks connection is dropped as well.
About this issue
- Original URL
- State: closed
- Created 6 years ago
- Comments: 31 (19 by maintainers)
@lukebakken, I think the suggestion to use SelectConnection wouldn’t improve the situation, just make his code more complicated due to “callback hell” 😃. Also, the suggestion to use
add_timeoutto delegate processing to the correct thread looks like a typo. If someone is using SelectConnection directly, and need to schedule a callback inside SelectConnection’s I/O loop safely, they should call theadd_callback_threadsafe()method of that SelectConnection’s I/O loop directly -connection.ioloop.add_callback_threadsafe(...)- notadd_timeoutwhich isn’t thread-safe.@lukebakken
I am doing my tests with 0.12.0.b2 and heartbeats at 600s
While testing I am capturing packets (Wireshark). Everything is working smoothly since this morning in the scenario.
I will see if by tomorrow the connection is still available leaving it idle all night long. If this works I would repeat the tests with:
@lukebakken, would you mind including the pika version number in the log content when opening a connection? This would help get our bearings when examining logs. Thx!