pika: Inconsistent Blocking Connection hanging when consumer is idle

Pika: 0.12.0b2 OS: Ubuntu

When the connection is left idle for a long time (e.g. 10 hours). A consumer hangs and does not consume anymore messages. This used to happen on pika 0.10 and it appeared to be solved in v. 0.11 Following example in Pika Examples, the issue could be reproduced.

Moreover:

  • the same problem is difficult to reproduce when pushing to localhost (respect to RabbitMQ in another host/network)
  • no specific output from pika (at debug level)
  • RabbitMQ does not simply to complain about heartebeats
  • When pushing new messages, it appears a “strange log”, that looks like it comes from pika application (that should consume)
rabbit                   | =WARNING REPORT==== 15-May-2018::08:59:41 ===
rabbit                   | closing AMQP connection <0.1090.0> (192.168.1.1:44386 -> 172.18.0.2:5672):
rabbit                   | client unexpectedly closed TCP connection

Apart from big headaches, I would like to point out that in pika 0.11.2 this problem should not be present, but another takes places. When operating very long tasks connection is dropped as well.

About this issue

  • Original URL
  • State: closed
  • Created 6 years ago
  • Comments: 31 (19 by maintainers)

Most upvoted comments

@lukebakken, I think the suggestion to use SelectConnection wouldn’t improve the situation, just make his code more complicated due to “callback hell” 😃. Also, the suggestion to use add_timeout to delegate processing to the correct thread looks like a typo. If someone is using SelectConnection directly, and need to schedule a callback inside SelectConnection’s I/O loop safely, they should call the add_callback_threadsafe() method of that SelectConnection’s I/O loop directly - connection.ioloop.add_callback_threadsafe(...) - not add_timeout which isn’t thread-safe.

@lukebakken

I am doing my tests with 0.12.0.b2 and heartbeats at 600s

While testing I am capturing packets (Wireshark). Everything is working smoothly since this morning in the scenario.

I will see if by tomorrow the connection is still available leaving it idle all night long. If this works I would repeat the tests with:

  • pika 0.12.0.b3
  • dockerized version of RabbitMq

@lukebakken, would you mind including the pika version number in the log content when opening a connection? This would help get our bearings when examining logs. Thx!