ruby-kafka: Weird behaviour of async producer: BufferOverflow woes
Thank you for your work, now I’m using an async producer which produces a lot of messages fast (millions/minutes), and I’m currently stuck in the following loop:
...
E, [2017-04-24T18:34:19.070589 #31] ERROR -- : Cannot produce message to RawTelemetry, max queue size (1000) reached
D, [2017-04-24T18:34:19.070807 #31] DEBUG -- : Current leader for RawTelemetry/0 is node cb71f3fb750b:9092 (node_id=0)
I, [2017-04-24T18:34:19.070991 #31] INFO -- : Sending 603 messages to cb71f3fb750b:9092 (node_id=0)
D, [2017-04-24T18:34:19.071105 #31] DEBUG -- : Opening connection to cb71f3fb750b:9092 with client id bm-ingester...
D, [2017-04-24T18:34:19.071868 #31] DEBUG -- : Sending request 1 to cb71f3fb750b:9092
W, [2017-04-24T18:34:19.072141 #31] WARN -- : Kafka buffer overflown... Retrying in 2...
D, [2017-04-24T18:34:19.080216 #31] DEBUG -- : Waiting for response 1 from cb71f3fb750b:9092
D, [2017-04-24T18:34:19.081006 #31] DEBUG -- : Received response 1 from cb71f3fb750b:9092
I, [2017-04-24T18:34:19.081847 #31] INFO -- : Disconnecting broker 0
D, [2017-04-24T18:34:19.081908 #31] DEBUG -- : Closing socket to cb71f3fb750b:9092
E, [2017-04-24T18:34:21.074674 #31] ERROR -- : Cannot produce message to RawTelemetry, max queue size (1000) reached
D, [2017-04-24T18:34:21.074955 #31] DEBUG -- : Current leader for RawTelemetry/0 is node cb71f3fb750b:9092 (node_id=0)
I, [2017-04-24T18:34:21.075383 #31] INFO -- : Sending 604 messages to cb71f3fb750b:9092 (node_id=0)
D, [2017-04-24T18:34:21.075620 #31] DEBUG -- : Opening connection to cb71f3fb750b:9092 with client id bm-ingester...
W, [2017-04-24T18:34:21.075850 #31] WARN -- : Kafka buffer overflown... Retrying in 2...
D, [2017-04-24T18:34:21.076596 #31] DEBUG -- : Sending request 1 to cb71f3fb750b:9092
D, [2017-04-24T18:34:21.089756 #31] DEBUG -- : Waiting for response 1 from cb71f3fb750b:9092
D, [2017-04-24T18:34:21.090935 #31] DEBUG -- : Received response 1 from cb71f3fb750b:9092
I, [2017-04-24T18:34:21.091617 #31] INFO -- : Disconnecting broker 0
D, [2017-04-24T18:34:21.091668 #31] DEBUG -- : Closing socket to cb71f3fb750b:9092
...
Notice my “buffer overflown” log lines which are added when a Kafka::BufferOverflow
error is rescued and production is delayed.
Now there might be something wrong with my kafka, but I have no indication of this in this log (or any log). Nothing seems to happen when in this loop except for “Sending N messages” count going up — which is pretty weird too.
Thanks in advance for sharing any thoughts on this.
About this issue
- Original URL
- State: closed
- Created 7 years ago
- Comments: 27 (2 by maintainers)
Seeing as this is a recurring problem, it might make sense to allow configuring ruby-kafka with a max message size, having it reject messages that are larger before even sending to Kafka. Furthermore, that would allow us to avoid creating too large batches, if that is truly a problem.