kafka-backup: "consumer poll timeout has expired" loop (was "kafka-backup may fail due to slow disk IO")

Just found this in logs today. Kafka-backup seems stuck in offsets resetting loop after restart because of this.

[2020-05-30 08:48:37,022] INFO [Consumer clientId=connector-consumer-chrono_prod-backup-sink-0, groupId=connect-chrono_prod-backup-sink] Member connector-consumer-chrono_prod-backup-sink-0-4bcd6afc-19a0-4458-b708-ded84fc567a3 sending LeaveGroup request to coordinator kafka1.node:9093 (id: 2147483646 rack: null) due to consumer poll timeout has expired. This means the time between subsequent calls to poll() was longer than the configured max.poll.interval.ms, which typically implies that the poll loop is spending too much time processing messages. You can address this either by increasing max.poll.interval.ms or by reducing the maximum size of batches returned in poll() with max.poll.records. (org.apache.kafka.clients.consumer.internals.AbstractCoordinator:916)

This is kafka-backup running over eCryptfs volume in Azure VM. Disk IOPS limit is 500. Bandwidth limit is 60 MB/s. max.poll.interval.ms = 300000 max.poll.records = 500

Not sure what to do in this situationā€¦

About this issue

  • Original URL
  • State: open
  • Created 4 years ago
  • Comments: 18 (18 by maintainers)

Most upvoted comments

Thank you for your research! Sadly I am currently not able to maintain this Project on my own. If you are interested in supporting me, please let me know for example by opening an issue.

I think we may have somewhere a memory leak or something like that šŸ˜¦

I am currently on holidays and will try to dig into it in September