ClickHouse: Insertion into table with Kafka engine creates duplicate records

Describe the bug Currently i have setup like this: 2 shards 2 replicas each

I have an table with kafka engine on each instance (for example, kafka_table). Settings are made correct with same consumer group to avoid dups as it described in docs. Then i have MV which takes records from kafka_table. Then i have a table named my_super_table which takes records from MV

So now we have 4 consumers at all.

And during push something to kafka i have randomly duplicate records. They became on different or same shard. It looks like magic.

I fully check whats really kafka contains and there are no dups in it.

  • Which ClickHouse server version to use 19.13, 19.11, 19.8, 19.6

Expected behavior There are no any duplicates.

Error message and/or stacktrace There are no any errors in log also.


Is there a stable version with kafka engine ?

About this issue

  • Original URL
  • State: closed
  • Created 5 years ago
  • Comments: 29 (11 by maintainers)

Most upvoted comments

@filimonov thanks for the answer 😃

this issue is reproduced on cluster in docker also with the same versions above
I will try to reproduce and send logs, thanks 😃 coz its very flaky

@josepowera Yes, i saw those fixes in changelog, but it has much more dups with higher versions

on 10kk messages it was like 10% of dups on 19.13 and it was only 100 on 19.6