uReplicator: Urgent - Worker restart , always have a number of duplicate message in topic
Hi everyone , I’m implementing uReplicator with latest master branch ( commit 70ac3404bb4517756c6813d3962b77edc7700019 ). As the another commit on master , after Worker restart, always have lots of message duplicate in Destination topic .
Here is overview my step :
step1: Build the latest code
step2: build Docker image with this Dockerfile
FROM openjdk:8-jre
COPY confd-0.15.0-linux-amd64 /usr/local/bin/confd
COPY uReplicator/uReplicator-Distribution/target/uReplicator-Distribution-pkg /uReplicator
COPY uReplicator/config uReplicator/config
COPY confd-new /etc/confd
#I use confd to generate configuration file in Container
COPY entrypoint-new.sh /entrypoint.sh
RUN chmod +x /entrypoint.sh && \
chmod +x /usr/local/bin/confd && \
chmod +x /uReplicator/bin/*.sh &&
ENTRYPOINT [ "/entrypoint.sh" ]
Step3: prepare 2 topic:
- Source topic : 500 message & keep rising to target 1000 message.
- Destination topic : Blank topic - 0 message
step4: apply deployment on Kubernetes with config :
- Controller : 1 pod : resource require: 100m CPU - 1500Mi memory
- Worker : 2 pod : resource require: 100m CPU - 1500Mi memory
clusters.properties
kafka.cluster.zkStr.cluster1=src-zk:2181
kafka.cluster.servers.cluster1=src-kafka:9092
kafka.cluster.zkStr.cluster2=dst-zk:2181
kafka.cluster.servers.cluster2=dst-kafka:9092
consumer.properties
zookeeper.connect=src-zk:2181
bootstrap.servers=src-kafka:9092
zookeeper.connection.timeout.ms=30000
zookeeper.session.timeout.ms=30000
group.id=kloak-mirrormaker-test
consumer.id=kloakmms01-sjc1
socket.receive.buffer.bytes=1048576
fetch.message.max.bytes=8388608
queued.max.message.chunks=5
key.deserializer=org.apache.kafka.common.serialization.ByteArrayDeserializer
value.deserializer=org.apache.kafka.common.serialization.ByteArrayDeserializer
auto.offset.reset=smallest
dstzk.properties
enable=true
zkServer=src-zk:2181
helix.properties : disable instanceId because of i’m using more than 1 Worker
zkServer=src-zk:2181
#instanceId=testHelixMirrorMaker01
helixClusterName=testMirrorMaker
federated.deployment.name=uReplicator-Tests
producer.properties
bootstrap.servers=dst-kafka:9092
client.id=kloak-mirrormaker-test
Step4: check number of message on both server
Src & Dst topic : 600 message
Step5 : Delete 1 worker POD kubernetes by : kubectl delete pod … ( no --force command )
result : 1 - old POD is Terminating. Src & Dst keep sync up on time : 750 -750 on both topic
2 - a few second later: new POD is UP Src topic : keep rising to 900 -> 95- -> 1000 Dst topic : stay at 750 message
3 - a minutes later : Src topic : 1000 message Dst topic : more than 1000 message : sometime 1200 message , sometime 1100-1300…etc ( the result is different in every test )
Note : if no Worker is down. The replication result is perfect . But in this case in everytime is test , the result is different and uReplicator is not stable. @yangy0000 @Pawka @DAddYE @hparra @chirayuk @dentafrice CC to anyone Who contributed to this application . Please let me know/correct me if something went wrong .
About this issue
- Original URL
- State: open
- Created 5 years ago
- Comments: 21 (10 by maintainers)
yes ,perfect .
@Technoboy- hey bro , the problem is : kill process or kill -9 process . I did this way with
kill -9
I’ve tried with ‘kill processId’ and here is result : have commit offset log? And It’s Work !!!
kill process.
search worker log, find key words like “Start clean shutdown” or “Flushing last batches and commit offsets” which mean gracefully shutdown。under gracefully shutdown, message offset will commit。otherwise, you may kill the pod with incorrect manner