kafka-docker: Cluster ID Mismatch

Hi - I’m new to Kafka so perhaps I’m not understanding something. First, thanks for providing this dockerized Kafka image. It works great. I have the Kafka logs volume mounted and using docker-compose stop and docker-compose up -d --no-recreate seem to be doing the trick. I am curious how one should handle data persistence a container is torn down (e.g. docker-compose down). It seems that all of the necessary data is still there on the docker host, but when starting the Kafka container runs into an issue where the Cluster ID does not match the id specified in meta.properties, and the Kafka container exits. Is there a way around this? Perhaps more environment variables that need to be specified in the compose file? Thanks in advance for any guidance and/or suggestions you may have.

About this issue

  • Original URL
  • State: closed
  • Created 4 years ago
  • Comments: 25

Most upvoted comments

Hi I had the same problem and I solved it, you can check my docker-compose.yml. First, delete your current files before applying the following config. After that, you can docker-compose up and docker-compose down then docker-compose up

Then you will see that your topics will remain there and the Kafka will work The problem that I had, was the wrong zookeeper path after I changed the path, then Kafka started working. I hope that it helps the ones that have this problem

---

version: '3.7'

x-zooServers: &zooServers zoo-1:12888:13888;zoo-2:22888:23888;zoo-3:32888:33888
x-kafkaZookeeperConnect: &kafkaZookeeperConnect zoo-1:12181,zoo-2:22181,zoo-3:32181


services:

  # Zookeeper servers and Kafka servers
  zoo-1:
    image: confluentinc/cp-zookeeper:latest
    hostname: zoo-1
    ports:
      - "12181:12181"
    volumes:
      - ./data/kafkazoo/zoo1/data:/var/lib/zookeeper/data
      - ./data/kafkazoo/zoo1/log:/var/lib/zookeeper/log
    environment:
      ZOOKEEPER_SERVER_ID: 1
      ZOOKEEPER_CLIENT_PORT: 12181
      ZOOKEEPER_TICK_TIME: 2000
      ZOOKEEPER_INIT_LIMIT: 5
      ZOOKEEPER_SYNC_LIMIT: 2
      ZOOKEEPER_SERVERS: *zooServers

  zoo-2:
    image: confluentinc/cp-zookeeper:latest
    hostname: zoo-2
    ports:
      - "22181:22181"
    volumes:
      - ./data/kafkazoo/zoo2/data:/var/lib/zookeeper/data
      - ./data/kafkazoo/zoo2/log:/var/lib/zookeeper/log
    environment:
      ZOOKEEPER_SERVER_ID: 2
      ZOOKEEPER_CLIENT_PORT: 22181
      ZOOKEEPER_TICK_TIME: 2000
      ZOOKEEPER_INIT_LIMIT: 5
      ZOOKEEPER_SYNC_LIMIT: 2
      ZOOKEEPER_SERVERS: *zooServers

  zoo-3:
    image: confluentinc/cp-zookeeper:latest
    hostname: zoo-3
    ports:
      - "32181:32181"
    volumes:
      - ./data/kafkazoo/zoo3/data:/var/lib/zookeeper/data
      - ./data/kafkazoo/zoo3/log:/var/lib/zookeeper/log
    environment:
      ZOOKEEPER_SERVER_ID: 3
      ZOOKEEPER_CLIENT_PORT: 32181
      ZOOKEEPER_TICK_TIME: 2000
      ZOOKEEPER_INIT_LIMIT: 5
      ZOOKEEPER_SYNC_LIMIT: 2
      ZOOKEEPER_SERVERS: *zooServers

  kafka-1:
    image: confluentinc/cp-kafka:latest
    hostname: kafka-1
    ports:
      - "19092:19092"
    volumes:
      - ./data/kafkazoo/kafka1:/var/lib/kafka/data
    depends_on:
      - zoo-1
      - zoo-2
      - zoo-3
    environment:
      KAFKA_BROKER_ID: 1
      KAFKA_ZOOKEEPER_CONNECT: *kafkaZookeeperConnect
      KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 2
      KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://${DOCKER_HOST_IP:-kafka-1}:19092

  kafka-2:
    image: confluentinc/cp-kafka:latest
    hostname: kafka-2
    ports:
      - "29092:29092"
    volumes:
      - ./data/kafkazoo/kafka2:/var/lib/kafka/data
    depends_on:
      - zoo-1
      - zoo-2
      - zoo-3
    environment:
      KAFKA_BROKER_ID: 2
      KAFKA_ZOOKEEPER_CONNECT: *kafkaZookeeperConnect
      KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 2
      KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://${DOCKER_HOST_IP:-kafka-2}:29092

  kafka-3:
    image: confluentinc/cp-kafka:latest
    hostname: kafka-3
    ports:
      - "39092:39092"
    volumes:
      - ./data/kafkazoo/kafka3:/var/lib/kafka/data
    depends_on:
      - zoo-1
      - zoo-2
      - zoo-3
    environment:
      KAFKA_BROKER_ID: 3
      KAFKA_ZOOKEEPER_CONNECT: *kafkaZookeeperConnect
      KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 2
      KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://${DOCKER_HOST_IP:-kafka-3}:39092

...

I found my mistake…

volumes:
      - /opt/data/kafka:/kafka/kafka-logs

meta.properties is stored in the container’s /kafka/kafka-logs path. According to the docker configuration above, the meta.properties stored here are stored in /opt/data/kafka outside the container, and these files remain with or without the container or stack exiting. The problem is that if you don’t delete these files, the newly created container will be referred to the meta.properties of the previous container. If you have the same settings as me, try deleting all files except the log folder for each topics.

@cricket007 - does this help?

version: '2.4'
services:
  zookeeper:
    image: wurstmeister/zookeeper
    container_name: my-zookeeper
    ports:
      - "2181:2181"
    networks:
      - internal-kafka
  kafka:
    image: wurstmeister/kafka
    container_name: my-kafka
    depends_on:
      - zookeeper
    ports:
      - "9092:9092"
    environment:
      KAFKA_BROKER_ID: 1
      KAFKA_ADVERTISED_HOST_NAME: my-kafka.domain.org
      KAFKA_ADVERTISED_PORT: 9092
      KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181
      KAFKA_LOG_RETENTION_HOURS: 3
      KAFKA_LOG_DIRS: /kafka/kafka-logs
      KAFKA_LOG_ROLL_HOURS: 1
    volumes:
      - ./kafka-data:/kafka/kafka-logs
    networks:
      - kafka-network
networks:
  kafka-network:

@cricket007 - Sorry. I’m still a little confused here. If I need to recreate the Kafka container, but want to reuse the data from the last container (e.g. use a volume mount), what is the best way to do this? With a single kafka broker, I’m seeing cluster id mismatch errors as indicated above. Any guidance would be greatly appreciated.

@cricket007 - Sorry. I’m still a little confused here. If I need to recreate the Kafka container, but want to reuse the data from the last container (e.g. use a volume mount), what is the best way to do this? With a single kafka broker, I’m seeing cluster id mismatch errors as indicated above. Any guidance would be greatly appreciated.

@klongmitre this is indeed the issue. For anyone interested: Kafka version 2.4 has introduced a check if the cluster.id matches the one stored in Zookeeper. Kafka itself stores this in meta.properties. The meta.properties shouldn’t be thrown away, instead you should keep both the zookeepers logs and data so that the cluster id is persistant.

Agreed. I’m not trying to scale the service. It’s a single broker set-up. My interest is in enabling persistent data store (via volume mounts), but if the container is destroyed, the persistent store is no longer useable due to the Cluster ID mismatch.