strimzi-kafka-operator: ScaleUp and ScaleDown is not working with KRaft mode

Describe the bug Kafka scaleUp from 3 pods to 4 is not working with the following error:

STRIMZI_BROKER_ID=0
Preparing truststore for replication listener
Adding /opt/kafka/cluster-ca-certs/ca.crt to truststore /tmp/kafka/cluster.truststore.p12 with alias ca
Certificate was added to keystore
Preparing truststore for replication listener is complete
Looking for the right CA
Found the right CA: /opt/kafka/cluster-ca-certs/ca.crt
Preparing keystore for replication and clienttls listener
Preparing keystore for replication and clienttls listener is complete
Preparing truststore for client authentication
Adding /opt/kafka/client-ca-certs/ca.crt to truststore /tmp/kafka/clients.truststore.p12 with alias ca
Certificate was added to keystore
Preparing truststore for client authentication is complete
Starting Kafka with configuration:
##############################
##############################
# This file is automatically generated by the Strimzi Cluster Operator
# Any changes to this file will be ignored and overwritten!
##############################
##############################

##########
# Broker ID
##########
broker.id=0
node.id=0

##########
# KRaft configuration
##########
process.roles=broker,controller
controller.listener.names=CONTROLPLANE-9090
controller.quorum.voters=0@my-cluster-5261ed90-kafka-0.my-cluster-5261ed90-kafka-brokers.namespace-0.svc.cluster.local:9090,1@my-cluster-5261ed90-kafka-1.my-cluster-5261ed90-kafka-brokers.namespace-0.svc.cluster.local:9090,2@my-cluster-5261ed90-kafka-2.my-cluster-5261ed90-kafka-brokers.namespace-0.svc.cluster.local:9090,3@my-cluster-5261ed90-kafka-3.my-cluster-5261ed90-kafka-brokers.namespace-0.svc.cluster.local:9090

##########
# Kafka message logs configuration
##########
log.dirs=/var/lib/kafka/data/kafka-log0

##########
# Control Plane listener
##########
listener.name.controlplane-9090.ssl.keystore.location=/tmp/kafka/cluster.keystore.p12
listener.name.controlplane-9090.ssl.keystore.password=[hidden]
listener.name.controlplane-9090.ssl.keystore.type=PKCS12
listener.name.controlplane-9090.ssl.truststore.location=/tmp/kafka/cluster.truststore.p12
listener.name.controlplane-9090.ssl.truststore.password=[hidden]
listener.name.controlplane-9090.ssl.truststore.type=PKCS12
listener.name.controlplane-9090.ssl.client.auth=required

##########
# Replication listener
##########
listener.name.replication-9091.ssl.keystore.location=/tmp/kafka/cluster.keystore.p12
listener.name.replication-9091.ssl.keystore.password=[hidden]
listener.name.replication-9091.ssl.keystore.type=PKCS12
listener.name.replication-9091.ssl.truststore.location=/tmp/kafka/cluster.truststore.p12
listener.name.replication-9091.ssl.truststore.password=[hidden]
listener.name.replication-9091.ssl.truststore.type=PKCS12
listener.name.replication-9091.ssl.client.auth=required

##########
# Listener configuration: PLAIN-9092
##########

##########
# Listener configuration: TLS-9093
##########
listener.name.tls-9093.ssl.keystore.location=/tmp/kafka/cluster.keystore.p12
listener.name.tls-9093.ssl.keystore.password=[hidden]
listener.name.tls-9093.ssl.keystore.type=PKCS12


##########
# Common listener configuration
##########
listeners=CONTROLPLANE-9090://0.0.0.0:9090,REPLICATION-9091://0.0.0.0:9091,PLAIN-9092://0.0.0.0:9092,TLS-9093://0.0.0.0:9093
advertised.listeners=REPLICATION-9091://my-cluster-5261ed90-kafka-0.my-cluster-5261ed90-kafka-brokers.namespace-0.svc:9091,PLAIN-9092://my-cluster-5261ed90-kafka-0.my-cluster-5261ed90-kafka-brokers.namespace-0.svc:9092,TLS-9093://my-cluster-5261ed90-kafka-0.my-cluster-5261ed90-kafka-brokers.namespace-0.svc:9093
listener.security.protocol.map=CONTROLPLANE-9090:SSL,REPLICATION-9091:SSL,PLAIN-9092:PLAINTEXT,TLS-9093:SSL
inter.broker.listener.name=REPLICATION-9091
sasl.enabled.mechanisms=
ssl.secure.random.implementation=SHA1PRNG
ssl.endpoint.identification.algorithm=HTTPS

##########
# User provided configuration
##########
default.replication.factor=3
inter.broker.protocol.version=3.2
log.message.format.version=3.2
min.insync.replicas=2
offsets.topic.replication.factor=3
transaction.state.log.min.isr=2
transaction.state.log.replication.factor=3
Kraft storage is already formatted
+ exec /usr/bin/tini -w -e 143 -- /opt/kafka/bin/kafka-server-start.sh /tmp/strimzi.properties
2022-05-25 08:24:32,210 INFO Registered kafka:type=kafka.Log4jController MBean (kafka.utils.Log4jControllerRegistration$) [main]
2022-05-25 08:24:32,623 INFO Setting -D jdk.tls.rejectClientInitiatedRenegotiation=true to disable client-initiated TLS renegotiation (org.apache.zookeeper.common.X509Util) [main]
2022-05-25 08:24:32,845 INFO [LogLoader partition=__cluster_metadata-0, dir=/var/lib/kafka/data/kafka-log0] Recovering unflushed segment 0 (kafka.log.LogLoader) [main]
2022-05-25 08:24:32,847 INFO [LogLoader partition=__cluster_metadata-0, dir=/var/lib/kafka/data/kafka-log0] Loading producer state till offset 0 with message format version 2 (kafka.log.UnifiedLog$) [main]
2022-05-25 08:24:32,847 INFO [LogLoader partition=__cluster_metadata-0, dir=/var/lib/kafka/data/kafka-log0] Reloading from producer snapshot and rebuilding producer state from offset 0 (kafka.log.UnifiedLog$) [main]
2022-05-25 08:24:32,849 INFO Deleted producer state snapshot /var/lib/kafka/data/kafka-log0/__cluster_metadata-0/00000000000000000009.snapshot (kafka.log.SnapshotFile) [main]
2022-05-25 08:24:32,851 INFO [LogLoader partition=__cluster_metadata-0, dir=/var/lib/kafka/data/kafka-log0] Producer state recovery took 3ms for snapshot load and 0ms for segment recovery from offset 0 (kafka.log.UnifiedLog$) [main]
2022-05-25 08:24:32,882 INFO [ProducerStateManager partition=__cluster_metadata-0] Wrote producer snapshot at offset 9 with 0 producer ids in 11 ms. (kafka.log.ProducerStateManager) [main]
2022-05-25 08:24:32,916 INFO [LogLoader partition=__cluster_metadata-0, dir=/var/lib/kafka/data/kafka-log0] Loading producer state till offset 9 with message format version 2 (kafka.log.UnifiedLog$) [main]
2022-05-25 08:24:32,916 INFO [LogLoader partition=__cluster_metadata-0, dir=/var/lib/kafka/data/kafka-log0] Reloading from producer snapshot and rebuilding producer state from offset 9 (kafka.log.UnifiedLog$) [main]
2022-05-25 08:24:32,917 INFO [ProducerStateManager partition=__cluster_metadata-0] Loading producer state from snapshot file 'SnapshotFile(/var/lib/kafka/data/kafka-log0/__cluster_metadata-0/00000000000000000009.snapshot,9)' (kafka.log.ProducerStateManager) [main]
2022-05-25 08:24:32,919 INFO [LogLoader partition=__cluster_metadata-0, dir=/var/lib/kafka/data/kafka-log0] Producer state recovery took 3ms for snapshot load and 0ms for segment recovery from offset 9 (kafka.log.UnifiedLog$) [main]
2022-05-25 08:24:33,319 INFO [raft-expiration-reaper]: Starting (kafka.raft.TimingWheelExpirationService$ExpiredOperationReaper) [raft-expiration-reaper]
2022-05-25 08:24:33,519 ERROR Exiting Kafka due to fatal exception (kafka.Kafka$) [main]
java.lang.IllegalStateException: Configured voter set: [0, 1, 2, 3] is different from the voter set read from the state file: [0, 1, 2]. Check if the quorum configuration is up to date, or wipe out the local state file if necessary
	at org.apache.kafka.raft.QuorumState.initialize(QuorumState.java:132)
	at org.apache.kafka.raft.KafkaRaftClient.initialize(KafkaRaftClient.java:364)
	at kafka.raft.KafkaRaftManager.buildRaftClient(RaftManager.scala:203)
	at kafka.raft.KafkaRaftManager.<init>(RaftManager.scala:125)
	at kafka.server.KafkaRaftServer.<init>(KafkaRaftServer.scala:76)
	at kafka.Kafka$.buildServer(Kafka.scala:79)
	at kafka.Kafka$.main(Kafka.scala:87)
	at kafka.Kafka.main(Kafka.scala)

To Reproduce Steps to reproduce the behavior:

Setup CO with KRaft enabled
Create Kafka CR with 3 replicas
Scale to 4 replicas
See error in Kafka pod

Expected behavior A clear and concise description of what you expected to happen.

Environment (please complete the following information):

Strimzi version: main
Installation method: YAML
Kubernetes cluster: OpenShift 4.10
Infrastructure: Openstack

YAML files and logs Kafka with 3 replicas

apiVersion: v1
items:
- apiVersion: kafka.strimzi.io/v1beta2
  kind: Kafka
  metadata:
    annotations:
      strimzi.io/pause-reconciliation: "false"
    labels:
      test.case: testPauseReconciliationInKafkaAndKafkaConnectWithConnector
    name: my-cluster-5261ed90
    namespace: namespace-0
  spec:
    kafka:
      config:
        default.replication.factor: 3
        inter.broker.protocol.version: "3.2"
        log.message.format.version: "3.2"
        min.insync.replicas: 2
        offsets.topic.replication.factor: 3
        transaction.state.log.min.isr: 2
        transaction.state.log.replication.factor: 3
      listeners:
      - name: plain
        port: 9092
        tls: false
        type: internal
      - name: tls
        port: 9093
        tls: true
        type: internal
      logging:
        loggers:
          kafka.root.logger.level: DEBUG
        type: inline
      replicas: 3
      storage:
        deleteClaim: true
        size: 1Gi
        type: persistent-claim
      version: 3.2.0
    zookeeper:
      logging:
        loggers:
          zookeeper.root.logger: DEBUG
        type: inline
      replicas: 3
      storage:
        deleteClaim: true
        size: 1Gi
        type: persistent-claim

Kafka with 4 replicas

apiVersion: v1
items:
- apiVersion: kafka.strimzi.io/v1beta2
  kind: Kafka
  metadata:
    annotations:
      strimzi.io/pause-reconciliation: "false"
    creationTimestamp: "2022-05-25T08:16:08Z"
    generation: 2
    labels:
      test.case: testPauseReconciliationInKafkaAndKafkaConnectWithConnector
    name: my-cluster-5261ed90
    namespace: namespace-0
    resourceVersion: "14487706"
    uid: d7843a81-0409-4769-8858-7ad8d6943a2a
  spec:
    kafka:
      config:
        default.replication.factor: 3
        inter.broker.protocol.version: "3.2"
        log.message.format.version: "3.2"
        min.insync.replicas: 2
        offsets.topic.replication.factor: 3
        transaction.state.log.min.isr: 2
        transaction.state.log.replication.factor: 3
      listeners:
      - name: plain
        port: 9092
        tls: false
        type: internal
      - name: tls
        port: 9093
        tls: true
        type: internal
      logging:
        loggers:
          kafka.root.logger.level: DEBUG
        type: inline
      replicas: 4
      storage:
        deleteClaim: true
        size: 1Gi
        type: persistent-claim
      version: 3.2.0
    zookeeper:
      logging:
        loggers:
          zookeeper.root.logger: DEBUG
        type: inline
      replicas: 3
      storage:
        deleteClaim: true
        size: 1Gi
        type: persistent-claim
  status:
    conditions:
    - lastTransitionTime: "2022-05-25T08:20:30.678Z"
      message: Error while waiting for restarted pod my-cluster-5261ed90-kafka-0 to
        become ready
      reason: FatalProblem
      status: "True"
      type: NotReady
    observedGeneration: 2

About this issue

Original URL
State: closed
Created 2 years ago
Comments: 21 (18 by maintainers)

Most upvoted comments

Running Strimzi 0.33.0 and deployed a Kafka cluster with KRaft mode enabled and Kafka version 3.3.2. I was able to scale up from 3 to 5 with no errors and then scaling down again.

ppatierno on Feb 1, 2023

On the community call, December 15th, the we thought that taking into account the load of work we need to do on KRaft, we could wait for the proper fix in Kafka to come in the next version instead of applying a “workaround” right now to fix this. Actually, not setting this issue as higher priority. If the Kafka fix won’t happen soon we could review the decision.

ppatierno on Dec 15, 2022