mariadb-operator: [Bug] Cannot add a semi-sync replica to existing cluster

Describe the bug I have a running semi-sync replica cluster with 1 master and 1 replica. Trying to add a new replica fails since it cannot find binlogs required for replica

2023-06-19  9:47:12 12 [Note] Master connection name: 'mariadb-operator'  Master_info_file: 'master-mariadb@002doperator.info'  Relay_info_file: 'relay-log-mariadb@002doperator.info'
2023-06-19  9:47:12 12 [Note] 'CHANGE MASTER TO executed'. Previous state master_host='', master_port='3306', master_log_file='', master_log_pos='4'. New state master_host='mariadb-myns-0.mariadb-myns.myns.svc.cluster.local', master_port='3306', master_log_file='', master_log_pos='4'.
2023-06-19  9:47:12 12 [Note] Previous Using_Gtid=No. New Using_Gtid=Current_Pos
2023-06-19  9:47:12 13 [Note] Master 'mariadb-operator': Slave I/O thread: Start semi-sync replication to master 'repl@mariadb-myns-0.mariadb-myns.myns.svc.cluster.local:3306' in log '' at position 4
2023-06-19  9:47:12 14 [Note] Master 'mariadb-operator': Slave SQL thread initialized, starting replication in log 'FIRST' at position 4, relay log './mariadb-myns-relay-bin-mariadb@002doperator.000001' position: 4; GTID position ''
2023-06-19  9:47:12 13 [Note] Master 'mariadb-operator': Slave I/O thread: connected to master 'repl@mariadb-myns-0.mariadb-myns.myns.svc.cluster.local:3306',replication starts at GTID position ''
2023-06-19  9:47:12 13 [ERROR] Master 'mariadb-operator': Error reading packet from server: Could not find GTID state requested by slave in any binlog files. Probably the slave state is too old and required binlog files have been purged. (server_errno=1236)
2023-06-19  9:47:12 13 [ERROR] Master 'mariadb-operator': Slave I/O: Got fatal error 1236 from master when reading data from binary log: 'Could not find GTID state requested by slave in any binlog files. Probably the slave state is too old and required binlog files have been purged.', Internal MariaDB error code: 1236
2023-06-19  9:47:12 13 [Note] Master 'mariadb-operator': Slave I/O thread exiting, read up to log 'FIRST', position 4; GTID position , master mariadb-myns-0.mariadb-myns.myns.svc.cluster.local:3306
2023-06-19  9:47:12 13 [Note] Master 'mariadb-operator': Failed to gracefully kill our active semi-sync connection with primary. Silently closing the connection.

Expected behaviour New replica initializes properly.

Steps to reproduce the bug

  1. Create cluster with semi-sync replication and 2 replicas
  2. Run the cluster until older binlog file are purged
  3. Try to scale replicas to 3

Environment details:

  • Kubernetes version: 1.26.5
  • mariadb-operator version: v0.0.15
  • Install method: helm

About this issue

  • Original URL
  • State: open
  • Created a year ago
  • Comments: 26 (15 by maintainers)

Most upvoted comments

Okay, my turn:

Disable mariadb-operator:

kubectl scale deploy mariadb-operator --replicas=0

Double check that .spec.replication.primary.podIndex is correct and points to the same pod specified in.status.currentPrimaryPodIndex. I had an issue when operator was setting empty gtid_slave_pos on promoting master, so don’t repeat my mistakes, and check that:

kubect get mariadb mydb -o yaml

let’s define variables to reuse them later:

MASTER=mydb-2
REPLICA=mydb-0

Copy backup form $MASTER to $REPLICA:

kubectl exec "$REPLICA" -ti -- bash -c '
  mkdir /var/lib/mysql/.mariabackup
  chown -R mysql: /var/lib/mysql
  socat -u TCP-LISTEN:4444,reuseaddr stdio | gosu mysql mbstream -x -C /var/lib/mysql/.mariabackup
'
kubectl exec "$MASTER" -ti -- bash -c 'mariabackup --backup -u root -p"${MARIADB_ROOT_PASSWORD}" --stream=xbstream | socat -u stdio TCP:'"$REPLICA"'.$(hostname -d):4444'

If mariabackup stuck on Waiting for log copy thread to read lsn 568204542376, then restart source pod and try again

After backup created, start recovering:

kubectl exec "$REPLICA" -ti -- bash

And run:

gosu mysql mariabackup --target-dir=/var/lib/mysql/.mariabackup --prepare
cat /var/lib/mysql/.mariabackup/xtrabackup_binlog_info # or
cat /var/lib/mysql/.mariabackup/mariadb_backup_binlog_info
#mydb-bin.000001    332     0-12-13773682

gosu mysql mariabackup --target-dir=/var/lib/mysql/.mariabackup --datadir=/var/lib/mysql/.restore --move-back
mariadb -u root -p"${MARIADB_ROOT_PASSWORD}" -e "STOP SLAVE 'mariadb-operator'"
rm -rf /var/lib/mysql/*
mv /var/lib/mysql/.restore/* /var/lib/mysql
rm -rf /var/lib/mysql/.mariabackup /var/lib/mysql/.restore

Now restart the pod:

kubectl delete pod "$REPLICA"

Exec into it:

kubectl exec "$REPLICA" -ti -- bash

Set gtid_slave_pos from the output above:

mariadb -u root -p"${MARIADB_ROOT_PASSWORD}" -e 'SET GLOBAL gtid_slave_pos = "0-12-13773682";'

Enable operator

kubectl scale deploy mariadb-operator --replicas=1

Wait when it configure replication, and watch the status:

mariadb -u root -p"${MARIADB_ROOT_PASSWORD}" -e "SHOW ALL REPLICAS STATUS \G;"
# Seconds_Behind_Master: 34
# Gtid_Slave_Pos: 0-12-15548118

Seconds_Behind_Master should slowly go down

Then to remediate the previous situation what I did was querying the gitd_current_pos in the master:

I think this is not the correct way to get it, since gtid_current_pos on the master increases with each operation, and the replica needs to restart replication where it left.

Just to make my point clearer, I did the following:

  1. Create cluster with replica 3
  2. Do some writes on master
  3. Scale the cluster to replicas: 2, the node shutting down logs its current GTID
2023-06-22 16:03:50 0 [Note] mariadbd (initiated by: unknown): Normal shutdown
2023-06-22 16:03:50 10 [Note] Master 'mariadb-operator': Error reading relay log event: slave SQL thread was killed
2023-06-22 16:03:50 10 [Note] Master 'mariadb-operator': Slave SQL thread exiting, replication stopped in log 'mdb-replica-test-bin.000002' at position 7116; GTID position '0-10-41', master: mdb-replica-test-0.mdb-replica-test.mdb-replica-test.svc.cluster.local:3306
2023-06-22 16:03:50 9 [Note] Master 'mariadb-operator': Slave I/O thread exiting, read up to log 'mdb-replica-test-bin.000002', position 7116; GTID position 0-10-41, master mdb-replica-test-0.mdb-replica-test.mdb-replica-test.svc.cluster.local:3306
2023-06-22 16:03:50 0 [Note] InnoDB: FTS optimize thread exiting.
2023-06-22 16:03:50 0 [Note] InnoDB: Starting shutdown...
2023-06-22 16:03:50 0 [Note] InnoDB: Dumping buffer pool(s) to /var/lib/mysql/ib_buffer_pool
2023-06-22 16:03:50 0 [Note] InnoDB: Buffer pool(s) dump completed at 230622 16:03:50
2023-06-22 16:03:50 0 [Note] InnoDB: Removed temporary tablespace data file: "./ibtmp1"
2023-06-22 16:03:50 0 [Note] InnoDB: Shutdown completed; log sequence number 148714; transaction id 372
2023-06-22 16:03:50 0 [Note] mariadbd: Shutdown complete
  1. Do some other writes on the master, gtid_current_pos increases:
MariaDB [(none)]> SELECT @@global.gtid_current_pos;
+---------------------------+
| @@global.gtid_current_pos |
+---------------------------+
| 0-10-41                   |
+---------------------------+
1 row in set (0.000 sec)

MariaDB [(none)]> create database increase_gtid_for_me;
Query OK, 1 row affected (0.007 sec)

MariaDB [(none)]> SELECT @@global.gtid_current_pos;
+---------------------------+
| @@global.gtid_current_pos |
+---------------------------+
| 0-10-42                   |
+---------------------------+
1 row in set (0.001 sec)
  1. Scale again cluster replicas: 3, the replica restarts replication from gtid_current_pos = 0-10-41
2023-06-22 16:05:06 0 [Note] Server socket created on IP: '0.0.0.0'.
2023-06-22 16:05:06 3 [Note] Reading Master_info: 'master-mariadb@002doperator.info'  Relay_info:'relay-log-mariadb@002doperator.info'
2023-06-22 16:05:06 3 [Note] Initialized Master_info from 'master-mariadb@002doperator.info'
2023-06-22 16:05:06 4 [Note] Master 'mariadb-operator': Slave I/O thread: Start asynchronous replication to master 'repl@mdb-replica-test-0.mdb-replica-test.mdb-replica-test.svc.cluster.local:3306' in log 'mdb-replica-test-bin.000002' at position 7116
2023-06-22 16:05:06 3 [Note] Started replication for 'mariadb-operator'
2023-06-22 16:05:06 5 [Note] Master 'mariadb-operator': Slave SQL thread initialized, starting replication in log 'mdb-replica-test-bin.000002' at position 7116, relay log './mdb-replica-test-relay-bin-mariadb@002doperator.000001' position: 4; GTID position '0-10-41'
2023-06-22 16:05:06 0 [Note] mariadbd: ready for connections.
Version: '10.6.14-MariaDB-1:10.6.14+maria~ubu2004-log'  socket: '/run/mysqld/mysqld.sock'  port: 3306  mariadb.org binary distribution
2023-06-22 16:05:06 4 [Note] Master 'mariadb-operator': Slave I/O thread: connected to master 'repl@mdb-replica-test-0.mdb-replica-test.mdb-replica-test.svc.cluster.local:3306',replication starts at GTID position '0-10-41'

So reading the value from the master does not work, the replica already knows its last gtid_current_pos on restart.

@fchiacchiaretta awesome investigation, thanks for the level of detail 💯 🥇

I’ve managed to reproduce the issue by:

  • Creating your MariaDB instance
  • Setting max-binlog-size=4096 and binlog_expire_logs_seconds=10
  • Ingest some data

As a result, the binary logs got deleted

MariaDB [(none)]> show binary logs;
+-------------------------+-----------+
| Log_name                | File_size |
+-------------------------+-----------+
| mariadb-repl-bin.000013 |   1992007 |
| mariadb-repl-bin.000014 |   1998579 |
| mariadb-repl-bin.000015 |   1989927 |
| mariadb-repl-bin.000016 |   1978851 |
| mariadb-repl-bin.000017 |   1933420 |
| mariadb-repl-bin.000018 |   1937694 |
| mariadb-repl-bin.000019 |    151378 |
| mariadb-repl-bin.000020 |       590 |
+-------------------------+-----------+

At this point, upscaling to 3 results in the following logs in the new replica:

mariadb 2023-06-21 16:45:16 7 [Note] Master 'mariadb-operator': Slave I/O thread: Start semi-sync replication to master 'repl@mariadb-repl-0.mariadb-repl.default.svc.cluster.local:3306' in l
og '' at position 4
mariadb 2023-06-21 16:45:16 8 [Note] Master 'mariadb-operator': Slave SQL thread initialized, starting replication in log 'FIRST' at position 4, relay log './mariadb-repl-relay-bin-mariadb@0
02doperator.000001' position: 4; GTID position ''
mariadb 2023-06-21 16:45:16 7 [Note] Master 'mariadb-operator': Slave I/O thread: connected to master 'repl@mariadb-repl-0.mariadb-repl.default.svc.cluster.local:3306',replication starts at
GTID position ''
mariadb 2023-06-21 16:45:16 7 [ERROR] Master 'mariadb-operator': Error reading packet from server: Could not find GTID state requested by slave in any binlog files. Probably the slave state
is too old and required binlog files have been purged. (server_errno=1236)
mariadb 2023-06-21 16:45:16 7 [ERROR] Master 'mariadb-operator': Slave I/O: Got fatal error 1236 from master when reading data from binary log: 'Could not find GTID state requested by slave
in any binlog files. Probably the slave state is too old and required binlog files have been purged.', Internal MariaDB error code: 1236
mariadb 2023-06-21 16:45:16 7 [Note] Master 'mariadb-operator': Slave I/O thread exiting, read up to log 'FIRST', position 4; GTID position , master mariadb-repl-0.mariadb-repl.default.svc.c
luster.local:3306
mariadb 2023-06-21 16:45:16 7 [Note] Master 'mariadb-operator': Failed to gracefully kill our active semi-sync connection with primary. Silently closing the connection.

This happens with both MariaDB version 10.11.3 and 10.6.13. Having a look now 👀

Hello @mmontes11 , thank you for your answer. Here my MariaDB resource

apiVersion: mariadb.mmontes.io/v1alpha1
kind: MariaDB
metadata:
  name: mariadb-myns
  namespace: myns
spec:
  rootPasswordSecretKeyRef:
    name: mariadb-credentials
    key: root-password
  database: mydb
  username: myuser
  passwordSecretKeyRef:
    name: mariadb-credentials
    key: password
  image:
    repository: mariadb
    tag: "10.6.13"
    pullPolicy: IfNotPresent
  port: 3306
  podSecurityContext:
    runAsUser: 0
  securityContext:
    allowPrivilegeEscalation: false
  livenessProbe:
    exec:
      command:
        - bash
        - -c
        - mysql -u root -p${MARIADB_ROOT_PASSWORD} -e "SELECT 1;"
    initialDelaySeconds: 20
    periodSeconds: 10
    timeoutSeconds: 5
  readinessProbe:
    exec:
      command:
        - bash
        - -c
        - mysql -u root -p${MARIADB_ROOT_PASSWORD} -e "SELECT 1;"
    initialDelaySeconds: 20
    periodSeconds: 10
    timeoutSeconds: 5
  replicas: 2
  replication:
    primary:
      podIndex: 0
      automaticFailover: true
      service:
        type: ClusterIP
    replica:
      waitPoint: AfterCommit
      gtid: CurrentPos
      connectionTimeout: 10s
      connectionRetries: 10
      syncTimeout: 10s
    syncBinlog: true
  resources:
    limits:
      memory: "4096Mi"
      cpu: "2000m"    
    requests:
      memory: "1024Mi"
      cpu: "500m"
  affinity:
    podAntiAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        - topologyKey: "kubernetes.io/hostname"
  tolerations:
    - key: "mariadb.mmontes.io/ha"
      operator: "Exists"
      effect: "NoSchedule"
  podDisruptionBudget:
    minAvailable: "66%"
  metrics:
    exporter:
      image:
        repository: prom/mysqld-exporter
        tag: "v0.14.0"
        pullPolicy: IfNotPresent
      resources:
        requests:
          cpu: 50m
          memory: 64Mi
        limits:
          cpu: 300m
          memory: 512Mi
    serviceMonitor:
      prometheusRelease: prometheus
      interval: 10s
      scrapeTimeout: 10s
  myCnf: |
    [mysqld]
    bind-address=0.0.0.0
    default_storage_engine=InnoDB
    binlog_format=row
    innodb_autoinc_lock_mode=2
    max_allowed_packet=256M
    max_user_connections = 1000
    innodb_log_file_size=3G
    innodb_buffer_pool_size=3G
    collation_server=utf8mb4_general_ci
    init_connect='SET NAMES utf8mb4'
    character_set_server=utf8mb4
    ## Caches and Limits
    ##
    tmp_table_size=32M
    max_heap_table_size=32M
    # Re-enabling as now works with Maria 10.1.2
    query_cache_type=1
    query_cache_limit=4M
    query_cache_size=256M
    thread_cache_size=50
    open_files_limit=65535
    table_definition_cache=4096
    table_open_cache=4096
  volumeClaimTemplate:
    resources:
      requests:
        storage: 50Gi
    storageClassName: default
    accessModes:
      - ReadWriteOnce
  service:
    type: ClusterIP

I did no special operation to trigger binlog purging, but on master pod I see that some files have been purged:

MariaDB [(none)]> show global status like '%binlog%';
+---------------------------------------+--------------------------+
| Variable_name                         | Value                    |
+---------------------------------------+--------------------------+
| Binlog_commits                        | 20759                    |
| Binlog_group_commits                  | 20758                    |
| Binlog_group_commit_trigger_count     | 0                        |
| Binlog_group_commit_trigger_lock_wait | 0                        |
| Binlog_group_commit_trigger_timeout   | 0                        |
| Binlog_snapshot_file                  | mariadb-myns-bin.000010  |
| Binlog_snapshot_position              | 50327739                 |
| Binlog_bytes_written                  | 50337747                 |
| Binlog_cache_disk_use                 | 0                        |
| Binlog_cache_use                      | 20759                    |
| Binlog_stmt_cache_disk_use            | 0                        |
| Binlog_stmt_cache_use                 | 0                        |
| Com_binlog                            | 0                        |
| Com_show_binlog_events                | 0                        |
| Com_show_binlogs                      | 0                        |
| Com_show_binlog_status                | 2                        |
+---------------------------------------+--------------------------+
16 rows in set (0.001 sec)

MariaDB [(none)]> SHOW BINARY LOGS;
+--------------------------+-----------+
| Log_name                 | File_size |
+--------------------------+-----------+
| mariadb-myns-bin.000007 | 286514232  |
| mariadb-myns-bin.000008 |  61618825  |
| mariadb-myns-bin.000009 | 144421966  |
| mariadb-myns-bin.000010 |  50329734  |
+--------------------------+-----------+
4 rows in set (0.000 sec)

MariaDB [(none)]> Bye
root@mariadb-myns-0:/# ls -l /var/lib/mysql/
total 3766964
-rw-rw---- 1 mysql mysql     475136 Jun 20 04:00 aria_log.00000001
-rw-rw---- 1 mysql mysql         52 Jun  7 10:26 aria_log_control
drwx------ 2 mysql mysql      12288 Jun 15 14:41 mydb
-rw-rw---- 1 mysql mysql          9 Jun  7 10:26 ddl_recovery-backup.log
-rw-rw---- 1 mysql mysql      16384 Jun 15 14:41 ddl_recovery.log
-rw-rw---- 1 mysql mysql     396372 May 30 07:32 ib_buffer_pool
-rw-rw---- 1 mysql mysql 3221225472 Jun 20 07:12 ib_logfile0
-rw-rw---- 1 mysql mysql   79691776 May 30 07:32 ibdata1
-rw-rw---- 1 mysql mysql   12582912 Jun  7 10:26 ibtmp1
drwx------ 2 mysql mysql      16384 May 24 10:48 lost+found
-rw-rw---- 1 mysql mysql  286514232 May 29 15:14 mariadb-myns-bin.000007
-rw-rw---- 1 mysql mysql   61618825 May 30 07:32 mariadb-myns-bin.000008
-rw-rw---- 1 mysql mysql  144421966 Jun  7 10:25 mariadb-myns-bin.000009
-rw-rw---- 1 mysql mysql   50327739 Jun 20 07:10 mariadb-myns-bin.000010
-rw-rw---- 1 mysql mysql        108 Jun  7 10:26 mariadb-myns-bin.index
-rw-rw---- 1 mysql mysql          0 May 24 10:48 multi-master.info
drwx------ 2 mysql mysql       4096 Jun  7 13:53 mysql
-rw-r--r-- 1 mysql mysql         15 May 24 10:48 mysql_upgrade_info
drwx------ 2 mysql mysql       4096 May 24 10:48 performance_schema
drwx------ 2 mysql mysql      12288 May 24 10:48 sys

I checked for expire_log_days and binlog_expire_logs_seconds but it seems that neither of them have been set.

I’m trying to reproduce the issue on a test cluster too, I’ll be back if I have some additional results.