mysql-operator: Mysql dont start completely and readiness fail
Notice sometimes when the pod restart for any reason it never comes up again for example
NAME READY STATUS RESTARTS AGE
db-mysql-0 3/4 Running 0 5m29s
events:
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 2m2s default-scheduler Successfully assigned 6c7951pozpb68x3w3es6pbodqe/db-mysql-0 to ipxxxxxxxx
Normal Pulled 103s kubelet, ipxxxxxxxx Container image "quay.io/presslabs/mysql-operator-sidecar:0.3.2" already present on machine
Normal Created 103s kubelet, ipxxxxxxxx Created container
Normal Started 103s kubelet, ipxxxxxxxx Started container
Normal Pulled 102s kubelet, ipxxxxxxxx Container image "percona@sha256:713c1817615b333b17d0fbd252b0ccc53c48a665d4cfcb42178167435a957322" already present on machine
Normal Created 102s kubelet, ipxxxxxxxx Created container
Normal Started 102s kubelet, ipxxxxxxxx Started container
Normal Pulled 101s kubelet, ipxxxxxxxx Container image "prom/mysqld-exporter:v0.11.0" already present on machine
Normal Created 101s kubelet, ipxxxxxxxx Created container
Normal Started 101s kubelet, ipxxxxxxxx Started container
Normal Pulled 101s kubelet, ipxxxxxxxx Container image "percona@sha256:713c1817615b333b17d0fbd252b0ccc53c48a665d4cfcb42178167435a957322" already present on machine
Normal Created 101s kubelet, ipxxxxxxxx Created container
Normal Started 101s kubelet, ipxxxxxxxx Started container
Normal Pulled 101s kubelet, ipxxxxxxxx Container image "quay.io/presslabs/mysql-operator-sidecar:0.3.2" already present on machine
Normal Created 100s kubelet, ipxxxxxxxx Created container
Normal Started 100s kubelet, ipxxxxxxxx Started container
Normal Pulled 100s kubelet, ipxxxxxxxx Container image "quay.io/presslabs/mysql-operator-sidecar:0.3.2" already present on machine
Normal Created 100s kubelet, ipxxxxxxxx Created container
Normal Started 100s kubelet, ipxxxxxxxx Started container
Warning Unhealthy 84s (x7 over 96s) kubelet, ipxxxxxxxx Readiness probe failed:
logs from the pods
Create rclone.conf file.
2019-10-24T15:47:32.823Z INFO sidecar environment is not set {"key": "INIT_BUCKET_URI"}
2019-10-24T15:47:32.823Z INFO sidecar cloning command {"host": "db-mysql-0"}
2019-10-24T15:47:32.823Z INFO sidecar data already exists! Remove manually PVC to cleanup or to reinitialize.
2019-10-24T15:47:32.824Z INFO sidecar configuring server {"host": "db-mysql-0"}
2019-10-24T15:47:32.824Z INFO sidecar error while reading PURGE GTID from xtrabackup info file {"error": "open /var/lib/mysql/xtrabackup_binlog_info: no such file or directory"}
Initialization complete, now exiting!
2019-10-24T15:47:34.526691Z 0 [Warning] TIMESTAMP with implicit DEFAULT value is deprecated. Please use --explicit_defaults_for_timestamp server option (see documentation for more details).
2019-10-24T15:47:34.528701Z 0 [Note] mysqld (mysqld 5.7.26-29-log) starting as process 1 ...
2019-10-24T15:47:34.538056Z 0 [Note] InnoDB: PUNCH HOLE support available
2019-10-24T15:47:34.538155Z 0 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
2019-10-24T15:47:34.538178Z 0 [Note] InnoDB: Uses event mutexes
2019-10-24T15:47:34.538215Z 0 [Note] InnoDB: GCC builtin __atomic_thread_fence() is used for memory barrier
2019-10-24T15:47:34.538233Z 0 [Note] InnoDB: Compressed tables use zlib 1.2.7
2019-10-24T15:47:34.538251Z 0 [Note] InnoDB: Using Linux native AIO
2019-10-24T15:47:34.538754Z 0 [Note] InnoDB: Number of pools: 1
2019-10-24T15:47:34.538969Z 0 [Note] InnoDB: Using CPU crc32 instructions
2019-10-24T15:47:34.545153Z 0 [Note] InnoDB: Initializing buffer pool, total size = 512M, instances = 1, chunk size = 128M
2019-10-24T15:47:34.569645Z 0 [Note] InnoDB: Completed initialization of buffer pool
2019-10-24T15:47:34.578085Z 0 [Note] InnoDB: If the mysqld execution user is authorized, page cleaner thread priority can be changed. See the man page of setpriority().
2019-10-24T15:47:34.595271Z 0 [Note] InnoDB: Crash recovery did not find the parallel doublewrite buffer at /var/lib/mysql/xb_doublewrite
2019-10-24T15:47:34.597393Z 0 [Note] InnoDB: Highest supported file format is Barracuda.
2019-10-24T15:47:34.633116Z 0 [Note] InnoDB: Created parallel doublewrite buffer at /var/lib/mysql/xb_doublewrite, size 3932160 bytes
2019-10-24T15:47:34.867432Z 0 [Note] InnoDB: Creating shared tablespace for temporary tables
2019-10-24T15:47:34.867512Z 0 [Note] InnoDB: Setting file './ibtmp1' size to 12 MB. Physically writing the file full; Please wait ...
2019-10-24T15:47:35.011558Z 0 [Note] InnoDB: File './ibtmp1' size is now 12 MB.
2019-10-24T15:47:35.012721Z 0 [Note] InnoDB: 96 redo rollback segment(s) found. 96 redo rollback segment(s) are active.
2019-10-24T15:47:35.012737Z 0 [Note] InnoDB: 32 non-redo rollback segment(s) are active.
2019-10-24T15:47:35.013453Z 0 [Note] InnoDB: Waiting for purge to start
2019-10-24T15:47:35.064136Z 0 [Note] InnoDB: Percona XtraDB (http://www.percona.com) 5.7.26-29 started; log sequence number 203893653
2019-10-24T15:47:35.064470Z 0 [Note] InnoDB: Loading buffer pool(s) from /var/lib/mysql/ib_buffer_pool
2019-10-24T15:47:35.064657Z 0 [Note] Plugin 'FEDERATED' is disabled.
2019-10-24T15:47:35.095022Z 0 [Note] Found ca.pem, server-cert.pem and server-key.pem in data directory. Trying to enable SSL support using them.
2019-10-24T15:47:35.095166Z 0 [Note] Skipping generation of SSL certificates as certificate files are present in data directory.
2019-10-24T15:47:35.099564Z 0 [Warning] CA certificate ca.pem is self signed.
2019-10-24T15:47:35.099626Z 0 [Note] Skipping generation of RSA key pair as key files are present in data directory.
2019-10-24T15:47:35.100921Z 0 [Note] Server hostname (bind-address): '*'; port: 3306
2019-10-24T15:47:35.100962Z 0 [Note] IPv6 is available.
2019-10-24T15:47:35.100976Z 0 [Note] - '::' resolves to '::';
2019-10-24T15:47:35.100998Z 0 [Note] Server socket created on IP: '::'.
2019-10-24T15:47:35.174622Z 0 [Note] InnoDB: Buffer pool(s) load completed at 191024 15:47:35
2019-10-24T15:47:35.214181Z 0 [Note] Event Scheduler: Loaded 0 events
2019-10-24T15:47:35.214376Z 0 [Note] Execution of init_file '/etc/mysql/conf.d/operator-init.sql' started.
2019-10-24T15:47:35.224302Z 0 [Note] Execution of init_file '/etc/mysql/conf.d/operator-init.sql' ended.
2019-10-24T15:47:35.224498Z 0 [Note] mysqld: ready for connections.
Version: '5.7.26-29-log' socket: '/var/lib/mysql/mysql.sock' port: 3306 Percona Server (GPL), Release 29, Revision 11ad961
Create rclone.conf file.
2019-10-24T15:47:34.787Z INFO sidecar environment is not set {"key": "INIT_BUCKET_URI"}
2019-10-24T15:47:34.788Z INFO sidecar start http server for backups
time="2019-10-24T15:47:35Z" level=info msg="Starting mysqld_exporter (version=0.11.0, branch=HEAD, revision=5d7179615695a61ecc3b5bf90a2a7c76a9592cdd)" source="mysqld_exporter.go:206"
time="2019-10-24T15:47:35Z" level=info msg="Build context (go=go1.10.3, user=root@3d3ff666b0e4, date=20180629-15:00:35)" source="mysqld_exporter.go:207"
time="2019-10-24T15:47:35Z" level=info msg="Enabled scrapers:" source="mysqld_exporter.go:218"
time="2019-10-24T15:47:35Z" level=info msg=" --collect.global_status" source="mysqld_exporter.go:222"
time="2019-10-24T15:47:35Z" level=info msg=" --collect.global_variables" source="mysqld_exporter.go:222"
time="2019-10-24T15:47:35Z" level=info msg=" --collect.slave_status" source="mysqld_exporter.go:222"
time="2019-10-24T15:47:35Z" level=info msg=" --collect.info_schema.tables" source="mysqld_exporter.go:222"
time="2019-10-24T15:47:35Z" level=info msg=" --collect.heartbeat" source="mysqld_exporter.go:222"
time="2019-10-24T15:47:35Z" level=info msg="Listening on 0.0.0.0:9125" source="mysqld_exporter.go:232"
Create rclone.conf file.
Usage: /usr/local/bin/sidecar-entrypoint.sh {files-config|clone|config-and-serve}
Now runs your command.
pt-heartbeat --update --replace --check-read-only --create-table --database sys_operator --table heartbeat --defaults-file /etc/mysql/heartbeat.conf --fail-successive-errors=20
any advice in how to recover from that?
/cc @AMecea @jwilander
About this issue
- Original URL
- State: open
- Created 5 years ago
- Reactions: 2
- Comments: 23 (3 by maintainers)
Because we are unable to configure the readinessProbe we cannot provide cluster stability in the case of a mysql pod that requires crash recovery. As a result we are unable to use this solution in a production environment. We really like the operator model and have been eager to use it in production. If we could give readinessProbe configurations in the mysql podSpec it would get us there, but as-is we can only use this for dev/test environments.