clickhouse-operator: Recover tables in read-only mode after zookeeper failure / restart

Is there a way to recover replicated tables after they enter read-only mode due to possible zookeeper failure / restarts ?

We are hitting scenarios where zookeeper pods fail and restart. But after that, all existing tables fall into read-only mode. Tried DETACH / ATTACH as well, but didn’t help. (New databases / tables still continue to work)

OPTIMIZE TABLE <table_name>

Received exception from server (version 19.16.17):
Code: 242. DB::Exception: Received from localhost:9000. 
DB::Exception: Table is in readonly mode.

Details:

Clickhouse Version: 19.16.17.1
Operator Version: 0.9.8
zookeeper set using advanced setup with persistent volume
- curious, if there other recommendations for having a more stable ZK ?
Running on AWS EKS v1.14

Thanks!

About this issue

Original URL
State: closed
Created 4 years ago
Comments: 32 (6 by maintainers)

Most upvoted comments

Thank you for confirmation, @boraozkan. I was worried what happens with your ZooKeeper, and can not sleep well 😃

alex-zaitsev on May 8, 2020

Hi @kushagra391 , we discovered that ZK manifests were not correct, and ZK actually stored data not on the mounted PV, but on a local pod’s storage. The problem has been fixed in master and 0.9.9 release of operator.

We have also tested that there is no need to restart ClickHouse if ZooKeeper has been restarted. ClickHouse will reconnect to new ZooKeeper pods after 30 second timeout automatically.

alex-zaitsev on May 6, 2020