tendermint: rpc-server module doesn't start after a container restart
Tendermint version (use tendermint version or git rev-parse --verify HEAD if installed from source): 0.31.5-d2eab536
ABCI app (name for built-in, URL for self-written if it’s publicly available): Bigchaindb ver: 2.0.0
Environment:
- OS (e.g. from /etc/os-release): OS: Ubuntu 18.04.4 LTS
- Install tools: docker
- Others: all-in-one docker build except for mongodb, which is used from outside of the container running bigchaindb and tendermint.
What happened: The EC2 instance ran out of disk. After freeing up some space, when the docker containers are restarted, bigchaindb was unable to commit any transaction. The debug log from bigchaindb said max-retries exceed to connect to http://localhost:26657 (the default tendermint port). I tried building a fresh container and run it, in that also it did not work.
What you expected to happen: rpc-server module of tendermint to be up and running.
Have you tried the latest version: no
How to reproduce it (as minimally and precisely as possible): 1) let a disk-full happen, 2) stop/start the container running bigchaindb/tendermint.
Logs (paste a small part showing an error (< 10 lines) or link a pastebin, gist, etc. containing more of the log file):
bash-5.0# tendermint node --rpc.laddr "tcp://0.0.0.0:26657" --log_level="*:debug"
I[2020-05-06|18:40:05.136] Starting multiAppConn module=proxy impl=multiAppConn
I[2020-05-06|18:40:05.137] Starting socketClient module=abci-client connection=query impl=socketClient
I[2020-05-06|18:40:05.138] Starting socketClient module=abci-client connection=mempool impl=socketClient
I[2020-05-06|18:40:05.139] Starting socketClient module=abci-client connection=consensus impl=socketClient
I[2020-05-06|18:40:05.139] Starting EventBus module=events impl=EventBus
I[2020-05-06|18:40:05.140] Starting PubSub module=pubsub impl=PubSub
I[2020-05-06|18:40:05.151] Starting IndexerService module=txindex impl=IndexerService
I[2020-05-06|18:40:05.231] ABCI Handshake App Info module=consensus height=355 hash= software-version= protocol-version=0
I[2020-05-06|18:40:05.233] ABCI Replay Blocks module=consensus appHeight=355 storeHeight=2760414 stateHeight=2760413
I[2020-05-06|18:40:05.233] Applying block module=consensus height=356
I[2020-05-06|18:40:05.315] Executed block module=consensus height=356 validTxs=0 invalidTxs=0
I[2020-05-06|18:40:05.395] Applying block module=consensus height=357
Config (you can paste only the changes you’ve made):
node command runtime flags:
/dump_consensus_state output for consensus bugs
Anything else we need to know: I see the following in tendermint log (with default log level): E[2020-05-06|17:18:40.274] abci.socketClient failed to connect to tcp://127.0.0.1:26658. Retrying… module=abci-client connection=query err=“dial tcp 127.0.0.1:26658: conn ect: connection refused”
I tried asking in SO : https://stackoverflow.com/questions/61580896/getting-error-while-sending-transaction-over-bigchaindb-node
About this issue
- Original URL
- State: closed
- Created 4 years ago
- Comments: 25 (10 by maintainers)
It seems like the node already has all blocks in local storage:
storeHeight=2760414. So fast_sync is not involved, this is pure block replay.When Tendermint starts up, it asks the application which height it processed last, and then starts replaying blocks from there. Why does the application report it’s at height 0? Is the application not persisting its state to disk?