solana: solana-ledger-tool errors with `Unable to load bank forks at slot 0 due to disconnected blocks.`
Problem
It seems that net.sh is broken when spinning up gce clusters.
./gce.sh create -n 3 -c 0 -p testnet-dev-kin-haoran -P --dedicated --validator-boot-disk-size-gb 3600 --self-destruct-hours 0 -z us-east1-b --custom-machine-type "--custom-cpu 64 --min-cpu-platform Intel%20Skylake --custom-vm-type n1 --custom-memory 256GB"
./net.sh start --internal-nodes-stake-lamports 1000000000000 --extra-primordial-stakes 3 --faucet-lamports 500000000000000000 --slots-per-epoch 432000
After creating ledger on the bootstrap node, ledger-tool fails to extract the bank hash from snap of slot1. And it can’t bring up the cluster.
++ solana-ledger-tool -l config/bootstrap-validator bank-hash
[2023-01-20T02:34:52.423038671Z INFO solana_ledger_tool] solana-ledger-tool 1.15.0 (src:devbuild; feat:2221197578)
[2023-01-20T02:34:52.423884939Z INFO solana_ledger::blockstore] Maximum open file descriptors: 1000000
[2023-01-20T02:34:52.423901857Z INFO solana_ledger::blockstore] Opening database at "/home/solana/solana/config/bootstrap-validator/rocksdb"
[2023-01-20T02:34:52.423913781Z INFO solana_ledger::blockstore_db] Disabling rocksdb's automatic compactions...
[2023-01-20T02:34:52.429892189Z INFO solana_ledger::blockstore_db] Opening Rocks with secondary (read only) access at: "/home/solana/solana/config/bootstrap-validator/rocksdb/solana-secondary"
[2023-01-20T02:34:52.429906849Z INFO solana_ledger::blockstore_db] This secondary access could temporarily degrade other accesses, such as by solana-validator
[2023-01-20T02:34:52.446842048Z INFO solana_ledger::blockstore] "/home/solana/solana/config/bootstrap-validator/rocksdb" open took 22ms
Unable to load bank forks at slot 0 due to disconnected blocks.
+ bankHash=
haoran_yi_solana_com@testnet-dev-kin-haoran-bootstrap-validator:~$ ls /home/solana/solana/config/bootstrap-validator/
accounts.ledger-tool genesis.bin genesis.tar.bz2 identity.json rocksdb snapshot-1-WiowSLurQoZrBXfmwyBb84k1mbFh3sHBLq41Anrx5t4.tar.zst snapshot.ledger-tool stake-account.json vote-account.json
Proposed Solution
Debug and fix the issue with ledger-tool for loading slot 0.
About this issue
- Original URL
- State: closed
- Created a year ago
- Comments: 15 (15 by maintainers)
@apfitzge @HaoranYi - Since there are PR’s in flight from all of us, here is the rundown for the sake of coordination. For reference, here is the relevant piece of code that these are in reference to: https://github.com/solana-labs/solana/blob/a3c763c2a0ee430feaa5b04a5a02b8100487802e/ledger-tool/src/main.rs#L1068-L1075
halt_slot > starting_slotprior to Blockstore function + gives more detailed error message in this scenario--halt-at-slottoshred-versionsubcommand--halt-at-slottobank-hashsubcommandhalt_slot == 0The order of operations on these:
halt_at_slotwas specified, we want to skip the existing check and check that 1 adds ifhalt_slot == 0halt_slot != 0, then perform thehalt_slot > starting_slotcheck added by 1, followed by the existingblockstore.slot_range_connected(starting_slot, halt_slot)checkIt looks like this assert is introduced in #26506.
Should we special case the connected check for slot 0? So that we can run gce cluster, which start from a snapshot 1 immediately after genesis? https://github.com/solana-labs/solana/pull/29860
@apfitzge and @steviez