solana: Rocksdb corruption issues on compaction
Problem
Seen messages like this on many validators from rocksdb.
[2020-03-22T10:22:48.766837593Z ERROR solana_core::window_service] thread
Some("solana-window-insert") error BlockstoreError(RocksDb(Error {
message: "Corruption: block checksum mismatch: expected 3583270445, got 3398136873
in /mnt/vol1/ledger/rocksdb/165855.sst offset 25107936 size 3758" }))
Proposed Solution
Debug and fix.
About this issue
- Original URL
- State: open
- Created 4 years ago
- Comments: 21 (18 by maintainers)
@yhchiang-sol @sakridge
so, this time wan’t that the root cause is a single bit flip. but there was a oddly zero-ed range in the reported sst file:
as far as I understand, sst files serialization isn’t aligned that way. Also, sst files won’t waste file space with those zeros under its usual operation.
so, i highly suspect the underlying filesystem/hardware wiped the nicely-hex-round block range due to some failure.