solana: Rocksdb corruption issues on compaction

Problem

Seen messages like this on many validators from rocksdb.

[2020-03-22T10:22:48.766837593Z ERROR solana_core::window_service] thread 
Some("solana-window-insert") error BlockstoreError(RocksDb(Error { 
message: "Corruption: block checksum mismatch: expected 3583270445, got 3398136873  
in /mnt/vol1/ledger/rocksdb/165855.sst offset 25107936 size 3758" }))

Proposed Solution

Debug and fix.

About this issue

  • Original URL
  • State: open
  • Created 4 years ago
  • Comments: 21 (18 by maintainers)

Most upvoted comments

@yhchiang-sol @sakridge

so, this time wan’t that the root cause is a single bit flip. but there was a oddly zero-ed range in the reported sst file:

$ xxd /home/ryoqun/Downloads/011080.sst
...
0233be70: 0000 0000 0000 d953 4108 0000 0000 1f00  .......SA.......
0233be80: 0000 da53 4108 0000 0000 1e00 0000 db53  ...SA..........S
0233be90: 4108 0000 0000 1d00 0000 dc53 4108 0000  A..........SA...
0233bea0: 0000 1c00 0000 ec53 4108 0000 0000 1b00  .......SA.......
0233beb0: 0000 ed53 4108 0000 0000 1a00 0000 ee53  ...SA..........S
0233bec0: 4108 0000 0000 1900 0000 ef53 4108 0000  A..........SA...
0233bed0: 0000 1800 0000 f053 4108 0000 0000 1700  .......SA.......
0233bee0: 0000 f153 4108 0000 0000 1600 0000 f853  ...SA..........S
0233bef0: 4108 0000 0000 1500 0000 0854 4108 0000  A..........TA...
0233bf00: 0000 1400 0000 0954 4108 0000 0000 1300  .......TA.......
0233bf10: 0000 1054 4108 0000 0000 1200 0000 1154  ...TA..........T
0233bf20: 4108 0000 0000 1100 0000 1254 4108 0000  A..........TA...
0233bf30: 0000 1000 0000 1354 4108 0000 0000 0f00  .......TA.......
0233bf40: 0000 1454 4108 0000 0000 0e00 0000 1554  ...TA..........T
0233bf50: 4108 0000 0000 0d00 0000 1c54 4108 0000  A..........TA...
0233bf60: 0000 0c00 0000 1d54 4108 0000 0000 0b00  .......TA.......
0233bf70: 0000 2454 4108 0000 0000 0a00 0000 2554  ..$TA.........%T
0233bf80: 4108 0000 0000 0900 0000 2654 4108 0000  A.........&TA...
0233bf90: 0000 0800 0000 2754 4108 0000 0000 0700  ......'TA.......
0233bfa0: 0000 3d54 4108 0000 0000 0600 0000 4454  ..=TA.........DT
0233bfb0: 4108 0000 0000 0500 0000 5054 4108 0000  A.........PTA...
0233bfc0: 0000 0400 0000 5154 4108 0000 0000 0300  ......QTA.......
0233bfd0: 0000 5254 4108 0000 0000 0200 0000 5354  ..RTA.........ST
0233bfe0: 4108 0000 0000 010f 098b 06e3 019d 93db  A...............
0233bff0: 1700 0000 ff5f 9d41 8d71 076a 2217 cb9a  ....._.A.q.j"...
0233c000: 0000 0000 0000 0000 0000 0000 0000 0000  ................
0233c010: 0000 0000 0000 0000 0000 0000 0000 0000  ................
0233c020: 0000 0000 0000 0000 0000 0000 0000 0000  ................
0233c030: 0000 0000 0000 0000 0000 0000 0000 0000  ................
0233c040: 0000 0000 0000 0000 0000 0000 0000 0000  ................
0233c050: 0000 0000 0000 0000 0000 0000 0000 0000  ................
0233c060: 0000 0000 0000 0000 0000 0000 0000 0000  ................
0233c070: 0000 0000 0000 0000 0000 0000 0000 0000  ................
...
0235bfc0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
0235bfd0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
0235bfe0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
0235bff0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
0235c000: 0104 27b7 6200 0000 0001 1686 15da 7295  ..'.b.........r.
[1] pry(main)> "%0x" % (0x0235c000 - 0x0233c000)
=> "20000"

as far as I understand, sst files serialization isn’t aligned that way. Also, sst files won’t waste file space with those zeros under its usual operation.

so, i highly suspect the underlying filesystem/hardware wiped the nicely-hex-round block range due to some failure.