pebble: db: pebble internal error
Hi @petermattis
I got the following error when trying to open the attached DB using the latest version of Pebble. The attached DB is generated using RocksDB.
pebble: internal error: L0 flushed file 000019 overlaps with the largest seqnum of a preceding flushed file: 8212-11042 vs 8416
Both this one and #566 were observed when testing Pebble to see whether it can operate correctly on RocksDB generated DB and the other way around. RocksDB/Pebble features used are pretty basic, they are -
- Write to DB via WriteBatch with sync set to true
- Read from DB via Get and forward iterator
- Delete a single record using the regular delete operation
- Range delete
- Manual compaction
About this issue
- Original URL
- State: closed
- Created 4 years ago
- Comments: 33 (16 by maintainers)
Commits related to this issue
- db: do not recycle WAL files used for recovery WAL recycling depends on the recycled WAL only containing log records written with the recyclable record format. Pebble always writes WAL files with the... — committed to cockroachdb/pebble by petermattis 4 years ago
- db_impl_open: Error out if WAL replay seqnums aren't increasing This should help trip early on issues such as the one found in: https://github.com/cockroachdb/pebble/issues/567 — committed to cockroachdb/rocksdb by itsbilal 4 years ago
- db_impl_open: Error out if WAL replay seqnums aren't increasing This should help trip early on issues such as the one found in: https://github.com/cockroachdb/pebble/issues/567 — committed to cockroachdb/rocksdb by itsbilal 4 years ago
- db_impl_open: Error out if WAL replay seqnums aren't increasing This should help trip early on issues such as the one found in: https://github.com/cockroachdb/pebble/issues/567 — committed to cockroachdb/rocksdb by itsbilal 4 years ago
FYI: The person you’re trying to reply to is “@lni” (L-N-I).
Got through 9 iterations of @lni’s tests with 64x parallelism while running on #580. Zero errors.
I managed to reproduce @lni’s problem with some additional instrumentation: before every restart cycle I create a
checkpointXXX
directory and hard link all of the DB files into that directory. The problem we’re seeing is that Pebble is running, then the test restarts into RocksDB and RocksDB creates sstables with overlapping seqnums. What the instrumentation shows is that the problem is somewhere in the WAL file itself. Here is a portion of the WAL file that RocksDB is replaying:Notice the sequence numbers here. That should never happen in a WAL file. I need to do some more analysis of this WAL to determine how this is possibly happening.