bbolt: DB cannot be opened after node hard reset

Hello! I am not sure whether this is a bug in bolt, but I think you might find this interesting:

  1. Create ext4 with fast_commit feature enabled (we use 5.10 kernel)
  2. Create DB with default settings (i.e. no unsafe settings, (my only custom parameters are batch size and batch delay, they seem unrelated).
  3. Perform server hard-reset under write load.
  4. DB cannot be opened, bbolt check reports:
panic: assertion failed: Page expected to be: 23092, but self identifies as 0

goroutine 6 [running]:
go.etcd.io/bbolt._assert(...)
        /repo/bbolt/db.go:1359
go.etcd.io/bbolt.(*page).fastCheck(0x7f0b95834000, 0x5a34)
        /repo/bbolt/page.go:57 +0x1d9
go.etcd.io/bbolt.(*Tx).page(0x7f0b909be000?, 0x4f9cc0?)
        /repo/bbolt/tx.go:534 +0x79
go.etcd.io/bbolt.(*Tx).forEachPageInternal(0x7f0b8fe11000?, {0xc000024140?, 0x3, 0xa}, 0xc00008db68)
        /repo/bbolt/tx.go:546 +0x5d
go.etcd.io/bbolt.(*Tx).forEachPageInternal(0x7f0b8fe19000?, {0xc000024140?, 0x2, 0xa}, 0xc00008db68)
        /repo/bbolt/tx.go:555 +0xc8
go.etcd.io/bbolt.(*Tx).forEachPageInternal(0x0?, {0xc000024140?, 0x1, 0xa}, 0xc00008db68)
        /repo/bbolt/tx.go:555 +0xc8
go.etcd.io/bbolt.(*Tx).forEachPage(...)
        /repo/bbolt/tx.go:542
go.etcd.io/bbolt.(*Tx).checkBucket(0xc00013e000, 0xc00002a280, 0xc00008deb0, 0xc00008dee0, {0x54b920?, 0x62ec60}, 0xc00007c0c0)
        /repo/bbolt/tx_check.go:83 +0x111
go.etcd.io/bbolt.(*Tx).checkBucket.func2({0x7f0b8fe3e0a6?, 0xc000104d28?, 0xc00013e000?})
        /repo/bbolt/tx_check.go:110 +0x90
go.etcd.io/bbolt.(*Bucket).ForEachBucket(0x0?, 0xc00008dd70)
        /repo/bbolt/bucket.go:403 +0x96
go.etcd.io/bbolt.(*Tx).checkBucket(0xc00013e000, 0xc00013e018, 0xc000104eb0, 0xc000104ee0, {0x54b920?, 0x62ec60}, 0xc00007c0c0)
        /repo/bbolt/tx_check.go:108 +0x252
go.etcd.io/bbolt.(*Tx).check(0xc00013e000, {0x54b920, 0x62ec60}, 0x0?)
        /repo/bbolt/tx_check.go:61 +0x365
created by go.etcd.io/bbolt.(*Tx).CheckWithOptions in goroutine 1
        /repo/bbolt/tx_check.go:31 +0x118

The last 64 pages of the file seem to be filled with zeroes.

When the ext4 fast_commit is disabled, our tests pass, and db can be opened. I have reproduced this on both 1.3.6 and 1.3.7.

Here is the output for bbolt pages https://gist.github.com/fyfyrchik/4aafec23d9dfc487fb4a4cd7f5560730

Meta pages

$ ./bbolt page ./1 0
Page ID:    0
Page Type:  meta
Total Size: 4096 bytes
Overflow pages: 0
Version:    2
Page Size:  4096 bytes
Flags:      00000000
Root:       <pgid=26>
Freelist:   <pgid=97>
HWM:        <pgid=23157>
Txn ID:     1198
Checksum:   6397aaef7230fab5

$ ./bbolt page ./1 1
Page ID:    1
Page Type:  meta
Total Size: 4096 bytes
Overflow pages: 0
Version:    2
Page Size:  4096 bytes
Flags:      00000000
Root:       <pgid=62>
Freelist:   <pgid=102>
HWM:        <pgid=23157>
Txn ID:     1199
Checksum:   8ec692c6ba15e06b

About this issue

  • Original URL
  • State: open
  • Created 10 months ago
  • Comments: 17 (17 by maintainers)

Most upvoted comments

Found a relative patch about Fast Commit: https://lore.kernel.org/linux-ext4/20211223032337.5198-3-yinxin.x@bytedance.com/. And confirmed with that patch’s author and fast commit in v5.10 has some corner case which can loss data even if fsync/fdatasync returns successful…

@fyfyrchik I think I would like to suggest to upgrade kernel or disable fc option 😂

NOTE:

Other patch is here https://lore.kernel.org/lkml/202201091544.W5HHEXAp-lkp@intel.com/T/ which is covered by https://kernel.googlesource.com/pub/scm/fs/xfs/xfstests-dev/+/refs/heads/master/tests/generic/455

Updated.

@fyfyrchik since my local env is using v6.4 kernel, I build v5.10 kernel and run it with qemu.

#!/bin/bash

set -exuo pipefail

readonly db_file=/var/lib/etcd/xxx.db
mkdir -p /var/lib/etcd

if [[ -f "${db_file}" ]]; then
  ls -alh $db_file
  bbolt check $db_file
fi


bbolt bench -work -path $db_file -count=1000000000 -batch-size=100 -value-size=4096 2> /dev/null &
sleep $(( ( RANDOM % 10 )  + 1 ))
echo b > /proc/sysrq-trigger

I can reproduce this issue with main branch.

panic: assertion failed: Page expected to be: 56763, but self identifies as 0

goroutine 6 [running]:
go.etcd.io/bbolt/internal/common.Assert(...)
        /home/fuwei/workspace/bbolt/internal/common/verify.go:65
go.etcd.io/bbolt/internal/common.(*Page).FastCheck(0x7f59cd5bb000, 0xddbb)
        /home/fuwei/workspace/bbolt/internal/common/page.go:83 +0x1d9
go.etcd.io/bbolt.(*Tx).page(0x7f59cd51e000?, 0x5dc120?)
        /home/fuwei/workspace/bbolt/tx.go:545 +0x79
go.etcd.io/bbolt.(*Tx).forEachPageInternal(0x7f59c61b6000?, {0xc0000200f0?, 0x4, 0xa}, 0xc000131b60)
        /home/fuwei/workspace/bbolt/tx.go:557 +0x5d
go.etcd.io/bbolt.(*Tx).forEachPageInternal(0x7f59c6cfe000?, {0xc0000200f0?, 0x3, 0xa}, 0xc000131b60)
        /home/fuwei/workspace/bbolt/tx.go:566 +0xc5
go.etcd.io/bbolt.(*Tx).forEachPageInternal(0x7f59c6d95000?, {0xc0000200f0?, 0x2, 0xa}, 0xc000131b60)
        /home/fuwei/workspace/bbolt/tx.go:566 +0xc5
go.etcd.io/bbolt.(*Tx).forEachPageInternal(0xc000104b38?, {0xc0000200f0?, 0x1, 0xa}, 0xc000131b60)
        /home/fuwei/workspace/bbolt/tx.go:566 +0xc5
go.etcd.io/bbolt.(*Tx).forEachPage(...)
        /home/fuwei/workspace/bbolt/tx.go:553
go.etcd.io/bbolt.(*Tx).checkBucket(0xc000146000, 0xc000024140, 0xc000131eb0, 0xc000131ee0, {0x67bd40?, 0x80b420}, 0xc00007e0c0)
        /home/fuwei/workspace/bbolt/tx_check.go:82 +0x111
go.etcd.io/bbolt.(*Tx).checkBucket.func2({0x7f59c6d96020?, 0xc000146000?, 0xc000104d20?})
        /home/fuwei/workspace/bbolt/tx_check.go:109 +0x90
go.etcd.io/bbolt.(*Bucket).ForEachBucket(0x0?, 0xc000131d68)
        /home/fuwei/workspace/bbolt/bucket.go:451 +0x96
go.etcd.io/bbolt.(*Tx).checkBucket(0xc000146000, 0xc000146018, 0xc000104eb0, 0xc000104ee0, {0x67bd40?, 0x80b420}, 0xc00007e0c0)
        /home/fuwei/workspace/bbolt/tx_check.go:107 +0x252
go.etcd.io/bbolt.(*Tx).check(0xc000146000, {0x67bd40, 0x80b420}, 0x0?)
        /home/fuwei/workspace/bbolt/tx_check.go:60 +0x365
created by go.etcd.io/bbolt.(*Tx).Check in goroutine 1
        /home/fuwei/workspace/bbolt/tx_check.go:30 +0x118

56886    unknown<00> 0
56887    unknown<00> 0
56888    unknown<00> 0
56889    unknown<00> 0
56890    unknown<00> 0
56891    unknown<00> 0
56892    unknown<00> 0
56893    unknown<00> 0
56894    unknown<00> 0
56895    unknown<00> 0
56896    unknown<00> 0
56897    unknown<00> 0
56898    unknown<00> 0
56899    unknown<00> 0
56900    unknown<00> 0
56901    unknown<00> 0
56902    unknown<00> 0
56903    unknown<00> 0
56904    unknown<00> 0
56905    unknown<00> 0
56906    unknown<00> 0
56907    unknown<00> 0
56908    unknown<00> 0
56909    unknown<00> 0
56910    unknown<00> 0
56911    unknown<00> 0
56912    unknown<00> 0
56913    free
56914    free
56915    leaf       2      2
56918    leaf       2      2
56921    leaf       2      2
56924    leaf       2      2
56927    leaf       2      2
56930    leaf       2      2
56933    leaf       2      2
56936    leaf       2      2
56939    leaf       2      2
56942    leaf       2      2
56945    leaf       2      2
56948    leaf       2      2
56951    leaf       2      2
56954    leaf       2      2
56957    leaf       2      2

I am not sure it’s related to fast-commit because I didn’t find relative patch for this. Trying to test it without fast-commit.

cc @ahrtr