etcd: Etcd panic in robustness tests blackhole failpoint
What happened?
Robustness Nightly are failing due to etcd panic. https://github.com/etcd-io/etcd/actions/runs/4562802681
What did you expect to happen?
Etcd should not panic due to network issues
How can we reproduce it (as minimally and precisely as possible)?
Run robustness tests blackhole tests enough times. See https://github.com/etcd-io/etcd/actions/workflows/robustness-nightly.yaml
Anything else we need to know?
No response
Etcd version (please run commands below)
N/A
Etcd configuration (command line flags or environment variables)
N/A
Etcd debug information (please run commands below, feel free to obfuscate the IP address or FQDN in the output)
N/A
Relevant log output
2023-03-30T10:25:27.3144193Z logger.go:130: 2023-03-30T10:25:26.312Z INFO Triggering failpoint
2023-03-30T10:25:27.3144599Z {"failpoint": "blackhole"}
2023-03-30T10:25:27.3145371Z failpoints.go:328: Blackholing traffic from and to member "TestRobustnessClusterOfSize3HighTraffic-test-1"
2023-03-30T10:25:27.3146769Z /home/runner/work/etcd/etcd/bin/etcd (TestRobustnessClusterOfSize3HighTraffic-test-2) (28025): {"level":"warn","ts":"2023-03-30T10:25:26.811397Z","caller":"etcdserver/v3_server.go:831","msg":"waiting for ReadIndex response took too long, retrying","sent-request-id":11854185818228622438,"retry-timeout":"500ms"}
2023-03-30T10:25:27.3148279Z /home/runner/work/etcd/etcd/bin/etcd (TestRobustnessClusterOfSize3HighTraffic-test-0) (28026): {"level":"warn","ts":"2023-03-30T10:25:26.812944Z","caller":"etcdserver/v3_server.go:831","msg":"waiting for ReadIndex response took too long, retrying","sent-request-id":14186768950229826371,"retry-timeout":"500ms"}
2023-03-30T10:25:27.3149756Z /home/runner/work/etcd/etcd/bin/etcd (TestRobustnessClusterOfSize3HighTraffic-test-1) (28035): {"level":"warn","ts":"2023-03-30T10:25:26.813508Z","caller":"etcdserver/v3_server.go:831","msg":"waiting for ReadIndex response took too long, retrying","sent-request-id":6108155643657295548,"retry-timeout":"500ms"}
2023-03-30T10:25:27.3265010Z /home/runner/work/etcd/etcd/bin/etcd (TestRobustnessClusterOfSize3HighTraffic-test-2) (28025): {"level":"warn","ts":"2023-03-30T10:25:27.312399Z","caller":"etcdserver/v3_server.go:831","msg":"waiting for ReadIndex response took too long, retrying","sent-request-id":11854185818228622438,"retry-timeout":"500ms"}
2023-03-30T10:25:27.3266115Z failpoints.go:332: Traffic restored from and to member "TestRobustnessClusterOfSize3HighTraffic-test-1"
2023-03-30T10:25:27.3267790Z /home/runner/work/etcd/etcd/bin/etcd (TestRobustnessClusterOfSize3HighTraffic-test-2) (28025): {"level":"info","ts":"2023-03-30T10:25:27.314167Z","caller":"traceutil/trace.go:171","msg":"trace[1342822156] linearizableReadLoop","detail":"{readStateIndex:601; appliedIndex:602; }","duration":"1.003816961s","start":"2023-03-30T10:25:26.310341Z","end":"2023-03-30T10:25:27.314158Z","steps":["trace[1342822156] 'read index received' (duration: 1.003814761s)","trace[1342822156] 'applied index is now lower than readState.Index' (duration: 1.5µs)"],"step_count":2}
2023-03-30T10:25:27.3269807Z /home/runner/work/etcd/etcd/bin/etcd (TestRobustnessClusterOfSize3HighTraffic-test-1) (28035): {"level":"info","ts":"2023-03-30T10:25:27.313778Z","caller":"traceutil/trace.go:171","msg":"trace[1664249991] linearizableReadLoop","detail":"{readStateIndex:601; appliedIndex:602; }","duration":"1.000933919s","start":"2023-03-30T10:25:26.31283Z","end":"2023-03-30T10:25:27.313764Z","steps":["trace[1664249991] 'read index received' (duration: 1.000931819s)","trace[1664249991] 'applied index is now lower than readState.Index' (duration: 1.5µs)"],"step_count":2}
2023-03-30T10:25:27.3271798Z /home/runner/work/etcd/etcd/bin/etcd (TestRobustnessClusterOfSize3HighTraffic-test-0) (28026): {"level":"info","ts":"2023-03-30T10:25:27.315185Z","caller":"traceutil/trace.go:171","msg":"trace[2074038161] linearizableReadLoop","detail":"{readStateIndex:601; appliedIndex:601; }","duration":"1.002760046s","start":"2023-03-30T10:25:26.312412Z","end":"2023-03-30T10:25:27.315172Z","steps":["trace[2074038161] 'read index received' (duration: 1.002757745s)","trace[2074038161] 'applied index is now lower than readState.Index' (duration: 1.601µs)"],"step_count":2}
2023-03-30T10:25:27.3274893Z /home/runner/work/etcd/etcd/bin/etcd (TestRobustnessClusterOfSize3HighTraffic-test-0) (28026): {"level":"panic","ts":"2023-03-30T10:25:27.31731Z","logger":"raft","caller":"v3@v3.0.0-20221201111702-eaa6808e1f7a/log.go:99","msg":"index, 7, is out of range [1]","stacktrace":"go.etcd.io/raft/v3.(*raftLog).maybeAppend\n\tgo.etcd.io/raft/v3@v3.0.0-20221201111702-eaa6808e1f7a/log.go:99\ngo.etcd.io/raft/v3.(*raft).handleAppendEntries\n\tgo.etcd.io/raft/v3@v3.0.0-20221201111702-eaa6808e1f7a/raft.go:1517\ngo.etcd.io/raft/v3.stepFollower\n\tgo.etcd.io/raft/v3@v3.0.0-20221201111702-eaa6808e1f7a/raft.go:1472\ngo.etcd.io/raft/v3.(*raft).Step\n\tgo.etcd.io/raft/v3@v3.0.0-20221201111702-eaa6808e1f7a/raft.go:1021\ngo.etcd.io/raft/v3.(*node).run\n\tgo.etcd.io/raft/v3@v3.0.0-20221201111702-eaa6808e1f7a/node.go:357"}
2023-03-30T10:25:27.3277939Z /home/runner/work/etcd/etcd/bin/etcd (TestRobustnessClusterOfSize3HighTraffic-test-2) (28025): {"level":"panic","ts":"2023-03-30T10:25:27.317178Z","logger":"raft","caller":"v3@v3.0.0-20221201111702-eaa6808e1f7a/log.go:99","msg":"index, 6, is out of range [1]","stacktrace":"go.etcd.io/raft/v3.(*raftLog).maybeAppend\n\tgo.etcd.io/raft/v3@v3.0.0-20221201111702-eaa6808e1f7a/log.go:99\ngo.etcd.io/raft/v3.(*raft).handleAppendEntries\n\tgo.etcd.io/raft/v3@v3.0.0-20221201111702-eaa6808e1f7a/raft.go:1517\ngo.etcd.io/raft/v3.stepFollower\n\tgo.etcd.io/raft/v3@v3.0.0-20221201111702-eaa6808e1f7a/raft.go:1472\ngo.etcd.io/raft/v3.(*raft).Step\n\tgo.etcd.io/raft/v3@v3.0.0-20221201111702-eaa6808e1f7a/raft.go:1021\ngo.etcd.io/raft/v3.(*node).run\n\tgo.etcd.io/raft/v3@v3.0.0-20221201111702-eaa6808e1f7a/node.go:357"}
2023-03-30T10:25:27.3279523Z /home/runner/work/etcd/etcd/bin/etcd (TestRobustnessClusterOfSize3HighTraffic-test-0) (28026): panic: index, 7, is out of range [1]
2023-03-30T10:25:27.3280186Z /home/runner/work/etcd/etcd/bin/etcd (TestRobustnessClusterOfSize3HighTraffic-test-0) (28026):
2023-03-30T10:25:27.3280872Z /home/runner/work/etcd/etcd/bin/etcd (TestRobustnessClusterOfSize3HighTraffic-test-0) (28026): goroutine 138 [running]:
2023-03-30T10:25:27.3281737Z /home/runner/work/etcd/etcd/bin/etcd (TestRobustnessClusterOfSize3HighTraffic-test-0) (28026): go.uber.org/zap/zapcore.CheckWriteAction.OnWrite(0x0?, 0x0?, {0x0?, 0x0?, 0xc0010faee0?})
2023-03-30T10:25:27.3282611Z /home/runner/work/etcd/etcd/bin/etcd (TestRobustnessClusterOfSize3HighTraffic-test-0) (28026): go.uber.org/zap@v1.24.0/zapcore/entry.go:198 +0x65
2023-03-30T10:25:27.3283467Z /home/runner/work/etcd/etcd/bin/etcd (TestRobustnessClusterOfSize3HighTraffic-test-0) (28026): go.uber.org/zap/zapcore.(*CheckedEntry).Write(0xc0004840d0, {0x0, 0x0, 0x0})
2023-03-30T10:25:27.3284295Z /home/runner/work/etcd/etcd/bin/etcd (TestRobustnessClusterOfSize3HighTraffic-test-0) (28026): go.uber.org/zap@v1.24.0/zapcore/entry.go:264 +0x3ec
2023-03-30T10:25:27.3285253Z /home/runner/work/etcd/etcd/bin/etcd (TestRobustnessClusterOfSize3HighTraffic-test-0) (28026): go.uber.org/zap.(*SugaredLogger).log(0xc0000123a0, 0x4, {0x108f80c?, 0x40dc27?}, {0xc0010fafc0?, 0xeced00?, 0x1?}, {0x0, 0x0, 0x0})
2023-03-30T10:25:27.3286094Z /home/runner/work/etcd/etcd/bin/etcd (TestRobustnessClusterOfSize3HighTraffic-test-0) (28026): go.uber.org/zap@v1.24.0/sugar.go:295 +0xee
2023-03-30T10:25:27.3286869Z /home/runner/work/etcd/etcd/bin/etcd (TestRobustnessClusterOfSize3HighTraffic-test-0) (28026): go.uber.org/zap.(*SugaredLogger).Panicf(...)
2023-03-30T10:25:27.3287622Z /home/runner/work/etcd/etcd/bin/etcd (TestRobustnessClusterOfSize3HighTraffic-test-0) (28026): go.uber.org/zap@v1.24.0/sugar.go:189
2023-03-30T10:25:27.3288638Z /home/runner/work/etcd/etcd/bin/etcd (TestRobustnessClusterOfSize3HighTraffic-test-0) (28026): go.etcd.io/etcd/server/v3/etcdserver.(*zapRaftLogger).Panicf(0xc000460230?, {0x108f80c?, 0x1?}, {0xc0010fafc0?, 0x0?, 0x203000?})
2023-03-30T10:25:27.3289528Z /home/runner/work/etcd/etcd/bin/etcd (TestRobustnessClusterOfSize3HighTraffic-test-0) (28026): go.etcd.io/etcd/server/v3/etcdserver/zap_raft.go:102 +0x49
2023-03-30T10:25:27.3290438Z /home/runner/work/etcd/etcd/bin/etcd (TestRobustnessClusterOfSize3HighTraffic-test-0) (28026): go.etcd.io/raft/v3.(*raftLog).maybeAppend(0xc000460230, 0x259, 0x2, 0x25a, {0xc001ae8c60?, 0x1, 0x1})
2023-03-30T10:25:27.3291677Z /home/runner/work/etcd/etcd/bin/etcd (TestRobustnessClusterOfSize3HighTraffic-test-0) (28026): go.etcd.io/raft/v3@v3.0.0-20221201111702-eaa6808e1f7a/log.go:99 +0x190
2023-03-30T10:25:27.3292770Z /home/runner/work/etcd/etcd/bin/etcd (TestRobustnessClusterOfSize3HighTraffic-test-0) (28026): go.etcd.io/raft/v3.(*raft).handleAppendEntries(0xc0004c9b80, {0x3, 0xf2d60a7c86d2c4e1, 0xb078ae59d09754c4, 0x2, 0x2, 0x259, {0xc001ae8c60, 0x1, 0x1}, ...})
2023-03-30T10:25:27.3293723Z /home/runner/work/etcd/etcd/bin/etcd (TestRobustnessClusterOfSize3HighTraffic-test-0) (28026): go.etcd.io/raft/v3@v3.0.0-20221201111702-eaa6808e1f7a/raft.go:1517 +0x77
2023-03-30T10:25:27.3294782Z /home/runner/work/etcd/etcd/bin/etcd (TestRobustnessClusterOfSize3HighTraffic-test-0) (28026): go.etcd.io/raft/v3.stepFollower(0xc0004c9b80, {0x3, 0xf2d60a7c86d2c4e1, 0xb078ae59d09754c4, 0x2, 0x2, 0x259, {0xc001ae8c60, 0x1, 0x1}, ...})
2023-03-30T10:25:27.3295693Z /home/runner/work/etcd/etcd/bin/etcd (TestRobustnessClusterOfSize3HighTraffic-test-0) (28026): go.etcd.io/raft/v3@v3.0.0-20221201111702-eaa6808e1f7a/raft.go:1472 +0x2f8
2023-03-30T10:25:27.3296667Z /home/runner/work/etcd/etcd/bin/etcd (TestRobustnessClusterOfSize3HighTraffic-test-0) (28026): go.etcd.io/raft/v3.(*raft).Step(0xc0004c9b80, {0x3, 0xf2d60a7c86d2c4e1, 0xb078ae59d09754c4, 0x2, 0x2, 0x259, {0xc001ae8c60, 0x1, 0x1}, ...})
2023-03-30T10:25:27.3297577Z /home/runner/work/etcd/etcd/bin/etcd (TestRobustnessClusterOfSize3HighTraffic-test-0) (28026): go.etcd.io/raft/v3@v3.0.0-20221201111702-eaa6808e1f7a/raft.go:1021 +0x1375
2023-03-30T10:25:27.3298364Z /home/runner/work/etcd/etcd/bin/etcd (TestRobustnessClusterOfSize3HighTraffic-test-0) (28026): go.etcd.io/raft/v3.(*node).run(0xc000492de0)
2023-03-30T10:25:27.3299191Z /home/runner/work/etcd/etcd/bin/etcd (TestRobustnessClusterOfSize3HighTraffic-test-0) (28026): go.etcd.io/raft/v3@v3.0.0-20221201111702-eaa6808e1f7a/node.go:357 +0x98a
2023-03-30T10:25:27.3299952Z /home/runner/work/etcd/etcd/bin/etcd (TestRobustnessClusterOfSize3HighTraffic-test-0) (28026): created by go.etcd.io/raft/v3.StartNode
2023-03-30T10:25:27.3300787Z /home/runner/work/etcd/etcd/bin/etcd (TestRobustnessClusterOfSize3HighTraffic-test-0) (28026): go.etcd.io/raft/v3@v3.0.0-20221201111702-eaa6808e1f7a/node.go:231 +0x65
2023-03-30T10:25:27.3301657Z /home/runner/work/etcd/etcd/bin/etcd (TestRobustnessClusterOfSize3HighTraffic-test-2) (28025): panic: index, 6, is out of range [1]
2023-03-30T10:25:27.3302324Z /home/runner/work/etcd/etcd/bin/etcd (TestRobustnessClusterOfSize3HighTraffic-test-2) (28025):
2023-03-30T10:25:27.3303003Z /home/runner/work/etcd/etcd/bin/etcd (TestRobustnessClusterOfSize3HighTraffic-test-2) (28025): goroutine 152 [running]:
2023-03-30T10:25:27.3303876Z /home/runner/work/etcd/etcd/bin/etcd (TestRobustnessClusterOfSize3HighTraffic-test-2) (28025): go.uber.org/zap/zapcore.CheckWriteAction.OnWrite(0x0?, 0x203000?, {0x0?, 0x0?, 0xc00007c080?})
2023-03-30T10:25:27.3304736Z /home/runner/work/etcd/etcd/bin/etcd (TestRobustnessClusterOfSize3HighTraffic-test-2) (28025): go.uber.org/zap@v1.24.0/zapcore/entry.go:198 +0x65
2023-03-30T10:25:27.3305589Z /home/runner/work/etcd/etcd/bin/etcd (TestRobustnessClusterOfSize3HighTraffic-test-2) (28025): go.uber.org/zap/zapcore.(*CheckedEntry).Write(0xc00048d380, {0x0, 0x0, 0x0})
2023-03-30T10:25:27.3306489Z /home/runner/work/etcd/etcd/bin/etcd (TestRobustnessClusterOfSize3HighTraffic-test-2) (28025): go.uber.org/zap@v1.24.0/zapcore/entry.go:264 +0x3ec
2023-03-30T10:25:27.3307446Z /home/runner/work/etcd/etcd/bin/etcd (TestRobustnessClusterOfSize3HighTraffic-test-2) (28025): go.uber.org/zap.(*SugaredLogger).log(0xc000486508, 0x4, {0x108f80c?, 0x40dc27?}, {0xc00198f280?, 0xeced00?, 0x1?}, {0x0, 0x0, 0x0})
2023-03-30T10:25:27.3308268Z /home/runner/work/etcd/etcd/bin/etcd (TestRobustnessClusterOfSize3HighTraffic-test-2) (28025): go.uber.org/zap@v1.24.0/sugar.go:295 +0xee
2023-03-30T10:25:27.3309043Z /home/runner/work/etcd/etcd/bin/etcd (TestRobustnessClusterOfSize3HighTraffic-test-2) (28025): go.uber.org/zap.(*SugaredLogger).Panicf(...)
2023-03-30T10:25:27.3309869Z /home/runner/work/etcd/etcd/bin/etcd (TestRobustnessClusterOfSize3HighTraffic-test-2) (28025): go.uber.org/zap@v1.24.0/sugar.go:189
2023-03-30T10:25:27.3310857Z /home/runner/work/etcd/etcd/bin/etcd (TestRobustnessClusterOfSize3HighTraffic-test-2) (28025): go.etcd.io/etcd/server/v3/etcdserver.(*zapRaftLogger).Panicf(0xc00017e4d0?, {0x108f80c?, 0x1?}, {0xc00198f280?, 0x0?, 0x203000?})
2023-03-30T10:25:27.3311754Z /home/runner/work/etcd/etcd/bin/etcd (TestRobustnessClusterOfSize3HighTraffic-test-2) (28025): go.etcd.io/etcd/server/v3/etcdserver/zap_raft.go:102 +0x49
2023-03-30T10:25:27.3312735Z /home/runner/work/etcd/etcd/bin/etcd (TestRobustnessClusterOfSize3HighTraffic-test-2) (28025): go.etcd.io/raft/v3.(*raftLog).maybeAppend(0xc00017e4d0, 0x25a, 0x2, 0x25a, {0xc0019a90e0?, 0x1, 0x1})
2023-03-30T10:25:27.3313619Z /home/runner/work/etcd/etcd/bin/etcd (TestRobustnessClusterOfSize3HighTraffic-test-2) (28025): go.etcd.io/raft/v3@v3.0.0-20221201111702-eaa6808e1f7a/log.go:99 +0x190
2023-03-30T10:25:27.3314643Z /home/runner/work/etcd/etcd/bin/etcd (TestRobustnessClusterOfSize3HighTraffic-test-2) (28025): go.etcd.io/raft/v3.(*raft).handleAppendEntries(0xc00056f1e0, {0x3, 0x9b0ed86c9976a482, 0xb078ae59d09754c4, 0x2, 0x2, 0x25a, {0xc0019a90e0, 0x1, 0x1}, ...})
2023-03-30T10:25:27.3315569Z /home/runner/work/etcd/etcd/bin/etcd (TestRobustnessClusterOfSize3HighTraffic-test-2) (28025): go.etcd.io/raft/v3@v3.0.0-20221201111702-eaa6808e1f7a/raft.go:1517 +0x77
2023-03-30T10:25:27.3316553Z /home/runner/work/etcd/etcd/bin/etcd (TestRobustnessClusterOfSize3HighTraffic-test-2) (28025): go.etcd.io/raft/v3.stepFollower(0xc00056f1e0, {0x3, 0x9b0ed86c9976a482, 0xb078ae59d09754c4, 0x2, 0x2, 0x25a, {0xc0019a90e0, 0x1, 0x1}, ...})
2023-03-30T10:25:27.3317464Z /home/runner/work/etcd/etcd/bin/etcd (TestRobustnessClusterOfSize3HighTraffic-test-2) (28025): go.etcd.io/raft/v3@v3.0.0-20221201111702-eaa6808e1f7a/raft.go:1472 +0x2f8
2023-03-30T10:25:27.3318537Z /home/runner/work/etcd/etcd/bin/etcd (TestRobustnessClusterOfSize3HighTraffic-test-2) (28025): go.etcd.io/raft/v3.(*raft).Step(0xc00056f1e0, {0x3, 0x9b0ed86c9976a482, 0xb078ae59d09754c4, 0x2, 0x2, 0x25a, {0xc0019a90e0, 0x1, 0x1}, ...})
2023-03-30T10:25:27.3319435Z /home/runner/work/etcd/etcd/bin/etcd (TestRobustnessClusterOfSize3HighTraffic-test-2) (28025): go.etcd.io/raft/v3@v3.0.0-20221201111702-eaa6808e1f7a/raft.go:1021 +0x1375
2023-03-30T10:25:27.3320224Z /home/runner/work/etcd/etcd/bin/etcd (TestRobustnessClusterOfSize3HighTraffic-test-2) (28025): go.etcd.io/raft/v3.(*node).run(0xc00055b380)
2023-03-30T10:25:27.3321052Z /home/runner/work/etcd/etcd/bin/etcd (TestRobustnessClusterOfSize3HighTraffic-test-2) (28025): go.etcd.io/raft/v3@v3.0.0-20221201111702-eaa6808e1f7a/node.go:357 +0x98a
2023-03-30T10:25:27.3321837Z /home/runner/work/etcd/etcd/bin/etcd (TestRobustnessClusterOfSize3HighTraffic-test-2) (28025): created by go.etcd.io/raft/v3.StartNode
2023-03-30T10:25:27.3322660Z /home/runner/work/etcd/etcd/bin/etcd (TestRobustnessClusterOfSize3HighTraffic-test-2) (28025): go.etcd.io/raft/v3@v3.0.0-20221201111702-eaa6808e1f7a/node.go:231 +0x65
About this issue
- Original URL
- State: closed
- Created a year ago
- Reactions: 1
- Comments: 19 (19 by maintainers)
Commits related to this issue
- tests/robustness: Disable testing network blackhole until #15595 is fixed — committed to serathius/etcd by serathius a year ago
- tests/robustness: Disable testing network blackhole until #15595 is fixed Signed-off-by: Marek Siarkowicz <siarkowicz@google.com> — committed to serathius/etcd by serathius a year ago
- tests/robustness: Disable testing network blackhole until #15595 is fixed Signed-off-by: Marek Siarkowicz <siarkowicz@google.com> — committed to serathius/etcd by serathius a year ago
- tests/robustness: Disable testing network blackhole until #15595 is fixed Signed-off-by: Marek Siarkowicz <siarkowicz@google.com> — committed to serathius/etcd by serathius a year ago
- tests/robustness: Disable testing network blackhole until #15595 is fixed Signed-off-by: Marek Siarkowicz <siarkowicz@google.com> — committed to serathius/etcd by serathius a year ago
- Merge pull request #15601 from serathius/robustness-disable-blackhole tests/robustness: Disable testing network blackhole until #15595 is fixed — committed to etcd-io/etcd by serathius a year ago
- Revert "tests/robustness: Disable testing network blackhole until #15595 is fixed" This reverts commit 013e25fab9f76f0c1a00459555fe42b33f379eb9. Signed-off-by: Marek Siarkowicz <siarkowicz@google.co... — committed to serathius/etcd by serathius a year ago
- Revert "tests/robustness: Disable testing network blackhole until #15595 is fixed" This reverts commit 013e25fab9f76f0c1a00459555fe42b33f379eb9. Signed-off-by: Marek Siarkowicz <siarkowicz@google.co... — committed to serathius/etcd by serathius a year ago
- Revert "tests/robustness: Disable testing network blackhole until #15595 is fixed" This reverts commit 013e25fab9f76f0c1a00459555fe42b33f379eb9. Signed-off-by: Marek Siarkowicz <siarkowicz@google.co... — committed to serathius/etcd by serathius a year ago
cc @tbg @pavelkalinnikov