etcd: Etcd panic in robustness tests blackhole failpoint

What happened?

Robustness Nightly are failing due to etcd panic. https://github.com/etcd-io/etcd/actions/runs/4562802681

What did you expect to happen?

Etcd should not panic due to network issues

How can we reproduce it (as minimally and precisely as possible)?

Run robustness tests blackhole tests enough times. See https://github.com/etcd-io/etcd/actions/workflows/robustness-nightly.yaml

Anything else we need to know?

No response

Etcd version (please run commands below)

N/A

Etcd configuration (command line flags or environment variables)

N/A

Etcd debug information (please run commands below, feel free to obfuscate the IP address or FQDN in the output)

N/A

Relevant log output

2023-03-30T10:25:27.3144193Z     logger.go:130: 2023-03-30T10:25:26.312Z	INFO	Triggering failpoint
2023-03-30T10:25:27.3144599Z         	{"failpoint": "blackhole"}
2023-03-30T10:25:27.3145371Z     failpoints.go:328: Blackholing traffic from and to member "TestRobustnessClusterOfSize3HighTraffic-test-1"
2023-03-30T10:25:27.3146769Z /home/runner/work/etcd/etcd/bin/etcd (TestRobustnessClusterOfSize3HighTraffic-test-2) (28025): {"level":"warn","ts":"2023-03-30T10:25:26.811397Z","caller":"etcdserver/v3_server.go:831","msg":"waiting for ReadIndex response took too long, retrying","sent-request-id":11854185818228622438,"retry-timeout":"500ms"}
2023-03-30T10:25:27.3148279Z /home/runner/work/etcd/etcd/bin/etcd (TestRobustnessClusterOfSize3HighTraffic-test-0) (28026): {"level":"warn","ts":"2023-03-30T10:25:26.812944Z","caller":"etcdserver/v3_server.go:831","msg":"waiting for ReadIndex response took too long, retrying","sent-request-id":14186768950229826371,"retry-timeout":"500ms"}
2023-03-30T10:25:27.3149756Z /home/runner/work/etcd/etcd/bin/etcd (TestRobustnessClusterOfSize3HighTraffic-test-1) (28035): {"level":"warn","ts":"2023-03-30T10:25:26.813508Z","caller":"etcdserver/v3_server.go:831","msg":"waiting for ReadIndex response took too long, retrying","sent-request-id":6108155643657295548,"retry-timeout":"500ms"}
2023-03-30T10:25:27.3265010Z /home/runner/work/etcd/etcd/bin/etcd (TestRobustnessClusterOfSize3HighTraffic-test-2) (28025): {"level":"warn","ts":"2023-03-30T10:25:27.312399Z","caller":"etcdserver/v3_server.go:831","msg":"waiting for ReadIndex response took too long, retrying","sent-request-id":11854185818228622438,"retry-timeout":"500ms"}
2023-03-30T10:25:27.3266115Z     failpoints.go:332: Traffic restored from and to member "TestRobustnessClusterOfSize3HighTraffic-test-1"
2023-03-30T10:25:27.3267790Z /home/runner/work/etcd/etcd/bin/etcd (TestRobustnessClusterOfSize3HighTraffic-test-2) (28025): {"level":"info","ts":"2023-03-30T10:25:27.314167Z","caller":"traceutil/trace.go:171","msg":"trace[1342822156] linearizableReadLoop","detail":"{readStateIndex:601; appliedIndex:602; }","duration":"1.003816961s","start":"2023-03-30T10:25:26.310341Z","end":"2023-03-30T10:25:27.314158Z","steps":["trace[1342822156] 'read index received'  (duration: 1.003814761s)","trace[1342822156] 'applied index is now lower than readState.Index'  (duration: 1.5µs)"],"step_count":2}
2023-03-30T10:25:27.3269807Z /home/runner/work/etcd/etcd/bin/etcd (TestRobustnessClusterOfSize3HighTraffic-test-1) (28035): {"level":"info","ts":"2023-03-30T10:25:27.313778Z","caller":"traceutil/trace.go:171","msg":"trace[1664249991] linearizableReadLoop","detail":"{readStateIndex:601; appliedIndex:602; }","duration":"1.000933919s","start":"2023-03-30T10:25:26.31283Z","end":"2023-03-30T10:25:27.313764Z","steps":["trace[1664249991] 'read index received'  (duration: 1.000931819s)","trace[1664249991] 'applied index is now lower than readState.Index'  (duration: 1.5µs)"],"step_count":2}
2023-03-30T10:25:27.3271798Z /home/runner/work/etcd/etcd/bin/etcd (TestRobustnessClusterOfSize3HighTraffic-test-0) (28026): {"level":"info","ts":"2023-03-30T10:25:27.315185Z","caller":"traceutil/trace.go:171","msg":"trace[2074038161] linearizableReadLoop","detail":"{readStateIndex:601; appliedIndex:601; }","duration":"1.002760046s","start":"2023-03-30T10:25:26.312412Z","end":"2023-03-30T10:25:27.315172Z","steps":["trace[2074038161] 'read index received'  (duration: 1.002757745s)","trace[2074038161] 'applied index is now lower than readState.Index'  (duration: 1.601µs)"],"step_count":2}
2023-03-30T10:25:27.3274893Z /home/runner/work/etcd/etcd/bin/etcd (TestRobustnessClusterOfSize3HighTraffic-test-0) (28026): {"level":"panic","ts":"2023-03-30T10:25:27.31731Z","logger":"raft","caller":"v3@v3.0.0-20221201111702-eaa6808e1f7a/log.go:99","msg":"index, 7, is out of range [1]","stacktrace":"go.etcd.io/raft/v3.(*raftLog).maybeAppend\n\tgo.etcd.io/raft/v3@v3.0.0-20221201111702-eaa6808e1f7a/log.go:99\ngo.etcd.io/raft/v3.(*raft).handleAppendEntries\n\tgo.etcd.io/raft/v3@v3.0.0-20221201111702-eaa6808e1f7a/raft.go:1517\ngo.etcd.io/raft/v3.stepFollower\n\tgo.etcd.io/raft/v3@v3.0.0-20221201111702-eaa6808e1f7a/raft.go:1472\ngo.etcd.io/raft/v3.(*raft).Step\n\tgo.etcd.io/raft/v3@v3.0.0-20221201111702-eaa6808e1f7a/raft.go:1021\ngo.etcd.io/raft/v3.(*node).run\n\tgo.etcd.io/raft/v3@v3.0.0-20221201111702-eaa6808e1f7a/node.go:357"}
2023-03-30T10:25:27.3277939Z /home/runner/work/etcd/etcd/bin/etcd (TestRobustnessClusterOfSize3HighTraffic-test-2) (28025): {"level":"panic","ts":"2023-03-30T10:25:27.317178Z","logger":"raft","caller":"v3@v3.0.0-20221201111702-eaa6808e1f7a/log.go:99","msg":"index, 6, is out of range [1]","stacktrace":"go.etcd.io/raft/v3.(*raftLog).maybeAppend\n\tgo.etcd.io/raft/v3@v3.0.0-20221201111702-eaa6808e1f7a/log.go:99\ngo.etcd.io/raft/v3.(*raft).handleAppendEntries\n\tgo.etcd.io/raft/v3@v3.0.0-20221201111702-eaa6808e1f7a/raft.go:1517\ngo.etcd.io/raft/v3.stepFollower\n\tgo.etcd.io/raft/v3@v3.0.0-20221201111702-eaa6808e1f7a/raft.go:1472\ngo.etcd.io/raft/v3.(*raft).Step\n\tgo.etcd.io/raft/v3@v3.0.0-20221201111702-eaa6808e1f7a/raft.go:1021\ngo.etcd.io/raft/v3.(*node).run\n\tgo.etcd.io/raft/v3@v3.0.0-20221201111702-eaa6808e1f7a/node.go:357"}
2023-03-30T10:25:27.3279523Z /home/runner/work/etcd/etcd/bin/etcd (TestRobustnessClusterOfSize3HighTraffic-test-0) (28026): panic: index, 7, is out of range [1]
2023-03-30T10:25:27.3280186Z /home/runner/work/etcd/etcd/bin/etcd (TestRobustnessClusterOfSize3HighTraffic-test-0) (28026): 
2023-03-30T10:25:27.3280872Z /home/runner/work/etcd/etcd/bin/etcd (TestRobustnessClusterOfSize3HighTraffic-test-0) (28026): goroutine 138 [running]:
2023-03-30T10:25:27.3281737Z /home/runner/work/etcd/etcd/bin/etcd (TestRobustnessClusterOfSize3HighTraffic-test-0) (28026): go.uber.org/zap/zapcore.CheckWriteAction.OnWrite(0x0?, 0x0?, {0x0?, 0x0?, 0xc0010faee0?})
2023-03-30T10:25:27.3282611Z /home/runner/work/etcd/etcd/bin/etcd (TestRobustnessClusterOfSize3HighTraffic-test-0) (28026): 	go.uber.org/zap@v1.24.0/zapcore/entry.go:198 +0x65
2023-03-30T10:25:27.3283467Z /home/runner/work/etcd/etcd/bin/etcd (TestRobustnessClusterOfSize3HighTraffic-test-0) (28026): go.uber.org/zap/zapcore.(*CheckedEntry).Write(0xc0004840d0, {0x0, 0x0, 0x0})
2023-03-30T10:25:27.3284295Z /home/runner/work/etcd/etcd/bin/etcd (TestRobustnessClusterOfSize3HighTraffic-test-0) (28026): 	go.uber.org/zap@v1.24.0/zapcore/entry.go:264 +0x3ec
2023-03-30T10:25:27.3285253Z /home/runner/work/etcd/etcd/bin/etcd (TestRobustnessClusterOfSize3HighTraffic-test-0) (28026): go.uber.org/zap.(*SugaredLogger).log(0xc0000123a0, 0x4, {0x108f80c?, 0x40dc27?}, {0xc0010fafc0?, 0xeced00?, 0x1?}, {0x0, 0x0, 0x0})
2023-03-30T10:25:27.3286094Z /home/runner/work/etcd/etcd/bin/etcd (TestRobustnessClusterOfSize3HighTraffic-test-0) (28026): 	go.uber.org/zap@v1.24.0/sugar.go:295 +0xee
2023-03-30T10:25:27.3286869Z /home/runner/work/etcd/etcd/bin/etcd (TestRobustnessClusterOfSize3HighTraffic-test-0) (28026): go.uber.org/zap.(*SugaredLogger).Panicf(...)
2023-03-30T10:25:27.3287622Z /home/runner/work/etcd/etcd/bin/etcd (TestRobustnessClusterOfSize3HighTraffic-test-0) (28026): 	go.uber.org/zap@v1.24.0/sugar.go:189
2023-03-30T10:25:27.3288638Z /home/runner/work/etcd/etcd/bin/etcd (TestRobustnessClusterOfSize3HighTraffic-test-0) (28026): go.etcd.io/etcd/server/v3/etcdserver.(*zapRaftLogger).Panicf(0xc000460230?, {0x108f80c?, 0x1?}, {0xc0010fafc0?, 0x0?, 0x203000?})
2023-03-30T10:25:27.3289528Z /home/runner/work/etcd/etcd/bin/etcd (TestRobustnessClusterOfSize3HighTraffic-test-0) (28026): 	go.etcd.io/etcd/server/v3/etcdserver/zap_raft.go:102 +0x49
2023-03-30T10:25:27.3290438Z /home/runner/work/etcd/etcd/bin/etcd (TestRobustnessClusterOfSize3HighTraffic-test-0) (28026): go.etcd.io/raft/v3.(*raftLog).maybeAppend(0xc000460230, 0x259, 0x2, 0x25a, {0xc001ae8c60?, 0x1, 0x1})
2023-03-30T10:25:27.3291677Z /home/runner/work/etcd/etcd/bin/etcd (TestRobustnessClusterOfSize3HighTraffic-test-0) (28026): 	go.etcd.io/raft/v3@v3.0.0-20221201111702-eaa6808e1f7a/log.go:99 +0x190
2023-03-30T10:25:27.3292770Z /home/runner/work/etcd/etcd/bin/etcd (TestRobustnessClusterOfSize3HighTraffic-test-0) (28026): go.etcd.io/raft/v3.(*raft).handleAppendEntries(0xc0004c9b80, {0x3, 0xf2d60a7c86d2c4e1, 0xb078ae59d09754c4, 0x2, 0x2, 0x259, {0xc001ae8c60, 0x1, 0x1}, ...})
2023-03-30T10:25:27.3293723Z /home/runner/work/etcd/etcd/bin/etcd (TestRobustnessClusterOfSize3HighTraffic-test-0) (28026): 	go.etcd.io/raft/v3@v3.0.0-20221201111702-eaa6808e1f7a/raft.go:1517 +0x77
2023-03-30T10:25:27.3294782Z /home/runner/work/etcd/etcd/bin/etcd (TestRobustnessClusterOfSize3HighTraffic-test-0) (28026): go.etcd.io/raft/v3.stepFollower(0xc0004c9b80, {0x3, 0xf2d60a7c86d2c4e1, 0xb078ae59d09754c4, 0x2, 0x2, 0x259, {0xc001ae8c60, 0x1, 0x1}, ...})
2023-03-30T10:25:27.3295693Z /home/runner/work/etcd/etcd/bin/etcd (TestRobustnessClusterOfSize3HighTraffic-test-0) (28026): 	go.etcd.io/raft/v3@v3.0.0-20221201111702-eaa6808e1f7a/raft.go:1472 +0x2f8
2023-03-30T10:25:27.3296667Z /home/runner/work/etcd/etcd/bin/etcd (TestRobustnessClusterOfSize3HighTraffic-test-0) (28026): go.etcd.io/raft/v3.(*raft).Step(0xc0004c9b80, {0x3, 0xf2d60a7c86d2c4e1, 0xb078ae59d09754c4, 0x2, 0x2, 0x259, {0xc001ae8c60, 0x1, 0x1}, ...})
2023-03-30T10:25:27.3297577Z /home/runner/work/etcd/etcd/bin/etcd (TestRobustnessClusterOfSize3HighTraffic-test-0) (28026): 	go.etcd.io/raft/v3@v3.0.0-20221201111702-eaa6808e1f7a/raft.go:1021 +0x1375
2023-03-30T10:25:27.3298364Z /home/runner/work/etcd/etcd/bin/etcd (TestRobustnessClusterOfSize3HighTraffic-test-0) (28026): go.etcd.io/raft/v3.(*node).run(0xc000492de0)
2023-03-30T10:25:27.3299191Z /home/runner/work/etcd/etcd/bin/etcd (TestRobustnessClusterOfSize3HighTraffic-test-0) (28026): 	go.etcd.io/raft/v3@v3.0.0-20221201111702-eaa6808e1f7a/node.go:357 +0x98a
2023-03-30T10:25:27.3299952Z /home/runner/work/etcd/etcd/bin/etcd (TestRobustnessClusterOfSize3HighTraffic-test-0) (28026): created by go.etcd.io/raft/v3.StartNode
2023-03-30T10:25:27.3300787Z /home/runner/work/etcd/etcd/bin/etcd (TestRobustnessClusterOfSize3HighTraffic-test-0) (28026): 	go.etcd.io/raft/v3@v3.0.0-20221201111702-eaa6808e1f7a/node.go:231 +0x65
2023-03-30T10:25:27.3301657Z /home/runner/work/etcd/etcd/bin/etcd (TestRobustnessClusterOfSize3HighTraffic-test-2) (28025): panic: index, 6, is out of range [1]
2023-03-30T10:25:27.3302324Z /home/runner/work/etcd/etcd/bin/etcd (TestRobustnessClusterOfSize3HighTraffic-test-2) (28025): 
2023-03-30T10:25:27.3303003Z /home/runner/work/etcd/etcd/bin/etcd (TestRobustnessClusterOfSize3HighTraffic-test-2) (28025): goroutine 152 [running]:
2023-03-30T10:25:27.3303876Z /home/runner/work/etcd/etcd/bin/etcd (TestRobustnessClusterOfSize3HighTraffic-test-2) (28025): go.uber.org/zap/zapcore.CheckWriteAction.OnWrite(0x0?, 0x203000?, {0x0?, 0x0?, 0xc00007c080?})
2023-03-30T10:25:27.3304736Z /home/runner/work/etcd/etcd/bin/etcd (TestRobustnessClusterOfSize3HighTraffic-test-2) (28025): 	go.uber.org/zap@v1.24.0/zapcore/entry.go:198 +0x65
2023-03-30T10:25:27.3305589Z /home/runner/work/etcd/etcd/bin/etcd (TestRobustnessClusterOfSize3HighTraffic-test-2) (28025): go.uber.org/zap/zapcore.(*CheckedEntry).Write(0xc00048d380, {0x0, 0x0, 0x0})
2023-03-30T10:25:27.3306489Z /home/runner/work/etcd/etcd/bin/etcd (TestRobustnessClusterOfSize3HighTraffic-test-2) (28025): 	go.uber.org/zap@v1.24.0/zapcore/entry.go:264 +0x3ec
2023-03-30T10:25:27.3307446Z /home/runner/work/etcd/etcd/bin/etcd (TestRobustnessClusterOfSize3HighTraffic-test-2) (28025): go.uber.org/zap.(*SugaredLogger).log(0xc000486508, 0x4, {0x108f80c?, 0x40dc27?}, {0xc00198f280?, 0xeced00?, 0x1?}, {0x0, 0x0, 0x0})
2023-03-30T10:25:27.3308268Z /home/runner/work/etcd/etcd/bin/etcd (TestRobustnessClusterOfSize3HighTraffic-test-2) (28025): 	go.uber.org/zap@v1.24.0/sugar.go:295 +0xee
2023-03-30T10:25:27.3309043Z /home/runner/work/etcd/etcd/bin/etcd (TestRobustnessClusterOfSize3HighTraffic-test-2) (28025): go.uber.org/zap.(*SugaredLogger).Panicf(...)
2023-03-30T10:25:27.3309869Z /home/runner/work/etcd/etcd/bin/etcd (TestRobustnessClusterOfSize3HighTraffic-test-2) (28025): 	go.uber.org/zap@v1.24.0/sugar.go:189
2023-03-30T10:25:27.3310857Z /home/runner/work/etcd/etcd/bin/etcd (TestRobustnessClusterOfSize3HighTraffic-test-2) (28025): go.etcd.io/etcd/server/v3/etcdserver.(*zapRaftLogger).Panicf(0xc00017e4d0?, {0x108f80c?, 0x1?}, {0xc00198f280?, 0x0?, 0x203000?})
2023-03-30T10:25:27.3311754Z /home/runner/work/etcd/etcd/bin/etcd (TestRobustnessClusterOfSize3HighTraffic-test-2) (28025): 	go.etcd.io/etcd/server/v3/etcdserver/zap_raft.go:102 +0x49
2023-03-30T10:25:27.3312735Z /home/runner/work/etcd/etcd/bin/etcd (TestRobustnessClusterOfSize3HighTraffic-test-2) (28025): go.etcd.io/raft/v3.(*raftLog).maybeAppend(0xc00017e4d0, 0x25a, 0x2, 0x25a, {0xc0019a90e0?, 0x1, 0x1})
2023-03-30T10:25:27.3313619Z /home/runner/work/etcd/etcd/bin/etcd (TestRobustnessClusterOfSize3HighTraffic-test-2) (28025): 	go.etcd.io/raft/v3@v3.0.0-20221201111702-eaa6808e1f7a/log.go:99 +0x190
2023-03-30T10:25:27.3314643Z /home/runner/work/etcd/etcd/bin/etcd (TestRobustnessClusterOfSize3HighTraffic-test-2) (28025): go.etcd.io/raft/v3.(*raft).handleAppendEntries(0xc00056f1e0, {0x3, 0x9b0ed86c9976a482, 0xb078ae59d09754c4, 0x2, 0x2, 0x25a, {0xc0019a90e0, 0x1, 0x1}, ...})
2023-03-30T10:25:27.3315569Z /home/runner/work/etcd/etcd/bin/etcd (TestRobustnessClusterOfSize3HighTraffic-test-2) (28025): 	go.etcd.io/raft/v3@v3.0.0-20221201111702-eaa6808e1f7a/raft.go:1517 +0x77
2023-03-30T10:25:27.3316553Z /home/runner/work/etcd/etcd/bin/etcd (TestRobustnessClusterOfSize3HighTraffic-test-2) (28025): go.etcd.io/raft/v3.stepFollower(0xc00056f1e0, {0x3, 0x9b0ed86c9976a482, 0xb078ae59d09754c4, 0x2, 0x2, 0x25a, {0xc0019a90e0, 0x1, 0x1}, ...})
2023-03-30T10:25:27.3317464Z /home/runner/work/etcd/etcd/bin/etcd (TestRobustnessClusterOfSize3HighTraffic-test-2) (28025): 	go.etcd.io/raft/v3@v3.0.0-20221201111702-eaa6808e1f7a/raft.go:1472 +0x2f8
2023-03-30T10:25:27.3318537Z /home/runner/work/etcd/etcd/bin/etcd (TestRobustnessClusterOfSize3HighTraffic-test-2) (28025): go.etcd.io/raft/v3.(*raft).Step(0xc00056f1e0, {0x3, 0x9b0ed86c9976a482, 0xb078ae59d09754c4, 0x2, 0x2, 0x25a, {0xc0019a90e0, 0x1, 0x1}, ...})
2023-03-30T10:25:27.3319435Z /home/runner/work/etcd/etcd/bin/etcd (TestRobustnessClusterOfSize3HighTraffic-test-2) (28025): 	go.etcd.io/raft/v3@v3.0.0-20221201111702-eaa6808e1f7a/raft.go:1021 +0x1375
2023-03-30T10:25:27.3320224Z /home/runner/work/etcd/etcd/bin/etcd (TestRobustnessClusterOfSize3HighTraffic-test-2) (28025): go.etcd.io/raft/v3.(*node).run(0xc00055b380)
2023-03-30T10:25:27.3321052Z /home/runner/work/etcd/etcd/bin/etcd (TestRobustnessClusterOfSize3HighTraffic-test-2) (28025): 	go.etcd.io/raft/v3@v3.0.0-20221201111702-eaa6808e1f7a/node.go:357 +0x98a
2023-03-30T10:25:27.3321837Z /home/runner/work/etcd/etcd/bin/etcd (TestRobustnessClusterOfSize3HighTraffic-test-2) (28025): created by go.etcd.io/raft/v3.StartNode
2023-03-30T10:25:27.3322660Z /home/runner/work/etcd/etcd/bin/etcd (TestRobustnessClusterOfSize3HighTraffic-test-2) (28025): 	go.etcd.io/raft/v3@v3.0.0-20221201111702-eaa6808e1f7a/node.go:231 +0x65

About this issue

  • Original URL
  • State: closed
  • Created a year ago
  • Reactions: 1
  • Comments: 19 (19 by maintainers)

Commits related to this issue

Most upvoted comments