seaweedfs: [bug:filer] Continues to stick not to the leader raft.Server: Not current leader (critical)
Describe the bug after shutdown one volume server only one filler lost leader
Jan 13, 2022 @ 09:10:35.681 | I0113 04:10:35 1 common.go:69] response method:PUT URL:/buckets/reports/report_631214642010_e674c402-c87e-4443-b942-d39d6417225c.pdf with httpStatus:500 and JSON:{"error":"rpc error: code = Unknown desc = raft.Server: Not current leader"}
-- | --
| Jan 13, 2022 @ 09:10:35.681 | E0113 04:10:35 1 s3api_object_handlers.go:421] upload to filer error: rpc error: code = Unknown desc = raft.Server: Not current leader
| Jan 13, 2022 @ 09:10:35.681 | E0113 04:10:34 1 filer_server_handlers_write.go:43] failing to assign a file id: rpc error: code = Unknown desc = raft.Server: Not current leader
| Jan 13, 2022 @ 09:10:35.681 | E0113 04:10:35 1 filer_server_handlers_write.go:43] failing to assign a file id: rpc error: code = Unknown desc = raft.Server: Not current leader
| Jan 13, 2022 @ 09:10:35.681 | E0113 04:10:35 1 filer_server_handlers_write_upload.go:172] upload error: rpc error: code = Unknown desc = raft.Server: Not current leader
| Jan 13, 2022 @ 09:10:34.680 | E0113 04:10:34 1 filer_server_handlers_write_upload.go:172] upload error: rpc error: code = Unknown desc = raft.Server: Not current leader
| Jan 13, 2022 @ 09:10:34.680 | I0113 04:10:34 1 filer_notify.go:103] log write failed /topics/.system/log/2022-01-13/00-01.cec7f54d: AssignVolume: rpc error: code = Unknown desc = raft.Server: Not current leader
System Setup weed version
version 30GB 2.85 ea8e4ec2 linux amd64
Additional context
logs
Jan 13, 2022 @ 05:02:29.653 | E0113 00:01:24 1 filer_grpc_server_sub_meta.go:133] processed to 2022-01-13 00:01:23.092106505 +0000 UTC: rpc error: code = Unavailable desc = transport is closing
-- | --
| Jan 13, 2022 @ 05:02:29.653 | I0113 00:01:23 1 filer_grpc_server_sub_meta.go:226] => client filer:10.106.65.20:9090@10.106.65.20:47818: rpc error: code = Unavailable desc = transport is closing
| Jan 13, 2022 @ 05:02:29.653 | I0113 00:02:18 1 filer_notify.go:103] log write failed /topics/.system/log/2022-01-13/00-01.cec7f54d: AssignVolume: rpc error: code = Unknown desc = raft.Server: Not current leader
| Jan 13, 2022 @ 05:02:29.653 | I0113 00:01:23 1 filer_grpc_server_sub_meta.go:226] => client filer:10.106.65.121:9090@10.106.65.121:53630: rpc error: code = Unavailable desc = transport is closing
| Jan 13, 2022 @ 05:02:29.653 | E0113 00:01:24 1 filer_grpc_server_sub_meta.go:133] processed to 2022-01-13 00:01:23.092106505 +0000 UTC: rpc error: code = Unavailable desc = transport is closing
| Jan 13, 2022 @ 05:02:29.653 | I0113 00:01:24 1 filer_grpc_server_sub_meta.go:255] - listener filer:10.106.65.121:9090@10.106.65.121:53630
| Jan 13, 2022 @ 05:02:29.653 | I0113 00:01:24 1 filer_grpc_server_sub_meta.go:255] - listener filer:10.106.65.20:9090@10.106.65.20:47818
| Jan 13, 2022 @ 05:02:30.493 | I0113 00:02:19 1 filer_notify.go:103] log write failed /topics/.system/log/2022-01-13/00-01.cec7f54d: AssignVolume: rpc error: code = Unknown desc = raft.Server: Not current leader
| Jan 13, 2022 @ 05:02:30.493 | I0113 00:02:19 1 filer_notify.go:103] log write failed /topics/.system/log/2022-01-13/00-01.cec7f54d: AssignVolume: rpc error: code = Unknown desc = raft.Server: Not current leader
| Jan 13, 2022 @ 05:02:30.494 | I0113 00:02:22 1 filer_notify.go:103] log write failed /topics/.system/log/2022-01-13/00-01.cec7f54d: AssignVolume: rpc error: code = Unknown desc = raft.Server: Not current leader
| Jan 13, 2022 @ 05:02:30.494 | I0113 00:02:21 1 filer_notify.go:103] log write failed /topics/.system/log/2022-01-13/00-01.cec7f54d: AssignVolume: rpc error: code = Unknown desc = raft.Server: Not current leader
| Jan 13, 2022 @ 05:02:30.494 | I0113 00:02:23 1 filer_notify.go:103] log write failed /topics/.system/log/2022-01-13/00-01.cec7f54d: AssignVolume: rpc error: code = Unknown desc = raft.Server: Not current leader
| Jan 13, 2022 @ 05:02:30.494 | I0113 00:02:20 1 filer_notify.go:103] log write failed /topics/.system/log/2022-01-13/00-01.cec7f54d: AssignVolume: rpc error: code = Unknown desc = raft.Server: Not current leader
| Jan 13, 2022 @ 05:02:30.494 | I0113 00:02:22 1 filer_notify.go:103] log write failed /topics/.system/log/2022-01-13/00-01.cec7f54d: AssignVolume: rpc error: code = Unknown desc = raft.Server: Not current leader
| Jan 13, 2022 @ 05:02:30.494 | I0113 00:02:24 1 filer_notify.go:103] log write failed /topics/.system/log/2022-01-13/00-01.cec7f54d: AssignVolume: rpc error: code = Unknown desc = raft.Server: Not current leader
About this issue
- Original URL
- State: closed
- Created 2 years ago
- Comments: 32 (21 by maintainers)
Commits related to this issue
- avoid set currentMaster k8s svc.local discoveruy service domains https://github.com/chrislusf/seaweedfs/issues/2589 — committed to kmlebedev/seaweedfs by kmlebedev 2 years ago
According to the log observation, I found a lot of
transport is closing
andgoaway
, I guess it is triggered because the grpc connection expires. Whenever the connection expires,mc.currentMaster
may be updated to a wrong value (not the real leader, because No judgment when updatingmc.currentMaster
, this was added in #3228) resulting in printing a lot of incomprehensible logsNow if the connection is disconnected due to an expired grpc connection will reset the master, all assignRequest will block sleep before connecting to the new master, depending on how long it takes to connect to the leader, this may be a problem; I thought of an optimized Way