seaweedfs: [bug:filer] Continues to stick not to the leader raft.Server: Not current leader (critical)

Describe the bug after shutdown one volume server only one filler lost leader

Jan 13, 2022 @ 09:10:35.681 | I0113 04:10:35     1 common.go:69] response method:PUT URL:/buckets/reports/report_631214642010_e674c402-c87e-4443-b942-d39d6417225c.pdf with httpStatus:500 and JSON:{"error":"rpc error: code = Unknown desc = raft.Server: Not current leader"}
-- | --

  | Jan 13, 2022 @ 09:10:35.681 | E0113 04:10:35     1 s3api_object_handlers.go:421] upload to filer error: rpc error: code = Unknown desc = raft.Server: Not current leader

  | Jan 13, 2022 @ 09:10:35.681 | E0113 04:10:34     1 filer_server_handlers_write.go:43] failing to assign a file id: rpc error: code = Unknown desc = raft.Server: Not current leader

  | Jan 13, 2022 @ 09:10:35.681 | E0113 04:10:35     1 filer_server_handlers_write.go:43] failing to assign a file id: rpc error: code = Unknown desc = raft.Server: Not current leader

  | Jan 13, 2022 @ 09:10:35.681 | E0113 04:10:35     1 filer_server_handlers_write_upload.go:172] upload error: rpc error: code = Unknown desc = raft.Server: Not current leader

  | Jan 13, 2022 @ 09:10:34.680 | E0113 04:10:34     1 filer_server_handlers_write_upload.go:172] upload error: rpc error: code = Unknown desc = raft.Server: Not current leader

  | Jan 13, 2022 @ 09:10:34.680 | I0113 04:10:34     1 filer_notify.go:103] log write failed /topics/.system/log/2022-01-13/00-01.cec7f54d: AssignVolume: rpc error: code = Unknown desc = raft.Server: Not current leader

System Setup weed version

version 30GB 2.85 ea8e4ec2 linux amd64

Additional context

logs


Jan 13, 2022 @ 05:02:29.653 | E0113 00:01:24     1 filer_grpc_server_sub_meta.go:133] processed to 2022-01-13 00:01:23.092106505 +0000 UTC: rpc error: code = Unavailable desc = transport is closing
-- | --

  | Jan 13, 2022 @ 05:02:29.653 | I0113 00:01:23     1 filer_grpc_server_sub_meta.go:226] => client filer:10.106.65.20:9090@10.106.65.20:47818: rpc error: code = Unavailable desc = transport is closing

  | Jan 13, 2022 @ 05:02:29.653 | I0113 00:02:18     1 filer_notify.go:103] log write failed /topics/.system/log/2022-01-13/00-01.cec7f54d: AssignVolume: rpc error: code = Unknown desc = raft.Server: Not current leader

  | Jan 13, 2022 @ 05:02:29.653 | I0113 00:01:23     1 filer_grpc_server_sub_meta.go:226] => client filer:10.106.65.121:9090@10.106.65.121:53630: rpc error: code = Unavailable desc = transport is closing

  | Jan 13, 2022 @ 05:02:29.653 | E0113 00:01:24     1 filer_grpc_server_sub_meta.go:133] processed to 2022-01-13 00:01:23.092106505 +0000 UTC: rpc error: code = Unavailable desc = transport is closing

  | Jan 13, 2022 @ 05:02:29.653 | I0113 00:01:24     1 filer_grpc_server_sub_meta.go:255] - listener filer:10.106.65.121:9090@10.106.65.121:53630

  | Jan 13, 2022 @ 05:02:29.653 | I0113 00:01:24     1 filer_grpc_server_sub_meta.go:255] - listener filer:10.106.65.20:9090@10.106.65.20:47818

  | Jan 13, 2022 @ 05:02:30.493 | I0113 00:02:19     1 filer_notify.go:103] log write failed /topics/.system/log/2022-01-13/00-01.cec7f54d: AssignVolume: rpc error: code = Unknown desc = raft.Server: Not current leader

  | Jan 13, 2022 @ 05:02:30.493 | I0113 00:02:19     1 filer_notify.go:103] log write failed /topics/.system/log/2022-01-13/00-01.cec7f54d: AssignVolume: rpc error: code = Unknown desc = raft.Server: Not current leader

  | Jan 13, 2022 @ 05:02:30.494 | I0113 00:02:22     1 filer_notify.go:103] log write failed /topics/.system/log/2022-01-13/00-01.cec7f54d: AssignVolume: rpc error: code = Unknown desc = raft.Server: Not current leader

  | Jan 13, 2022 @ 05:02:30.494 | I0113 00:02:21     1 filer_notify.go:103] log write failed /topics/.system/log/2022-01-13/00-01.cec7f54d: AssignVolume: rpc error: code = Unknown desc = raft.Server: Not current leader

  | Jan 13, 2022 @ 05:02:30.494 | I0113 00:02:23     1 filer_notify.go:103] log write failed /topics/.system/log/2022-01-13/00-01.cec7f54d: AssignVolume: rpc error: code = Unknown desc = raft.Server: Not current leader

  | Jan 13, 2022 @ 05:02:30.494 | I0113 00:02:20     1 filer_notify.go:103] log write failed /topics/.system/log/2022-01-13/00-01.cec7f54d: AssignVolume: rpc error: code = Unknown desc = raft.Server: Not current leader

  | Jan 13, 2022 @ 05:02:30.494 | I0113 00:02:22     1 filer_notify.go:103] log write failed /topics/.system/log/2022-01-13/00-01.cec7f54d: AssignVolume: rpc error: code = Unknown desc = raft.Server: Not current leader

  | Jan 13, 2022 @ 05:02:30.494 | I0113 00:02:24     1 filer_notify.go:103] log write failed /topics/.system/log/2022-01-13/00-01.cec7f54d: AssignVolume: rpc error: code = Unknown desc = raft.Server: Not current leader

About this issue

  • Original URL
  • State: closed
  • Created 2 years ago
  • Comments: 32 (21 by maintainers)

Commits related to this issue

Most upvoted comments

According to the log observation, I found a lot of transport is closing and goaway, I guess it is triggered because the grpc connection expires. Whenever the connection expires, mc.currentMaster may be updated to a wrong value (not the real leader, because No judgment when updating mc.currentMaster, this was added in #3228) resulting in printing a lot of incomprehensible logs

Now if the connection is disconnected due to an expired grpc connection will reset the master, all assignRequest will block sleep before connecting to the new master, depending on how long it takes to connect to the leader, this may be a problem; I thought of an optimized Way