raft: [v2] Rejecting vote request... since we have a leader

I am using the v2-stage-one branch and while everything seems to work fine for the most part, I do have one issue:

I have a cluster of 3 nodes. I take one node down gracefully (used consul as an example of leave/shutdown logic, and waiting for changes to propagate) and the cluster maintains itself at 2 nodes. If I then, however, try to restart the same node (with any combination of ServerID and addr:port), the new node sits there and requests a leader vote forever, with the other two nodes logging [WARN] raft: Rejecting vote request from ... since we already have a leader

I used Consul as an example of the implementation, fwiw.

About this issue

Original URL
State: closed
Created 7 years ago
Comments: 20 (6 by maintainers)

Most upvoted comments

@superfell So the situation I am imagining is, say, a cluster of three containerized raft nodes. These containers can be stopped(or die) and a new container started automatically by the orchestration framework. A couple questions in this regard:

Should the raft node’s ID be ephemeral like a container ID or “sticky” like a specific name?
Should the raft state (i.e. the raft.db and snapshots) be maintained across restarts? And if the ID needs to be “sticky”, does this raft state need to always be associated with the node of the same ID?

jonbonazza on Dec 4, 2017