moby: The swarm does not have a leader after demoting the Leader

Description I have a swarm cluster made up of tree nodes, two manager and one worker. Node master is the leader at present. After doing “docker node demote master” on the slave node, I get error on master node while doing “docker node ls”. root@master:~# docker node ls Error response from daemon: rpc error: code = 2 desc = The swarm does not have a leader. It’s possible that too few managers are online. Make sure more than half of the managers are online.

But I try again. it appear no more.

Steps to reproduce the issue:

  1. doing “docker node demote master” on the slave node
  2. doing “docker node ls” on master node

Describe the results you received: the cluster has no leader.

Describe the results you expected: slave will become leader.

Additional information you deem important (e.g. issue happens only occasionally):

Output of docker version:

root@ubuntu:~# docker version Client: Version: 17.06.0-ce API version: 1.30 Go version: go1.8.3 Git commit: 02c1d87 Built: Fri Jun 23 21:23:31 2017 OS/Arch: linux/amd64

Server: Version: 17.06.0-ce API version: 1.30 (minimum version 1.12) Go version: go1.8.3 Git commit: 02c1d87 Built: Fri Jun 23 21:19:04 2017 OS/Arch: linux/amd64 Experimental: false

Output of docker info:

root@ubuntu:~# docker info Containers: 2 Running: 2 Paused: 0 Stopped: 0 Images: 2 Server Version: 17.06.0-ce Storage Driver: devicemapper Pool Name: docker-thinpool Pool Blocksize: 524.3kB Base Device Size: 10.74GB Backing Filesystem: xfs Data file: Metadata file: Data Space Used: 300.9MB Data Space Total: 10.2GB Data Space Available: 9.895GB Metadata Space Used: 200.7kB Metadata Space Total: 104.9MB Metadata Space Available: 104.7MB Thin Pool Minimum Free Space: 1.019GB Udev Sync Supported: true Deferred Removal Enabled: true Deferred Deletion Enabled: true Deferred Deleted Device Count: 0 Library Version: 1.02.110 (2015-10-30) Logging Driver: json-file Cgroup Driver: cgroupfs Plugins: Volume: local Network: bridge host macvlan null overlay Log: awslogs fluentd gcplogs gelf journald json-file logentries splunk syslog Swarm: active NodeID: i0i4rmmpk3d2xzf570m793zz6 Is Manager: false Node Address: 192.168.200.254 Manager Addresses: 192.168.200.85:2377 Runtimes: runc Default Runtime: runc Init Binary: docker-init containerd version: cfb82a876ecc11b5ca0977d1733adbe58599088a runc version: 2d41c047c83e09a6d61d464906feb2a2f3c52aa4 init version: 949e6fa Security Options: apparmor seccomp Profile: default Kernel Version: 4.4.0-80-generic Operating System: Ubuntu 16.04.1 LTS OSType: linux Architecture: x86_64 CPUs: 2 Total Memory: 7.796GiB Name: ubuntu ID: 4K7T:EBWI:6YB3:K7AP:C63Q:YTCU:TVNL:HJ56:3PPM:NK4F:GTDX:J2DR Docker Root Dir: /var/lib/docker Debug Mode (client): false Debug Mode (server): false Registry: https://index.docker.io/v1/ Experimental: false Insecure Registries: docker.hikvision.com.cn 127.0.0.0/8 Live Restore Enabled: false

WARNING: No swap limit support

About this issue

  • Original URL
  • State: open
  • Created 7 years ago
  • Reactions: 6
  • Comments: 22 (5 by maintainers)

Most upvoted comments

docker swarm init --force-new-cluster will create new swarm with the same dataset so you don’t lose your services and all.

@xpepermint Don’t run a swarm with 2 managers (or any even number, but 2 is really bad). The same advice would be true for any quorum based system (etcd, consul, etc).

My case:

  • Create 1 manager, 2 workers.
  • Add 1 manager to the cluster (status Reachable).
  • Remove the new manager (docker-machine rm {name}).
  • Login to the first manager and run docker node ls => same error.

@cpuguy83

I lost two out of 3 managers and cannot do anything, even your suggested command fails:

# docker swarm init --force-new-cluster --advertise-addr 192.168.4.102
Error response from daemon: This node is not a swarm manager. Worker nodes can't be used to view or modify cluster state. Please run this command on a manager node or promote the current node to a manager.

But this node is a swarm manager, there’s just no quorum or leader. Why does it not force the new cluster?

@cpuguy83 Can you please have a look at my stackoverflow question? I have explained the whole scenario. Thanks!

To be CLEAR! If you do what Riza-Aslan suggests, you WILL lose control of your entire running cluster! Sadly, this does appear to be the only way to “recover” your management cluster once it’s split-brain. Sigh.

Here’s our experience tonight, and hopefully it will help you. Our Docker Swarm management cluster (three nodes) just inexplicably went split-brain. (We’ll try to forensically discover what happened later, but, as with us, numerous people are experiencing Docker managers apparently just shooting themselves in the head.)

After much searching for potential fixes without the force-recreate approach, we opted for the nuclear option, and, yes, the swarm config is retained. However, and this is CRITICAL to understand, the new management cluster that is thus created CANNOT communicate with the old workers and vice-versa. This fact instantly brings “down” all of your running worker instances, as they no longer have networking. The new management cluster’s certificates and networking are different from what the old cluster’s workers are using. So, yes, we have just verified that the entire cluster’s running containers are effectively disconnected when you create a new management cluster. All of your workers are running, but they are disconnected and cannot be reconnected while keeping the containers running.

The only way to recover from this MESS is to restart docker on each of your worker servers (like: systemctl restart docker), then forcibly make each of those nodes leave the old cluster (like: docker swarm leave --force), then rejoin each node to the new cluster (like: docker swarm join --token [your worker token] [your leader IP address]:2377), then, (gasp!) finally, restart each of your containers.

Now, you’d think that you are done. You are not. Almost certainly, the networking devices from the old cluster will still hang around (they even survive a server reboot!), which means that as you start bringing up your containers again, you are going to get increasing piles of “network sandbox join failed” errors.

This new nightmare is because on random hosts in /sys/class/net/ there are a bunch of old vx- and veth files (the networking device files) left over from the old cluster. Heaven help you track down which are the “old” ones to differentiate them from the ones in use by some of your running containers on the host. Like us, you’ll find numerous walk-throughs online (all inconsistent with each other) trying to “help” you accomplish this (why Docker should even put you through all this is incomprehensible).

We eventually figured out how to detect the bad ones and delete them, while leaving the running containers still running. The following will clean up your networking mess and thereby enable you to bring up all of your old-cluster containers:

First, on each of your hosts, check for these detritus vx- and veth files:

ip addr | grep " vx-\| veth" | grep " DOWN " | awk '{print $2}' | rev | cut -c2- | rev | cut -f1 -d"@"

If you get any results, then issue this command:

ip addr | grep " vx-\| veth" | grep " DOWN " | awk '{print $2}' | rev | cut -c2- | rev | cut -f1 -d"@" | xargs -L1 sudo ip link delete

Fixed! All Docker networking detritus is removed, and you can proceed to resurrect your containers. Eventually, depending upon how many containers you have on however many hosts you have, you’ll get your entire empire back up in the new management cluster.

To be clear, because so many people vaguely imply that recreating your management cluster will leave your existing containers running beautifully, your previous cluster is functionally GONE (and Docker leaves a networking devices MESS behind)! It can be restarted as described above via the old swarm config; you don’t lose your cluster config. But the idea that this approach just keeps everything in the old cluster running wonderfully with no downtime is FALSE.

Understand this: If you create a new management cluster, your entire empire IS going down. This “fix” is the nuclear option.

Now, sadly, this approach appears to be the only way to recover from your management cluster going split-brain (which is a ridiculous state of affairs in the first place).

IMO, Docker Swarm is just not ready for prime-time as an enterprise-grade cluster/container approach. The fact that it is possible to trivially (through no apparent fault of your own) have your management cluster suddenly go brainless is an outrage. And “fixing” the problem by recreating your management cluster is NOT a FIX! It’s a forced recreation of your entire enterprise almost from scratch. This should never need to happen. But if you run Docker Swarm long enough, it WILL happen to you. And you WILL plunge into a Hell the scope of which is precisely defined by the size and scope of your containerization empire. In our case, this was half a night in Hell.

This event was the last straw for us. Moving to Kubernetes. Good luck to you hardy souls staying on Docker Swarm!

I lost two out of 3 managers and cannot do anything, even your suggested command fails:

# docker swarm init --force-new-cluster --advertise-addr 192.168.4.102
Error response from daemon: This node is not a swarm manager. Worker nodes can't be used to view or modify cluster state. Please run this command on a manager node or promote the current node to a manager.

But this node is a swarm manager, there’s just no quorum or leader. Why does it not force the new cluster?

Rerunning the command after a restart of the Docker service helped in my case.

Hi all!

We’re running a three node swarm with a single manager. For some reason our manager got demoted but is still in the swarm, and the swarm now has no manager. I tried the docker swarm init --force-new-cluster without success.

$ docker swarm init --force-new-cluster --advertise-addr eth1:2377
Error response from daemon: This node is not a swarm manager. Worker nodes can't be used to view or modify cluster state. Please run this command on a manager node or promote the current node to a manager.

Is there a way to recover without losing all our services and data?

@cpuguy83 That didn’t work for me. Can you please have a look at my stackoverflow question and help me out? I’m stuck at it since a week, really frustrated. Thanks!