moby: Unable to start dockerd on swarm manager. Error says "tocommit(150010) is out of range [lastIndex(78775)]. Was the raft log corrupted"
Description I am unable to start docker on one of the manager nodes of my swarm (IMPORTANT: docker swarm mode, not docker swarm).
Output of docker version
:
Client:
Version: 1.12.3
API version: 1.24
Go version: go1.6.3
Git commit: 6b644ec
Built: Wed Oct 26 22:01:48 2016
OS/Arch: linux/amd64
Cannot connect to the Docker daemon. Is the docker daemon running on this host?
Output of docker info
:
I’m unable to run this command because docker won’t start
Additional environment details (AWS, VirtualBox, physical, etc.): 7 Node swarm with 3 manager nodes. Running in a dedicated VPS on AWS. All nodes are running Ubuntu 16.04.1 LTS
Output of dockerd
:
INFO[0000] libcontainerd: new containerd process, pid: 1693
WARN[0000] containerd: low RLIMIT_NOFILE changing to max current=1024 max=65536
INFO[0001] [graphdriver] using prior storage driver "aufs"
INFO[0001] Graph migration to content-addressability took 0.00 seconds
WARN[0001] Your kernel does not support swap memory limit.
INFO[0001] Loading containers: start.
.................................................................................INFO[0001] Firewalld running: false
INFO[0001] Default bridge (docker0) is assigned with an IP address 172.17.0.0/16. Daemon option --bip can be used to set a preferred IP address
INFO[0001] Loading containers: done.
INFO[0001] Listening for local connections addr=/var/lib/docker/swarm/control.sock proto=unix
INFO[0001] Listening for connections addr=[::]:2377 proto=tcp
WARN[0001] ignoring request to join cluster, because raft state already exists
INFO[0001] 52b75a6e6dd83823 became follower at term 2
INFO[0001] newRaft 52b75a6e6dd83823 [peers: [8475b40f5c3b344,197af42d1ac22e90,52b75a6e6dd83823], term: 2, commit: 78774, applied: 70000, lastindex: 78775, lastterm: 2]
INFO[0002] 52b75a6e6dd83823 [term: 2] received a MsgHeartbeat message with higher term from 197af42d1ac22e90 [term: 14]
INFO[0002] 52b75a6e6dd83823 became follower at term 14
PANI[0002] tocommit(150010) is out of range [lastIndex(78775)]. Was the raft log corrupted, truncated, or lost?
panic: (*logrus.Entry) (0x1d275e0,0xc824fda400)
goroutine 520 [running]:
panic(0x1d275e0, 0xc824fda400)
/usr/local/go/src/runtime/panic.go:481 +0x3e6
github.com/Sirupsen/logrus.Entry.log(0xc82004c1c0, 0xc8203c68d0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0xc8246d9a40, ...)
/usr/src/docker/vendor/src/github.com/Sirupsen/logrus/entry.go:113 +0x62c
github.com/Sirupsen/logrus.(*Entry).Panic(0xc82004da00, 0xc82527c3b8, 0x1, 0x1)
/usr/src/docker/vendor/src/github.com/Sirupsen/logrus/entry.go:158 +0x99
github.com/Sirupsen/logrus.(*Entry).Panicf(0xc82004da00, 0x20d2dc0, 0x5d, 0xc825267560, 0x2, 0x2)
/usr/src/docker/vendor/src/github.com/Sirupsen/logrus/entry.go:206 +0x139
github.com/coreos/etcd/raft.(*raftLog).commitTo(0xc82110d260, 0x249fa)
/usr/src/docker/vendor/src/github.com/coreos/etcd/raft/log.go:194 +0x1a6
github.com/coreos/etcd/raft.(*raft).handleHeartbeat(0xc820d3d2c0, 0x8, 0x52b75a6e6dd83823, 0x197af42d1ac22e90, 0xe, 0x0, 0x0, 0x0, 0x0, 0x0, ...)
/usr/src/docker/vendor/src/github.com/coreos/etcd/raft/raft.go:771 +0x44
github.com/coreos/etcd/raft.stepFollower(0xc820d3d2c0, 0x8, 0x52b75a6e6dd83823, 0x197af42d1ac22e90, 0xe, 0x0, 0x0, 0x0, 0x0, 0x0, ...)
/usr/src/docker/vendor/src/github.com/coreos/etcd/raft/raft.go:736 +0x119c
github.com/coreos/etcd/raft.(*raft).Step(0xc820d3d2c0, 0x8, 0x52b75a6e6dd83823, 0x197af42d1ac22e90, 0xe, 0x0, 0x0, 0x0, 0x0, 0x0, ...)
/usr/src/docker/vendor/src/github.com/coreos/etcd/raft/raft.go:564 +0x3e0
github.com/coreos/etcd/raft.(*node).run(0xc822c5bbd0, 0xc820d3d2c0)
/usr/src/docker/vendor/src/github.com/coreos/etcd/raft/node.go:310 +0x90e
created by github.com/coreos/etcd/raft.RestartNode
/usr/src/docker/vendor/src/github.com/coreos/etcd/raft/node.go:215 +0x2e4
About this issue
- Original URL
- State: open
- Created 8 years ago
- Comments: 22 (10 by maintainers)
Do you have other managers? If so, I’d follow this process to readd the node:
/var/lib/docker/swarm
out of the way on the affected nodedocker swarm join
docker node rm
) afterwards, to clean up the node list.@AashishAsh Go to https://github.com/moby/moby/issues/new to file a new issue (the “New Issue” button up in the top right-hand corner of the page). There’s a template to follow when you do so.
Probably set the title of the issue to be something like “dockerd nil pointer dereference panic in agent.WalkTask”?
The template will ask you for how you produced this exception (e.g. do you know how to trigger it?), what happened (good place to include the above log), the docker version information (although it looks like the daemon was down when that happened - it’d be useful to get version information about which version of the daemon this is against as well), and the output of
docker info
as well as any other system information you may have.