moby: could not resolve address of member ID
Hello, Description
I’m facing a weird issue with swarm, and would like to have some assistance on that one. I’ve added in a swarm of 3 managers (version 17.03) a new host as a worker with no issue. After that a VMware rollback has been done on that worker after an upgrade in debian stretch (was for a test). The rollback seems have broken something in my swarm.
No data available from the swarm --> Error: rpc error: code = 4 desc = context deadline exceeded
I have not the ability to delete and create the swarm again as it is currently in use with some services.
Steps to reproduce the issue:
- Add host as a worker into a swarm (Debian 8.8)
- VWmare snapshot on that host
- Upgrade Debian 9 stretch
- Rollback on the snapshot
- Promote Worker to Manager or add new host as manager
Describe the results you received: Now when I want to promote the worker as a manager, or even add a new host to my swarm as a manager, I’ve got that on my logs every seconds :
Jul 18 14:23:51 docker-5 dockerd[710]: time="2017-07-18T14:23:51.286450834+02:00" level=warning msg="sending message to an unrecognized member ID 5c9f1beef700d602" raft_id=6b9c136b96a2f655
Jul 18 14:23:51 docker-5 dockerd[710]: time="2017-07-18T14:23:51.286554667+02:00" level=error msg="could not resolve address of member ID 5c9f1beef700d602" error="rpc error: code = 9 desc = grpc: the client connection is closing" raft_id=6b9c136b96a2f655
The “could not resolve address of member ID 5c9f1beef700d602” is the same ID on both of the host that I Try to add as a manager into my swarm. But I could not find to what it is related
When I add them, it’s taking time, like if somehting was in timeout and then end normally. But I could not use any of the swarm command on those 2 managers.
Describe the results you expected:
Have a responding and working new manager in my Swarm.
Additional information you deem important (e.g. issue happens only occasionally):
Output of docker version
:
Client:
Version: 17.03.1-ce
API version: 1.27
Go version: go1.7.5
Git commit: c6d412e
Built: Mon Mar 27 17:07:28 2017
OS/Arch: linux/amd64
Server:
Version: 17.03.1-ce
API version: 1.27 (minimum version 1.12)
Go version: go1.7.5
Git commit: c6d412e
Built: Mon Mar 27 17:07:28 2017
OS/Arch: linux/amd64
Experimental: false
Output of docker info
:
Swarm: active
NodeID: yhpvwcm60oko20leg9z7vieop
Error: rpc error: code = 4 desc = context deadline exceeded
Is Manager: true
ClusterID:
Managers: 0
Nodes: 0
Orchestration:
Task History Retention Limit: 0
Raft:
Snapshot Interval: 0
Heartbeat Tick: 0
Election Tick: 0
Dispatcher:
Heartbeat Period: Less than a second
CA Configuration:
Expiry Duration: Less than a second
Additional environment details (AWS, VirtualBox, physical, etc.): VMWare
Many thanks !
Best regards,
Trafle73
About this issue
- Original URL
- State: closed
- Created 7 years ago
- Comments: 16 (6 by maintainers)
Any update on this? I’m encountering the same issue with 17.09. Is there a way to cleanup the stale raft ID?