moby: Removing stack that failed to fully deploy leaves behind stray networks
Description
I have a four node swarm on Docker 1.13-rc5. I accidentally deployed a stack from a compose file whose services are all from a private registry, but without the --with-registry-auth
option. My services failed to start, so I removed the stack.
While removing the stack, the docker client shows that it is moving the networks defined in the stack (all overlay/swarm in this case), as expected. But, upon listing the networks on the nodes, some of the stack networks have actually been left behind and now appear as overlay/local.
The stack cannot be deployed again until the stray networks are removed from all the nodes.
Steps to reproduce the issue:
Given a Docker Compose file with some networks defined, and services that use a private registry:
- Create a Docker Compose file with services whose images are on a private registry, and with some networks.
- Deploy a stack using the compose file, but without the
--with-registry-auth
option. - Remove the stack because the services failed to start due to missing images (they can’t be pulled).
- List the networks on the swarm nodes.
Describe the results you received:
Some of the networks that were created as part of the stack deployment are left behind. Their driver is still overly, but their scope has changed from swarm to local.
Describe the results you expected:
The networks that were created as part of the stack deployment should be all removed.
Additional information you deem important (e.g. issue happens only occasionally):
Here’s the Docker Compose file I was trying to deploy, and output showing the problem.
networks:
converis: {}
front: {}
traefik_rms-dev:
external: true
services:
converis:
environment:
CONVERIS_HOST: converis-db-configs-master.rms-test.ucalgary.ca
CRA_HOST: converis-db-configs-master.rms-test.ucalgary.ca
image: docker.ucalgary.ca/tr/converis:5.9.8
networks:
converis: null
front: null
converis-db:
environment:
CONVERIS_HOST: converis-db-configs-master.rms-test.ucalgary.ca
CRA_HOST: converis-db-configs-master.rms-test.ucalgary.ca
image: docker.ucalgary.ca/rms/converis-db-configs:master
networks:
converis: null
converis-web:
environment:
CAS_LOGIN_URL: http://castestqa.ucalgary.ca/replicant/login
CAS_VALIDATE_URL: http://castestqa.ucalgary.ca/replicant/ucserviceValidate
CONVERIS_HOST: converis-db-configs-master.rms-test.ucalgary.ca
CRA_HOST: converis-db-configs-master.rms-test.ucalgary.ca
image: docker.ucalgary.ca/tr/converis-web:1.1.0
networks:
front: null
traefik_rms-dev: null
version: '3.0'
volumes: {}
→ ~ docker stack ls
NAME SERVICES
→ ~ docker network ls
NETWORK ID NAME DRIVER SCOPE
58721ada8da1 bridge bridge local
45ac998dbf08 docker_gwbridge bridge local
3c2e6ce7fecf host host local
68i730xxf68a ingress overlay swarm
5430cf1700b3 none null local
vur2f7vb8eue traefik_rms-dev overlay swarm
→ ~ docker stack deploy --compose-file /Users/kchuang/Downloads/converis-db-configs.yml converis-db-configs-master
Creating network converis-db-configs-master_default
Creating network converis-db-configs-master_converis
Creating network converis-db-configs-master_front
Creating service converis-db-configs-master_converis
Creating service converis-db-configs-master_converis-db
Creating service converis-db-configs-master_converis-web
→ ~ docker network ls
NETWORK ID NAME DRIVER SCOPE
58721ada8da1 bridge bridge local
svatalmcadi2 converis-db-configs-master_converis overlay swarm
k63biko0t0fm converis-db-configs-master_default overlay swarm
pfkak51i75hr converis-db-configs-master_front overlay swarm
45ac998dbf08 docker_gwbridge bridge local
3c2e6ce7fecf host host local
68i730xxf68a ingress overlay swarm
5430cf1700b3 none null local
vur2f7vb8eue traefik_rms-dev overlay swarm
(time passes, services fail to start)
→ ~ docker stack rm converis-db-configs-master
Removing service converis-db-configs-master_converis-db
Removing service converis-db-configs-master_converis-web
Removing service converis-db-configs-master_converis
Removing network converis-db-configs-master_converis
Removing network converis-db-configs-master_front
Removing network converis-db-configs-master_default
→ ~ docker network ls
NETWORK ID NAME DRIVER SCOPE
58721ada8da1 bridge bridge local
fwe4wv5bc8md converis-db-configs-master_converis overlay local
ks6pzoguzk3c converis-db-configs-master_front overlay local
45ac998dbf08 docker_gwbridge bridge local
3c2e6ce7fecf host host local
68i730xxf68a ingress overlay swarm
5430cf1700b3 none null local
vur2f7vb8eue traefik_rms-dev overlay swarm
→ ~ docker stack deploy --compose-file /Users/kchuang/Downloads/converis-db-configs.yml converis-db-configs-master
Creating network converis-db-configs-master_default
Creating service converis-db-configs-master_converis
Error response from daemon: network converis-db-configs-master_converis not found
Output of docker version
:
Client:
Version: 1.13.0-rc5
API version: 1.25
Go version: go1.7.3
Git commit: 43cc971
Built: Thu Jan 5 00:43:46 2017
OS/Arch: linux/amd64
Server:
Version: 1.13.0-rc5
API version: 1.25 (minimum version 1.12)
Go version: go1.7.3
Git commit: 43cc971
Built: Thu Jan 5 00:43:46 2017
OS/Arch: linux/amd64
Experimental: true
Output of docker info
:
Containers: 0
Running: 0
Paused: 0
Stopped: 0
Images: 3
Server Version: 1.13.0-rc5
Storage Driver: devicemapper
Pool Name: docker-thinpool
Pool Blocksize: 524.3 kB
Base Device Size: 10.74 GB
Backing Filesystem: xfs
Data file:
Metadata file:
Data Space Used: 4.406 GB
Data Space Total: 65.28 GB
Data Space Available: 60.87 GB
Metadata Space Used: 1.126 MB
Metadata Space Total: 683.7 MB
Metadata Space Available: 682.5 MB
Thin Pool Minimum Free Space: 6.527 GB
Udev Sync Supported: true
Deferred Removal Enabled: true
Deferred Deletion Enabled: false
Deferred Deleted Device Count: 0
Library Version: 1.02.107-RHEL7 (2016-06-09)
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
Volume: local
Network: bridge host ipvlan macvlan null overlay
Swarm: active
NodeID: 3vz5vqb96ab3yh6mrhhbnf5ki
Is Manager: true
ClusterID: 9p9qjvvrp991zitpixlm8dpm5
Managers: 4
Nodes: 4
Orchestration:
Task History Retention Limit: 5
Raft:
Snapshot Interval: 10000
Number of Old Snapshots to Retain: 0
Heartbeat Tick: 1
Election Tick: 3
Dispatcher:
Heartbeat Period: 5 seconds
CA Configuration:
Expiry Duration: 3 months
Node Address: 10.45.32.40
Manager Addresses:
10.45.32.40:2377
10.45.32.41:2377
10.45.32.42:2377
10.45.32.43:2377
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 03e5862ec0d8d3b3f750e19fca3ee367e13c090e
runc version: 51371867a01c467f08af739783b8beafc154c4d7
init version: 949e6fa
Security Options:
seccomp
Profile: default
Kernel Version: 3.10.0-327.36.3.el7.x86_64
Operating System: Red Hat Enterprise Linux Server 7.2 (Maipo)
OSType: linux
Architecture: x86_64
CPUs: 4
Total Memory: 15.51 GiB
Name: itpocnode01.ucalgary.ca
ID: OM7K:NNN7:75VJ:I34C:W5ZS:TCYH:ZGPH:SRXP:HINS:7O2W:IZQ4:QIGD
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
WARNING: bridge-nf-call-ip6tables is disabled
Experimental: true
Insecure Registries:
127.0.0.0/8
Live Restore Enabled: false
Additional environment details (AWS, VirtualBox, physical, etc.):
This is on a four node swarm with Docker 1.13.0-rc5.
About this issue
- Original URL
- State: closed
- Created 7 years ago
- Comments: 24 (11 by maintainers)
I can still reproduce this issue on 19.03.13 for Windows containers. Can you please reopen the issue?
Thanks @kinghuang. I will close this issue since it is resolved in 17.05. I still dont know which fix solved the issue though .
Facing the same issue in docker 17.03.1-ce. Intermittently docker stack rm leaves stray overlay network with local scope. This actually results in subsequent containers of other apps also saying “Address already in use”. Looks like docker assigns some ip which is still reserved in the stray network. To resolve this we need to manually disconnect endpoints of the stray network and then remove it. Should we be expecting any fix for this?