moby: docker swarm mode overlay can't access external internet
Description
I’m using swarm mode and a network overlay on AWS. On the AWS created docker-machines I can access the external network. Inside the docker container I cannot.
It is likely that I’m doing something wrong.
Steps to reproduce the issue:
PC=...
REGION=...
SUBNET=...
ZONE=...
docker-machine create -d amazonec2 --amazonec2-vpc-id $VPC --amazonec2-region $REGION --amazonec2-zone $ZONE --amazonec2-instance-type t2.micro --amazonec2-subnet-id $SUBNET --amazonec2-security-group swarm-mode selenium-swarm-manager
# Get IP
docker-machine ssh selenium-swarm-manager ifconfig eth0
# Point docker client to the swarm manager
eval $(docker-machine env selenium-swarm-manager)
# Initialize the swarm
docker swarm init --advertise-addr <ip from above>
# Note the output to join nodes, as it will be used below
eval $(docker-machine env selenium-swarm-node1)
docker swarm join --token <token> from <ip from above>:2377
eval $(docker-machine env selenium-swarm-node2)
docker swarm join --token <token> from <ip from above>:2377
docker network create seleniumnet --driver overlay
docker service create --name hub --network seleniumnet -p 4444:4444 selenium/hub
# more docker service creates for same network
# connect to the container, wherever it was created
eval $(docker-machine env selenium-swarm-manager)
docker exec -it 9b299e486b0a bash
$ sudo apt-get update # note that it hangs
It now hangs. Note that the container doesn’t have access to ping or anything useful, which I don’t get.
Describe the results you received:
Hang
Describe the results you expected:
External HTTP requests. Note that this was tested other ways as well. Same result.
Additional information you deem important (e.g. issue happens only occasionally):
Output of docker version
:
Docker version 1.12.2, build bb80604, experimental
Output of docker info
:
Containers: 9
Running: 1
Paused: 0
Stopped: 8
Images: 5
Server Version: 1.12.2
Storage Driver: aufs
Root Dir: /var/lib/docker/aufs
Backing Filesystem: extfs
Dirs: 79
Dirperm1 Supported: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
Volume: local
Network: overlay null bridge host
Swarm: active
NodeID: 1dac0q4m0qh2hp3ug33p5rnyr
Is Manager: true
ClusterID: 9jmu7r5rh9gu5rq8qw5j2ky3m
Managers: 1
Nodes: 3
Orchestration:
Task History Retention Limit: 5
Raft:
Snapshot Interval: 10000
Heartbeat Tick: 1
Election Tick: 3
Dispatcher:
Heartbeat Period: 5 seconds
CA Configuration:
Expiry Duration: 3 months
Node Address: 10.0.0.127
Runtimes: runc
Default Runtime: runc
Security Options: apparmor seccomp
Kernel Version: 4.2.0-18-generic
Operating System: Ubuntu 15.10
OSType: linux
Architecture: x86_64
CPUs: 1
Total Memory: 991.1 MiB
Name: selenium-swarm-manager
ID: QXFU:53LD:SQKN:7SZM:VYOK:2GYG:KOMB:OFYU:STLS:R5CN:A4AA:QPYP
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
WARNING: No swap limit support
Labels:
provider=amazonec2
Insecure Registries:
127.0.0.0/8
About this issue
- Original URL
- State: closed
- Created 8 years ago
- Comments: 21 (1 by maintainers)
@fingermark Your observation that using
--subnet=192.168.1.0/24
for overlay fixes the problem is correct. The reason for the original problem is a bit involved though.Within a container in an EC2 instance on a private subnet only the name resolution fails, but the external connectivity is just fine…
The address range in VPC by default is (10.0.0.0/16) and the DHCP server provisioned by AWS is 10.0.0.2. The default address range of overlay networks also start at 10.0.0.0/24. This is still fine because the docker overlay bridge is created inside a separate network namespace (this has a kernel version dependency though). But for DNS resolution, the queries from the container are sent to the Docker embedded DNS server, 127.0.0.11 (address in container’s resolv.conf).
Though the DNS server runs as part of the docker daemon it tries to do the routing to the DNS name server address,
10.0.0.2
here, inside the container’s name space. Since it conflicts with the overlay address range, packets will not be sent to the DNS server. That is the reason why external connectivity works but only the resolution is failing.To work around this there are two options…
Since there is no fix required here I think this issue can be closed.
@herbrandson I wound up starting over. I left the docker_gwbridge gateway alone and assigned my network overlay a subnet different from anything else:
--subnet=192.168.1.0/24
. I’m now able to access servers on the same network and externally.Maybe it would help if I described exactly what I was looking to achieve:
Using AWS EC2 t2.micro instances and AWS VPC with a private and public subnet. How would I create the following using docker swarm mode:
Requirements
Hub Node (count: 1)
docker service create --name hub --network seleniumnet -p 4444:4444 selenium/hub --replicas 1
Chrome Nodes (count: 50)
docker service create --network seleniumnet --endpoint-mode dnsrr --name chrome --mount type=bind,source=/dev/shm,target=/dev/shm -e HUB_PORT_4444_TCP_ADDR=hub -e HUB_PORT_4444_TCP_PORT=4444 --replicas 50 selenium/node-chrome:2.53.1-beryllium bash -c 'REMOTE_HOST=http://$HOSTNAME:5555 /opt/bin/entry_point.sh'
AWS VPC Details:
VPC (vpc-00000001):
10.0.0.0/16
public subnet (subnet-00000001):
10.0.0.0/24
private subnet (subnet-00000002):
10.0.1.0/24
Notes:
docker_gwbridge gets automatically created with:
It would be great if this could be stated in docker swarm create network documentation. I spent several hours figuring this out …
@fingermark Per your comments above, I ran the following commands
However I’m still unable to launch services due to
failed to allocate gateway (1…
Did I miss a step?