moby: Docker swarm - encrypted network overlay - stops working.

Description After creating a 3 node swarm (all managers), and then creating an encrypted overlay, we have noticed that the overlay network drops out randomly

Steps to reproduce the issue:

  1. Create docker swarm cluster of at least 3 nodes
  2. Create overlay with:
docker network create --attachable --opt encrypted -d overlay networkname"

NOTE: Making it attachable to test easily

  1. Start an alpine container (easy test) on 2 nodes:
docker run -it --rm --net=networkname alpine /bin/ash

4.) Find the IPs (ifconfig) of each, and ping across.

Describe the results you received: It works and randomly it stops. Firewall (both IP protocol 50 and the rest of the parts are any/any allowed between the 3 nodes)

Describe the results you expected: To work all the time 😃

Additional information you deem important (e.g. issue happens only occasionally): Happens randomly almost. If you reboot, it starts working again.

Output of docker version:

Client:
 Version:      1.13.0
 API version:  1.25
 Go version:   go1.7.3
 Git commit:   49bf474
 Built:        Tue Jan 17 09:58:26 2017
 OS/Arch:      linux/amd64

Server:
 Version:      1.13.0
 API version:  1.25 (minimum version 1.12)
 Go version:   go1.7.3
 Git commit:   49bf474
 Built:        Tue Jan 17 09:58:26 2017
 OS/Arch:      linux/amd64
 Experimental: false

Output of docker info:

Containers: 1
 Running: 1
 Paused: 0
 Stopped: 0
Images: 8
Server Version: 1.13.0
Storage Driver: aufs
 Root Dir: /var/lib/docker/aufs
 Backing Filesystem: extfs
 Dirs: 34
 Dirperm1 Supported: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins: 
 Volume: local
 Network: bridge host macvlan null overlay
Swarm: active
 NodeID: 4y3mi5goxun18p0rif8hdrt5o
 Is Manager: true
 ClusterID: vcwzg0mebqw4kp58pz8ynm0cn
 Managers: 3
 Nodes: 3
 Orchestration:
  Task History Retention Limit: 5
 Raft:
  Snapshot Interval: 10000
  Number of Old Snapshots to Retain: 0
  Heartbeat Tick: 1
  Election Tick: 3
 Dispatcher:
  Heartbeat Period: 5 seconds
 CA Configuration:
  Expiry Duration: 3 months
 Node Address: PUB#1
 Manager Addresses:
  PUB#1:2377
  PUB#2:2377
  PUB#2:2377
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 03e5862ec0d8d3b3f750e19fca3ee367e13c090e
runc version: 2f7393a47307a16f8cee44a37b262e8b81021e3e
init version: 949e6fa
Security Options:
 apparmor
 seccomp
  Profile: default
Kernel Version: 4.4.0-59-generic
Operating System: Ubuntu 16.04.1 LTS
OSType: linux
Architecture: x86_64
CPUs: 24
Total Memory: 100 GiB
Name: swarmhost01
ID: GVD4:VFPH:ELAN:X2CK:CLFZ:MFDC:C5LT:RLTU:DWKE:KDKY:HT6M:BAC2
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
Labels:
 nfs=yes
Experimental: false
Insecure Registries:
 127.0.0.0/8
Live Restore Enabled: false

Additional environment details (AWS, VirtualBox, physical, etc.): Environment is between physical and virtual systems. We have changed it around to be only virtual and only physical - same results. Systems are located in 3 different regions, on 3 different public IP spaces.

About this issue

  • Original URL
  • State: closed
  • Created 7 years ago
  • Comments: 29 (10 by maintainers)

Most upvoted comments

Thanks @ventz I found where the issue is and pushed the fix ^. It affects only the use case where the advertise address is outside of the box (1-1 NAT case) which is way I could not initially reproduce. It happens during key rotations. It is expected a node reload, or removing and restarting the container on the node fixes it, but only until next datapath key rotation happens.

@emcgee could you open a new ticket so that it doesn’t get lost? (this issue is closed, so comments get easily overlooked

So we’re seeing something quite similar to what is described above. It may also be related to https://github.com/moby/moby/issues/33133

We’ve noticed that we’re unable to send any data traffic through an encrypted overlay to an endpoint behind AWS/GCE NAT on Debian 9, Ubuntu 18.04, or anything with a kernel > 4.4.

The only modern Debian variant that works is 16.04 with the 4.4 kernel.


Simple repro: node-1 on DigitalOcean, node-2 on GCE/AWS.

  1. Spin up a Debian Stretch host (4.9 kernel) on both DigitalOcean and GCE or AWS with the latest stable Docker-CE (18.03.1)

  2. Initialize the swarm on DigitalOcean node-1 using --advertise-addr external_ip

  3. By default, DigitalOcean has no firewall so node-1 is wide open. Open TCP 2377, TCP/UDP 7946, UDP 4789, Protocol 50 on node-2 in the AWS Security Group/GCP VPC Firewall Rules and join the swarm as a worker using --advertise-addr node-2-external-ip

  4. docker create network --attachable --driver overlay --opt encrypted encryption_test

  5. docker run -ti --rm --network encryption_test debian bash

If you do this with a non-encrypted overlay, traffic flows with no issues. We’re able to ping between containers, run iperf3; all is well. But on the encrypted overlay, the traffic simply won’t transmit.

ip xfrm state gives:

src 172.31.38.243 dst 207.106.235.21
	proto esp spi 0x3e7a19d6 reqid 13681891 mode transport
	replay-window 0
	aead rfc4106(gcm(aes)) 0xdcb885a138afc1d801f86a6b379dd22e3e7a19d6 64
	anti-replay context: seq 0x0, oseq 0xa, bitmap 0x00000000
	sel src 0.0.0.0/0 dst 0.0.0.0/0
src 207.106.235.21 dst 172.31.38.243
	proto esp spi 0x617bc0ba reqid 13681891 mode transport
	replay-window 0
	aead rfc4106(gcm(aes)) 0xdcb885a138afc1d801f86a6b379dd22e617bc0ba 64
	anti-replay context: seq 0x0, oseq 0x0, bitmap 0x00000000
	sel src 0.0.0.0/0 dst 0.0.0.0/0

We can see ESP traffic on the remote host (not the one doing the pinging) but there is no return:

18:31:24.652640 IP ec2-18-176-21-33.us-east-2.compute.amazonaws.com > hostname.mydomain.com: ESP(spi=0x3e7a19d6,seq=0x1), length 140
18:31:25.653615 IP ec2-18-176-21-33.us-east-2.compute.amazonaws.com > hostname.mydomain.com: ESP(spi=0x3e7a19d6,seq=0x2), length 140
18:31:26.654775 IP ec2-18-176-21-33.us-east-2.compute.amazonaws.com > hostname.mydomain.com: ESP(spi=0x3e7a19d6,seq=0x3), length 140
18:31:27.655940 IP ec2-18-176-21-33.us-east-2.compute.amazonaws.com > hostname.mydomain.com: ESP(spi=0x3e7a19d6,seq=0x4), length 140
18:31:28.657156 IP ec2-18-176-21-33.us-east-2.compute.amazonaws.com > hostname.mydomain.com: ESP(spi=0x3e7a19d6,seq=0x5), length 140

There seems to be a change here in how the kernel is processing the traffic in > 4.4 - has anyone else seen this?

/cc @aboch @ventz

@aboch - That’s great! Thanks.

What do you think is the eta for the next update that will have this?