moby: Problem with overlay network and DNS resolution between containers

Description

I noticed some problems with DNS resolution inside a overlay network and Swarm. DNS entries are not always updated automatically. I have 10 containers over 4 hosts on Ubuntu 16.04 connected by Swarm. I created an overlay network for those containers. When I redeploy one of those containers (I stop the current one, rename it to OLD, and create a new one with the same name), the container will not always have the same IP as before (which is not a problem). But It looks like it does not always update the DNS entry for the others containers in the network. The new created container is then unreachable from other one.

My docker version is 1.13.0.

Steps to reproduce the issue:

  1. Create a Swarm architecture with multiple hosts
  2. Create a overlay network
  3. Deploy few containers with specific names on each node and attach them to this network
  4. Remove one of this container and create exactly the same one.

Describe the results you received: If IP of this new container has changed, the DNS entry will not be updated automatically for others containers. If you try to ping this new container dns name from others containers, sometimes you will notice that the resolved IP is actually the IP of the previous removed container.

Describe the results you expected: DNS entries should be updated for every containers when these last ones have their IP changed.

Additional information you deem important (e.g. issue happens only occasionally):

Output of docker version:

Docker version 1.13.0, build 49bf474

Output of docker info:

Containers: 14
 Running: 10
 Paused: 0
 Stopped: 4
Images: 449
Server Version: 1.13.0
Storage Driver: aufs
 Root Dir: /var/lib/docker/aufs
 Backing Filesystem: extfs
 Dirs: 571
 Dirperm1 Supported: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins: 
 Volume: local
 Network: bridge host macvlan null overlay
Swarm: active
 NodeID: okiqm8slow52nm4rx8qt08rpc
 Is Manager: true
 ClusterID: 7b3cohqvxgp3q9qm19xq4dj97
 Managers: 2
 Nodes: 4
 Orchestration:
  Task History Retention Limit: 5
 Raft:
  Snapshot Interval: 10000
  Number of Old Snapshots to Retain: 0
  Heartbeat Tick: 1
  Election Tick: 3
 Dispatcher:
  Heartbeat Period: 5 seconds
 CA Configuration:
  Expiry Duration: 3 months
 Node Address: 172.17.10.83
 Manager Addresses:
  172.17.1.224:2377
  172.17.10.83:2377
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 03e5862ec0d8d3b3f750e19fca3ee367e13c090e
runc version: 2f7393a47307a16f8cee44a37b262e8b81021e3e
init version: 949e6fa
Security Options:
 apparmor
 seccomp
  Profile: default
Kernel Version: 4.4.0-43-generic
Operating System: Ubuntu 16.04.1 LTS
OSType: linux
Architecture: x86_64
CPUs: 8
Total Memory: 19.61 GiB
Name: piranha
ID: 3SY6:AAEL:NLUO:4BTD:U5ZK:AMWA:PNGQ:4ZVM:F7S4:7GFH:E2KG:V32H
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
WARNING: No swap limit support
Experimental: false
Insecure Registries:
 172.17.11.100:5000
 127.0.0.0/8
Live Restore Enabled: false

Additional environment details (AWS, VirtualBox, physical, etc.):

About this issue

  • Original URL
  • State: closed
  • Created 7 years ago
  • Reactions: 5
  • Comments: 44 (14 by maintainers)

Most upvoted comments

I have a VPN between hosts (10.8.3.0/24)

Steps to reproduce the issue:

  1. Create a Swarm architecture with multiple hosts (I haven’t specified an adrvertising address or other oprtions)
  2. Create an attachable overlay network
  3. Deploy a container with docker-compose (even on the same host as manager or not) and attach them to this network
  4. The new container appear in nslookup container-name
  5. Remove one of this container and create exactly the same one.
  6. The container still appears in nslookup container-name
  7. Re-deploy the same container with the same name with docker-compose
  8. There are two ip addresses for the same container name

@Oleg-Arkhipov @bigfoot90 This is also a dup of https://github.com/moby/moby/issues/30134 As @thaJeztah said here: https://github.com/moby/moby/issues/30134#issuecomment-403093090 issue is fixed by https://github.com/docker/libnetwork/pull/2176 and is available in the 18.06-rc1 and in the next stable release 18.06 that will come soon.

After some days, I can confirm that Docker 17.12.1 is the last version that works correctly.

hey @danielmrosa good to hear that the new cluster is more stable.

Maybe one way to reproduce this problem is try to create many problematic containers and let then restart by itself, do many service updates using some problematic tags and see a mess happens 😃

we have internally actually a test that does exactly that. It spawn a bunch of services with a wrong command so that they will stay and spin with containers coming up and exiting immediately. I will take a look just to be sure on Monday.

Regarding the overlay, the limitation is performance and time that take to spin up services. The more services with VIP the more iptables and ipvs rules that has to be configured. There is a PR open to address it and is in the review phase, we are aiming to have it in 18.06 but don’t quote me on that just yet, will be more precise as the patch get merged.

@danielmrosa do you have any way to reproduce? First check that I would do is to verify that the container that was associated to the extra IP is actually exited properly, I was reading that 18.03 was having an issue where containers where remaining stuck, so maybe the cleanup did not happen yet because of that bug.

First thing to check is the network inspect 0) do a docker network inspect -v <network id> on a node that has a container for that network. That will show the endpoint ID of the endpoint with old IP. If you have the daemon in debug mode you can grep for it and see if there was an error on the cleanup

If that is not the case I would start taking a look to the network db state: I will suggest the following greps on the daemon logs:

  1. outgoing queue length: grep "stats.*<network id> you will see a bunch of lines like:
Apr 03 10:46:38 ip-172-31-22-5 dockerd[1151]: time="2018-04-03T10:46:38.902639904Z" level=info msg="NetworkDB stats <hostname>(<node id>) - netID:3r6rqkvee3l4c7dx3c9fmf2a8 leaving:false netPeers:3 entries:12 Queue qLen:0 netMsg/s:0"

netPeers should match the number of nodes that have container on that network, entries is the number of entries in the database (it’s not 1:1 with the containers), qLen should be always be 0 when the system is stable and will spike only when there is changes in the cluster. 2) grep healthscore this will show up only if nodes have connectivity issues, the higher the number the worse is the issue 3) grep change state this can identify the change of state of networkdb nodes, maybe there is some nodes that are not stable in the cluster.

If you use the diagnostic tool, you can also identify who was the node owner of the extra entry and track back with the last grep if the node left the cluster at some point and why the cleanup did not happen. Let me know if you find a repro state or how is going the debug. If you want you can also share with me the logs of the nodes and I can help taking a look. I will need anyway the information mentioned above

@bigfoot90 I confirm that I am able to reproduce the same bug (I found the same steps by myself and then this issue on GitHub) on Docker version 18.03.1-ce, build 9ee9f40. If I restart container connected to an overlay network (everything is managed by Docker Compose, not by Swarm), with each new restart I get +1 old IP in the DNS resolution of host name associated with that container.

Actually docker network inspect -v network_name shows that stopping of a container does not remove its entry and just leaves it with some null:

"Services": {
            "": {
                "VIP": "<nil>",
                "Ports": [],
                "LocalLBIndex": 0,
                "Tasks": [
                    {
                        "Name": "test_mysql_1",
                        "EndpointID": "32a943410b63d4762ff694611963f9f20264cd185419992b13c6502aa3f21704",
                        "EndpointIP": "10.0.0.10",
                        "Info": {
                            "Host IP": "192.168.1.104"
                        }
                    },
                    {
                        "Name": "test_nginx_1",
                        "EndpointID": "ce99e53df31f8cc14835901ca454517570a6a450a08b28d467c65a6a38c4b92c",
                        "EndpointIP": "10.0.0.9",
                        "Info": null
                    }
                ]
            }
        }

In this example nginx was stopped. Starting it again will result in addition of new correct entry, while broken old one will stay in place.

However stopping of all containers in project (but not through docker-compose restart, there must be distinct moment when all containers in the net are stopped) finally cleans up all entries.

Some feedback that may be useful: I migrated a node to swarm and kept some other services managed by the old docker-compose, probably a conceptual error. This made those dockers appear as a “null” service from the networking perspective (docker network inspect -v), so they were not being cleaned-up accordingly. Hope this helps!

I’m also having this issue when using docker swarm mode.

docker --version
Docker version 18.03.0-ce, build 0520e24

Service records (docker DNS) sometimes end up with old IP addresses from the previous service.

@fcrisciani thanks for your help, we are now specifying EndpointSpec to avoid this problem.

We are happy to know that this will be fixed soon.

We will post here if we find another related issue.