moby: Docker for Mac 1.12.0 TCP DNS queries fail

Since Docker for Mac 1.12.0, fragmented DNS queries cause DNS resolution to fail when trying to resolve via 192.168.65.1:53 because the resolver can not connect via TCP to said IP.

Output of docker version:

Client:
 Version:      1.12.0-rc3
 API version:  1.24
 Go version:   go1.6.2
 Git commit:   91e29e8
 Built:        Sat Jul  2 00:09:24 2016
 OS/Arch:      darwin/amd64
 Experimental: true

Server:
 Version:      1.12.0-rc3
 API version:  1.24
 Go version:   go1.6.2
 Git commit:   876f3a7
 Built:        Tue Jul  5 02:20:13 2016
 OS/Arch:      linux/amd64
 Experimental: true

Output of docker info:

Containers: 0
 Running: 0
 Paused: 0
 Stopped: 0
Images: 0
Server Version: 1.12.0-rc3
Storage Driver: aufs
 Root Dir: /var/lib/docker/aufs
 Backing Filesystem: extfs
 Dirs: 0
 Dirperm1 Supported: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
 Volume: local
 Network: overlay bridge host null
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Security Options: seccomp
Kernel Version: 4.4.14-moby
Operating System: Alpine Linux v3.4
OSType: linux
Architecture: x86_64
CPUs: 4
Total Memory: 1.954 GiB
Name: moby
ID: 7NQP:NH7Q:G2HH:3TIS:3L6P:EPES:SS5I:YEIU:7XEB:ZU37:UX27:6RTI
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): true
 File Descriptors: 17
 Goroutines: 29
 System Time: 2016-07-05T20:39:53.984107317Z
 EventsListeners: 1
No Proxy: *.local, 169.254/16
Username: gavinmroy
Registry: https://index.docker.io/v1/
Experimental: true
Insecure Registries:
 127.0.0.0/8

Additional environment details (AWS, VirtualBox, physical, etc.): This only happens when the DNS result is truncated, as can happen with auth.docker.io.

Please see https://forums.docker.com/t/auth-docker-io-on-192-168-65-1-53-no-such-host/16801/4 - It’s worth noting that there are multiple posts in the Docker for Mac forum about this issue.

I was hoping the forums were visible enough for reports there, but I’m guessing I should have gone here first.

Steps to reproduce the issue:

  1. docker pull alpine

Describe the results you received: Docker fails to pull the image due to DNS failure:

$ docker pull alpine
Using default tag: latest
Error response from daemon: Get https://registry-1.docker.io/v2/library/alpine/manifests/latest1: Get https://auth.docker.io/token?scope=repository%3Alibrary%2Falpine%3Apull&service=registry.docker.io: dial tcp: lookup auth.docker.io on 192.168.65.1:53: no such host

Describe the results you expected: Docker pulls the image.

Additional information you deem important (e.g. issue happens only occasionally):

  • This has something to do with the EDNS record sizes returned for auth.docker.io and the new Alpine 3.4 based moby image. Previous versions of moby on my network do not have the issue
  • The error that I get is a failure to connect to the embedded DNS server in docker via TCP On my network, I am getting EDNS truncated responses back which should cause the DNS resolver to failover to TCP queries
  • Alpine 3.4 does not have a problem with truncated issues on its own, so I’m guessing that the main issue here is lack of full EDNS/TCP support in the embedded docker DNS server, which seems to cause an issue when issuing a docker login/pull or anything else asking to resolve auth.docker.io since the record seems to be longer than the docker embedded DNS server can handle.
  • When I am in the moby instance, I can not resolve auth.docker.io with the default settings:
moby:~# nslookup auth.docker.io
;; Truncated, retrying in TCP mode.
;; Connection to 192.168.65.1#53(192.168.65.1) for auth.docker.io failed: connection refused.

If I change the DNS server to 8.8.8.8, I do not have a problem:

moby:~# cat /etc/resolv.conf 
search local
nameserver 8.8.8.8
moby:~# nslookup auth.docker.io
Server:         8.8.8.8
Address:        8.8.8.8#53

Non-authoritative answer:
auth.docker.io  canonical name = elb-registry.us-east-1.aws.dckr.io.
elb-registry.us-east-1.aws.dckr.io      canonical name = us-east-1-elbregis-10fucsvj1tcgy-133821800.us-east-1.elb.amazonaws.com.
Name:   us-east-1-elbregis-10fucsvj1tcgy-133821800.us-east-1.elb.amazonaws.com
Address: 52.73.165.108
Name:   us-east-1-elbregis-10fucsvj1tcgy-133821800.us-east-1.elb.amazonaws.com
Address: 52.203.219.86
Name:   us-east-1-elbregis-10fucsvj1tcgy-133821800.us-east-1.elb.amazonaws.com
Address: 52.71.80.248
Name:   us-east-1-elbregis-10fucsvj1tcgy-133821800.us-east-1.elb.amazonaws.com
Address: 54.172.251.194
Name:   us-east-1-elbregis-10fucsvj1tcgy-133821800.us-east-1.elb.amazonaws.com
Address: 52.71.245.229
Name:   us-east-1-elbregis-10fucsvj1tcgy-133821800.us-east-1.elb.amazonaws.com
Address: 54.164.225.120
Name:   us-east-1-elbregis-10fucsvj1tcgy-133821800.us-east-1.elb.amazonaws.com
Address: 52.72.94.105
Name:   us-east-1-elbregis-10fucsvj1tcgy-133821800.us-east-1.elb.amazonaws.com
Address: 54.174.255.71

I’m not quite sure why the results are being truncated as the resolver for my laptop has a max-udp-size and edns-udp-size of 4096. What that is material to the cause of the truncated EDNS UDP packets, it is not as much of a problem as Docker’s embedded DNS server running at 192.168.65.1 doesn’t seem to be listening for DNS queries via TCP.

I would assume that if 192.168.65.1 was able to resolve DNS via TCP (which is the fallback behavior when a UDP DNS response returns a fragmented reply), it would be connectable on port 53:

moby:~# exit

Welcome to Moby alpha
Kernel 4.4.14-moby on an x86_64 (/dev/ttyS0)

                        ##         .
                  ## ## ##        ==
               ## ## ## ## ##    ===
           /"""""""""""""""""___/ ===
      ~~~ {~~ ~~~~ ~~~ ~~~~ ~~~ ~ /  ===- ~~~
           \______ o           __/
             \    \         __/
              \____\_______/

moby login: root
Welcome to the Moby alpha, based on Alpine Linux.
moby:~# telnet 192.168.65.1 53
telnet: can't connect to remote host (192.168.65.1): Connection refused

About this issue

  • Original URL
  • State: closed
  • Created 8 years ago
  • Reactions: 16
  • Comments: 31 (9 by maintainers)

Most upvoted comments

fixed 👍

$ docker run -it --rm --privileged --pid=host debian nsenter -t 1 -m -u -n -i sh
$ cat /etc/resolv.conf
search local
nameserver 192.168.65.1
nameserver 192.168.65.3

# change to 8.8.8.8
$ cat /etc/resolv.conf
search local
nameserver 8.8.8.8

No need restart docker engine, works fine now.

@SydOps, as a temporary workaround you can set your OS X to use Google’s public DNS 8.8.8.8. Works for me. That’s how I was able to get the alpine container. 😃

@djs55, here’s the info you requested:

docker version
Client:
 Version:      1.12.1-rc1
 API version:  1.24
 Go version:   go1.6.3
 Git commit:   7889dc7
 Built:        Fri Aug 12 18:35:53 2016
 OS/Arch:      darwin/amd64
 Experimental: true

Server:
 Version:      1.12.1-rc1
 API version:  1.24
 Go version:   go1.6.3
 Git commit:   7889dc7
 Built:        Fri Aug 12 18:35:53 2016
 OS/Arch:      linux/amd64
 Experimental: true

Verifying the bug happens:

docker run -it alpine:3.2 sh
Unable to find image 'alpine:3.2' locally
3.2: Pulling from library/alpine
bfc185be0245: Pulling fs layer
docker: error pulling image configuration: Get https://dseasb33srnrn.cloudfront.net/registry-v2/docker/registry/v2/blobs/sha256/49/4933271a21f1a3eb183cae296ce2f405c8e0852fb4c90eae577b430393d7ef36/data?Expires=1471566573&Signature=G6TdXQg1c-oPj0KuyrDtirqgM~clW-ohpzLOEBNij-Q8EvNtheaaxJWb4jZi~4fn07iWu1yddXwCMPnh23lRy6coXRpiyoEYT2AK6O7j57ewG~5QrK5G61TNSvmc-CzvXAIUwFpDw81WQSAzSoLt~VlwDc4HSxnxRY6fiKatZ2k_&Key-Pair-Id=APKAJECH5M7VWIS5YZ6Q: dial tcp: lookup dseasb33srnrn.cloudfront.net on 192.168.65.3:53: cannot unmarshal DNS message.
See 'docker run --help'.

Querying from OS X works:

# dig +tcp @192.168.88.1 google.com

; <<>> DiG 9.8.3-P1 <<>> +tcp @192.168.88.1 google.com
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 52163
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 4, ADDITIONAL: 4

;; QUESTION SECTION:
;google.com.            IN  A

;; ANSWER SECTION:
google.com.     299 IN  A   216.58.212.14

;; AUTHORITY SECTION:
google.com.     27901   IN  NS  ns3.google.com.
google.com.     27901   IN  NS  ns2.google.com.
google.com.     27901   IN  NS  ns1.google.com.
google.com.     27901   IN  NS  ns4.google.com.

;; ADDITIONAL SECTION:
ns3.google.com.     45642   IN  A   216.239.36.10
ns2.google.com.     27901   IN  A   216.239.34.10
ns1.google.com.     27901   IN  A   216.239.32.10
ns4.google.com.     27901   IN  A   216.239.38.10

;; Query time: 670 msec
;; SERVER: 192.168.88.1#53(192.168.88.1)
;; WHEN: Fri Aug 19 03:10:20 2016
;; MSG SIZE  rcvd: 180

Going into the container:

# docker run -it alpine sh
/ # apk update
fetch http://dl-cdn.alpinelinux.org/alpine/v3.4/main/x86_64/APKINDEX.tar.gz
fetch http://dl-cdn.alpinelinux.org/alpine/v3.4/community/x86_64/APKINDEX.tar.gz

v3.4.3-3-g543f7af [http://dl-cdn.alpinelinux.org/alpine/v3.4/main]
v3.4.2-11-g9b41a63 [http://dl-cdn.alpinelinux.org/alpine/v3.4/community]
OK: 5968 distinct packages available
/ # apk add drill
(1/2) Installing ldns (1.6.17-r3)
(2/2) Installing drill (1.6.17-r3)
Executing busybox-1.24.2-r9.trigger
OK: 5 MiB in 13 packages
/ # drill -t google.com @192.168.65.1
;; ->>HEADER<<- opcode: QUERY, rcode: NOERROR, id: 24221
;; flags: qr rd ra ; QUERY: 1, ANSWER: 1, AUTHORITY: 4, ADDITIONAL: 4
;; QUESTION SECTION:
;; google.com.  IN  A

;; ANSWER SECTION:
google.com. 238 IN  A   216.58.212.14

;; AUTHORITY SECTION:
google.com. 27840   IN  NS  ns1.google.com.
google.com. 27840   IN  NS  ns4.google.com.
google.com. 27840   IN  NS  ns3.google.com.
google.com. 27840   IN  NS  ns2.google.com.

;; ADDITIONAL SECTION:
ns1.google.com. 27840   IN  A   216.239.32.10
ns4.google.com. 27840   IN  A   216.239.38.10
ns3.google.com. 45581   IN  A   216.239.36.10
ns2.google.com. 27840   IN  A   216.239.34.10

;; Query time: 5 msec
;; SERVER: 192.168.65.1
;; WHEN: Fri Aug 19 00:11:20 2016
;; MSG SIZE  rcvd: 180
/ #

Hi There,

Are there any updates on this issue?

I am having a similar experience running Docker 1.12.3 for Mac. DNS resolution does not seem to be reliable. May work for some period of time but then becomes intermitent. Restart of docker seems to fix the issue.

For now, I have just changed DNS servers to Google’s as @mitsuhiko suggested. Seems to work for the moment but not sure if this is because of the restart required to action his suggestion (which I would do in any case to get DNS working) or the different DNS servers.

Here’s a quick update on where we are (as of beta 28, due to be released this week)

What should work:

  • fallback to TCP should work using the same server-selection logic as UDP
  • the Mac’s “System Preferences -> Network -> Advanced -> DNS” server list should be used as upstream servers (with caveats, see below). This uses the system config database rather than the legacy /etc/resolv.conf file on the host.
  • search domain configuration is forwarded to the VM on start (which does also mean an application restart is needed to change it)

What we’re still working on:

  • if there are multiple upstream DNS servers (especially if more than 3) then the wrong servers are sometimes queried and this can cause lookup failures.
  • configurations where queries for a particular domain (e.g. *.corp) are sent to specific servers

Note the /etc/resolv.conf in the file contains multiple virtual DNS IP addresses, but these are mapped onto the servers in “System Preferences -> Network -> Advanced -> DNS”. If you change the system preferences settings then it should affect lookups in the VM, even though the /etc/resolv.conf file doesn’t change.

I’m hoping to address some of the remaining issues over the next few beta cycles – I’ll keep you posted. Thanks for your reports and your patience!

As long as you have a local image (debian for example) I think you can connect to moby with the following command.

docker run -it --privileged --pid=host debian nsenter -t 1 -m -u -n -i sh

I think I am hitting the same thing here but I’m not entirely sure yet about the mechanics. I noticed that if I put 8.8.8.8 as the DNS server into my mac (docker for mac) the VM can resolve just fine. If I use my router’s IP which runs a DNS server the VM fails to resolve.

@djs55 There’s another use-case to this (although I’m not sure how relevant it is to this particular issue). If something (say, VPN client) updates /etc/resolv.conf on host machine - docker is blissfully unaware of that change. Currently I’m “fixing” this issue by just logging into moby and forcing VPN DNS as first in the list, but would be awesome it it would poll for changes on a regular basis.

@favoretti this should work, modulo the issues that @djs55 mentioned - the DNS servers on Moby are just proxies to the Mac, which will use the latest information. If you have a specific feature request/issue around that that is not working, I recommend opening an issue on https://github.com/docker/for-mac/issues with details of exactly how to replicate.

Hi,

We’re having this issue while trying to login a private docker registry. Will this PR make it to 1.12.1? Will you please add roadmap label to this one? 😃