kind: Private DNS Fails on 0.8 with systemd-resolved

What happened: Deployments that reference a private registry that is only resolvable via a private DNS server still fails in 0.8. It is not exactly clear to me what is running DNS on the 172.18.0.1 interface. The systemd DNS resolver is mounted on 127.0.0.53 and the authors are pretty adamant that it should never be exposed to other interfaces. Relevant logs below.

What you expected to happen: The new DNS approach would k8s to successfully resolve my hosts and fetch containers.

How to reproduce it (as minimally and precisely as possible): hmm. configure systemd-resolved to reference a DNS server in a private-network/VPN. That same private network hosts a container registry.

Anything else we need to know?: I see that it is attempting to use the Bridge Interface (172.18.0.1) but the DNS server (proxy?) on that IP still does not resolve my private hosts.

Environment: Arch Linux connecting to a remote network via a Wireguard VPN. The VPN (and its DNS) are defined in systemd/networkd and systemd/resolved.

  • kind version: (use kind version): kind v0.8.0 go1.14.2 linux/amd64
  • Kubernetes version: (use kubectl version):
Client Version: version.Info{Major:"1", Minor:"17", GitVersion:"v1.17.3", GitCommit:"06ad960bfd03b39c8310aaf92d1e7c12ce618213", GitTreeState:"clean", BuildDate:"2020-02-11T18:14:22Z", GoVersion:"go1.13.6", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"18", GitVersion:"v1.18.2", GitCommit:"52c56ce7a8272c798dbc29846288d7cd9fbae032", GitTreeState:"clean", BuildDate:"2020-04-30T20:19:45Z", GoVersion:"go1.13.9", Compiler:"gc", Platform:"linux/amd64"}
  • Docker version: (use docker info): Server Version: 19.03.8-ce
  • OS (e.g. from /etc/os-release):
NAME="Manjaro Linux"
ID=manjaro
ID_LIKE=arch
PRETTY_NAME="Manjaro Linux"

Pod Docker Log

  Warning  Failed     23m (x4 over 25m)    kubelet, ricks-local-control-plane  Failed to pull image "registry.myprivatedomain.com/project/core/bifrost:latest": rpc error: code = Unknown desc = failed to pull and unpack image "registry.myprivatedomain.com/project/core/bifrost:latest": failed to resolve reference "registry.myprivatedomain.com/project/core/bifrost:latest": failed to do request: Head https://registry.myprivatedomain.com/v2/project/core/bifrost/manifests/latest: dial tcp: lookup registry.myprivatedomain.com on 172.18.0.1:53: no such host

Checking DNS from my local host

➜ drill @127.0.0.53 registry.myprivatedomain.com
;; ->>HEADER<<- opcode: QUERY, rcode: NOERROR, id: 9308
;; flags: qr rd ra ; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 0
;; QUESTION SECTION:
;; registry.myprivatedomain.com.	IN	A

;; ANSWER SECTION:
registry.myprivatedomain.com.	60	IN	A	10.192.5.119
registry.myprivatedomain.com.	60	IN	A	10.192.10.209

About this issue

  • Original URL
  • State: closed
  • Created 4 years ago
  • Comments: 18 (10 by maintainers)

Most upvoted comments

For future Googlers of this issue:

Workarounds

I have found either of these two approaches to work:

  1. Remove the symlinked /etc/resolve.conf and create your own, then add your nameserver. Then launch kind. This is a supported configuration. In this mode, systemd-resolved will read /etc/resolv.conf instead of setting it. Since /etc/resolv.conf only really exists for legacy reasons in the eyes of systemd-resolved, this is a reasonable approach.
cp /etc/resolv.conf /tmp/resolv.conf
echo "nameserver 10.10.0.2" >> /tmp/resolv.conf
sudo rm /etc/resolv.conf
sudo mv /tmp/resolve.conf /etc/
  1. Add a dns entry to your /etc/docker/daemon.json I am not sure why this works, but it does. In a user-defined network of type “Bridge”, like the one that kind uses at 0.8+, a container’s /etc/resolv.conf points to the bridge’d ethernet interface, then iptables redirects that to a “kernel” DNS server on some high level random port. This is Docker’s embedded DNS server, and it acts according to some undocumented magic. At the moment, this works. There is no way to know if this is intended behavior, or if it will last forever.

(if you already have a /etc/docker/daemon.json file}

cat /etc/docker/daemon.json | jq --arg ns "10.10.0.2" '. + { dns: [$ns] }' > /tmp/daemon.json && sudo mv /tmp/daemon.json /etc/docker 

If you don’t have an /etc/docker/daemon.json file just create one with these contents:

{
  "dns": [
    "10.10.0.2"
  ]
}

then systemctl restart docker

@BenTheElder - Thanks for looking into this. I agree with your assessment, even though I don’t like it. 😃

To summarize:

  • The systemd-resolved loopback resolver will not be directly reachable in any docker container, due to being on loopback, which is in the host’s namespace, and is not reachable from within any container.
    • We cannot point the containers at this directly due to this, adding a config option to do so won’t change that
  • The systemd-resolved loopback resolver should be reachable via the embedded DNS listener <-> dockerd on the host
    • dockerd seems to not be respecting the systemd-resolved setting on the host

To fix that last point, we’ll need work done upstream like https://github.com/moby/moby/issues/38243.

It doesn’t make sense for kind to do anything else about this, we’re not responsible for the docker daemon on the host, nor the system resolver / config. There isn’t any reasonable workaround for this within kind.