kubernetes: kube-dns does not work due to DNS server lookup loop (/etc/resolv.conf contains 127.0.0.01 as the upstream nameserver)kubedns-masq and sidecar containers crash after doing nslookup @kubernetes/sig-network-bugs /sig area/dns

Is this a BUG REPORT or FEATURE REQUEST?:

Uncomment only one, leave it on its own line:

/kind bug @kubernetes/sig-network-bugs /sig area/dns

What happened: kubedns-masq and sidecar containers crash after doing nslookup on logical names that do not exists in the kubedns.

What you expected to happen: The crashing of containers is not an expected behavior.

How to reproduce it (as minimally and precisely as possible):

Anything else we need to know?: Se details below Environment:

  • Kubernetes version (use kubectl version): 1.6.4
  • Cloud provider or hardware configuration**: Local machine running K8s cluster on VMWare Workstation 12 pro (single node)
  • OS (e.g. from /etc/os-release): Ubuntu 17.04
  • Kernel (e.g. uname -a):

Linux ubuntumaster 4.10.0-19-generic #21-Ubuntu SMP Thu Apr 6 17:04:57 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux

  • Install tools:
  • Others:

I am facing a strange issue Here are my kubernetes details Kubernetes version: 1.6.4 OS: Ubuntu 17.04 I use the YAML files from the following link to install kube-dns https://github.com/kubernetes/kubernetes/tree/master/cluster/addons/dns/

Step 1

I then try to launch busybox using kubectl run -i --tty busybox --image=busybox – sh

Now when I do nslookup kubernetes it works. Now when I try to see the docker logs for my sidecar I see the following

ERROR: logging before flag.Parse: I0721 17:15:25.067730 1 main.go:48] Version v1.14.3-4-gee838f6 ERROR: logging before flag.Parse: I0721 17:15:25.067981 1 server.go:45] Starting server (options {DnsMasqPort:53 DnsMasqAddr:127.0.0.1 DnsMasqPollIntervalMs:5000 Probes:[{Label:kubedns Server:127.0.0.1:10053 Name:kubernetes.default.svc.cluster.local. Interval:5s Type:1} {Label:dnsmasq Server:127.0.0.1:53 Name:kubernetes.default.svc.cluster.local. Interval:5s Type:1}] PrometheusAddr:0.0.0.0 PrometheusPort:10054 PrometheusPath:/metrics PrometheusNamespace:kubedns}) ERROR: logging before flag.Parse: I0721 17:15:25.068191 1 dnsprobe.go:75] Starting dnsProbe {Label:kubedns Server:127.0.0.1:10053 Name:kubernetes.default.svc.cluster.local. Interval:5s Type:1} ERROR: logging before flag.Parse: I0721 17:15:25.068398 1 dnsprobe.go:75] Starting dnsProbe {Label:dnsmasq Server:127.0.0.1:53 Name:kubernetes.default.svc.cluster.local. Interval:5s Type:1} ERROR: logging before flag.Parse:

Step 2 If I do a nslookup against a name (say nslookup ABCD) that I know does not exist in kubedns (as a A record) then I see following logs in my dnsmesg container

I0718 17:26:33.738368 1 nanny.go:108] dnsmasq[13]: Maximum number of concurrent DNS queries reached (max: 150)

And at the same time I see following in sidecar

dns sidecar errorERROR: logging before flag.Parse: I0720 02:09:37.975054 1 main.go:48] Version v1.14.3-4-gee838f6 ERROR: logging before flag.Parse: I0720 02:09:37.975140 1 server.go:45] Starting server (options {DnsMasqPort:53 DnsMasqAddr:127.0.0.1 DnsMasqPollIntervalMs:5000 Probes:[{Label:kubedns Server:127.0.0.1:10053 Name:kubernetes.default.svc.cluster.local. Interval:5s Type:1} {Label:dnsmasq Server:127.0.0.1:53 Name:kubernetes.default.svc.cluster.local. Interval:5s Type:1}] PrometheusAddr:0.0.0.0 PrometheusPort:10054 PrometheusPath:/metrics PrometheusNamespace:kubedns}) ERROR: logging before flag.Parse: I0720 02:09:37.975160 1 dnsprobe.go:75] Starting dnsProbe {Label:kubedns Server:127.0.0.1:10053 Name:kubernetes.default.svc.cluster.local. Interval:5s Type:1} ERROR: logging before flag.Parse: I0720 02:09:37.975200 1 dnsprobe.go:75] Starting dnsProbe {Label:dnsmasq Server:127.0.0.1:53 Name:kubernetes.default.svc.cluster.local. Interval:5s Type:1} ERROR: logging before flag.Parse: W0720 02:12:50.090595 1 server.go:64] Error getting metrics from dnsmasq: read udp 127.0.0.1:36473->127.0.0.1:53: i/o timeout ERROR: logging before flag.Parse: W0720 02:13:03.607740 1 server.go:64] Error getting metrics from dnsmasq: read udp 127.0.0.1:56334->127.0.0.1:53: i/o timeout ERROR: logging before flag.Parse: W0720 02:13:10.609651 1 server.go:64] Error getting metrics from dnsmasq: read udp 127.0.0.1:39976->127.0.0.1:53: i/o timeout ERROR: logging before flag.Parse: W0720 02:13:23.644035 1 server.go:64] Error getting metrics from dnsmasq: read udp 127.0.0.1:57226->127.0.0.1:53: i/o timeout

Step 3 Now here is a bummer after 5minutes or 10 minutes my sidecar and dnsmasq both crash and new containers get recreated.

A few other details about my K8s cluster I am using VMWare workstation 12 Pro to run my cluster on one node and I have enabled RBAC and am using client certificates for authentication

SOS. Can someone point me in the right direction? Have spent a lot of time trying to figure this one out. Thanks in advance

About this issue

  • Original URL
  • State: closed
  • Created 7 years ago
  • Comments: 17 (8 by maintainers)

Most upvoted comments