rancher: Pods cannot resolve DNS in Rancher 2.0 cluster.

Rancher versions: Rancher:v2.0.2 kubernetes (if applicable): v1.10.1

**Docker version: Docker version 17.03.2-ce, build f5ec1e2 Operating system and kernel: (cat /etc/os-release, uname -r preferred) NAME=“CentOS Linux” VERSION=“7 (Core)” ID=“centos” ID_LIKE=“rhel fedora” VERSION_ID=“7” PRETTY_NAME=“CentOS Linux 7 (Core)” ANSI_COLOR=“0;31” CPE_NAME=“cpe:/o:centos:centos:7” HOME_URL=“https://www.centos.org/” BUG_REPORT_URL=“https://bugs.centos.org/

CENTOS_MANTISBT_PROJECT=“CentOS-7” CENTOS_MANTISBT_PROJECT_VERSION=“7” REDHAT_SUPPORT_PRODUCT=“centos” REDHAT_SUPPORT_PRODUCT_VERSION=“7” Type/provider of hosts: (VirtualBox/Bare-metal/AWS/GCE/DO) Bare-metal Setup details: (single node rancher vs. HA rancher, internal DB vs. external DB) HA rancher Environment Template: (Cattle/Kubernetes/Swarm/Mesos)

Steps to Reproduce:

  1. Added the nodes manually from Rancher 2.0 console with the following config. K8s version: v1.10.1-rancher2-1 Network provider: Canal
  2. I have tried K8s DNS debugging steps from the link (https://kubernetes.io/docs/tasks/administer-cluster/dns-debugging-resolution/). cannot pin point the issue.

Results:

 kubectl get ep kube-dns --namespace=kube-system
 NAME       ENDPOINTS                                              AGE
 kube-dns   10.42.0.13:53,10.42.2.2:53,10.42.0.13:53 + 1 more...   10d

  kubectl get svc --namespace=kube-system
 NAME       TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)         AGE
 kube-dns   ClusterIP   10.43.0.10   <none>        53/UDP,53/TCP   10d


kubectl get pods --namespace=kube-system
NAME                                  READY     STATUS    RESTARTS   AGE
canal-fkqxp                           3/3       Running   0          41m
canal-qtlxb                           3/3       Running   0          41m
canal-wv6t5                           3/3       Running   0          41m
kube-dns-787ddf5554-78vkj             3/3       Running   0          9d
kube-dns-787ddf5554-jkdr8             3/3       Running   0          3h
kube-dns-autoscaler-6c4b786f5-xlqd4   1/1       Running   0          10d

Busybox pod output

/ # nslookup kubernetes.default
 Server:    10.43.0.10
 Address 1: 10.43.0.10
 
 nslookup: can't resolve 'kubernetes.default'


 / # cat /etc/resolv.conf
nameserver 10.43.0.10
search default.svc.cluster.local svc.cluster.local cluster.local enterprisenet.org nielsen.com nielsenmedia.com
options ndots:5

About this issue

  • Original URL
  • State: closed
  • Created 6 years ago
  • Comments: 26 (7 by maintainers)

Most upvoted comments

Okay cool. I haven’t ran any test on it but possibly:

# Possible other interfaces instead of eth0
firewall-cmd --permanent --zone=trusted --add-interface=eth0
# Flannel
firewall-cmd --permanent --zone=trusted --add-interface=flannel.1
# Your internal network
firewall-cmd --permanent --zone=trusted --add-source=192.168.1.0/24
# POD cidr
firewall-cmd --permanent --zone=trusted --add-source=10.42.0.0/16
# Service cidr
firewall-cmd --permanent --zone=trusted --add-source=10.43.0.0/16

For all testers, please dont use busybox due to https://github.com/docker-library/busybox/issues/48 / https://bugs.busybox.net/show_bug.cgi?id=11161.

Only reliable way to test DNS using busybox is to use busybox:1.28 as the bug is not present in this version:

kubectl run busybox --image=busybox:1.28 --rm -ti --restart=Never -- nslookup kubernetes.default

Or another image with nslookup or just ping (although it won’t reply, it will still resolve). The original issue was fixed by disabling firewalld, and we have an issue with upgrading to v2.0.7 and higher when system namespaces were moved in a project before upgrading.

https://linux-rtdocs.readthedocs.io/en/latest/k8s/coreDNS添加静态DNS的方法/ you could try to use this article to add hosts to k8s coredns

I hace the same problem on nodes connected through VPN (tinc).

I’ve seen weird issues with firewalld enabled, can you disable it?