coredns: k8s with coredns that could not access internet

the version of CoreDNS: k8s.gcr.io/coredns:1.2.6
your Corefile

.:53 {
    errors
    health
    kubernetes cluster.local in-addr.arpa ip6.arpa {
       pods insecure
       upstream
       fallthrough in-addr.arpa ip6.arpa
    }
    prometheus :9153
    proxy . /etc/resolv.conf
    cache 30
    loop
    reload
    loadbalance
}

logs, if applicable

 [ERROR] plugin/errors: 2 7402652896690598563.3431075457317150843. HINFO: unreachable backend: read udp 10.244.0.69:49291->114.114.114.114:53: i/o timeout
 [ERROR] plugin/errors: 2 7402652896690598563.3431075457317150843. HINFO: unreachable backend: read udp 10.244.0.69:51219->114.114.114.114:53: i/o timeout

My /etc/resolv.conf file content:

# Dynamic resolv.conf(5) file for glibc resolver(3) generated by resolvconf(8)
#     DO NOT EDIT THIS FILE BY HAND -- YOUR CHANGES WILL BE OVERWRITTEN
nameserver 114.114.114.114

This error happens when an application in my k8s cluster tried to access another service, that also inside the same k8s cluster(service domain is *.namespace.svc.cluster.local)

The machine is in an offline environment, that means, when I tried to ping 114.114.114.114 (that is a DNS server in china):

[root@k8s ~]# ping 114.114.114.114
PING 114.114.114.114 (114.114.114.114) 56(84) bytes of data.

there is no reply at all, so that must be the cause of the problem.

However, After I read the doc here, it says:

upstream [ADDRESS…] defines the upstream resolvers used for resolving services that point to external hosts (aka External Services, aka CNAMEs). If no ADDRESS is given, CoreDNS will resolve External Services against itself. ADDRESS can be an IP, an IP:port, or a path to a file structured like resolv.conf.

So logically coreDNS will solve my requests inside the k8s cluster, but why this happened?

And after that I have tried to remove the proxy . /etc/resolv.conf or clean all of the data in /etc/resolv.conf, but they still not work.

Can someone help me? thanks!

About this issue

Original URL
State: closed
Created 5 years ago
Comments: 26 (15 by maintainers)

Most upvoted comments

The machine is in an offline environment

Since you are in an offline environment and no upstream servers can be reached (by design), remove the proxy plugin, and the upstream directive in the kubernetes plugin … I would also remove loop plugin, and the fallthrough directive since there is no forwarding that can occur …

.:53 {
    errors
    health
    kubernetes cluster.local in-addr.arpa ip6.arpa {
       pods insecure
    }
    prometheus :9153
    cache 30
    reload
    loadbalance
}

There was a bug that caused unreachable upstreams to cause loop plugin to report false positive loops, but this was fixed in 1.2.6.

chrisohaver on Jan 24, 2019

The NXDOMAIN responses in the log are because those names dont exist. for example: orderer.fabric-net.svc.cluster.local.svc.cluster.local. does not exist. This will return NXDOMAIN. orderer.fabric-net.svc.cluster.local. looks more correct. This will return NOERROR if the service exists.

You’ll see series of weird lookups like this from your clients in kubernetes, due to the search domains in a k8s cluster. The client will try the name in each search domain before trying the name by itself… e.g.

peer1-org1.fabric-net.svc.cluster.local.fabric-net.svc.cluster.local. ->  NXDOMAIN
peer1-org1.fabric-net.svc.cluster.local.svc.cluster.local. ->  NXDOMAIN
peer1-org1.fabric-net.svc.cluster.local.cluster.local. ->  NXDOMAIN
peer1-org1.fabric-net.svc.cluster.local. ->  NOERROR

The pattern above is normal to see in Kubernetes, and would not cause the failures you are seeing.

chrisohaver on Jan 24, 2019

/close

If this turns out to be a coreDNS bug or issue, happy for you to reopen.

stp-ip on Jan 25, 2019