rancher: Dnsmasq running on the host will result in Rancher DNS container fail to do recursive DNS query
Network services weren’t coming up, and timeout getting IP is a generic error message. Turns out dnsmasq was running on the agent host and bound to port 53.
metadata-1 logs
8/2/2017 7:21:33 PMtime="2017-08-03T02:21:33Z" level=info msg="Subscribing to events"
8/2/2017 7:21:33 PMtime="2017-08-03T02:21:33Z" level=fatal msg="Failed to subscribeGet https://<myRancherURI>/v2-beta: dial tcp: lookup <myRancherURI> on 127.0.0.1:53: read udp 127.0.0.1:38353->127.0.0.1:53: read: connection refused"
8/2/2017 7:21:38 PMtime="2017-08-03T02:21:38Z" level=info msg="Subscribing to events"
8/2/2017 7:21:38 PMtime="2017-08-03T02:21:38Z" level=fatal msg="Failed to subscribeGet https://<myRancherURI>/v2-beta: dial tcp: lookup <myRancherURI> on 127.0.0.1:53: read udp 127.0.0.1:57495->127.0.0.1:53: read: connection refused"
metadata-dns logs were empty
seems dns binds 6060, 80 and 53… we should also document those in the agent host docs.
About this issue
- Original URL
- State: closed
- Created 7 years ago
- Comments: 24 (18 by maintainers)
In order to skip the default check, you’ll need to run the
docker run rancher/agent
command with-e CATTLE_CHECK_NAMESERVER=false
@aiwantaozi
We cannot set any arbitrary DNS for the customer.
Let’s take a look like what we have now.
What we observed:
Analysis:
/etc/resolv.conf
. Since DNSMASQ server is running,/etc/resolv.conf
will be modified to point to127.0.0.1
rancher-dns
service started, it will pick up the/etc/resolv.conf
in the host, and use it inside the container. It is the standard behavior of Docker container (unless someone like Rancher update it later). So it will use127.0.0.1
inside the container network namespace, and it won’t work because the host DNS server is in another network namespace.127.0.0.1
can be rancher-dns itself if it listens to all IPs. I think the behavior of listening address has changed in 1.6.10.Solution:
We can prohibit using
127.0.0.1
in the host/etc/resolv.conf
by checking it when rancher-dns started.If the user wants to use DNSMASQ on the host, it seems we can recommend user to add
DNSMASQ_EXCEPT=lo
to/etc/defaults/dnsmasq
to avoid DNSMASQ make itself as the only DNS server on the host. Reference: https://superuser.com/questions/894513/resolv-conf-keeps-getting-overwritten-when-dnsmasq-is-restarted-breaking-dnsmas