moby: Container using network mode host does not get its resolv.conf updated when the host's resolv.conf is updated (using systemd-resolved)
Description
Because the resolv.conf is not updated on the container, it stops having access to the internet when the host / device changes networks. I saw https://github.com/docker/for-linux/issues/889 which mentions that it is supposed to be updated automatically but I actually can’t find where this is mentioned in https://docs.docker.com/v17.09/engine/userguide/networking/default_network/configure-dns/.
Is this behavior of the resolv.conf not updating with host a bug or is this something not implemented or intended behavior ?
Reproduce
Start a long running container on your laptop (which is using systemd-resolved), then move to a different network with different DNS servers. Notice that the resolv.conf inside the container is now wrong.
Expected behavior
resolv.conf on container should match the updated host’s resolv.conf
docker version
Client:
Version: 24.0.5
API version: 1.43
Go version: go1.20.6
Git commit: ced0996600
Built: Wed Jul 26 21:44:58 2023
OS/Arch: linux/amd64
Context: default
Server:
Engine:
Version: 24.0.5
API version: 1.43 (minimum version 1.12)
Go version: go1.20.6
Git commit: a61e2b4c9c
Built: Wed Jul 26 21:44:58 2023
OS/Arch: linux/amd64
Experimental: false
containerd:
Version: v1.7.2
GitCommit: 0cae528dd6cb557f7201036e9f43420650207b58.m
runc:
Version: 1.1.8
GitCommit:
docker-init:
Version: 0.19.0
GitCommit: de40ad0
docker info
Client:
Version: 24.0.5
Context: default
Debug Mode: false
Server:
Containers: 7
Running: 1
Paused: 0
Stopped: 6
Images: 14
Server Version: 24.0.5
Storage Driver: overlay2
Backing Filesystem: extfs
Supports d_type: true
Using metacopy: true
Native Overlay Diff: false
userxattr: false
Logging Driver: json-file
Cgroup Driver: systemd
Cgroup Version: 2
Plugins:
Volume: local
Network: bridge host ipvlan macvlan null overlay
Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
Swarm: inactive
Runtimes: io.containerd.runc.v2 runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 0cae528dd6cb557f7201036e9f43420650207b58.m
runc version:
init version: de40ad0
Security Options:
seccomp
Profile: builtin
cgroupns
Kernel Version: 6.4.8-zen1-1-zen
Operating System: Arch Linux
OSType: linux
Architecture: x86_64
CPUs: 16
Total Memory: 13.5GiB
Name: Lenovo-Yoga-7
ID: 3PMN:VRXJ:C3R6:RFC2:ZLXJ:OJJU:OFKE:DQLW:YBC6:YYWQ:EHPI:WDWG
Docker Root Dir: /var/lib/docker
Debug Mode: false
Experimental: false
Insecure Registries:
127.0.0.0/8
Live Restore Enabled: false
Additional Info
No response
About this issue
- Original URL
- State: open
- Created a year ago
- Reactions: 1
- Comments: 19 (14 by maintainers)
Heh, I knew which PR you linked to without clicking the link. I opened that PR when I was somewhat cleaning up the
resolvconf
package, which had become over-engineered and complex over the years (still more to do there!).Host
--network=host
handles DNSSo for the
--network=host
case, the situation is somewhat similar to the “default bridge” case, but for different reasons;--network=host
, the container doesn’t have a networking namespace, so “localhost
“inside” the container ===localhost
“outside” the container” because from a networking perspective, there is no “inside” or “outside” the container they’re exactly the same.But: here’s where the “fun” start, because while the “networking” namespace is the same, the filesystem (mount namespace) is still separate, and we still need to configure the container so that processes inside the container know what resolver to use;
/etc/resolv.conf
inside the container is a file that needs to be present inside the containerA logical approach would be to bind-mount the host’s
/etc/resolv.conf
into the container, but that had some challenges;/etc/resolv.conf
on the host, depending on the system configuration, may be a symlink (bind-mounting that inside the container would try to resolve the symlink’s target inside the container)/etc/resolv.conf
on the host may be “modified” (the topic of this ticket); bind-mounting files uses the file’sinode
, which can be problematic because most software updating files will use acopy file -> update copy -> (delete, and) replace original file
, in which case the container would still be holding a mount for the deleted file (so the copy before updating)./etc/resolv.conf
inside the container is writable, and we don’t want the container to be able to modify the file on the host (which would be the case if we’d bind-mount the file from the host’s/etc/resolv.conf
).So, for these reasons, we (again) need a COPY of the host’s
/etc/resolv.conf
(or whatever that’s symlinked to) for each container, and make sure thatWhich brings us back to “square one” (described in my “bridge” comment from earlier) 😂
Reconfiguring the “embedded DNS”
So this is something I need to look into, and what came up when I discussed this with @akerouanton
While writing my earlier comment, my assumption was that the embedded DNS itself has no real configuration
/etc/resolv.conf
on the host)127.0.0.53
(if systemd-resolvd is in use)systemd-resolvd
handle the forwarding to “upstream” resolvers.However, this MAY not be the case (this is something I need to look into / verify), and it’s possible that the embedded DNS also is using more than that, and may be reading systemd-resolvd’s UPSTREAM DNS resolvers to configure what it should use. This would mean that dynamically switching networks would also prevent the embedded DNS from using the correct DNS. And if that’s the case, that’s probably something that should be fixed.
systemd-resolved can actually run in different modes based on resolv.conf contents https://man.archlinux.org/man/systemd-resolved.8#/ETC/RESOLV.CONF the recommended way is for resolv.conf to be a symlink to
/run/systemd/resolve/stub-resolv.conf
but resolv.conf can be maintained by something else (like NetworManager) and then systemd-resolved can act as a consumer of that file rather than managing it, and that is how I think its setup on my system. I dont remember how or why I configured it that way but its been working perfectly for me for a while and it is not really something wrong, just not the recommended way.For docker itself, no. Customisations can be made through the
--dns
,--dns-opt
,--add-host
,--hostname
etc options, and those are made when the container is created (so would not require the file to be writable).But having these files (
/etc/hosts
,/etc/resolv.conf
,/etc/hostname
writable is a feature that was added at some point, so 🤷♂️ ; seeAdmitted, I think most of the requests were for
/etc/hosts
to be writable, but there may have been some cases where either the user, or software they were running required (expected) those files to be writable.From the description, I think this is for the “default” bridge network.
When using the default bridge, the “legacy” networking stack (pre “custom networks”) is used;
resolv.conf
127.0.0.53
) cannot be used, and instead, systemd-resolvd’s “upstream” DNS servers are read from/run/systemd/resolve/resolv.conf
, and included in the container’sresolv.conf
resolv.conf
at time of creation, which is to allow the user to edit the file (in which case, docker will no longer update it, to prevent changes made by the user from being reverted)resolv.conf
copies of all containers are re-created (skipping those that were modified by the user)This flow originated from the very early beginnings of Docker, and was designed with the assumption that the daemon would run in a server environment (no dynamic IP and/or networks), and before
systemd-resolvd
existed (having alocalhost
/127.0.0.x
resolver was an “exception”, not the “rule”);It may be clear that a lot of complexity is involved here, and quite some parts where things can go wrong (looking up systemd-resolvd’s upstreams); dynamically updating the
resolv.conf
for each container could be an option, but I guess the challenge would be somewhat to decide what should trigger this; alternatively, maybe we can do this on areload
(systemctl reload docker.service
to trigger re-generatingresolv.conf
).The better solution would probably be to remove the legacy code-path, and always use the embedded DNS; I opened a ticket for that once;
The reason the legacy code-path still exists was (IIRC) for a few reasons, but I think most of those should no longer be a concern (and I’d love to get rid of the two distinct implementations);
resolv.conf
and/etc/hosts
), and making the default bridge use the embedded DNS could break those tools. I’m not sure if that’s something we should be really concerned about, as anything inside/var/lib/docker
is considered to be exclusively accessed by the daemon, so any tool making other assumptions is depending on “undocumented” behavior.