moby: docker in qemu in docker: dockerd hangs entire system

Steps to reproduce:

  1. Prepare an Alpine Linux qcow2 or use this one
  2. Build this Dockerfile to get my qemu build (docker build -f Dockerfile -t qemu .)
  3. Ensure you are in the kvm group (or have write access to /dev/kvm) on the host
  4. Run the following Docker command:
docker run \
    -it \
    --device /dev/kvm \
    --mount type=tmpfs,destination=/var/tmp \
    -v $(pwd):/base:rw \
    -p 127.0.0.1:8022:8022 qemu \
    qemu-system-x86_64 \
        -m 2048 \
        -net nic,model=virtio \
        -net user,hostfwd=tcp::8022-:22 \
        -cpu host \
        -enable-kvm \
        -drive file=/base/alpine.img.qcow2,media=disk,if=virtio \
        -nographic

This will boot up the Alpine Linux image with qemu, you should see the kernel logs in your docker window. This image is set up so you can SSH in from the outside:

ssh -p 8022 build@localhost

There’s no password for this user and it has sudo access. Run the following commands to reproduce the dockerd issue:

  1. sudo apk add docker
  2. sudo service docker start

Wait a moment and you should see the whole system hang. You will have to docker kill this container from the outside.

I’ve done this while tailing dmesg and docker.log and found nothing useful. It just hangs.

I have also had similar issues when running docker on an Arch Linux guest.

Output of docker version:

Client:
 Version:      18.03.1-ce
 API version:  1.37
 Go version:   go1.10.1
 Git commit:   20527e6d83
 Built:        Sun May 20 19:20:18 2018
 OS/Arch:      linux/amd64
 Experimental: false
 Orchestrator: swarm

About this issue

  • Original URL
  • State: closed
  • Created 6 years ago
  • Reactions: 5
  • Comments: 16

Most upvoted comments

I was curious, so I poked around a little bit.

Are you sure qemu is hanging, and it’s not just the network? If I follow your instructions, my ssh session does indeed hang, but I can still access the qemu monitor with Ctrl-A C, and run the quit command rather than docker kill.

Instead, If I run

ssh -p 8022 build@localhost "sudo sed -i 's,^#ttyS0,ttyS0,' /etc/inittab && sudo kill -HUP 1"

to enable a console login, then login and run

sudo apk add docker
sudo service docker start

from the console, the rest of the system (apart from ssh) appears to be fine. Since the log mentioned the docker0 bridge, I tried deleting it on the guest with

sudo ip link del docker0

and then ssh starts to work again.

Maybe there is some conflict between the host docker networking and the guest docker networking? If I run the guest docker as sudo dockerd --bip 172.18.0.1/16 (the host one was 172.17.0.1/16 on my system), that seems to fix it and I can successfully start containers in the guest.

So I think you could resolve this for sr.ht by running the host dockerd with a non-default bridge address, and then the guest dockerd should work with the default configuration.

That was it 😄

https://builds.sr.ht/~sircmpwn/job/47477

https://builds.sr.ht/api/jobs/47477/manifest

Now I just need to get this fix packaged nicely to minimize annoyance to builds.sr.ht users. Thanks for your help!

Progress

QEMU emulator version 2.8.1 without docker appears to work with >20 minutes of uptime QEMU emulator version 3.0.0 and 3.1.0 crashes when run inside docker

QEMU + gdb remote

To build QEMU with symbols add --disable-strip to ./configure in the Dockerfile

gdb remote to qemu in docker requires port forwarding

docker run \
    -it \
    --device /dev/kvm \
    --mount type=tmpfs,destination=/var/tmp \
    -v $(pwd):/base:rw \
    -p 127.0.0.1:8022:8022 \
    -p 127.0.0.1:1234:1234 qemu \
    qemu-system-x86_64 \
        -S -s \
        -m 2048 \
        -net nic,model=virtio \
        -net user,hostfwd=tcp::8022-:22 \
        -cpu host \
        -enable-kvm \
        -drive file=/base/alpine.img.qcow2,media=disk,if=virtio \
        -nographic

Building dockerd with symbols

git clone https://git.alpinelinux.org/aports
cd aports/community/docker
vi APKBUILD

In APKBUILD change options="!check" to options="!check !strip"

abuild checksum && abuild -r

gdb remote into QEMU

$ gdb
GNU gdb (Debian 7.12-6) 7.12.0.20161007-git
Copyright (C) 2016 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word".
(gdb) target remote localhost:1234
Remote debugging using localhost:1234
warning: No executable has been specified and target does not support
determining executable automatically.  Try using the "file" command.
0x000000000000fff0 in ?? ()
(gdb) c
^C
Program received signal SIGINT, Interrupt.
0xffffffff984a9dfd in ?? ()
(gdb) bt
#0  0xffffffff984a9dfd in ?? ()
#1  0xffffffff984a9b62 in ?? ()
#2  0xffffffff98e124c0 in ?? ()
#3  0x0000000000000000 in ?? ()

Even when qemu looks hung, gdb does not stop until a ctrl-c is entered. The backtrace seems to be the same as when the system is idle and is consistent across runs.

The symbols in gdb don’t appear to be in /proc/kallsyms, qemu or dockerd

Other notes

/var/log/messages shows no errors after the crash/hang

Jan  4 06:52:31 build authpriv.notice sudo:    build : TTY=pts/0 ; PWD=/home/build ; USER=root ; COMMAND=/sbin/service docker start
Jan  4 07:23:11 build syslog.info syslogd started: BusyBox v1.28.4

starting dockerd with --iptables=false doesn’t seem to help Once it is hung, additional ssh connections can not be made.

$ sudo docker container stats <container_id>
CONTAINER ID        NAME                 CPU %               MEM USAGE / LIMIT     MEM %               NET I/O             BLOCK I/O           PIDS
<container_id>      suspicious_taussig   0.86%               528.6MiB / 3.366GiB   15.34%              58.8MB / 397kB      71.5MB / 509MB      3

The stats command seems to indicate that the VM is still running

$ sudo docker container cp <container_id>:/var/log/dmesg /tmp
Error: No such container:path: <container_id>:/var/log/dmesg

Trying to get a copy of dmesg off the crashed container failed

todo 😴

run QEMU 3.1.0 outside of docker and/or run QEMU 2.8.1 inside docker

Progress

After container hangs running docker container stop … then relaunching with the same qcow file date && sudo service docker start followed by watch -n 1 date Seems to “work” for a bit. Sometimes it ran for as short as 00:00 and other times it ran up to 02:43 before crashing.

It doesn’t seem to crash if docker isn’t running. Also, in one test, it stayed crashed for at least 42 minutes.

top, vmstat, etc seem to indicate that it isn’t memory. /proc/sys/kernel/random/entropy_avail as high as 2149 and as low as 90 seem to make no difference.

I guess gdb is next 😞

QEMU Console Log

QEMU terminates upon execution of docker container stop …


   OpenRC 0.35.5.87b1ff59c1 is starting up Linux 4.14.59-0-vanilla (x86_64)

 * /proc is already mounted
 * Mounting /run ... * /run/openrc: creating directory
 * /run/lock: creating directory
 * /run/lock: correcting owner
 * Caching service dependencies ... [ ok ]
 * Remounting devtmpfs on /dev ... [ ok ]
 * Mounting /dev/mqueue ... [ ok ]
 * Mounting security filesystem ... [ ok ]
 * Mounting debug filesystem ... [ ok ]
 * Mounting persistent storage (pstore) filesystem ... [ ok ]
 * Starting busybox mdev ... [ ok ]
 * Loading hardware drivers ... [ ok ]
 * Loading modules ... [ ok ]
 * Setting system clock using the hardware clock [UTC] ... [ ok ]
 * Checking local filesystems  .../dev/vda3: clean, 15556/786432 files, 297938/3143424 blocks
/dev/vda1: recovering journal
/dev/vda1: clean, 24/25688 files, 26852/102400 blocks
 [ ok ]
 * Remounting root filesystem read/write ... [ ok ]
 * Remounting filesystems ... [ ok ]
 * Activating swap devices ... [ ok ]
 * Mounting local filesystems ... [ ok ]
 * Configuring kernel parameters ... [ ok ]
 * Creating user login records ... [ ok ]
 * Wiping /tmp directory ... [ ok ]
 * Setting hostname ... [ ok ]
 * Setting keymap ... [ ok ]
 * Starting networking ... *   lo ... [ ok ]
 *   eth0 ...udhcpc: started, v1.28.4
udhcpc: sending discover
udhcpc: sending select for 10.0.2.15
udhcpc: lease of 10.0.2.15 obtained, lease time 86400
 [ ok ]
 * Starting busybox syslog ... [ ok ]
 * Initializing random number generator ... [ ok ]
 * Starting sshd ... [ ok ]
qemu-system-x86_64: terminating on signal 15