moby: Docker 1.10.3 fails to create network bridge on start up

I seem to have the #18113 issue with Docker 1.10.1!

But it also occurs mostly when using a virtual machine (virtual box) with more than 3G of memory. Then startup of Dockers fails the most time and leaves a /var/lib/docker/network/files/local-kv.db. As long as the file exists, I can not start docker. As soon as I remove it, everything is fine. I have some more hints to reproduce the problem below.

I have the following setup:

  • Host:
    • Ubuntu: 15.10
    • 4.2.0-27-generic #32-Ubuntu SMP Fri Jan 22 04:49:08 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
    • Vagrant: 1.7.4 (Default on Ubuntu 15.10)
    • VirtualBox: 5.0.14 (Default on Ubuntu 15.10)
  • Create a Vagrant virtual machine with VirtualBox:
    • Ubuntu 14.04 (LTS)
    • 3.13.0-77-generic #121-Ubuntu SMP Wed Jan 20 10:50:42 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
  • Inside of the VM
    • Install Puppet (3.4.3) with some modules (apt, docker, epel, etckeeper, stdlib)
    • Start Docker via Puppet manifest: latest Puppet-Docker (garether-docker) sets up Docker version 1.10.1, build 9e83765

I have a Jenkins starting some of those virtual machines. Most of them require only 2048 MB of assigned RAM. But since yesterday I needed some larger instances and set the memory size for one of them to 4096 MB. This one almost always fails to start Docker and leaves the mentioned local-kv.db. Why only “almost”, why not always?

I was trying around with some ideas:

  • At first I was running the puppet based setup with strace, something like strace -f -o /tmp/docker-problem -ttt puppet apply docker-setup.pp. When run with strace the problem never occured, though I was also never able to see access to the local-kv.db (I guess this happens in the container?).
  • Then I was trying with different memory sizes:
    • 3072 MB (always works)
    • 3584 MB: Worked some times (1 out of 3)
    • 3800 MB: Failed (only one try)
    • 4095 MB (trying to be slightly below the 4GB): Worked (only one try)
    • 4096 MB: Always fails

Maybe it’s something completely different and has nothing todo with memory sizes?

I attach one of the local-kv.db (a bzip2 in a zip) for further investigation: local-kv.db.zip

About this issue

  • Original URL
  • State: closed
  • Created 8 years ago
  • Comments: 30 (12 by maintainers)

Commits related to this issue

Most upvoted comments

@dvorak: As I pointed out, my CI setup is driven by a Puppet module. However, my workaround is the following, maybe it also works in your case?

# Workaround for https://github.com/docker/docker/issues/20312
sleeptime=10
maxtries=5
try=0
echo "Sleeping ${sleeptime}s until Docker is up and running (first)"
sleep $sleeptime
until $dir/test.sh || test $try -ge $maxtries
do  sudo /bin/rm -f /var/lib/docker/network/files/local-kv.db
    sudo service docker start
    try=`expr $try + 1`
    echo "Sleeping ${sleeptime}s until Docker is up and running (try #$try)"
    sleep $sleeptime
done
echo "Sleeping ${sleeptime}s until Docker is up and running (final)"
sleep $sleeptime
$dir/test.sh

The mentioned test.sh in $dir is simply

#!/bin/bash

set -eu

if status docker 2>/dev/null | grep -q 'process '
   then echo "Docker is up and running"
   else echo "Docker is not running!" >&2
        exit 1
fi

If it does not work after n (=5) retries I give up and stop the overall CI box setup. Usually this works (sometimes with 3 or 4 retries, but it works).

On 23.02.2016, at 15:43, dvorak notifications@github.com wrote:

Also, if I delete the docker0 bridge (ip link del docker0) then the daemon starts up fine afterwards. We have a work around for the issue, but that doesn’t help in CI environments where this is happening unpredictably.