moby: Kernel Panic // dirperm1 breaks the protection by the permission bits on the lower branch

I experienced the above mentioned kernel panic several times while running mysqldump-backups (script attached) started by cron on Debian Jessie (Kernel 3.16 / Docker Engine 1.10.1).

docker_mysql_crash_20101003.txt

Output of docker version:

Client:
 Version:      1.10.1
 API version:  1.22
 Go version:   go1.5.3
 Git commit:   9e83765
 Built:        Thu Feb 11 19:09:42 2016
 OS/Arch:      linux/amd64

Server:
 Version:      1.10.1
 API version:  1.22
 Go version:   go1.5.3
 Git commit:   9e83765
 Built:        Thu Feb 11 19:09:42 2016
 OS/Arch:      linux/amd64

Output of docker info:

Containers: 65
 Running: 65
 Paused: 0
 Stopped: 0
Images: 45
Server Version: 1.10.1
Storage Driver: aufs
 Root Dir: /var/lib/docker/aufs
 Backing Filesystem: xfs
 Dirs: 201
 Dirperm1 Supported: true
Execution Driver: native-0.2
Logging Driver: json-file
Plugins: 
 Volume: local
 Network: bridge null host
Kernel Version: 3.16.0-4-amd64
Operating System: Debian GNU/Linux 8 (jessie)
OSType: linux
Architecture: x86_64
CPUs: 12
Total Memory: 126.1 GiB
Name: node01
ID: UBAS:SXY3:QBSF:C5P7:KMPH:FKB2:SGDV:3HRX:A4GQ:FGHL:YQLI:BK4P
WARNING: No cpu cfs quota support
WARNING: No cpu cfs period support

Provide additional environment details (AWS, VirtualBox, physical, etc.): physical

List the steps to reproduce the issue:

#!/bin/bash
DATE=$(date "+%Y-%m-%d");

for container in $(docker ps --filter="name=mysql" --format "{{.Names}}"); do
echo ${container};
docker run --rm --volumes-from ${container} -v /var/backup/scripts:/var/lib/mysql sameersbn/mysql mysqldump -uroot --single-transaction --default-character-set=utf8 magento | gzip > /var/backup/mysql/${container}_${DATE}.sql.gz
done

Unfortunately it’s not crashing every time.

Describe the results you received: kernel panics, hardware reset needed

Describe the results you expected:

Provide additional info you think is important:

About this issue

  • Original URL
  • State: closed
  • Created 8 years ago
  • Comments: 38 (19 by maintainers)

Commits related to this issue

Most upvoted comments

@bitliner ok so that is a lockup on the emulated ethernet card on vagrant. Are you running it with virtualbox? This is not really a kernel bug at all it is virtualisation code trying to emulate physical hardware and not doing it well enough. This is often an issue, especially with timing. Virtual network drivers (PV) rather than emulated hardware drivers work better with VMs.

@bitliner as @cpuguy83 says, a kernel panic is not a bug in docker it is a bug in the kernel (often a security issue in fact as well). We cannot fix it by changing docker. We test on the major distributions especially RHEL 7, Ubuntu LTS, and Debian stable, and on recent stable long term support kernels (4.4.x currently).

Generally it is best to open a new issue, as your kernel panic is very unlikely to be the same as someone else’s. This issue is particularly confusing as it references a generic aufs message which many people find in their kernel logs and is unrelated.

@bitliner If you have a specific issue, please open it so we can take a look.

A kernel panic is always a bug in the kernel. The dirperm1 setting is not causing a kernel panic, it’s just a warning from the aufs kernel module that really has no effect on docker’s use case of aufs.

In the stack trace abovedm_calculate_queue_limits seems to be causing the panic. I assume dm here is devicemapper. Based on the logs, however, it looks like the docker instance is using aufs, not devicemapper which seems to indicate it is not something that docker is directly doing that’s triggering the kernel bug.

You should always make sure to run the latest version of the kernel provided by your distro. These updates contain bug fixes for just things such as this. In some cases we can work around kernel bugs, but in most cases these need to be fixed in the kernel. We do work with kernel maintainers from the various distros to help make sure these things get patched and backported as neccessary.

If you are on the latest kernel provided by your distro and you still experience a panic, it is best to report to the distro. You can of course report in docker/docker if it happened and is reproducible by using docker and we can figure out what might need to be done. Also big thanks to the Canonical, RedHat, Debian, and other distro teams for being awesome with fixing and backporting fixes for kernel issues.

I’m seeing the same error on 16.04, but only when using the --userns-remap flag.