moby: Unattended upgrades that reload apparmor profiles break Engine 1.13, requiring a restart
(If you’re in a hurry, scroll down to “Steps to reproduce the issue”.)
Symptoms
This happened twice to me already: I spin up a 5-nodes Ubuntu 16.04 cluster on EC2, do some work on it; then the day after, I login again, and the cluster seems functional, except that I can’t start new containers. Existing containers seem fine, but any attempt at starting a new container results in:
$ docker run -ti alpine sh
Unable to find image 'alpine:latest' locally
latest: Pulling from library/alpine
Digest: sha256:1354db23ff5478120c980eca1611a51c9f2b88b61f24283ee8200bf9a54f2e5c
Status: Downloaded newer image for alpine:latest
docker: Error response from daemon: oci runtime error: container_linux.go:247: starting container process caused "process_linux.go:359: container init caused \"apparmor failed to apply profile: no such file or directory\"".
The “interesting” part is that this behavior is consistent across the whole cluster; i.e. I all my nodes appear to be botched.
Running a privileged container, however, works fine. What could it be?!? 🔎 🪲
Restarting the Engine (or, more drastically, rebooting the node) fixes the issue.
Analysis
I dynamically enabled debug (through config file + SIGHUP
), and looked into the Engine’s logs when trying to start a container; but nothing special was appearing. (If I compare a “sane” Engine and a broken one, the first different line is the one mentioning the apparmor error.)
The kernel logs indicate that the docker-default
profile doesn’t exist anymore:
[96551.680251] audit: type=1400 audit(1480734916.438:79): apparmor="DENIED" operation="change_onexec" info="label not found" error=-2 profile="unconfined" name="docker-default" pid=9524 comm="runc:[2:INIT]"
This prompted me to try to start a container with --security-opt apparmor=unconfined
, and that worked! Same if I specify another “OK-ish” AppArmor profile, e.g. --security-opt apparmor=lxc-container-default
.
We’re getting closer.
Looking at the kernel logs, I saw a blast of mysterious (i.e. unsolicited) messages like the following:
Dec 2 07:41:17 ubuntu kernel: [26112.997698] audit: type=1400 audit(1480664477.757:17): apparmor="STATUS" operation="profile_
replace" profile="unconfined" name="/sbin/dhclient" pid=15189 comm="apparmor_parser"
Dec 2 07:41:17 ubuntu kernel: [26112.998057] audit: type=1400 audit(1480664477.757:18): apparmor="STATUS" operation="profile_
replace" profile="unconfined" name="/usr/lib/NetworkManager/nm-dhcp-client.action" pid=15189 comm="apparmor_parser"
Dec 2 07:41:17 ubuntu kernel: [26112.998338] audit: type=1400 audit(1480664477.757:19): apparmor="STATUS" operation="profile_
replace" profile="unconfined" name="/usr/lib/NetworkManager/nm-dhcp-helper" pid=15189 comm="apparmor_parser"
...
What could be happening at 7:41 to cause a merry-go-round of apparmor profiles?
I looked at FBI_Surveillance_Van data to figure out if I had been sleepwalking to the console and issued random commands to my nodes, but found nothing.
However, according to the logs, dpkg
has been busy at that time:
dpkg.log:2016-12-02 07:41:16 status installed vim-tiny:amd64 2:7.4.1689-3ubuntu1.2
dpkg.log:2016-12-02 07:41:17 configure linux-headers-4.4.0-51:all 4.4.0-51.72 <none>
dpkg.log:2016-12-02 07:41:17 status unpacked linux-headers-4.4.0-51:all 4.4.0-51.72
dpkg.log:2016-12-02 07:41:17 status half-configured linux-headers-4.4.0-51:all 4.4.0-51.72
dpkg.log:2016-12-02 07:41:17 status installed linux-headers-4.4.0-51:all 4.4.0-51.72
dpkg.log:2016-12-02 07:41:17 configure linux-headers-4.4.0-51-generic:amd64 4.4.0-51.72 <none>
dpkg.log:2016-12-02 07:41:17 status unpacked linux-headers-4.4.0-51-generic:amd64 4.4.0-51.72
dpkg.log:2016-12-02 07:41:17 status half-configured linux-headers-4.4.0-51-generic:amd64 4.4.0-51.72
dpkg.log:2016-12-02 07:41:17 status installed linux-headers-4.4.0-51-generic:amd64 4.4.0-51.72
dpkg.log:2016-12-02 07:41:17 configure linux-image-virtual:amd64 4.4.0.51.54 <none>
dpkg.log:2016-12-02 07:41:17 status unpacked linux-image-virtual:amd64 4.4.0.51.54
dpkg.log:2016-12-02 07:41:17 status half-configured linux-image-virtual:amd64 4.4.0.51.54
dpkg.log:2016-12-02 07:41:17 status installed linux-image-virtual:amd64 4.4.0.51.54
dpkg.log:2016-12-02 07:41:17 configure linux-headers-generic:amd64 4.4.0.51.54 <none>
dpkg.log:2016-12-02 07:41:17 status unpacked linux-headers-generic:amd64 4.4.0.51.54
dpkg.log:2016-12-02 07:41:17 status half-configured linux-headers-generic:amd64 4.4.0.51.54
dpkg.log:2016-12-02 07:41:17 status installed linux-headers-generic:amd64 4.4.0.51.54
dpkg.log:2016-12-02 07:41:17 configure linux-headers-virtual:amd64 4.4.0.51.54 <none>
dpkg.log:2016-12-02 07:41:17 status unpacked linux-headers-virtual:amd64 4.4.0.51.54
dpkg.log:2016-12-02 07:41:17 status half-configured linux-headers-virtual:amd64 4.4.0.51.54
dpkg.log:2016-12-02 07:41:17 status installed linux-headers-virtual:amd64 4.4.0.51.54
dpkg.log:2016-12-02 07:41:17 configure linux-virtual:amd64 4.4.0.51.54 <none>
dpkg.log:2016-12-02 07:41:17 status unpacked linux-virtual:amd64 4.4.0.51.54
dpkg.log:2016-12-02 07:41:17 status half-configured linux-virtual:amd64 4.4.0.51.54
dpkg.log:2016-12-02 07:41:17 status installed linux-virtual:amd64 4.4.0.51.54
dpkg.log:2016-12-02 07:41:17 configure python3-cryptography:amd64 1.2.3-1ubuntu0.1 <none>
dpkg.log:2016-12-02 07:41:17 status unpacked python3-cryptography:amd64 1.2.3-1ubuntu0.1
dpkg.log:2016-12-02 07:41:17 status half-configured python3-cryptography:amd64 1.2.3-1ubuntu0.1
dpkg.log:2016-12-02 07:41:17 status installed python3-cryptography:amd64 1.2.3-1ubuntu0.1
dpkg.log:2016-12-02 07:41:17 configure liblxc1:amd64 2.0.5-0ubuntu1~ubuntu16.04.3 <none>
dpkg.log:2016-12-02 07:41:17 status unpacked liblxc1:amd64 2.0.5-0ubuntu1~ubuntu16.04.3
dpkg.log:2016-12-02 07:41:17 status half-configured liblxc1:amd64 2.0.5-0ubuntu1~ubuntu16.04.3
dpkg.log:2016-12-02 07:41:17 status installed liblxc1:amd64 2.0.5-0ubuntu1~ubuntu16.04.3
dpkg.log:2016-12-02 07:41:17 configure lxc-common:amd64 2.0.5-0ubuntu1~ubuntu16.04.3 <none>
dpkg.log:2016-12-02 07:41:17 status unpacked lxc-common:amd64 2.0.5-0ubuntu1~ubuntu16.04.3
dpkg.log:2016-12-02 07:41:17 status unpacked lxc-common:amd64 2.0.5-0ubuntu1~ubuntu16.04.3
dpkg.log:2016-12-02 07:41:17 status unpacked lxc-common:amd64 2.0.5-0ubuntu1~ubuntu16.04.3
dpkg.log:2016-12-02 07:41:17 status unpacked lxc-common:amd64 2.0.5-0ubuntu1~ubuntu16.04.3
dpkg.log:2016-12-02 07:41:17 status unpacked lxc-common:amd64 2.0.5-0ubuntu1~ubuntu16.04.3
dpkg.log:2016-12-02 07:41:17 status unpacked lxc-common:amd64 2.0.5-0ubuntu1~ubuntu16.04.3
dpkg.log:2016-12-02 07:41:17 status unpacked lxc-common:amd64 2.0.5-0ubuntu1~ubuntu16.04.3
dpkg.log:2016-12-02 07:41:17 status unpacked lxc-common:amd64 2.0.5-0ubuntu1~ubuntu16.04.3
dpkg.log:2016-12-02 07:41:17 status unpacked lxc-common:amd64 2.0.5-0ubuntu1~ubuntu16.04.3
dpkg.log:2016-12-02 07:41:17 status half-configured lxc-common:amd64 2.0.5-0ubuntu1~ubuntu16.04.3
dpkg.log:2016-12-02 07:41:17 status installed lxc-common:amd64 2.0.5-0ubuntu1~ubuntu16.04.3
dpkg.log:2016-12-02 07:41:17 trigproc libc-bin:amd64 2.23-0ubuntu4 <none>
dpkg.log:2016-12-02 07:41:17 status half-configured libc-bin:amd64 2.23-0ubuntu4
dpkg.log:2016-12-02 07:41:17 status installed libc-bin:amd64 2.23-0ubuntu4
dpkg.log:2016-12-02 07:41:18 startup packages configure
I suspect that unattended upgrades have pulled some package that included AppArmor profiles, and as a result, all AppArmor profiles have been flushed and reloaded — discarding the docker-default
profile in the process.
Specifically, I suspect ~colonel Mustard, in the INT gate, with the task_struct~ lxc-common
. (But really, I think anything pulling AppArmor profiles will do!)
Steps to reproduce the issue:
This gives us the following steps to reproduce (on Ubuntu 16.04 with 1.13-rc2):
- Install Docker on Ubuntu 16.04
- Run your favorite container (e.g.
docker run alpine echo ♥
) - Move fast and break all the things with
sudo dpkg-reconfigure lxc-common
- Run your favorite container again
Describe the results you received:
I get one ♥, and then the following error message:
container_linux.go:247: starting container process caused "process_linux.go:359: container init caused \"apparmor failed to apply profile: no such file or directory\""
docker: Error response from daemon: oci runtime error: container_linux.go:247: starting container process caused "process_linux.go:359: container init caused \"apparmor failed to apply profile: no such file or directory\"".
Describe the results you expected:
I expected two ♥
Additional information you deem important (e.g. issue happens only occasionally):
Issue will randomly break your Engine subrepticiously¹ with undercover, randomly-timed, unattended upgrades.
¹Did I just make up that word? 📖 👀
Output of docker version
:
Client:
Version: 1.13.0-rc2
API version: 1.25
Go version: go1.7.3
Git commit: 1f9b3ef
Built: Wed Nov 23 06:32:39 2016
OS/Arch: linux/amd64
Server:
Version: 1.13.0-rc2
API version: 1.25
Minimum API version: 1.12
Go version: go1.7.3
Git commit: 1f9b3ef
Built: Wed Nov 23 06:32:39 2016
OS/Arch: linux/amd64
Experimental: false
Output of docker info
:
Containers: 13
Running: 0
Paused: 0
Stopped: 13
Images: 3
Server Version: 1.13.0-rc2
Storage Driver: aufs
Root Dir: /var/lib/docker/aufs
Backing Filesystem: extfs
Dirs: 45
Dirperm1 Supported: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
Volume: local
Network: bridge host macvlan null overlay
Swarm: active
NodeID: b0vdva852jomk6kbvlklh23qc
Is Manager: true
ClusterID: bianshirja3n5k5qrt4jmvtng
Managers: 3
Nodes: 5
Orchestration:
Task History Retention Limit: 5
Raft:
Snapshot Interval: 10000
Number of Old Snapshots to Retain: 0
Heartbeat Tick: 1
Election Tick: 3
Dispatcher:
Heartbeat Period: 5 seconds
CA Configuration:
Expiry Duration: 3 months
Node Address: 172.31.42.62
Manager Addresses:
0.0.0.0:2377
172.31.42.58:2377
172.31.42.60:2377
172.31.42.62:2377
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 03e5862ec0d8d3b3f750e19fca3ee367e13c090e
runc version: 51371867a01c467f08af739783b8beafc154c4d7
init version: 949e6fa
Security Options:
apparmor
seccomp
Profile: default
Kernel Version: 4.4.0-47-generic
Operating System: Ubuntu 16.04.1 LTS
OSType: linux
Architecture: x86_64
CPUs: 2
Total Memory: 3.858 GiB
Name: node2
ID: EFCV:WAAA:U4AO:7Q5A:NLIS:EXZ5:LHSB:G5FB:37ZK:ELVE:7MQ2:NUQW
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
WARNING: No swap limit support
Labels:
provider=generic
Experimental: false
Insecure Registries:
127.0.0.0/8
Live Restore Enabled: false
Additional environment details (AWS, VirtualBox, physical, etc.):
I can provide unlimited supplies of broken instances on demand.
I can confirm that the problem only shows up on 1.13; not on 1.12.
I can’t confirm or deny whether this is related to #26823.
I can also provide 🍪 or suitably adequate bribes if that gets fixed before my workshop next week (or even if it doesn’t, because one should never pass on an opportunity to show 💖 to our benevolent maintainers, may their path be sowed with one thousand rose petals).
Thank you ✨
About this issue
- Original URL
- State: closed
- Created 8 years ago
- Comments: 16 (10 by maintainers)
I had exactly same issue a few days ago, docker version
Running on
It happened on all cluster nodes, just with same update of lxc packages.
For those who encounter same problem - no need to restart containers, just run a new container (any container) on each node, and it loads the apparmor profile.