moby: setcap not supported on BTRFS storage backend

Description

I would like to set particular capabilities to a file while building a new container. When building the Dockerfile, the build fails when setting the capabilities if the host uses BTRFS as storage backend, but the build is successful when using OverlayFS2 as storage backend.

Steps to reproduce the issue:

  1. Create a Dockerfile which sets a capability on a regular file (e.g. setcap 'cap_sys_resource+ep' /bin/ls)
  2. Build the Dockerfile on a host where Docker uses BTRFS for storage backend
  3. The build fails when trying to execute the setcap command

Describe the results you received:

This was done on a Dockerfile which is based on ubuntu:16.04 image and install ntp and try to set the capabilities on the file /usr/sbin/ntpd. The following is displayed:

Step 4/6 : RUN setcap 'cap_net_bind_service,cap_sys_time,cap_sys_resource,cap_sys_nice=+ep' /usr/sbin/ntpd
 ---> Running in 75025f7b6071
Failed to set capabilities on file `/usr/sbin/ntpd' (Invalid argument)
The value of the capability argument is not permitted for a file. Or the file is not a regular (non-symlink) file
The command '/bin/sh -c setcap 'cap_net_bind_service,cap_sys_time,cap_sys_resource,cap_sys_nice=+ep' /usr/sbin/ntpd' returned a non-zero code: 1

Of course /usr/sbin/ntpd is a regular file: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked. In addition, if I install ntp on Ubuntu 16.04 on a bare metal host and if I run this:

$ getcap /usr/sbin/ntpd
$ sudo setcap cap_net_bind_service,cap_sys_time,cap_sys_resource,cap_sys_nice=+ep  /usr/sbin/ntpd
$ getcap /usr/sbin/ntpd
/usr/sbin/ntpd = cap_net_bind_service,cap_sys_nice,cap_sys_resource,cap_sys_time+ep
$ df -Th /usr/sbin/ntpd
Filesystem     Type   Size  Used Avail Use% Mounted on
/dev/sda4      btrfs   23G   11G   12G  49% /

Describe the results you expected:

This is the expected result (from another host which uses OverlayFS2 as storage backend over an ext4 partition):

Step 4/6 : RUN setcap 'cap_net_bind_service,cap_sys_time,cap_sys_resource,cap_sys_nice+ep' /usr/sbin/ntpd
 ---> Running in 72ec9ec7d0ce
 ---> a5ed9956ab95
Removing intermediate container 72ec9ec7d0ce

Additional information you deem important (e.g. issue happens only occasionally):

It works with this backend:

Storage Driver: overlay2
 Backing Filesystem: extfs
 Supports d_type: true
 Native Overlay Diff: true

But fails with this one:

Storage Driver: btrfs
 Build Version: Btrfs v4.4
 Library Version: 101

Output of docker version:

$ docker version
Client:
 Version:      1.13.0
 API version:  1.25
 Go version:   go1.7.3
 Git commit:   49bf474
 Built:        Tue Jan 17 09:58:26 2017
 OS/Arch:      linux/amd64

Server:
 Version:      1.13.0
 API version:  1.25 (minimum version 1.12)
 Go version:   go1.7.3
 Git commit:   49bf474
 Built:        Tue Jan 17 09:58:26 2017
 OS/Arch:      linux/amd64
 Experimental: false

Output of docker info:

$ docker system info
Containers: 8
 Running: 4
 Paused: 0
 Stopped: 4
Images: 39
Server Version: 1.13.0
Storage Driver: btrfs
 Build Version: Btrfs v4.4
 Library Version: 101
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
 Volume: local
 Network: bridge host macvlan null overlay
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 03e5862ec0d8d3b3f750e19fca3ee367e13c090e
runc version: 2f7393a47307a16f8cee44a37b262e8b81021e3e
init version: 949e6fa
Security Options:
 apparmor
 seccomp
  Profile: default
 userns
Kernel Version: 4.8.0-34-generic
Operating System: Ubuntu 16.04.1 LTS
OSType: linux
Architecture: x86_64
CPUs: 2
Total Memory: 7.55 GiB
Name: mercury
ID: 
Docker Root Dir: /var/lib/docker/235536.235536
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
WARNING: No swap limit support
Experimental: false
Insecure Registries:
 127.0.0.0/8
Live Restore Enabled: false

Additional environment details (AWS, VirtualBox, physical, etc.):

The tests were run both on physical boxes (bare metal). One (BTRFS) is on x86_64 (Ubuntu 16.04 with kernel 4.8), the other one (OverlayFS2 over ext4) is on ARM 32bit (Raspberry Pi, Raspbian Jessie with custom kernel 4.9). Both run the same version of Docker (1.13.0). Here is the Docker information command for the ARM host:

$ docker system info
Containers: 5
 Running: 4
 Paused: 0
 Stopped: 1
Images: 94
Server Version: 1.13.0
Storage Driver: overlay2
 Backing Filesystem: extfs
 Supports d_type: true
 Native Overlay Diff: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
 Volume: local
 Network: bridge host macvlan null overlay
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 03e5862ec0d8d3b3f750e19fca3ee367e13c090e
runc version: 2f7393a47307a16f8cee44a37b262e8b81021e3e
init version: 949e6fa
Security Options:
 apparmor
Kernel Version: 4.9.5-v7-lowlat-tick-rtc1307+
Operating System: Raspbian GNU/Linux 8 (jessie)
OSType: linux
Architecture: armv7l
CPUs: 4
Total Memory: 969.9 MiB
Name: venus
ID: 
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
WARNING: No cpuset support
Experimental: false
Insecure Registries:
 127.0.0.0/8
Live Restore Enabled: false

I want to set capabilities because I have basically the problem described here #8460 and I want to work around it while waiting for ambient capabilities to be implemented.

About this issue

  • Original URL
  • State: closed
  • Created 7 years ago
  • Comments: 15 (14 by maintainers)

Most upvoted comments

Hi @estesp

So I’ve tested trying to set capabilities on a file in a Dockerfile when the backend is overlayfs2 over ext4 but using user namespace, and in this configuration it fails exactly as with the BTRFS backend configuration when using user namespace.

So the problem in this case was not the storage backend, but the user namespace. So it is related to the bug I have mentioned.

Therefore I think - if I’m correct - that a part of this bug report is a duplicate (a specialisation) of bug #1916. And, unless Justin reports that when using --cap-add <...> should have granted the capability to the user namespace and therefore my “hack” should have worked, we could close this issue.