moby: Problem with linux kernel 4.8
BUG REPORT INFORMATION
Description When I launch bash on a docker with an image debian:wheezy and linux-kernel 4.8, it fails. All is ok with linux-kernel 4.7.
docker run -it debian:wheezy bash
vagrant@debian-testing:~$ echo $?
139
Steps to reproduce the issue:
- vagrant up
# install a debian testing with linux 4.8 - I've upload Vagrantfile.txt and bootstrap.sh.txt for setup vagrant box
- vagrant ssh
# entering vagrant box with running kernel 4.7
a. docker run -it debian:wheezybash # all is ok actually linux kernel 4.7
b. sudo reboot - vagrant ssh
# entering vagrant box with running kernel 4.8
a. docker run -it debian:wheezy bash b. echo $?139
Additional information you deem important (e.g. issue happens only occasionally): bootstrap.sh.txt Vagrantfile.txt
Output of docker version
:
Client:
Version: 1.12.3
API version: 1.24
Go version: go1.6.3
Git commit: 6b644ec
Built: Wed Oct 26 21:45:16 2016
OS/Arch: linux/amd64
Server:
Version: 1.12.3
API version: 1.24
Go version: go1.6.3
Git commit: 6b644ec
Built: Wed Oct 26 21:45:16 2016
OS/Arch: linux/amd64
Output of docker info
:
ontainers: 2
Running: 0
Paused: 0
Stopped: 2
Images: 1
Server Version: 1.12.3
Storage Driver: devicemapper
Pool Name: docker-8:1-262977-pool
Pool Blocksize: 65.54 kB
Base Device Size: 10.74 GB
Backing Filesystem: ext4
Data file: /dev/loop0
Metadata file: /dev/loop1
Data Space Used: 352.4 MB
Data Space Total: 107.4 GB
Data Space Available: 7.905 GB
Metadata Space Used: 860.2 kB
Metadata Space Total: 2.147 GB
Metadata Space Available: 2.147 GB
Thin Pool Minimum Free Space: 10.74 GB
Udev Sync Supported: true
Deferred Removal Enabled: false
Deferred Deletion Enabled: false
Deferred Deleted Device Count: 0
Data loop file: /var/lib/docker/devicemapper/devicemapper/data
WARNING: Usage of loopback devices is strongly discouraged for production use. Use `--storage-opt dm.thinpooldev` to specify a custom block storage device.
Metadata loop file: /var/lib/docker/devicemapper/devicemapper/metadata
Library Version: 1.02.133 (2016-08-15)
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
Volume: local
Network: overlay host null bridge
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Security Options: seccomp
Kernel Version: 4.8.0-1-amd64
Operating System: Debian GNU/Linux stretch/sid
OSType: linux
Architecture: x86_64
CPUs: 1
Total Memory: 492.3 MiB
Name: debian-testing
ID: BVRA:DBPA:GCAW:Z6LO:BIEE:I64Q:DFSB:53O2:VQFA:OVCH:CZOB:T2PX
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
WARNING: No swap limit support
Insecure Registries:
127.0.0.0/8
Additional environment details (AWS, VirtualBox, physical, etc.): vagrant with fujimakishouten/debian-stretch64 image
About this issue
- Original URL
- State: closed
- Created 8 years ago
- Comments: 15 (10 by maintainers)
Commits related to this issue
- Check for LEGACY_VSYSCALL_* options Chosing LEGACY_VSYSCALL_NONE (over NATIVE or EMULATE) will mean that binaries using eglibc <= 2.13 will not run (segfault). Fixes #28705. Signed-off-by: Ian Camp... — committed to ijc/moby by ijc 8 years ago
- Check for LEGACY_VSYSCALL_* options Chosing LEGACY_VSYSCALL_NONE (over NATIVE or EMULATE) will mean that binaries using eglibc <= 2.13 will not run (segfault). Fixes #28705. Signed-off-by: Ian Camp... — committed to vieux/docker by ijc 8 years ago
- Check for LEGACY_VSYSCALL_* options Chosing LEGACY_VSYSCALL_NONE (over NATIVE or EMULATE) will mean that binaries using eglibc <= 2.13 will not run (segfault). Fixes #28705. Signed-off-by: Ian Camp... — committed to xianlubird/docker by ijc 8 years ago
- Use bosh-dns which allows us to remove consul Given that our VM is using a 4.9.x kernel we need to have vsyscall=emulate as a kernel cmdline argument. Otherwise bosh-dns will segfault because it is c... — committed to cloudfoundry-attic/cfdev by xtreme-stevehiehn 6 years ago
- Use bosh-dns which allows us to remove consul Given that our VM is using a 4.9.x kernel we need to have vsyscall=emulate as a kernel cmdline argument. Otherwise bosh-dns will segfault because it is c... — committed to cloudfoundry-attic/cfdev by dprotaso 6 years ago
- Updated default redis image tag Debian 10 "buster" (Linux kernel 4.19) cannot run Debian 7 "wheezy" (glibc 2.13) based container image. - https://github.com/tianon/docker-brew-debian/issues/55 - htt... — committed to groovenauts/mcrain by minimum2scp 4 years ago
- Updated default redis image tag Debian 10 "buster" (Linux kernel 4.19) cannot run Debian 7 "wheezy" (glibc 2.13) based container image. - https://github.com/tianon/docker-brew-debian/issues/55 - htt... — committed to groovenauts/mcrain by minimum2scp 4 years ago
- make - autocommit — committed to EpicMorg/docker by stamepicmorg 2 years ago
- https://github.com/moby/moby/issues/28705 — committed to EpicMorg/docker by stamepicmorg 2 years ago
Something similar was reported to Debian in Debian #845085 which also points to a forum post and https://github.com/tianon/docker-brew-debian/issues/55 (/cc @tianon).
Comparing my local
/boot/config-4.7.0-1-amd64
and/boot/config-4.8.0-1-amd64
(I’m still running 4.7, haven’t had a chance to reboot yet) the most interesting thing I see is:Those are described in
linux/arch/x86/Kconfig
In particular:
So it would be worth trying booting with each of
vsyscall=emulate
anhdvsyscall=native
(in two independent tests).CONFIG_LEGACY_VSYSCALL_NATIVE should be considered a dangerous setting: it provides an ASLR-bypassing target with usable ROP gadgets.
CONFIG_LEGACY_VSYSCALL_NONE is the safest, but it sounds like you have to deal with pre-2.13 glibcs. In that case, the remaining option is fine:
CONFIG_LEGACY_VSYSCALL_EMULATED contains some risk for ASLR-bypassing, even just for having a known-good place to read a known-value from memory.
I would strongly recommend that CONFIG_LEGACY_VSYSCALL_NONE be used and to boot systems that require emulation with “vsyscall=emulate”
@ijc25 I need a way to react to a GitHub comment with more than one heart – thanks so much for chasing this down and dropping info about it in all the places I’ve seen it reported before I was even awake! 😄 ❤️ ❤️
IMO, trying to convince Ben to delay this change until stretch+1 is just delaying the inevitable – I think our efforts would probably be better spent documenting this change and how to override the behavior. 😅
Here’s how I fixed Alpine. I hope this helps anyone else struggling with this issue.
Edit
/boot/grub/grub.cfg
. Add vsyscall=emulate at the end of the first menuentry. Thenreboot
.Example:
Note that on GRUB2 the
grub.cfg
file is meant to be generated automatically by theupdate-grub
scripts so your changes will be overwritten if/when these run.Instead, you should edit
/etc/default/grub
and add the optionvsyscall=emulate
to the end ofGRUB_CMDLINE_LINUX_DEFAULT
. It should look something like this:GRUB_CMDLINE_LINUX_DEFAULT="quiet vsyscall=emulate"
The run
sudo update-grub
and reboot your computer.Should we add a warning to the
check-config
script?AIUI (mainly based on the Kconfig help) it’s a security “related” thing because the old setting involves some non-ASLR code in every process address space (vsyscall used to be at a fixed address), so disabling it improves things by getting rid of that.
Older (e)glibc (<=2.13 according to the Debian kernel changelog) is not compatible since it doesn’t know about the new dynamic vsyscall address mechanisms and only knows the static one. Looks like CentOS 6 and Debian Wheezy both have old enough libc to be affected.
Since Wheezy is now oldstable I suppose that was deemed a reasonable cut off point, especially since there is a command line escape hatch. I wasn’t involved/paying attention when this change was made though, so I don’t know what the probability of deferring the change for another Debian release would be.