moby: direct-lvm with xfs causes Docker to hang when disk is full

Output of docker version:

Client:
 Version:      1.9.1
 API version:  1.21
 Go version:   go1.4.2
 Git commit:   a34a1d5/1.9.1
 Built:        
 OS/Arch:      linux/amd64

Server:
 Version:      1.9.1
 API version:  1.21
 Go version:   go1.4.2
 Git commit:   a34a1d5/1.9.1
 Built:        
 OS/Arch:      linux/amd64

Output of docker info (on a fresh host; an affected host does not respond to docker info):

Containers: 1
Images: 6
Server Version: 1.9.1
Storage Driver: devicemapper
 Pool Name: docker-docker--pool
 Pool Blocksize: 524.3 kB
 Base Device Size: 107.4 GB
 Backing Filesystem: xfs
 Data file: 
 Metadata file: 
 Data Space Used: 100.7 MB
 Data Space Total: 9.437 GB
 Data Space Available: 9.337 GB
 Metadata Space Used: 77.82 kB
 Metadata Space Total: 25.17 MB
 Metadata Space Available: 25.09 MB
 Udev Sync Supported: true
 Deferred Removal Enabled: true
 Deferred Deletion Enabled: true
 Deferred Deleted Device Count: 0
 Library Version: 1.02.93-RHEL7 (2015-01-28)
Execution Driver: native-0.2
Logging Driver: json-file
Kernel Version: 4.1.17-22.30.amzn1.x86_64
Operating System: Amazon Linux AMI 2015.09
CPUs: 1
Total Memory: 995.6 MiB
Name: ip-172-31-34-29
ID: GO67:QAUR:ZN7D:LBC6:M366:GT2C:BQTV:F3UE:IMGD:YFNF:T37L:OQWC

Provide additional environment details (AWS, VirtualBox, physical, etc.): Amazon Linux AMI 2015.09.2 configured with a second EBS volume at /dev/xvdb and configured with direct LVM (using devicemapper) by docker-storage-setup. By default, docker-storage-setup configures the thin pool originally sized at 40% of the available disk with an auto-extend policy, however the same behavior is exhibited when auto-extend is disabled and when lv_when_full is changed to error.

List the steps to reproduce the issue:

  1. Configure Docker as described above (direct LVM with devicemapper using xfs)
  2. Run containers to progressively consume disk (I seemed to have reasonable results using docker run ubuntu dd if=/dev/urandom of=sample.txt bs=64M count=1 iflag=fullblock) until the disk is full (alternately, run my repro script)
  3. Try to do something with Docker (any of docker ps, docker info, docker run, docker stop, docker rm, docker images, docker rmi)

Describe the results you received: Docker commands hang indefinitely. dmesg fills with things like the following:

[ 3029.535199] device-mapper: thin: 253:2: reached low water mark for data device: sending event.
[ 3029.589706] device-mapper: thin: 253:2: switching pool to out-of-data-space mode
[ 3037.584626] vethac84d34: renamed from eth0
[ 3037.588339] docker0: port 2(vethd9e484d) entered disabled state
[ 3037.622172] docker0: port 2(vethd9e484d) entered disabled state
[ 3037.626293] device vethd9e484d left promiscuous mode
[ 3037.629420] docker0: port 2(vethd9e484d) entered disabled state
[ 3089.592091] device-mapper: thin: 253:2: switching pool to read-only mode
[ 3089.598147] Buffer I/O error on dev dm-4, logical block 144077, lost async page write
[ 3089.603304] Buffer I/O error on dev dm-4, logical block 144078, lost async page write
[ 3089.608669] Buffer I/O error on dev dm-4, logical block 144079, lost async page write
[ 3089.614565] Buffer I/O error on dev dm-4, logical block 144080, lost async page write
[ 3089.620112] Buffer I/O error on dev dm-4, logical block 144081, lost async page write
[ 3089.624919] Buffer I/O error on dev dm-4, logical block 144082, lost async page write
[ 3089.633141] Buffer I/O error on dev dm-4, logical block 144083, lost async page write
[ 3089.638389] Buffer I/O error on dev dm-4, logical block 144084, lost async page write
[ 3089.645177] Buffer I/O error on dev dm-4, logical block 144085, lost async page write
[ 3089.650294] Buffer I/O error on dev dm-4, logical block 144086, lost async page write
[ 3089.664084] XFS (dm-4): metadata I/O error: block 0x2c230 ("xfs_buf_iodone_callbacks") error 5 numblks 16
[ 3089.684777] XFS (dm-5): metadata I/O error: block 0x2c230 ("xfs_buf_iodone_callbacks") error 5 numblks 16
[ 3089.695953] XFS (dm-4): Unmounting Filesystem
[ 3089.703478] XFS (dm-4): metadata I/O error: block 0x2c230 ("xfs_buf_iodone_callbacks") error 5 numblks 16
[ 3089.710627] XFS (dm-5): metadata I/O error: block 0x2c230 ("xfs_buf_iodone_callbacks") error 5 numblks 16
...
[ 3150.336222] XFS (dm-6): metadata I/O error: block 0x2c230 ("xfs_buf_iodone_callbacks") error 5 numblks 16
[ 3240.180102] INFO: task docker:3312 blocked for more than 120 seconds.
[ 3240.184544]       Tainted: G            E   4.1.17-22.30.amzn1.x86_64 #1
[ 3240.189691] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 3240.195639] docker          D ffff880037d03d48     0  3312      1 0x00000000
[ 3240.200654]  ffff880037d03d48 ffff88003e321980 ffff88003745b300 0000000000000246
[ 3240.206939]  ffff880037d04000 ffff88003d6aabe0 ffff880037e6cdc0 ffff880037e6cde8
[ 3240.212569]  ffff880037e6cd90 ffff880037d03d68 ffffffff814df3e7 ffff88003d6aabe0
[ 3240.218299] Call Trace:
[ 3240.219963]  [<ffffffff814df3e7>] schedule+0x37/0x90
[ 3240.223179]  [<ffffffffa048fb79>] xfs_ail_push_all_sync+0xa9/0xe0 [xfs]
[ 3240.227811]  [<ffffffff810aafc0>] ? prepare_to_wait_event+0x110/0x110
[ 3240.233513]  [<ffffffffa047b2e9>] xfs_unmountfs+0x59/0x170 [xfs]
[ 3240.239390]  [<ffffffffa047bdab>] ? xfs_mru_cache_destroy+0x6b/0x90 [xfs]
[ 3240.247251]  [<ffffffffa047dd36>] xfs_fs_put_super+0x36/0x90 [xfs]
[ 3240.252770]  [<ffffffff811ccb06>] generic_shutdown_super+0x76/0x100
[ 3240.256723]  [<ffffffff811ccea7>] kill_block_super+0x27/0x70
[ 3240.260913]  [<ffffffff811cd1b9>] deactivate_locked_super+0x49/0x80
[ 3240.265629]  [<ffffffff811cd7ae>] deactivate_super+0x4e/0x70
[ 3240.269797]  [<ffffffff811e9a63>] cleanup_mnt+0x43/0x90
[ 3240.274904]  [<ffffffff811e9b02>] __cleanup_mnt+0x12/0x20
[ 3240.279970]  [<ffffffff81085b97>] task_work_run+0xb7/0xf0
[ 3240.284063]  [<ffffffff81013bb1>] do_notify_resume+0x51/0x80
[ 3240.288751]  [<ffffffff814e35bc>] int_signal+0x12/0x17

Describe the results you expected: Docker itself would continue to be responsive, but operations that consume space (pulling images, writing inside a container, etc) would fail. This is the behavior observed when I switch to ext4 instead of xfs.

About this issue

  • Original URL
  • State: closed
  • Created 8 years ago
  • Reactions: 2
  • Comments: 24 (19 by maintainers)

Most upvoted comments

This “issue” couldn’t be more improperly categorized. By asserting “direct-lvm with xfs causes Docker to hang when disk is full” you’ve shown willful ignorance about why XFS degenerates to busy waiting when the thin-pool runs out of space. And the reason boils down to improper configuration and reckless use of a finite resource.

First let me say that we’d love for this not to be an issue but the fact of the matter is it is a pitfall that has always been at the heart of why it is imperative for docker (and the admin) to be aware of:

  1. the expected rate of change and use of free space
  2. how much space is currently available

Admins cannot stuff 10 pounds into a 5 pound bag (of DM thinp on XFS) and expect things to gracefully halt with a message of “oh you wanted performance (which XFS has over ext4), etc but you didn’t reason through proper resource allocation means for your use… please take a moment to think and act accordingly”. Simply put: Care must be taken to properly provision and then monitor the free space utilization over time. XFS’s existing behavior of spinning waiting for more space just happens to make this requirement much more cut throat.

Docker being aware of how much dm-thin-pool space is free and disallowing spinning up new containers, etc is a perfectly reasonable (and overdue) response to avoid the more problematic fault handling (e.g. XFS’s busy waiting for more space that never gets added). Just because ext4 can tolerate this case more gracefully doesn’t mean users should be left to keep making the same mistakes over and over. And in doing so: exposing themselves to error handling bugs that even ext4 is known to have had (at least historically).

In a properly provisioned, monitored and managed docker deployment this failure case should never present itself. NEVER. If it is then the admin has failed. And docker quite likely has failed by allowing the admin enough rope to fail. Regardless of which underlying filesystem is used.

Apologies if my frustration came through… I’ve not been able to keep up with how each project documents the importance of properly configuring, monitoring and managing the free space that is relied on for forward progress.

But for docker setups that use DM thin-provisioning they should:

  1. use lvm to setup the thin-pool that docker uses for container storage
  2. configure lvm such that it will resize the thin-pool, from free space in the parent VG, once the lvm.conf configured threshold is crossed
  3. the parent VG’s free space must be monitored and replenished; otherwise the autoresize of the thin-pool that crossed the low water mark will not have more space to pull on when it needs it again

This “issue” couldn’t be more improperly categorized. By asserting “direct-lvm with xfs causes Docker to hang when disk is full” you’ve shown willful ignorance about why XFS degenerates to busy waiting when the thin-pool runs out of space.

My apologies. I’ve attempted to learn as much as I can about XFS, devicemapper, LVM, and so forth recently and my (unwilling) ignorance is clearly coming across.

And the reason boils down to improper configuration and reckless use of a finite resource.

This sounds true. Regardless, this is the recommended configuration in Docker’s documentation on running devicemapper.

Admins cannot stuff 10 pounds into a 5 pound bag (of DM thinp on XFS) and expect things to gracefully halt […]

Also sensible. I think what’s missing is that xfs is now the default since Docker 1.9.1 and this behavior is not documented anywhere for people who are unfamiliar with these pitfalls.