containerd: Pulling image fails - failed to unmount temp mount - device or resource busy: unknown

Description

Trying to install Mirantis Kubernetes Engine on RHEL 7.9. Any time I run ctr i pull <image>, it fails with

Steps to reproduce the issue:

  1. ctr i pull docker.io/mirantis/ucp-hyperkube:3.3.9
ctr i pull docker.io/mirantis/ucp-hyperkube:3.3.9
docker.io/mirantis/ucp-hyperkube:3.3.9:                                           resolved       |++++++++++++++++++++++++++++++++++++++|
manifest-sha256:c33dd9a587560191b55d6c6831738f977f4452505199651756ba491a34b66dcb: exists         |++++++++++++++++++++++++++++++++++++++|
config-sha256:77e43fa8731d2a8655e921fd049f5c113075df0e1f428167155bafdc17c961df:   exists         |++++++++++++++++++++++++++++++++++++++|
layer-sha256:44c5ded36b57e206e3c8c6ee7891f61db2afc1129ce970a953b839e5a2422e44:    exists         |++++++++++++++++++++++++++++++++++++++|
layer-sha256:f621385815828b7b813f6bb3a9558a0fdf1a47053b9788454c901d9acce3fe92:    exists         |++++++++++++++++++++++++++++++++++++++|
layer-sha256:b5ee2f548017637d9b88ff01690e9c6ee19bc2a8d3eca7d2e99de32d02f22520:    exists         |++++++++++++++++++++++++++++++++++++++|
layer-sha256:c804e25559599c2c08ce025eca93de83fdbd117a92251e9ac9b51a591d2aaaa6:    exists         |++++++++++++++++++++++++++++++++++++++|
layer-sha256:20be0bce70df6fe6559c9a1dc211895c1a7f67a5f5e921c62879734fd54323ca:    exists         |++++++++++++++++++++++++++++++++++++++|
layer-sha256:7847f72fd3610206d55e593bd31f246aefdf24cc23871727955c8a213f8f4e75:    exists         |++++++++++++++++++++++++++++++++++++++|
layer-sha256:ba09177b274d4003591a0997e64a688a97c9fb1ccb75e99e5e835d69a2ac60a0:    exists         |++++++++++++++++++++++++++++++++++++++|
layer-sha256:72187ecbd309203499a9ad877c9717298039ca8c2a3db48164b1de4c6c69d0c9:    exists         |++++++++++++++++++++++++++++++++++++++|
layer-sha256:4c8fa809f39139b264c2aeb552b663b7be0251f334ad0528ee66949969585377:    exists         |++++++++++++++++++++++++++++++++++++++|
layer-sha256:6f1e41900c4abd1b502f871096aade786cf82ba999181425f8eec1d0b3e9c1ae:    exists         |++++++++++++++++++++++++++++++++++++++|
layer-sha256:57671312ef6fdbecf340e5fed0fb0863350cd806c92b1fdd7978adbd02afc5c3:    exists         |++++++++++++++++++++++++++++++++++++++|
layer-sha256:3c040335d2d6b2f692aa1e061fabb7a36f1907fc0ff2a2b0171d12c6195334fa:    exists         |++++++++++++++++++++++++++++++++++++++|
layer-sha256:5e9250ddb7d0fa6d13302c7c3e6a0aa40390e42424caed1e5289077ee4054709:    exists         |++++++++++++++++++++++++++++++++++++++|
layer-sha256:6d5336bd2edf97742c3b91459ed76ab5393b47109686658d49c4ccb12d7c42e9:    exists         |++++++++++++++++++++++++++++++++++++++|
layer-sha256:be0d493fa65e37bba9bcdfa4e10a87e19deb1ab8cb64d1c414eca47d8f624da1:    exists         |++++++++++++++++++++++++++++++++++++++|
layer-sha256:345e3491a907bb7c6f1bdddcf4a94284b8b6ddd77eb7d93f09432b17b20f2bbe:    exists         |++++++++++++++++++++++++++++++++++++++|
elapsed: 0.6 s                                                                    total:   0.0 B (0.0 B/s)
unpacking linux/amd64 sha256:c33dd9a587560191b55d6c6831738f977f4452505199651756ba491a34b66dcb...
INFO[0006] apply failure, attempting cleanup             error="failed to extract layer sha256:ccdbb80308cc5ef43b605ac28fac29c6a597f89f5a169bbedbb8dec29c987439: failed to unmount /group/app/containerd/tmpmounts/containerd-mount093668013: failed to unmount target /group/app/containerd/tmpmounts/containerd-mount093668013: device or resource busy: unknown" key="extract-906700158-sFgo sha256:ccdbb80308cc5ef43b605ac28fac29c6a597f89f5a169bbedbb8dec29c987439"
ctr: failed to extract layer sha256:ccdbb80308cc5ef43b605ac28fac29c6a597f89f5a169bbedbb8dec29c987439: failed to unmount /group/app/containerd/tmpmounts/containerd-mount093668013: failed to unmount target /group/app/containerd/tmpmounts/containerd-mount093668013: device or resource busy: unknown

After it fails, the directory still shows it is mounted:

/dev/mapper/rhel-group on /group/app/containerd/tmpmounts/containerd-mount093668013 type xfs (rw,relatime,seclabel,attr2,inode64,noquota)

Thing is, /dev/mapper/rhel-group is already mounted at a higher mount point:

/dev/mapper/rhel-group on /group type xfs (rw,relatime,seclabel,attr2,inode64,noquota)

Describe the results you expected:

Pull completes successfully.

What version of containerd are you using:

containerd containerd.io 1.3.9 ea765aba0d05254012b0b9e595e995c09186427f

Any other relevant information (runC version, CRI configuration, OS/Kernel version, etc.):

runc --version
runc version 1.0.0-rc10
commit: dc9208a3303feef5b3839f4323d9beb36df0a9dd
spec: 1.0.1-dev
crictl info
$ crictl info

uname -a
Linux localhost 3.10.0-1160.24.1.el7.x86_64 #1 SMP Thu Mar 25 21:21:56 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux

About this issue

  • Original URL
  • State: closed
  • Created 3 years ago
  • Reactions: 6
  • Comments: 40 (11 by maintainers)

Most upvoted comments

We faced a similar error. It turns out it was our security software.

@encbladexp We’re using K8s (K3s to be more specific). In our case, the issue was fixed adding the following exception in Defender: sudo mdatp exclusion folder add --path "/var/lib/rancher/k3s/agent/containerd/tmpmounts/*"

I assume that a similar workaround can be found for Docker Engine

Facing the same issue with containerd 1.4.4 on RedHat 7.9. I don’t see any security tool running on the setup to interrupt the un-mounting process. Adding MountFlags in containerd service file is also not helping as it is bringing down the other core pods like kube-proxy after a restart. Anybody with any workarounds to sort this issue?

Hi, do we talk about Microsoft Defender as “Security Software”? @emosbaugh

I also reproduced this issue with Microsoft Defender and RHEL 8.6

Yes, what we saw was that containerd would get EBUSY trying to umount2 the directory (used strace). Disabling security scanning software bypassed the issue. We only saw the issue with larger images.

My theory is that when /var/lib/containerd/tmpmounts/containerd-mount* is created and mounted, scanning software starts scanning it, but doesn’t finish before containerd times out on the EBUSY signals.

One improvement here would be to wait longer when trying to unmount. It seems reasonable to allow scans of container filesystems.