moby: overlayfs fails to run container with a strange file checksum error

I started to use “overlay” driver today and seeing odd behaviors in my containers (CentOS 6).

  1. yum install <pkg> in Dockerfile always fails, but running it again in the same container always succeeds.
  2. tail -f <file> doesn’t show appended contents

For 1, this simple Dockerfile always fails at second yum install (for perl)

FROM centos:centos6

RUN yum install -y sudo
RUN yum install -y perl

with this error:

Step 2 : RUN yum install -y perl
 ---> Running in 21669e152088
...
Rpmdb checksum is invalid: dCDPT(pkg checksums): perl-version.x86_64 3:0.77-136.el6_6.1 - u

INFO[0018] The command [/bin/sh -c yum install -y perl] returned a non-zero code: 1

However, running the same command twice in the same container always succeeds.

FROM centos:centos6

RUN yum install -y sudo 
RUN (yum install -y perl || yum install -y perl)
Step 2 : RUN (yum install -y perl || yum install -y perl)
 ---> Running in 5bf1177b4437
...

Rpmdb checksum is invalid: dCDPT(pkg checksums): perl-version.x86_64 3:0.77-136.el6_6.1 - u
  Installing : 4:perl-5.10.1-136.el6_6.1.x86_64                             6/6
Loaded plugins: fastestmirror
Setting up Install Process
Loading mirror speeds from cached hostfile
 * base: mirror.vastspace.net
 * extras: mirror.vastspace.net
 * updates: mirror.vastspace.net
Package 4:perl-5.10.1-136.el6_6.1.x86_64 already installed and latest version
Nothing to do
 ---> 2671034175e9
Removing intermediate container 5bf1177b4437
Successfully built 2671034175e9

Even though I change perl to something else, I get the same behavior.

For 2, I noticed that doing tail -f <application.log> first shows last 10 lines of the log file, then it doesn’t show any appended contents. If I press Ctrl+C and redo, it will show the most recent contents.

I have only tried the official centos:centos6 image so far, and the above behavior is 100% reproducible. I didn’t try other images.

Docker Host:

CoreOS latest alpha with ext4 root file system and “overlay” driver. I’m running it on EC2, and recreated it once but the problem persists.

core@core-01 ~ $ cat /etc/os-release
NAME=CoreOS
ID=coreos
VERSION=561.0.0
VERSION_ID=561.0.0
BUILD_ID=
PRETTY_NAME="CoreOS 561.0.0"
ANSI_COLOR="1;32"
HOME_URL="https://coreos.com/"
BUG_REPORT_URL="https://github.com/coreos/bugs/issues"

Docker Version and Info:

core@core-01 ~ $ docker --version
Docker version 1.4.1, build 5bc2ff8-dirty

core@core-01 ~ $ docker info
Containers: 1
Images: 4
Storage Driver: overlay
Execution Driver: native-0.2
Kernel Version: 3.18.2
Operating System: CoreOS 561.0.0
CPUs: 1
Total Memory: 3.68 GiB
Name: core-01
ID: TM2L:NIZI:CUGT:EUKX:ACP3:EOFX:YQVR:CTL7:R7L2:ILQF:6E3H:HAHQ

Kernel Config:

core@core-01 ~ $ wget https://raw.githubusercontent.com/docker/docker/master/contrib/check-config.sh
...
2015-01-19 11:09:00 (133 MB/s) - 'check-config.sh' saved [4476/4476]

core@core-01 ~ $ chmod a+x check-config.sh
core@core-01 ~ $ ./check-config.sh
info: reading kernel config from /proc/config.gz ...

Generally Necessary:
- cgroup hierarchy: properly mounted [/sys/fs/cgroup]
- CONFIG_NAMESPACES: enabled
- CONFIG_NET_NS: enabled
- CONFIG_PID_NS: enabled
- CONFIG_IPC_NS: enabled
- CONFIG_UTS_NS: enabled
- CONFIG_DEVPTS_MULTIPLE_INSTANCES: enabled
- CONFIG_CGROUPS: enabled
- CONFIG_CGROUP_CPUACCT: enabled
- CONFIG_CGROUP_DEVICE: enabled
- CONFIG_CGROUP_FREEZER: enabled
- CONFIG_CGROUP_SCHED: enabled
- CONFIG_MACVLAN: enabled
- CONFIG_VETH: enabled
- CONFIG_BRIDGE: enabled
- CONFIG_NF_NAT_IPV4: enabled
- CONFIG_IP_NF_FILTER: enabled
- CONFIG_IP_NF_TARGET_MASQUERADE: enabled
- CONFIG_NETFILTER_XT_MATCH_ADDRTYPE: enabled
- CONFIG_NETFILTER_XT_MATCH_CONNTRACK: enabled
- CONFIG_NF_NAT: enabled
- CONFIG_NF_NAT_NEEDED: enabled
- CONFIG_POSIX_MQUEUE: enabled

Optional Features:
- CONFIG_MEMCG_SWAP: enabled
- CONFIG_RESOURCE_COUNTERS: enabled
- CONFIG_CGROUP_PERF: enabled
- Storage Drivers:
  - "aufs":
    - CONFIG_AUFS_FS: missing
    - CONFIG_EXT4_FS_POSIX_ACL: enabled
    - CONFIG_EXT4_FS_SECURITY: enabled
  - "btrfs":
    - CONFIG_BTRFS_FS: enabled
  - "devicemapper":
    - CONFIG_BLK_DEV_DM: enabled
    - CONFIG_DM_THIN_PROVISIONING: enabled
    - CONFIG_EXT4_FS: enabled
    - CONFIG_EXT4_FS_POSIX_ACL: enabled
    - CONFIG_EXT4_FS_SECURITY: enabled
  - "overlay":
    - CONFIG_OVERLAY_FS: enabled

About this issue

  • Original URL
  • State: closed
  • Created 9 years ago
  • Reactions: 4
  • Comments: 76 (32 by maintainers)

Commits related to this issue

Most upvoted comments

There is a yum patch, simply add this in your Dockerfile :

RUN yum install -y yum-plugin-ovl

Tested in Centos 7/6 + RHEL 7/6

http://man7.org/linux/man-pages/man1/yum-ovl.1.html

Enjoy Docker/overlay 😃

Found that running rpm --rebuilddb before any yum installs is a decent workaround for this issue as @porjo has mentioned above. Also, using AUFS I have not seen this problem.

I have the same issue with Docker version 17.03.1-ce, build c6d412e and CENTOS 7

RUN rpm --rebuilddb && yum install -y XXX => This solution has resolved this issue

If you’re unable to install the yum plugin because of dependency conflicts (e.g. an older base image), You can work around this with touch /var/lib/rpm/*. See the bug report for details.

I use CoreOS and have just switched from Btrfs to Ext4 and I having the same issue. I was also seeing errors like this:

error: rpmdb: BDB0689 Packages page 122 is on free list with type 7
error: rpmdb: BDB0061 PANIC: Invalid argument
error: db5 error(-30973) from dbcursor->c_put: BDB0087 DB_RUNRECOVERY: Fatal error, run database recovery
error: error(-30973) adding header #163 record
error: rpmdb: BDB0060 PANIC: fatal region error detected; run recovery
error: db5 error(-30973) from dbcursor->c_close: BDB0087 DB_RUNRECOVERY: Fatal error, run database recovery
error: rpmdb: BDB0060 PANIC: fatal region error detected; run recovery
error: db5 error(-30973) from db->sync: BDB0087 DB_RUNRECOVERY: Fatal error, run database recovery

I found that inserting RUN rpm --rebuilddb prior to yum install -y <pkg> resolved both errors, however that should not be necessary.

@xavierbaude I also get the checksum error when install yum-plugin-ovl 😢

@jasonmp85 I’ve just pushed an updated for the OL6 and OL7 latest images. In future, ping me if you have any issues with the OL base and I can address them. Usually our turnaround time is pretty quick, if we’re made aware of an issue. 😃

So what’s the current guidance for fixing this? Still touch /var/lib/rpm/*?

It’s one thing when I’m building images on my local box, but for automated builds in Docker Hub, this is really annoying, and I have no clue how to fix it.

@jakirkham Please try this.

#!/usr/bin/env python
# Workaround for CentOS 6 yum overlay issue (docker/docker#10180)
# Based on yum-utils-1.1.31-34.el7 (GPLv2+)
#
# Example Dockerfile:
#  FROM centos:centos6
#  ADD this.py /
#  RUN /this.py && yum install -y pkg1
#  RUN /this.py && yum install -y pkg2

from os import walk, path, fstat

def _stat_ino_fp(fp):
    """
    Get the inode number from file descriptor
    """
    return fstat(fp.fileno()).st_ino


def get_file_list(rpmpath):
    """
    Enumerate all files in a directory
    """
    for root, _, files in walk(rpmpath):
        for f in files:
            yield path.join(root, f)


def for_each_file(files, cb, m='rb'):
    """
    Open each file with mode specified in `m`
    and invoke `cb` on each of the file objects
    """
    if not files or not cb:
        return []
    ret = []
    for f in files:
        with open(f, m) as fp:
            ret.append(cb(fp))
    return ret


def do_detect_copy_up(files):
    """
    Open the files first R/O, then R/W and count unique
    inode numbers
    """
    num_files = len(files)
    lower = for_each_file(files, _stat_ino_fp, 'rb')
    upper = for_each_file(files, _stat_ino_fp, 'ab')
    diff = set(lower + upper)
    return len(diff) - num_files

def main():
    rpmdb_path = '/var/lib/rpm'
    try:
        files = list(get_file_list(rpmdb_path))
        copied_num = do_detect_copy_up(files)
        print("ovl: Copying up (%i) files from OverlayFS lower layer" % copied_num)
    except Exception as e:
        print("ovl: Error while doing RPMdb copy-up:\n%s" % e)

if __name__ == '__main__':
    main()

The latest CentOS 7 image includes the workaround yum-plugin-ovl:

$ docker run -it --rm library/centos@sha256:8dcd2ec6183f3f4a94d4f9552ce76091624760edefcaa39a9e04441f9e2ad9f6 sh -c "rpm -qa | grep yum"
yum-plugin-fastestmirror-1.1.31-34.el7.noarch
yum-utils-1.1.31-34.el7.noarch
yum-metadata-parser-1.1.4-10.el7.x86_64
yum-3.4.3-132.el7.centos.0.1.noarch
yum-plugin-ovl-1.1.31-34.el7.noarch

Note that the latest tagged image not:

$ docker run -it --rm centos:centos7.2.1511 sh -c "rpm -qa | grep yum"
yum-metadata-parser-1.1.4-10.el7.x86_64
yum-3.4.3-132.el7.centos.0.1.noarch
yum-plugin-fastestmirror-1.1.31-34.el7.noarch

I’ll open a PR that adds a note to the documentation.

@thaJeztah Do you think this issue is closable after that PR?

meet this issue on Linux fedora 4.0.4-301.fc22.x86_64

The original “Rpmdb checksum is invalid” occurs because of an overlayfs limitation.

In general, data inconsistency can be demonstrated by doing the following:

  • pick a file that has not yet been copied up to the upper layer
  • open it twice: once in O_RDONLY (fd-ro) and once in O_RDWR (fd-rw) ==> Writing to fd-rw, and reading from fd-ro results in data mismatch.

The problem with yum occurs because a file is opened from /var/lib/rpm in RO mode (from ‘lower’ layer), the same file is subsequently opened in r/w mode (resulting in a copy up to ‘upper’) and is updated with new data. The contents of the file when read from the RO fd don’t match the new data resulting in the ‘checksum is invalid’ complaint. Subsequent operations with yum work because /var/lib/rpm files how exist in the ‘upper’ layer and all the fds come from the ‘upper’ layer.

We’ve created a kernel patch for overlayfs to solve this problem and are currently testing it. Please give it a whirl and see if it solves your problem - appreciate your feedback.

The source is here: https://github.com/portworx/overlayfs