buildah: invalid tar header: unknown

GKE cluster (1.12.7-gke.10) using the node image cos_containerd is failing to run containers built with buildah and pushed to GCR.

Failed to pull image "gcr.io/gke-clusters/testing:latest": rpc error: code = Unknown desc = failed to pull and unpack image "gcr.io/gke-clusters/testing:latest": failed to unpack image on snapshotter overlayfs: failed to extract layer sha256:f1b5933fe4b5f49bbe8258745cf396afe07e625bdab3168e364daf7c956b6b81: mount callback failed on /var/lib/containerd/tmpmounts/containerd-mount799987480: archive/tar: invalid tar header: unknown
➜ cat Dockerfile 
FROM alpine
RUN date

Projects/cluster1/test on 🐳 v18.09.5 
➜ buildah version
Version:         1.8.2
Go Version:      go1.12.4
Image Spec:      1.0.0
Runtime Spec:    1.0.0
CNI Spec:        0.4.0
libcni Version:  v0.7.0-rc2
Git Commit:      e23314b1
Built:           Fri May 10 09:23:56 2019
OS/Arch:         linux/amd64

Projects/cluster1/test on 🐳 v18.09.5 
➜ cat test.yaml 
apiVersion: v1
kind: Pod
metadata:
  name: test
  namespace: default
spec:
  containers:
  - name: test
    image: gcr.io/gke-clusters/testing:latest

Steps to reproduce the issue:

  1. Build an image using the Dockerfile above with buildah
  2. Push to GCR (or maybe any other container registry?)
  3. kubectl apply -f test.yaml on k8s cluster
  4. Use kubectl describe to view warning event

Describe the results you received:

Pod fails to spin up.

Describe the results you expected:

I’d expect the pod to pull and run the container.

Output of rpm -q buildah or apt list buildah:

➜ yay -s buildah
2 aur/buildah-git r1330.391a5bea-1 (+1 0.00%) 
    A tool which facilitates building OCI images
1 community/buildah 1.8.2-1 (5.4 MiB 23.6 MiB) (Installed)
    A tool which facilitates building OCI images
==> Packages to install (eg: 1 2 3, 1-3 or ^4)
==> ^C

Output of buildah version:

Version:         1.8.2
Go Version:      go1.12.4
Image Spec:      1.0.0
Runtime Spec:    1.0.0
CNI Spec:        0.4.0
libcni Version:  v0.7.0-rc2
Git Commit:      e23314b1
Built:           Fri May 10 09:23:56 2019
OS/Arch:         linux/amd64

Output of cat /etc/*release:

NAME="Arch Linux"
PRETTY_NAME="Arch Linux"
ID=arch
BUILD_ID=rolling
ANSI_COLOR="0;36"
HOME_URL="https://www.archlinux.org/"
DOCUMENTATION_URL="https://wiki.archlinux.org/"
SUPPORT_URL="https://bbs.archlinux.org/"
BUG_REPORT_URL="https://bugs.archlinux.org/"

Output of uname -a:

Linux dell 5.0.9-arch1-1-ARCH #1 SMP PREEMPT Sat Apr 20 15:00:46 UTC 2019 x86_64 GNU/Linux

Output of cat /etc/containers/storage.conf:

# This file is is the configuration file for all tools
# that use the containers/storage library.
# See man 5 containers-storage.conf for more information
# The "container storage" table contains all of the server options.
[storage]

# Default Storage Driver
driver = "overlay"

# Temporary storage location
runroot = "/var/run/containers/storage"

# Primary Read/Write location of container storage
graphroot = "/var/lib/containers/storage"

[storage.options]
# Storage options to be passed to underlying storage drivers

# AdditionalImageStores is used to pass paths to additional Read/Only image stores
# Must be comma separated list.
additionalimagestores = [
]

# Size is used to set a maximum size of the container image.  Only supported by
# certain container storage drivers.
size = ""

# Path to an helper program to use for mounting the file system instead of mounting it
# directly.
#mount_program = "/usr/bin/fuse-overlayfs"

# OverrideKernelCheck tells the driver to ignore kernel checks based on kernel version
override_kernel_check = "true"

# mountopt specifies comma separated list of extra mount options
mountopt = "nodev"

# Remap-UIDs/GIDs is the mapping from UIDs/GIDs as they should appear inside of
# a container, to UIDs/GIDs as they should appear outside of the container, and
# the length of the range of UIDs/GIDs.  Additional mapped sets can be listed
# and will be heeded by libraries, but there are limits to the number of
# mappings which the kernel will allow when you later attempt to run a
# container.
#
# remap-uids = 0:1668442479:65536
# remap-gids = 0:1668442479:65536

# Remap-User/Group is a name which can be used to look up one or more UID/GID
# ranges in the /etc/subuid or /etc/subgid file.  Mappings are set up starting
# with an in-container ID of 0 and the a host-level ID taken from the lowest
# range that matches the specified name, and using the length of that range.
# Additional ranges are then assigned, using the ranges which specify the
# lowest host-level IDs first, to the lowest not-yet-mapped container-level ID,
# until all of the entries have been used for maps.
#
# remap-user = "storage"
# remap-group = "storage"

[storage.options.thinpool]
# Storage Options for thinpool

# autoextend_percent determines the amount by which pool needs to be
# grown. This is specified in terms of % of pool size. So a value of 20 means
# that when threshold is hit, pool will be grown by 20% of existing
# pool size.
# autoextend_percent = "20"

# autoextend_threshold determines the pool extension threshold in terms
# of percentage of pool size. For example, if threshold is 60, that means when
# pool is 60% full, threshold has been hit.
# autoextend_threshold = "80"

# basesize specifies the size to use when creating the base device, which
# limits the size of images and containers.
# basesize = "10G"

# blocksize specifies a custom blocksize to use for the thin pool.
# blocksize="64k"

# directlvm_device specifies a custom block storage device to use for the
# thin pool. Required if you setup devicemapper.
# directlvm_device = ""

# directlvm_device_force wipes device even if device already has a filesystem.
# directlvm_device_force = "True"

# fs specifies the filesystem type to use for the base device.
# fs="xfs"

# log_level sets the log level of devicemapper.
# 0: LogLevelSuppress 0 (Default)
# 2: LogLevelFatal
# 3: LogLevelErr
# 4: LogLevelWarn
# 5: LogLevelNotice
# 6: LogLevelInfo
# 7: LogLevelDebug
# log_level = "7"

# min_free_space specifies the min free space percent in a thin pool require for
# new device creation to succeed. Valid values are from 0% - 99%.
# Value 0% disables
# min_free_space = "10%"

# mkfsarg specifies extra mkfs arguments to be used when creating the base.
# device.
# mkfsarg = ""

# use_deferred_removal marks devicemapper block device for deferred removal.
# If the thinpool is in use when the driver attempts to remove it, the driver 
# tells the kernel to remove it as soon as possible. Note this does not free
# up the disk space, use deferred deletion to fully remove the thinpool.
# use_deferred_removal = "True"

# use_deferred_deletion marks thinpool device for deferred deletion.
# If the device is busy when the driver attempts to delete it, the driver
# will attempt to delete device every 30 seconds until successful.
# If the program using the driver exits, the driver will continue attempting
# to cleanup the next time the driver is used. Deferred deletion permanently
# deletes the device and all data stored in device will be lost.
# use_deferred_deletion = "True"

# xfs_nospace_max_retries specifies the maximum number of retries XFS should
# attempt to complete IO when ENOSPC (no space) error is returned by
# underlying storage device.
# xfs_nospace_max_retries = "0"

# If specified, use OSTree to deduplicate files with the overlay backend
ostree_repo = ""

# Set to skip a PRIVATE bind mount on the storage home directory.  Only supported by
# certain container storage drivers
skip_mount_home = "false"

About this issue

  • Original URL
  • State: closed
  • Created 5 years ago
  • Comments: 63 (29 by maintainers)

Commits related to this issue

Most upvoted comments

Hi @akospinter, thanks for reaching out. The issue is being tracked in https://github.com/containers/image/issues/733 which is not yet fixed.

Looks like containers/image#563 is merged now. Does that unblock buildah? Just hit this myself for the first time w/ some Pi cluster exploration I’m doing.

We should be merging the latest version of containers/image this week. As soon as this is merged https://github.com/containers/image/pull/718

I will open a PR for Buildah, and then will work with @TomSweeneyRedHat to get a release out.

@marshallford, no updates yet but thanks for the reminder. I will add this back to my TODO and work with @mtrmac to push it over the finish line.

Yes, it has been vendored into Buildah v1.9.x.

I faced the same issue today with podman 2.1.1/fedora-32 and gcr.io on GKE/cos/containerd. Exporting to tarball and pushing via docker worked.

Ok, so I got this working, but I had to pass -D option to buildah push for this to work properly. With that option supplied, the built images work with recent k3s/containerd in my cluster.

Buildah v1.11.3 has been released and includes fixes for this issue. Thanks to everybody involved!

I can now reproduce with @vrothberg’s DockerHub image; let me see if I can understand what the applier method is choking on with that layer. I did look through commit history and nothing stands out as changing behavior here since 1.1.7 and today; but hopefully some debug will help figure it out.

Docker Hub is transforming the OCI image into a Docker v2 one, while the default registry:2 preserves the OCI image. But even when building a Docker v2 image (i.e., buildah bud --format=docker) and pushing it to a local registry, containerd still doesn’t like it:

ctr: failed to extract layer sha256:a91695c2d5c558e126464a6e8229d3062ad78db76ec5ab685f1150d0f5929ca0: mount callback failed on /var/lib/containerd/tmpmounts/containerd-mount564865765: unexpected EOF: unknown

I pushed the same image to docker.io/valentinrothberg/image:docker. Containerd fails with the same error trying to pull that image from Docker Hub, so the issue does not seem to be related to OCI images or some other side-effects from Docker Hub.

@nalind, @rhatdan, do you have any suspicion of what could possible go south here? It would be nice to have containerd >= 1.2 and Buildah working together.

I see that https://github.com/containers/image/issues/733 is fixed now, has this fix been vendored into buildah yet?

@marshallford, I had time to look into it and to think about how we can address this specific issue but I did not have the time to implement it. I opened https://github.com/containers/image/issues/733 to track it. I can’t give an ETA for a fix since there are many things on my table at the moment. Maybe others will pick up https://github.com/containers/image/issues/733. I will close the issue here to move the discussion over to containers/image.

Well, maybe some other year. My 2 cents is that a blocking issue should never be a reason for major refactoring.

The feature to edit the MIME types correctly just doesn’t exist in c/image, it’s not some gratuitous big refactoring that we are stubborn on insisting when we could have easily committed a one-line fix; or at least I’m not aware of any correct one-line fix we could have made.

podman v1.4.2 has Buildah v1.9.0 vendored into it. Please try this out.