containerd: "failed to copy: io: read/write on closed pipe" on `ctr images push` while pushing large images

Description

While pushing large container image to a registry using

ctr images push ghcr.io/akhilerm/private-testing:ci container-registry.oracle.com/database/express:21.3.0-xe

hitting the following error

failed to copy: io: read/write on closed pipe

Notes:

  • The issue occurs randomly, so its not easily reproducible. Out of 10 tries, hit the issue only twice.
  • Faced while uploading to ghcr.io only. Couldnt reproduce for dockerhub or quay. (Again this can be due to randomness)
  • Similar issue has been reported in moby/buildkit#3347 and docker/build-push-action#761

Steps to reproduce the issue

  1. sudo ctr content fetch --all-platforms container-registry.oracle.com/database/express:21.3.0-xe
  2. ctr images push ghcr.io/akhilerm/testing-gha:io-failure container-registry.oracle.com/database/express:21.3.0-xe
$ ctr images ls
REF                                                      TYPE                                                      DIGEST                                                                  SIZE      PLATFORMS                                                                       LABELS
container-registry.oracle.com/database/express:21.3.0-xe application/vnd.docker.distribution.manifest.v2+json      sha256:016d1a2becd9c9b9bfb683eebf3aa092527fe1354ace5b23691e75759f301bed 3.3 GiB   linux/amd64                                                                     -

Reproduced using this image which is 3.3GiB in size.

I will update more info into the issue, if this can be reproduced easily; as currently testing requires uploading the 3GiB image and can take a lot of time.

Describe the results you received and expected

Expected the image to be successfully pushed to the registry

What version of containerd are you using?

412ca496dc8fc44394d96f9b1bfb7c8c9b70f951

Any other relevant information

$ runc --version
runc version 1.1.4
commit: v1.1.4-0-g5fd4c4d
spec: 1.0.2-dev
go: go1.18.9
libseccomp: 2.5.3

$ uname -a
Linux ams-hz-ubu-055 5.15.0-52-generic #58-Ubuntu SMP Thu Oct 13 08:03:55 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux

Show configuration if it is related to CRI plugin.

$ cat /etc/containerd/config.toml
#   Copyright 2018-2022 Docker Inc.

#   Licensed under the Apache License, Version 2.0 (the "License");
#   you may not use this file except in compliance with the License.
#   You may obtain a copy of the License at

#       http://www.apache.org/licenses/LICENSE-2.0

#   Unless required by applicable law or agreed to in writing, software
#   distributed under the License is distributed on an "AS IS" BASIS,
#   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
#   See the License for the specific language governing permissions and
#   limitations under the License.

#disabled_plugins = ["cri"]

#root = "/var/lib/containerd"
#state = "/run/containerd"
#subreaper = true
#oom_score = 0

#[grpc]
#  address = "/run/containerd/containerd.sock"
#  uid = 0
#  gid = 0

#[debug]
#  address = "/run/containerd/debug.sock"
#  uid = 0
#  gid = 0
#  level = "info"

About this issue

  • Original URL
  • State: closed
  • Created a year ago
  • Reactions: 24
  • Comments: 20 (4 by maintainers)

Commits related to this issue

Most upvoted comments

#7985 should resolve the issues entirely.

The issue is a client-side fix - no registry-side changes on GHCR will be necessary.

We too facing the same error:

#22 ERROR: failed to push ghcr.io/atlanhq/atlas-master:latest: failed to copy: io: read/write on closed pipe
------
 > exporting to image:
------
ERROR: failed to solve: failed to push ghcr.io/atlanhq/atlas-master:latest: failed to copy: io: read/write on closed pipe
Error: buildx failed with: ERROR: failed to solve: failed to push ghcr.io/atlanhq/atlas-master:latest: failed to copy: io: read/write on closed pipe

Tried retying the job but still we get the same.

I’m using Docker Setup Buildx and having this issue too. But I don’t think it’s changed buildkit versions since October. Could it be something else?

No, BuildKit 0.10.6 uses containerd v1.6.3 and https://github.com/containerd/containerd/pull/6995 change appears first in containerd v1.6.9 and BuildKit 0.11 uses containerd v1.6.14 which contains this change.

Had this issue a couple of times today. Rerun 3 times and finally success.