buildkit: panic: failed to get edge

I’m not sure which buildctl command triggered the panic. We are building a few different images in parallel. The general command format is:

buildctl build --progress=plain --frontend=dockerfile.v0 --local context="." --local dockerfile="." --opt target="production" --opt build-arg:GIT_BRANCH="$BUILDKITE_BRANCH" --opt build-arg:GIT_COMMIT_SHA="$BUILDKITE_COMMIT" --opt build-arg:GIT_COMMIT_MESSAGE="$BUILDKITE_MESSAGE" --import-cache ref=zzzzz/cache:production-zzzz-master,type=registry --import-cache ref=zzzz/cache:production-zzzz-master,type=registry --export-cache ref=zzzz/cache:production-zzzz-master,mode=max,type=registry --output \"name=zzzz/zzzz:production-bc6e10840eb7d7652c2cf120199d15e3623f65ba-amd64\",\"push=true\",\"type=image\"

Version:

buildkitd --version
buildkitd github.com/moby/buildkit v0.9.0 c8bb937807d405d92be91f06ce2629e6202ac7a9

Config:

cat /etc/buildkit/buildkitd.toml 
[worker.oci]
  max-parallelism = 3

[worker.containerd]
  max-parallelism = 3

Error:

panic: failed to get edge

goroutine 81 [running]:
github.com/moby/buildkit/solver.(*pipeFactory).NewInputRequest(0x4004a17f00, 0x0, 0x1095e98, 0x4127e9ce40, 0x4081d9c180, 0x41259347a8, 0x1)
        /src/solver/scheduler.go:354 +0x1e0
github.com/moby/buildkit/solver.(*edge).createInputRequests(0x4115bec280, 0x2, 0x4004a17f00, 0x0, 0x18c91d0)
        /src/solver/edge.go:809 +0x280
github.com/moby/buildkit/solver.(*edge).unpark(0x4115bec280, 0x411272a020, 0x1, 0x1, 0x4004a17e30, 0x0, 0x0, 0x18c91d0, 0x0, 0x0, ...)
        /src/solver/edge.go:360 +0x12c
github.com/moby/buildkit/solver.(*scheduler).dispatch(0x400024f030, 0x4115bec280)
        /src/solver/scheduler.go:136 +0x3a0
github.com/moby/buildkit/solver.(*scheduler).loop(0x400024f030)
        /src/solver/scheduler.go:104 +0x1ac
created by github.com/moby/buildkit/solver.newScheduler
        /src/solver/scheduler.go:35 +0x1a0

About this issue

  • Original URL
  • State: open
  • Created 3 years ago
  • Comments: 23 (11 by maintainers)

Most upvoted comments

Thank you @coryb, that reproduction was super helpful (we’ve been trying to track this down for a while) - I’ve sent a PR here that might fix: #3953

Ran into this today on v0.9.2 Some info about what we do:

  • don’t use the --no-cache flag
  • build about 40 images in parallel on a big box
  • new box is spun up for each build and then terminated after
  • use registry cache via local registry container pointed at s3
  • multiple boxes can be running builds for different branches, each using the same cache

it doesn’t usually happen but today happened on two separate occasions, for about 5 of the images in each occasion each time it was these two logs together:

importing cache manifest from <image name redacted>
error: failed to solve: failed to get edge: inconsistent graph state

We have 3 retries for each image and about 4 of the images succeeded on one of the retries while one image failed all 3 retries