buildx: Using gha cache with `mode=max` fails with 400 error
Hello, I have recently started using cahe-to "type=gha,mode=max,scope=..."
to cache all the layers and the following error seemed to persist only for specific builds (consistently failing on the same). After removeing the mode=max
the issues went away but obviousely not everything is cached.
Failing builds with mode=max
:
- https://github.com/hertzg/rtl_433_docker/actions/runs/1455213857
- https://github.com/hertzg/rtl_433_docker/actions/runs/1451049139
- https://github.com/hertzg/rtl_433_docker/actions/runs/1446674498
Passing build after removing mode=max
#130 exporting cache
#130 preparing build cache for export
#130 preparing build cache for export 121.0s done
#130 writing layer sha256:06be072867b08bb9aef2e469533d0d0d6a85f97c2aabcaf5103d78c217977918
#130 writing layer sha256:06be072867b08bb9aef2e469533d0d0d6a85f97c2aabcaf5103d78c217977918 0.1s done
#130 writing layer sha256:08980434c63c0751ea69086b9c8282a5b2784f78643934cf62036ae07e94b943
#130 writing layer sha256:08980434c63c0751ea69086b9c8282a5b2784f78643934cf62036ae07e94b943 0.1s done
...
redacted to keep it short
...
#130 writing layer sha256:d91065fb02477295b9823f300711ac850313d3b6a81a6ca1f8d214f8700b8b2e
#130 writing layer sha256:d91065fb02477295b9823f300711ac850313d3b6a81a6ca1f8d214f8700b8b2e 4.0s done
#130 ERROR: error writing layer blob: error committing cache 37887: failed to parse error response 400: <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN""http://www.w3.org/TR/html4/strict.dtd">
<HTML><HEAD><TITLE>Bad Request</TITLE>
<META HTTP-EQUIV="Content-Type" Content="text/html; charset=us-ascii"></HEAD>
<BODY><h2>Bad Request</h2>
<hr><p>HTTP Error 400. The request is badly formed.</p>
</BODY></HTML>
: invalid character '<' looking for beginning of value
------
> exporting cache:
------
error: failed to solve: error writing layer blob: error committing cache 37887: failed to parse error response 400: <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN""http://www.w3.org/TR/html4/strict.dtd">
<HTML><HEAD><TITLE>Bad Request</TITLE>
<META HTTP-EQUIV="Content-Type" Content="text/html; charset=us-ascii"></HEAD>
<BODY><h2>Bad Request</h2>
<hr><p>HTTP Error 400. The request is badly formed.</p>
</BODY></HTML>
: invalid character '<' looking for beginning of value
Error: buildx failed with: : invalid character '<' looking for beginning of value
About this issue
- Original URL
- State: closed
- Created 3 years ago
- Comments: 39 (12 by maintainers)
Commits related to this issue
- Disable Docker cache mode=max for some layers in GitHub Actions to avoid timeouts May be related to https://github.com/docker/buildx/issues/841 — committed to brefphp/aws-lambda-layers by mnapoli a year ago
- Disable Docker cache mode=max for some layers in GitHub Actions to avoid timeouts May be related to https://github.com/docker/buildx/issues/841 — committed to brefphp/aws-lambda-layers by mnapoli a year ago
- Disable Docker cache mode=max entirely in GitHub Actions to avoid timeouts May be related to https://github.com/docker/buildx/issues/841 and https://github.com/moby/buildkit/issues/2804 — committed to brefphp/aws-lambda-layers by mnapoli a year ago
- Disable Docker cache for some images to avoid timeouts May be related to https://github.com/docker/buildx/issues/841 and https://github.com/moby/buildkit/issues/2804 — committed to brefphp/aws-lambda-layers by mnapoli a year ago
- Disable Docker cache mode=max for some layers in GitHub Actions to avoid timeouts May be related to https://github.com/docker/buildx/issues/841 — committed to crashdev226/aws-lambda-layers by crashdev226 a year ago
- Disable Docker cache mode=max for some layers in GitHub Actions to avoid timeouts May be related to https://github.com/docker/buildx/issues/841 — committed to crashdev226/aws-lambda-layers by crashdev226 a year ago
- Disable Docker cache mode=max entirely in GitHub Actions to avoid timeouts May be related to https://github.com/docker/buildx/issues/841 and https://github.com/moby/buildkit/issues/2804 — committed to crashdev226/aws-lambda-layers by crashdev226 a year ago
- Disable Docker cache for some images to avoid timeouts May be related to https://github.com/docker/buildx/issues/841 and https://github.com/moby/buildkit/issues/2804 — committed to crashdev226/aws-lambda-layers by crashdev226 a year ago
If you’re using the cache toolkit module, then read and writes are handled in separate calls (
restoreCache
vssaveCache
). ButrestoreCache
checks if the cache exists + downloads the content.Inside
restoreCache
, it makes a call tocacheHttpClient.getCacheEntry
that can be used to just check if the record exists:Although this is getting into some of the internal implementation. Perhaps we need to add another top-level function alongside
restoreCache
andsaveCache
that just checks if the cache exists.This
GET
operation also has a higher rate limit (900). Although now that the seal/reserve limit was increased (300 -> 700), it’s not much higher 😄 @t-dedah can we please look into increasing this limit?https://github.com/tonistiigi/go-actions-cache/pull/12
@t-dedah @dhadka I’ll try to give some more background on where I think our requests come from.
Let’s say the user is doing a build that touches, for example, 50 blobs/layers in total(with Github cache users usually would export all intermediate layers as well, not only the final result layers so this can grow fast). If that build is mostly cached, for example updated only 1 layer we will create a new “manifest” blob with links to these 50 layers. We will push the manifest and new layer, but most importantly we still need to make a request for all the old 49 blobs just to check if they still exist. We will make a request for all of them, Github will answer that record exists and we can continue. This needs to happen for all builds and some repositories make a lot of builds in a single workflow or very complex(multi-platform) builds with lots of layers. Even if lots of builds mostly share same layers we need to check them all on each build.
If there would be an endpoint we could use just to check that a cache key exists that would not be rate-limited(or with a much higher limit) it would be much less likely to hit these limits. That endpoint does not need to provide a download link or reserve a key like the current requests do. It could also be a batch endpoint to check multiple keys together. Maybe even just endpoint to list all current keys would be manageable as keys should be small and not take much room even if there are a lot of them.
Another way would be for us to somehow remember that the cache key existed and not check it more often than some timeframe. But for that, we would need some kind of external storage/database where we could keep these records.
Looks like this is a private repository.
@hertzg apologize for the delay, but the good new is that we are rolling out a change to increase the rate limit threshold. This should allow buildx action to cache layers with higher level of parallelization. cc @t-dedah to keep you in loop when rollout is complete.
Would love to hear back on whether this helps in reducing the
429
s.@hertzg we will be looking into relaxing the rate limit. But we need to carefully evaluate the extra load it will bring on the system. I will update you with an ETA.
@hertzg We will probably not do another patch release just for this unless something else comes up as well. You can use the master branch image(or pin it to digest for safety). cc @crazy-max
@dhadka That could be the issue indeed. Thanks for the pointer. I’ll update that logic.