kaniko: Long builds fail with "UNAUTHORIZED: \"authentication required\""
steps to reproduce
mkdir -p ./kaniko-issue
cat > ./kaniko-issue/Dockerfile <<EOF
FROM debian:stable-slim
RUN sleep 1
EOF
# This will work as expected
docker run -v `pwd`/kaniko-issue/:/workspace \
gcr.io/kaniko-project/executor:latest \
-c /workspace -f Dockerfile \
-d registry.example.com/does/not:matter \
--tarPath /workspace/tarball.tar
cat > ./kaniko-issue/Dockerfile <<EOF
FROM debian:stable-slim
RUN sleep 360
EOF
# This will fail with "UNAUTHORIZED: \"authentication required\""
docker run -v `pwd`/kaniko-issue/:/workspace \
gcr.io/kaniko-project/executor:latest \
-c /workspace -f Dockerfile \
-d registry.example.com/does/not:matter \
--tarPath /workspace/tarball.tar
# rm -r ./kaniko-issue
additional obervations
I also ran tcpdump on the network interface of the container. I saw a quite a bit of traffic at the start (I assume pulling the image) and a single, short TLS connection to index.docker.io after sleep was done.
The issue seems to be gone (or at least takes substantially longer to arise) if I substitute debian:stable-slim with any image from my harbor (private docker registry) instance.
working theory
My working theory of the underlying cause based on those two observations is, that kaniko tries to fetch the image config of the base image using an expired bearer token. This config would normally be extended and included in the tarball or pushed to the registry.
About this issue
- Original URL
- State: closed
- Created 6 years ago
- Reactions: 7
- Comments: 45 (2 by maintainers)
Commits related to this issue
- try to reproduce #245 in CI — committed to JensGutermuth/kaniko by JensGutermuth 6 years ago
- try to reproduce #245 in CI — committed to JensGutermuth/kaniko by JensGutermuth 6 years ago
- Try a multi-stage build in https://github.com/GoogleContainerTools/kaniko/issues/245#issuecomment-410062791 @MnrGreg describes a multi-stage buld failing in the same way. Maybe this can be reproduced... — committed to JensGutermuth/kaniko by JensGutermuth 6 years ago
- try to reproduce #245 in CI — committed to JensGutermuth/kaniko by JensGutermuth 6 years ago
- Try a multi-stage build in https://github.com/GoogleContainerTools/kaniko/issues/245#issuecomment-410062791 @MnrGreg describes a multi-stage buld failing in the same way. Maybe this can be reproduced... — committed to JensGutermuth/kaniko by JensGutermuth 6 years ago
- fully qualify images in dockerfile test für #245 — committed to JensGutermuth/kaniko by JensGutermuth 6 years ago
- Stage source image retrieving before tarball save. Fixes GoogleContainerTools/kaniko#245 Signed-off-by: tzununbekov <t.zununbekov@gmail.com> — committed to tzununbekov/kaniko by tzununbekov 6 years ago
- Update go-containerregistry dependency #245 — committed to ianberinger/kaniko by ianberinger 6 years ago
I’m impacted by this as well.
Thanks @tzununbekov for the pointer. I’ve debugged this further, this seems to be caused by https://github.com/google/go-containerregistry not refreshing expired Bearer tokens. Bearer tokens expire after a fixed duration (for docker.io after 300s). If a stage takes longer than that, saving the stage (which involves getting the compressed source image layers) will fail.
I’ve created a PR that fixes this: https://github.com/google/go-containerregistry/pull/283.
Works for me, thanks everyone!
@akhmerov Replace
/root/.docker->/kaniko/.dockerin.gitlab-ci.yml, this issue has nothing to do with your issue.I can confirm the same. Good job guys!
@pieterlange it looks like the fix to the underlying issue is being reviewed now in go-containeregistry.
Once that’s merged we can get the dependency updated here fairly easily.
Awesome, I’m going to go ahead and close this issue since it seems like #388 fixed it. If anyone experiences this again please comment on this thread or open another issue!
I can confirm this fixed it for me!
@yurrriq make sure the token doesn’t expire serverside either - gitlab-ci’s expiry also happens to be 5 minutes, but you can easily up that in the admin console.
#388 fixed my long builds.
Edit: It looks like I spoke too soon…
Update: Extending the authorization token duration, per @pieterlange’s suggestion in GitLab’s container registry settings fixed it. Thanks, everyone!
@tstromberg can you assign this issue to someone else so this fix gets shipped without undue delay? This makes kaniko unusable for building containers that take >5m to build.
@AndreasBieber, perhaps you need to replace
/root/with/kaniko/, as described on this potentially relevant GitLab issue.yup, it should be in
gcr.io/kaniko-project/executor:latest