kaniko: Long builds fail with "UNAUTHORIZED: \"authentication required\""

steps to reproduce

mkdir -p ./kaniko-issue

cat > ./kaniko-issue/Dockerfile <<EOF
FROM debian:stable-slim
RUN sleep 1
EOF
# This will work as expected
docker run -v `pwd`/kaniko-issue/:/workspace \
    gcr.io/kaniko-project/executor:latest \
    -c /workspace -f Dockerfile \
    -d registry.example.com/does/not:matter \
    --tarPath /workspace/tarball.tar

cat > ./kaniko-issue/Dockerfile <<EOF
FROM debian:stable-slim
RUN sleep 360
EOF
# This will fail with "UNAUTHORIZED: \"authentication required\""
docker run -v `pwd`/kaniko-issue/:/workspace \
    gcr.io/kaniko-project/executor:latest \
    -c /workspace -f Dockerfile \
    -d registry.example.com/does/not:matter \
    --tarPath /workspace/tarball.tar

# rm -r ./kaniko-issue

additional obervations

I also ran tcpdump on the network interface of the container. I saw a quite a bit of traffic at the start (I assume pulling the image) and a single, short TLS connection to index.docker.io after sleep was done.

The issue seems to be gone (or at least takes substantially longer to arise) if I substitute debian:stable-slim with any image from my harbor (private docker registry) instance.

working theory

My working theory of the underlying cause based on those two observations is, that kaniko tries to fetch the image config of the base image using an expired bearer token. This config would normally be extended and included in the tarball or pushed to the registry.

About this issue

  • Original URL
  • State: closed
  • Created 6 years ago
  • Reactions: 7
  • Comments: 45 (2 by maintainers)

Commits related to this issue

Most upvoted comments

I’m impacted by this as well.

Thanks @tzununbekov for the pointer. I’ve debugged this further, this seems to be caused by https://github.com/google/go-containerregistry not refreshing expired Bearer tokens. Bearer tokens expire after a fixed duration (for docker.io after 300s). If a stage takes longer than that, saving the stage (which involves getting the compressed source image layers) will fail.

I’ve created a PR that fixes this: https://github.com/google/go-containerregistry/pull/283.

Works for me, thanks everyone!

@akhmerov Replace /root/.docker -> /kaniko/.docker in .gitlab-ci.yml, this issue has nothing to do with your issue.

I can confirm the same. Good job guys!

@pieterlange it looks like the fix to the underlying issue is being reviewed now in go-containeregistry.

Once that’s merged we can get the dependency updated here fairly easily.

Awesome, I’m going to go ahead and close this issue since it seems like #388 fixed it. If anyone experiences this again please comment on this thread or open another issue!

I can confirm this fixed it for me!

@yurrriq make sure the token doesn’t expire serverside either - gitlab-ci’s expiry also happens to be 5 minutes, but you can easily up that in the admin console.

#388 fixed my long builds.

Edit: It looks like I spoke too soon…

error building image: getting stage builder for stage 1: no token in bearer response:
{"errors":[{"code":"DENIED","message":"access forbidden"}],"http_status":403}

Update: Extending the authorization token duration, per @pieterlange’s suggestion in GitLab’s container registry settings fixed it. Thanks, everyone!

@tstromberg can you assign this issue to someone else so this fix gets shipped without undue delay? This makes kaniko unusable for building containers that take >5m to build.

@AndreasBieber, perhaps you need to replace /root/ with /kaniko/, as described on this potentially relevant GitLab issue.

yup, it should be in gcr.io/kaniko-project/executor:latest