docker-node: [BUG - CrashLoopBackOff] Node 16.15.0 -> 16.15.1 AWS Kubernetes pod startup

Environment

  • Platform: AWS
  • Docker Version: tried 20.10.16 and 20.10.11 (MAC OS)
  • Node.js Version: 16.15.1
  • Image Tag: 16.15-alpine3.15

Expected Behavior

Successfull startup of pod using this image as base image. (16.15.0 cached worked without problems)

Current Behavior

Pod crashes with a CrashLoopBackOff and no messages! We only have the following:

│     State:          Waiting                                                                                                                                                                                                                                                                     │
│       Reason:       CrashLoopBackOff                                                                                                                                                                                                                                                            │
│     Last State:     Terminated                                                                                                                                                                                                                                                                  │
│       Reason:       Error                                                                                                                                                                                                                                                                       │
│       Exit Code:    254                                                                                                                                                                                                                                                                         │
│       Started:      Tue, 14 Jun 2022 11:07:23 +0200                                                                                                                                                                                                                                             │
│       Finished:     Tue, 14 Jun 2022 11:07:24 +0200                                                                                                                                                                                                                                             │
│     Ready:          False

Possible Solution

Rollback that change?

Steps to Reproduce

  • Build simple docker image with this as base image.
  • deploy to AWS cluster
  • pod crashes (older version 16.15.0 works)

Additional Information

We build the docker image on MAC Linux and Windows. Same result old version runs new version fails.

EDIT: Where all began: https://github.com/nodejs/docker-node/commit/194a775693fd40598a1bafd4858e063c24efeb42

About this issue

  • Original URL
  • State: open
  • Created 2 years ago
  • Reactions: 4
  • Comments: 21 (2 by maintainers)

Most upvoted comments

In our case, we managed to fix this issue. We were using a multi stage dockerbuild (install, builder, distribution) and we were using node:16.14-alpine3.15. In order to cater for security vulnerabilities (CVE-2022-2097, CVE-2022-29458), we had to update to node:16.16-alpine3.15

In our case the fix was to explicitly install and downgrade the npm to 8.5.0 in distribution image in our docker file. We tried any version of npm above 8.5.0 and it didn’t work and the issue was reproduced again or some other issues surfaced.

Therefore we had to install and fix the npm version to 8.5.0 and specify exact version.

RUN npm install -g npm@8.5.0 --save-exact

Our docker version previously looked like:

# INSTALL CONTAINER
FROM node:16.16-alpine3.15 as install
...


# BUILDER CONTAINER

FROM node:16.16-alpine3.15 as builder
....


# RUNTIME CONTAINER

FROM node:16.16-alpine3.15 AS distribution
....

We then changed it to explicitly set the npm version on each container to be 8.5.0 and our Dockerfile now looks like this and the issue is fixed.

# INSTALL CONTAINER
FROM node:16.16-alpine3.15 as install
RUN npm install -g npm@8.5.0 --save-exact

...


# BUILDER CONTAINER

FROM node:16.16-alpine3.15 as builder
RUN npm install -g npm@8.5.0 --save-exact

....


# RUNTIME CONTAINER

FROM node:16.16-alpine3.15 AS distribution
RUN npm install -g npm@8.5.0 --save-exact

....

This approach for now resolved the problem. I wish our approach helps others fix their problem, but we understand that even our approach is a work around and hope node image distributes a proper working version of npm in their image.

Same here.

It was no issue in node v16.15.0 with npm 8.5.5. In the v16.15.1 with npm 8.11.0, the deployment of Pods crashes with CrashLoopBackOff error without any logs, only with the exit code 254.

Thanks to @propattern for the workaround, this has also worked for our environments.

However, upon further investigation we have managed to fix the problem without downgrading npm and have done so by changing our Dockerfile to run as the node user provided by the Docker image. You may find more documentation around this for other use cases here: https://github.com/nodejs/docker-node/blob/main/docs/BestPractices.md#non-root-user

TL;DR: Use the provided node user

FROM node:16-alpine
# ...
# ...
# At the end, set the user to use when running this image
USER node
CMD ["node", "src/start.js"]

Looks like this problem is still occurring on the latest 16.16.0-alpine tag.

I can confirm it’s not CMD issue either. I downloaded the image to local and it can be run normally for npm start. I suspect the npm v8.11.0 causes the issue. We noticed similar ERESOLVE issues as this thread Node v16.15.1 (npm v8.11.0) breaks some builds and this thread #npm/cli#4998. Although we solved the peerDependencies issues and compiled the image, the image exits with the empty 254 error.

@LaurentGoderre I have the same issue, npm is not functional at all in the latest version, it is not related to the working directory since even npm -v returns a newline and the non 0 return code (I bilieve it was 248 for me though). I found this issue searching fro soultuion and downgrading helped. It is not related to command or entrypoint since it is reporducable from command line when you ‘terminal into’ the pod. Node works fine, everything else seems to be working fine. It is also not a permission issue it seems since I made sure for project directory and /tmp to have same owner as the user I was running commands from. Same exact image works just fine under the same non-root user on my local docker (node with uid and gid changed to 999 and from under default 1000 uid\gid).