nodejs.org: SLOW OR FAILED (500 ERROR) NODE.JS DOWNLOADS

Edited by the Node.js Website Team

⚠️ PLEASE AVOID CREATING DUPLICATED ISSUES

Learn more about this incident at https://nodejs.org/en/blog/announcements/node-js-march-17-incident

tl;dr: The Node.js website team is aware of ongoing issues with intermittent download instability.

More Details: https://github.com/nodejs/build/issues/1993#issuecomment-1699231827

Original Issue Below


  • URL: https://nodejs.org/dist/latest-v16.x/node-v16.14.1-linux-x64.tar.xz (or basically any specific file, as opposed to just browsing the dirs, which mostly works)
  • Browser version: Firefox 98.0.0, Firefox Nightly 100.0a1 (2022-03-15), curl 7.47.0 (Travis CI), curl 7.68.0 (my local machine) (Doesn’t feel like a browser issue, feels like a server issue…)
  • Operating system: Ubuntu 20.04 / Linux Mint 20

When trying to get files off of nodejs.org/dist/... or nodejs.org/download/..., I get a server error.

“500 Internal Server Error”

(error page served by nginx)

Full error message page (HTML snippet, click to expand)
<html>
<head><title>500 Internal Server Error</title></head>
<body bgcolor="white">
<center><h1>500 Internal Server Error</h1></center>
<hr><center>nginx</center>
</body>
</html>

Browsing around the dirs, like https://nodejs.org/dist/latest-v16.x/, seems to work. Also, Downloading really small files such as https://nodejs.org/dist/latest-v16.x/SHASUMS256.txt seems to work sporadically, whereas downloading tarballs doesn’t seem to work.

Given that the outage seems sporadic: Maybe it’s a resource exhaustion issue over at the server? Running out of RAM or something?? I don’t know.

Edit to add: The error message page seems to be served by Cloudflare. (According to the server: cloudflare response header, when looking in browser dev tools). So I guess this is a Cloudflare issue? Actually that’s probably not what that means.

About this issue

  • Original URL
  • State: closed
  • Created 2 years ago
  • Reactions: 8
  • Comments: 61 (32 by maintainers)

Commits related to this issue

Most upvoted comments

Seeing node-gyp failures while installing some NodeJS packages. gyp ERR! stack Error: 500 status code downloading checksum

The error is reproducing with wget:

wget https://nodejs.org/download/release/v8.17.0/SHASUMS256.txt       
--2023-07-31 18:38:31--  https://nodejs.org/download/release/v8.17.0/SHASUMS256.txt
Resolving nodejs.org (nodejs.org)... 104.20.23.46, 104.20.22.46
Connecting to nodejs.org (nodejs.org)|104.20.23.46|:443... connected.
HTTP request sent, awaiting response... 500 Internal Server Error
2023-07-31 18:38:52 ERROR 500: Internal Server Error.

@ovflowd

This is closed because it’s a known issue, and we’re working on it.

I think it would better communicate your intent if you closed this issue after you have solved it.

Seems to work nominally now.

One suggestion: since this issue was ongoing for some time, and the status page was not showing any anomaly when the issue occurred, I would suggest adding a check on some random binary versions so in case the issue occurs again it will be easily spotted and reflected on the status page.

We can close this issue. We did an enormous amount of work with #5149 and I’m convinced this issue is largely solved for now. And for the main website, this will also (pretty much) not be a thing anymore.

We are still continuously working with small changes here and there to improve things and figure out other ideas to better our infra.

I have CloudFlare LB error notices spanning from ~2:40 am UTC to ~3:40 am UTC, which I suppose correlates with the latest 16.x release. Overloaded primary server, switching to the backup. Users may have got the 500 while hitting the primary before it switched.

I don’t have a good explanation beyond that, I’m not sure what’s caused the overload, I don’t believe that server actually has trouble serving, but we have witnessed some weird I/O issues connected with rsync that maybe all happened at the same time. Perhaps during the next release someone should be on the server watching for weirdness.

Seen on an Azure DevOps agent:

Downloading: https://nodejs.org/dist/v18.17.1/node-v18.17.1-linux-x64.tar.gz
##[error]Aborted
##[warning]Content-Length (44417128 bytes) did not match downloaded file size (8204861 bytes).

I actually just filed a parallel ticket for that one @pughpugh before I saw this: https://github.com/nodejs/nodejs.org/issues/5818 . Not sure why Cloudflare is caching 500s

Also seeing intermittent failures in our CI builds:

14:04:14 npm ERR! gyp http GET https://nodejs.org/download/release/v15.14.0/node-v15.14.0-headers.tar.gz
14:04:14 npm ERR! gyp http 500 https://nodejs.org/download/release/v15.14.0/node-v15.14.0-headers.tar.gz

With the latest changes to the current website build infrastructure (and the way how we serve the website) (https://github.com/nodejs/build/issues/3366) this issue should now have minimal impact and probably not even surface anymore.

I think I’m going to bounce nginx.

This was happening continually today at https://nodejs.org/dist/v20.8.1/node-v20.8.1-darwin-arm64.tar.gz

Any update on why this is closed?

and happening again 😦

Downloading: https://nodejs.org/dist/v16.16.0/node-v16.16.0-darwin-x64.tar.gz ##[error]Aborted ##[warning]Content-Length (30597385 bytes) did not match downloaded file size (5075560 bytes).

we were facing this as well for quite some time. only solution to mitigate this is to use cached nodejs that ships with container 😕 not the best thing in the world, but it is what it is https://github.com/microsoft/fluentui/pull/29552

gyp ERR! stack Error: 500 status code downloading checksum

We’ve identified an issue on Cloudflare that is simply disabling cache for anything /download we’re waiting someone with write access to make the changes on Cloudflare.

Got this issue today: grafik

grafik

After two trials of re-running the pipeline, the download succeeded then at the third try.

Makes me think that it might be driven more by CI runners always pulling down the latest for a release, so there is a massive swarm at the beginning of the release at the same time as the cache invalidation

I also think the global cache invalidation as part of a trigger is the cause. Too much spiky traffic.

Wonder if delaying the index.tab update till the cache is hydrated might alleviate it

Makes me think that it might be driven more by CI runners always pulling down the latest for a release, so there is a massive swarm at the beginning of the release at the same time as the cache invalidation

It’s interesting we did not get reports after 18.x was launched. I’d expect if it was driven by downloads we might have seen something yesterday.

@Trott a few links are still 500ing for me

This seems to be happening again; https://iojs.org/dist/index.json is 500ing, and v12.22.11 just went out.

(after about 5-10 minutes, the 500s seemed to stop, but it’s still worth looking into)

@SheetJSDev I think the https://iojs.org/dist site is backed by https://nodejs.org/dist behind the scenes. (Using the Network pane of browser Dev Tools to inspect the traffic, the responses are served from nodejs.org, even when visiting iojs.org/dist.)

Edit: Actually just parts of it show as coming from nodejs.org, some resources say they’re from io.js. 🤷

Edit again: I note that the root of the io.js site (https://iojs.org/) redirects to nodejs.org… But also, I can’t see where in your CI run it visibly redirected to nodejs.org. I just see that the server error caused nvm to fail to download and install io.js.

And at that… I’ll hide my own comment as off-topic.