gatsby: error UNHANDLED EXCEPTION write EPIPE while running on Netlify
Preliminary Checks
- This issue is not a duplicate. Before opening a new issue, please search existing issues: https://github.com/gatsbyjs/gatsby/issues
- This issue is not a question, feature request, RFC, or anything other than a bug report directly related to Gatsby. Please post those things in GitHub Discussions: https://github.com/gatsbyjs/gatsby/discussions
Description
When trying to run Gatsby latest 4.9.1
I get this error
4:35:23 PM: success write out requires - 0.005s
4:36:27 PM: success Building production JavaScript and CSS bundles - 63.377s
4:37:03 PM: success Building HTML renderer - 36.060s
4:37:03 PM: success Execute page configs - 0.380s
4:37:03 PM: success Caching Webpack compilations - 0.001s
4:37:03 PM: error (node:2227) MaxListenersExceededWarning: Possible EventEmitter memory leak detected. 11 end listeners added to [PassThrough]. Use emitter.setMaxListeners() to increase limit
4:37:03 PM: (Use `node --trace-warnings ...` to show where the warning was created)
4:37:07 PM: success run queries in workers - 3.599s - 37/37 10.28/s
4:37:09 PM: success Running gatsby-plugin-sharp.IMAGE_PROCESSING jobs - 120.528s - 5/5 0.04/s
4:37:09 PM: error UNHANDLED EXCEPTION write EPIPE
4:37:09 PM:
4:37:09 PM:
4:37:09 PM: Error: write EPIPE
4:37:09 PM:
4:37:09 PM: - child_process:864 ChildProcess.target._send
4:37:09 PM: node:internal/child_process:864:20
4:37:09 PM:
4:37:09 PM: - child_process:737 ChildProcess.target.send
4:37:09 PM: node:internal/child_process:737:19
4:37:09 PM:
4:37:09 PM: - index.js:291 WorkerPool.sendMessage
4:37:09 PM: [repo]/[gatsby-worker]/dist/index.js:291:19
4:37:09 PM:
4:37:09 PM: - worker-messaging.ts:22
4:37:09 PM: [repo]/[gatsby]/src/utils/jobs/worker-messaging.ts:22:22
4:37:09 PM:
4:37:09 PM:
4:37:09 PM: not finished Merge worker state - 0.058s
4:37:09 PM: error Command failed with exit code 1.
4:37:09 PM: info Visit https://yarnpkg.com/en/docs/cli/run for documentation about this command.
4:37:09 PM:
4:37:09 PM: ────────────────────────────────────────────────────────────────
4:37:09 PM: "build.command" failed
4:37:09 PM: ────────────────────────────────────────────────────────────────
4:37:09 PM:
4:37:09 PM: Error message
4:37:09 PM: Command failed with exit code 1: yarn postinstall && yarn build:incremental
4:37:09 PM:
4:37:09 PM: Error location
4:37:09 PM: In build.command from netlify.toml:
4:37:09 PM: yarn postinstall && yarn build:incremental
4:37:09 PM:
I cleared my Netlify cache before running this and this seems like a new issue as per https://github.com/gatsbyjs/gatsby/issues/33738#issuecomment-1055595579
Reproduction Link
tbd
Steps to Reproduce
Following steps outlined here: https://github.com/gatsbyjs/gatsby/issues/33738
Expected Result
For the site to build as it normally did before I updated packages.
Actual Result
Consistently fails
Environment
System:
OS: macOS 12.0.1
CPU: (12) x64 Intel(R) Core(TM) i7-9750H CPU @ 2.60GHz
Shell: 5.8 - /bin/zsh
Binaries:
Node: 17.4.0 - /var/folders/tb/qb7x5sw53vngt06y9g92csn80000gp/T/yarn--1646432031346-0.9788514151471666/node
Yarn: 1.22.17 - /var/folders/tb/qb7x5sw53vngt06y9g92csn80000gp/T/yarn--1646432031346-0.9788514151471666/yarn
npm: 8.3.1 - ~/.nvm/versions/node/v17.4.0/bin/npm
Languages:
Python: 2.7.18 - /usr/bin/python
Browsers:
Chrome: 99.0.4844.51
Safari: 15.1
Config Flags
GATSBY_EXPERIMENTAL_PAGE_BUILD_ON_DATA_CHANGES=true
Links/References
https://answers.netlify.com/t/error-unhandled-exception-write-epipe/52650 https://answers.netlify.com/t/gatsby-v4-works-locally-but-timed-out-on-netlify/46339/2 https://answers.netlify.com/t/gatsby-v4-works-locally-but-timed-out-on-netlify/46339/21
About this issue
- Original URL
- State: closed
- Created 2 years ago
- Reactions: 26
- Comments: 103 (52 by maintainers)
Commits related to this issue
- disable webp image generation https://github.com/gatsbyjs/gatsby/issues/35055#issuecomment-1060887627 https://www.gatsbyjs.com/docs/reference/built-in-components/gatsby-plugin-image/#customizing-the-... — committed to rdela/rdela.com by rdela 2 years ago
- pin gatsby 4.7.2, update renovate config https://github.com/gatsbyjs/gatsby/issues/35055 https://docs.renovatebot.com/configuration-options/ — committed to rdela/rdela.com by rdela 2 years ago
- Pin to 4.7.2 to avoid EPIPE errors image resize https://github.com/gatsbyjs/gatsby/issues/35055#issuecomment-1060887627 — committed to jwsy/thescienceseed-gatsby-cms by jwsy 2 years ago
- ci: set GATSBY_CONCURRENT_DOWNLOAD to 15 See https://answers.netlify.com/t/gatsby-v4-works-locally-but-timed-out-on-netlify/46339/23 and https://github.com/gatsbyjs/gatsby/issues/35055 — committed to screendriver/echooff.dev by screendriver 2 years ago
- ci: reduce memory usage when building page See also https://github.com/gatsbyjs/gatsby/issues/35055#issuecomment-1064501847 — committed to screendriver/echooff.dev by screendriver 2 years ago
- ci: reduce memory usage when building page See also https://github.com/gatsbyjs/gatsby/issues/35055#issuecomment-1064501847 — committed to screendriver/echooff.dev by screendriver 2 years ago
- rolling back versions because of https://github.com/gatsbyjs/gatsby/issues/35055 — committed to christopher-besch/homepage by christopher-besch 2 years ago
- test https://github.com/gatsbyjs/gatsby/issues/35055 — committed to theapplegates/gat-foundation-apr14 by theapplegates 2 years ago
- chore: bump Gatsby to 4.13 fixes: https://github.com/gatsbyjs/gatsby/issues/35055\#issuecomment-1113150550 — committed to maxpou/maxpou.fr by maxpou 2 years ago
- chore: bump Gatsby to 4.13 fixes: https://github.com/gatsbyjs/gatsby/issues/35055\#issuecomment-1113150550 — committed to maxpou/maxpou.fr by maxpou 2 years ago
- chore: bump Gatsby to x.15 (#139) * chore: bump Gatsby to 4.13 fixes: https://github.com/gatsbyjs/gatsby/issues/35055\#issuecomment-1113150550 * chore: bump everything to x.15 — committed to maxpou/maxpou.fr by maxpou 2 years ago
- upstream issue with gatsby preventing netlify builds fixed now https://github.com/gatsbyjs/gatsby/issues/35055#issuecomment-1119298825 — committed to rdela/rdela.com by rdela 2 years ago
- rm ignore gatsby in renovate config (#547) * upstream issue with gatsby preventing netlify builds "resolved" now https://github.com/gatsbyjs/gatsby/issues/35055#issuecomment-1119298825 PR https... — committed to rdela/rdela.com by rdela 2 years ago
Hey folks, we just released changes from https://github.com/gatsbyjs/gatsby/pull/35513 - huge thanks to @ascorbic for debugging, figuring out possible solutions and actually implementing one!
Above change is available in:
@latest
-gatsby@4.13.1
@next
-gatsby@4.14.0-next.2
Please do try it out and let us know wether you see still those kind of errors showing with either of above version of
gatsby
Further progress. It is a race condition.
waitUntilAllJobsComplete
waits forhasActiveJobs
to resolve, which happens whenactiveJobs
is0
. This happens insideenqueueJob
here:https://github.com/gatsbyjs/gatsby/blob/00220f4247408dfc519d6f64c0f3262baf1cf316/packages/gatsby/src/utils/jobs/manager.ts#L314-L319
Once that resolves, then the workerPool is restarted:
https://github.com/gatsbyjs/gatsby/blob/00220f4247408dfc519d6f64c0f3262baf1cf316/packages/gatsby/src/commands/build.ts#L298-L300
The trouble is that at this point, the final
JOB_COMPLETED
message has not dispatched. The problem is where this happens, insideinitJobsMessagingInMainProcess
. The job is created here by dispatchingcreateJobV2FromInternalJob
and then waiting for the result.https://github.com/gatsbyjs/gatsby/blob/00220f4247408dfc519d6f64c0f3262baf1cf316/packages/gatsby/src/utils/jobs/worker-messaging.ts#L20-L25
The problem is that
hasActiveJobs
has already resolved by this point, because it happened insideenqueueJob
(which was itself called when dispatchingcreateJobV2FromInternalJob
). This means that the workerPool is already being restarted at this point, as we’re about to dispatch an end message to a worker that is currently shutting down.There are a few ways we could fix this. My favoured one would be to dispatch
JOB_COMPLETED
insiderunJob
, rather than waiting for Redux. This would ensure that the message has been sent before the activeJobs are decremented and hasActiveJobs is resolved. If this sounds reasonable I can open a PR to move this frominitJobsMessagingInMainProcess
intorunJob
(probably something likerunLocalWorker(worker[job.name], job).then(() => sendTheJobCompletedMessageOrSomething())
). What do you reckon?OK, so I’ve spent some time looking at this, and it appears to be a race condition. In the section you quoted, @pieh
…it looks like there are
JOB_COMPLETED
messages being dispatched afterrestart()
has been called. I forkedgatsby-worker
and added a load of logging, and you can see it here. The message “about to end worker” is added at the beginning ofworkerPool.end()
here. You can see that theJOB_COMPLETED
message is dispatched afterwards.If I put a 1 second await inside
end()
then it works fine. See here, where the message is dispatched while it waits before restarting.I’m not sure why
waitUntilAllJobsComplete
is resolving before that final job is completed, but that seems to be the issue.Gosh, that’s annoying - import hell 😃
This sound like very pragmatic approach and I approve 😃
Yup - I might rename those variable to match better what they represent (so it’s not confusing for future readers) but overall this is effectively what I had in mind.
I think the PR should go in regardless if it fixes the issue completely - it just make sense on its own - if that will be not sufficient we will continue exploring of course
I’ve spent some time investigating this (thanks @browniebroke for the great repro) and have found out a few things. First, I don’t think it’s memory: I can reproduce it every time on the browniebroke site using Netlify’s enterprise High Performance builds, which have 32GB available, where I have allocated 20GB to node. It fails at the same point, whether using regular builds or high perf. It fails on the same image each time, but removing that image means it fails on another image, so I’m not sure if that’s the issue. The error can be prevented from aborting the build by adding an error-handler as the second argument here: https://github.com/gatsbyjs/gatsby/blob/master/packages/gatsby-worker/src/index.ts#L406 I don’t know if this is ok. The message being sent is
JOB_COMPLETED
, which fails because the worker has closed the channel. When I test it, the generated site seems fine. If this is acceptable, I’m happy to open a PR. What do you think, @wardpeet ? I would certainly like to find out what happened after4.7.2
that caused this. I don’t know if the error is ingatsby-plugin-sharp
,gatsby-worker
or something else. One possible clue is that on my M1 Mac I have very occasionally managed to have it fail around the same place, and this time is has a different error. I don’t know if it’s connected, but it is interesting that it happens in a similar place:Thanks everyone and thank you @ascorbic!
I’ll be closing this issue now as it’s been resolved.
I have been experiencing this issue as well while migrating a site from gatsby 2 to 4. Builds work locally but fail 9 times out of 10 on netlify.
I have tried pinning gatsby to 4.1.3, 4.6.2, 4.7.2, and 4.9.3, as suggested in the various threads dealing with this issue, all without success.
I have also tried various values for env variables GATSBY_CONCURRENT_DOWNLOAD, GATSBY_CPU_COUNT, and NODE_OPTIONS = --max-old-space-size= 4096 without success.
I have upgraded my netlify build image and profiled my local memory usage to be sure memory use was not spiking over 3GB and have significantly decreased bundle size by optimizing images stored in the repo rather than in a CMS.
I have also experimented with rolling back versions of gatsby-plugin-image, gastby-plugin-sharp, gatsby-transformer-sharp, and sharp under the theory that this issue has something to do with processing static images (similar to benlavalley’s experience here https://github.com/gatsbyjs/gatsby/issues/33738#issuecomment-1055595579) (I maintain another site with very few static images and it has been on gastby 4.4. with no issue for months).
The options at this point seem to be rolling back to Gatsby 3, rearchitecting the site to host all images in a CMS, find a replacement for Netlify.
I just put it in the Netlify env variables like this
Tnahks for all the work! my site now is currently running smoothly on 4,13.1
Multiple successful builds on private server, issue seems to be resolved, thank you @ascorbic and @pieh!
I just tested it out on a site that had issues with this bug with all versions prior to 4.13.1. I did three deploys in total and have not experienced any issues yet. Seems to work great!
looks great on my end too! Thank you so much @ascorbic @pieh seeing you guys triage and discuss was illuminating, all the love ❤️
I updated Gatsby to v4 last year and after 4.13.1 for the first time in months I’ve been able to run a clean deploy without any crazy routine of clearing content from my site. Awesome work!
I have had this problem since starting in 4.11.2 (I’m new to gatsby and netlify and couldn’t figure out if I was doing something wrong!) and had to resort to building locally until this very moment. The release of 4.13.1 was the first time I could build successfully on their servers!
Thank you! I’ve built several times successfully on Netlify.
Thank you! Works for us, too. Tried to bail to Cloudflare Pages, but they don’t have the same type of support for Gatsby functions 😅
Fantastic work @ascorbic and @pieh! Thank you!
Also now having success with 4.13.1 where the build was mostly failing in Netlify before (I rolled back to 4.7.2 while this was being looked at). Thanks!
I’ll re-open this issue (it was auto closed as linked PR was merged), but we do want to hear back from you folks wether the issue was resolved with above versions for you.
Will keep it open for a week and if there will be no new reports will close it then
I’m going to continue looking at this tomorrow, so hopefully I can work out a fix. If so I’ll open a PR. I’m always happy to contribute a fix if I can.
All you have to do is
yarn add gatsby@4.7.2
ornpm install gatsby@4.7.2
or you could make this change in your package.json
"gatsby": "4.7.2",
You can keep all your plugins on the latest version (4.9.3).
Updated to
v4.13.1
and builds are successful every time.Thanks @ascorbic for your work on this.
So far so good! My v4 branch that was failing consistently now seems to be building consistently! Thank you @ascorbic!
I’ve had some decent success with gatsby 4.11.x by explicitly unsetting AVIF generation in sharp’s config
gatsby-config.js
Depending on how much static content you have, I found one thing I can do is reduce it significantly which allows my site to build and deploy. I then check it all back in, and because of what I suspect to be build caching, my site is then able to deploy.
It’s not just about images it seems, but the overall amount of content Gatby is processing.
Because Netlify states they aren’t capable of changing memory limits, and Gatsby V4 has been out in the wild now for quite sometime and it doesn’t seem like it will be reducing it’s own memory usage, I’m probably going to have to bail on Netlify as well 😦
gotcha. thank you!
Update: I’ve seen this issue on and off for a few weeks, so no promises if this is a surefire fix, but I add the following environment variable to force the process to reduce memory usage and it seems to help:
NODE_OPTIONS = --max-old-space-size= 4096
I wish that someone on their support team would have told me this or suggested another workaround. Our build times have also been fluctuating wildly so not sure if there are other variables at play.
I have a draft PR up. Does the approach look ok? I’ve not tested it properly yet which is why it’s still a draft, but I’ll look into it tomorrow.
https://github.com/gatsbyjs/gatsby/pull/35513
I think we should export function returning current promise from
worker-messaging.ts
(promise can be recreated later on, so we need getter to get new one and not just original one), but we can await on it inside existingwaitUntilAllJobsComplete
(Promise.all
?) - this would delaywaitUntilAllJobsComplete
potentially by one or 2setImmediate
ticks (that currently happen after job finish and before messages are sent?) so should be fine and not really noticable and it will just be nice to share it with other places that await jobs (feels much safer than doing it just in that one spot)Thanks for looking at this. I don’t think the forced ending is the issue. If you look at these logs you can see that the forced kill has not happened yet when the error occurs. The problem is that the message catches the worker during the shutdown process. It sends
END
, which removes the listener to allow the graceful stop. When it starts to shut down, node closes the ipc channel, meaning that any future writes will cause the EPIPE error, so it doesn’t have the opportunity to complete as theJOB_COMPLETE
message fails. Allowing a graceful exit is the problem, because that exit is what triggers the attempt to dispatch the message, but by that time the channel has closed.The solution would be to find a way to ensure that either
END
isn’t dispatched untilJOB_COMPLETE
has been sent, or that the event listener isn’t removed until it has. Alternatively find another way to resolve the promise that doesn’t require dispatchingJOB_COMPLETE
.Haha. Ten seconds before you. WHere do you think would be a good place for this?
It appears incontrovertible this error relates to memory usage. And many claim that freezing Gatsby at version 4.7.0 works.
Dumb question: has anyone identified the precise diff between 4.7.0 and next versions that changes the memory usage so much?
Would be great to solve this. Really frustrating.
I had the same issue and I’ve been trying to fix it different ways but I didn’t want to rollback Gatsby to previous versions. Currently I’m using 4.12.1v of Gatsby. @wahidshafique Try to use “Essential Gatsby” Netlify plugin with the latest version (3.0.0). This helped me to solve the issue.
I ran into the same issue and “solved” it by not using Netlify to compile my Gatsby website. Instead I use this GitHub workflow:
It compiles everything into the
deploy
branch, which Netlify publishes. I gave Netlify these build settings:true
is a program that according to its man page:This at least works as an intermediate solution until the problem is fixed.
Ugh. Scrap that. No, it still fails.
The relevant change for the
MDB_PROBLEM
could be that we updated LMDB from v1 to v2: https://github.com/gatsbyjs/gatsby/pull/34576The lmdb version is no longer locked to 2.2.1, it’s
^2
nowThanks for the investigation!
The pessimist in me thinks this is a feature, not a bug.
sharp
v0.30.1 which is upgraded from v0.29.3 in 326a483bc01c5a3e433e3a82fd52c92a9f6467d5 requireslibvips
v8.12.2. ^1 Netlify’s build environment comes withlibvips
v8.9.1. ^2 ~Maybe upgrading to sharp v0.30.1 (326a483bc01c5a3e433e3a82fd52c92a9f6467d5) brought an incompatibility?~ Update: This seems to be not right. Overriding the resolution of sharp to v0.29.3 did not solve the problem.Thanks for the update. I also tried some of the suggestions like reducing the size of the images and nothing worked.
I think that the solution for now is to pin Gatsby to v4.7
I also have the same issue on a relatively small blog type site I administer for a friend that uses NetlifyCMS, Gatsby and builds and deploys on Netlify.
Currently with
v4.6.2
it builds fine both locally and on Netlify.Updating to
v4.9.2
it builds fine locally but the Deploy Preview results in the same UNHANDLED EXCEPTION write EPIPE error.Here’s the output from
gatsby info
locallyI have tried some needed maintenance but with no success:
Gatsby cache
withEssential Gatsby
Will stick with
v4.6.2
for now, but it is another instance ofv4.9.x
consistently failing to build on Netlify.My fix for this, as advised by Netlify, was to pin to
"gatsby": "4.7"
. I agree with @t2ca about reproduction constraints here. I don’t have any capacity at the moment, but I was thinking that the Netlify build image could be used. I believe @benlavalley already tried this in some capacity (and Ben if you’re reading this, it’d be awesome if you have the repro available somewhere). If I recall, the 3Gb memory constraint was the thing that was causing failed builds, and Netlify has fluctuating allocation for the runners, so you could get north of that guarantee and see no issues (hence the intermittent nature of this problem)I can confirm that I also have this issue with
v4.9.1
and I previously experienced it withv4.8.0
Update: The current solution that appears to work for most people is to pin Gatsby to
v4.7.2
yarn add gatsby@4.7.2
ornpm install gatsby@4.7.2
or you could make this change in your package.json file:
"gatsby": "4.7.2",
You can keep all your Gatsby plugins on the latest version (4.11.1).