build: Improve Error DX
There have been some comments around error boundaries & general feedback to devs when errors occur.
Here is an example of a plugin with broken code.
module.exports = {
name: 'netlify-plugin-one',
onInit: () => {
console.log(thing.what) // undefined ref
},
}
or
module.exports = {
name: 'netlify-plugin-one',
onInit: () => {
throw new Error('http://www.nooooooooooooooo.com/')
},
}
Currently errors materialize like so:


Are there ways we can improve upon this?
About this issue
- Original URL
- State: closed
- Created 4 years ago
- Comments: 20 (18 by maintainers)
Commits related to this issue
- chore(main): release 9.0.2 (#711) Co-authored-by: token-generator-app[bot] <82042599+token-generator-app[bot]@users.noreply.github.com> Co-authored-by: ehmicky <ehmicky@users.noreply.github.com> — committed to netlify/build by token-generator-app[bot] 2 years ago
Chiming in to +1 these suggestions! I think it’s imperative we more clearly surface these errors both in the logs and the UI to avoid user confusion and churn, like @jlengstorf said:
Could we get this work prioritized as part of the #project-build-plugins-ui scope?
Once we know how this is going to work on the backend, it’d be super helpful to get an issue opened in https://github.com/netlify/netlify-react-ui/issues so we can send the UI work through our typical process: design can take these suggestions and formalize a solution, then we can send it into frontend implementation (@drewm will lead that work), Copy Club, etc. 🙂
all makes sense, and I’m on board. A couple wishlist items that I’d really like to see in place to avoid frustration/confusion later on:
1. Make it one-click to remove and rebuild when a plugin fails
this makes sense, and I’ll make a request since it sounds like we’re already building all of this functionality:
could we make the uncaught error message say something like this?
clicking the button would both turn off the build plugin and restart the deploy
2. Log unexpected failures so we can identity broken plugins/opportunities to coach plugin devs
can we make sure we’re logging the failures in an analyzable way? this would help us flag and remove plugins that are outright broken, and gives us potential to provide actionable feedback to developers (your plugin has X unexpected failures with the message “Y” — consider using
utils.build.failPluginto avoid these failures)stoked to be talking about this!
I have a lot of different things that I think might be worth discussion, so let me know if we need to break these out into sub-discussions
at a high level, the main issues I see are:
logs are hard to read and full of unhelpful output
the logs are made of about 95% junk in terms of useful information — we could leave out almost all of it and I’d have the same idea of what’s going on
because we output so much junk, when a build fails and I go to the log and click the “jump to bottom” arrow, I don’t see the actual error; I see the
Build script returned non-zero exit code: 1someone who is familiar with how Netlify works and has all the context that “oh, no, don’t look at the last error, scroll up and look for the error before that” will figure this out, but new customers may not get that far — they might just contact support and/or churn because “Netlify isn’t stable”
which leads to the next point:
failures look the same no matter where they came from
right now, when you look at the app, a failure just looks like a failure — there’s no way to tell why the thing failed without digging into the logs
this means that a build plugin makes our UI look the same way as a Netlify build error — how would our customers know not to ask support when they have a half dozen failures in a row because some plugin changed in the repo and they weren’t aware?
an idea to improve it
if we use our existing badge markup, we could potentially just add another clarifier to failed builds:
this would require build plugins to capture errors and communicate them out of the build
there is no indication in the UI that the error wasn’t Netlify’s fault
a failed build should add some kind of big blinking marquee at the top of the build log — not just in the log — that something went wrong
right now the visual language at the top of the build log is so similar that I actually thought it was the same output between successful/failed builds until I looked just now
the build should capture errors — especially if those errors come from build plugins — and expose them in a big-ass red box so people know exactly what went wrong:
there is no way to recover from build plugin errors
this may not be feasible, but in a perfect world we’d be running build plugins in a way that if they failed we could just say, “okay, this build plugin’s busted, don’t run any more of its lifecycles and let’s keep rolling with this build”
I’m happy to be convinced otherwise, but by default I would think we’d want build plugins to fail in a recoverable way, meaning we don’t actually fail a build if the plugin errors out — we’d just expose warnings and ship the site without the build plugin
for plugins like build time speedup plugins, this maintains great DX vs. costing even more time (i.e. I can ignore the failed builds until I have room to breathe and dig in vs. needing to drop everything because the plugin broke and our site won’t build and there’s something critical we need to deploy)
if a plugin should definitely fail the build if it errors out — for compliance checks or perf budgets, for example — they would explicitly opt in to fail builds on error with a setting (ideally we have a utility that they call so they can conditionally full bail, kind of like ESLint warnings vs. errors)
in very simplified pseudo-code, the logic might look like this:
this is A Lot™, so let me know if you want to dig into these separately or get on a call to discuss
@jlengstorf you ruined a flawlessly executed swoop-and-poop by saying nice things at the end! 😂
Clarifications
My proposal was only addressing the question of builds continuing when plugins unexpectedly fail, the rest of your error communication recommendations have been converted to tickets for implementation.
Here are the ways plugins can fail, and the expected/proposed result:
utils.build.failPlugin()utils.build.failBuild()We should probably provide a stack trace in the deploy header as well, but still finalizing details on that.
Remaining issue: should unexpected plugin failures fail the build?
The problem with allowing builds to continue by default is that a plugin can do anything to the cloned repo, including putting it in an unstable state. If we allow builds to continue after an unexpected plugin failure, we don’t know what incomplete actions took place and what the impact on the resulting build will be. We’re planning on wrapping plugins in try/catch as you would expect.
The only thing we seem to differ on is whether to make build failure opt-in or opt-out when a plugin fails (throws an uncaught error) unexpectedly. I’ll distill the specific reasoning for proposing opt-out to help the conversation stay focused:
utils.build.failPluginallows plugin authors to explicitly signal that the build can continue safely even if their plugin fails unexpectedly - we can also shout about this in the plugin authoring docs to encourage it when possibleThere was also plugin author feedback that they’d prefer failure-by-default for unexpected errors, but I don’t recall who said that. @verythorough may know.
I think as a discussion issue, it makes sense to close this now. If there are remaining smaller tasks within the comments that haven’t been addressed, we should file new, focused issues for them.
100% agreed on both insights, thank you for this.
This issue has a lot of great feedback, we can use it as a parent for plugins errors in the UI. Raw mechanisms on the build side are in place via #735.
agreed — I buried this toward the end of my original comment, but I think build plugins should have to opt in to kill the build
having an explicit way to fail builds also overcomes the next point:
instead of something arbitrary that may or may not be build-related, e.g.
we would have a more structured error using whatever falls out of #161, e.g.
this would presumably allow us to capture those errors in a way that’s easier to surface in the UI (or at the very least allow us to make that improvement under the hood of the
failWithErrorutil later on with no user-facing API changes)thanks, @ehmicky and @lesliecdubs!
in my mind this needs to be solved — if a build plugin fails, we should roll back and continue the build as if the plugin wasn’t installed. that could be brute-forced by restarting the deploy and removing the plugin config, or (ideally; maybe in the future) treating builds as immutable objects or something so that the build plugin aren’t mutating the build, but rather creating a modified copy that we can throw away if there are errors
I would argue that between this and upstream failures requiring developers to drop everything and make code changes because deploys stopped working, overlooking failures is the better outcome
the UI should make it apparent that something has gone wrong, and ideally we’d be able to send out an error notification email and/or digest to make sure people see the errors