lighthouse: How to audit 404 (not-found) html page with lighthouse

Hi,

when I try to run audit on 404 page, the lighthouse complains that it was faulty request and it can’t continue.

runtimeError:
      { code: 'ERRORED_DOCUMENT_REQUEST',
        message:
         'Lighthouse was unable to reliably load the page you requested. Make sure you are testing the correct URL and that the server is properly responding to all requests. (Status code: 404)' },

I wonder why is there this restriction? Is there a way to bypass this and actually test 404 HTML page?

Thank you

About this issue

  • Original URL
  • State: closed
  • Created 4 years ago
  • Reactions: 6
  • Comments: 19 (2 by maintainers)

Most upvoted comments

Thanks for the tip.

It would be useful to bypass this without writing a workaround code. There could be an argument like --ignore-not-found or similar.

can you explain what you mean please?

Sure. You mentioned that if you host your site in such a way that navigating directly to a URL doesn’t result in the page loading (and instead gives a 404) then Lighthouse can’t measure it. I was saying if this is true, then a user can’t visit it either.

I don’t observe what you are describing so maybe we don’t talk about the same thing?

Are you saying that your 404 error page is a client-side redirect to the underlying page (or a copy of your SPA)? If so, I see how we were talking past each other 😃 This still isn’t advisable from the SEO side of things (or from the performance side of things if a redirect), but your point is taken that it’s better served by an audit failure rather than a fatal error.

I’m inclined to agree this should be a toplevel warning rather than a fatal error, or at a minimum an optional flag to optout. Just a question of bandwidth of who is available to work on it.

If you host your SPA website through Google Cloud Storage, you will have a 404 response for every page that is not the main page, thus making lighthouse audit impossible for every page other than the main page.

  • we could have Node default to throwing an error on a bad status code, but introduce a flag to suppress that behavior. CDT would just ignore by default b/c it isn’t automated.
  • OR: just make this a warning 😃
  • Seems like LHCI should be the place for such a check/fatal error, if we had to have it

So, let’s remove the fatal error and just have this be a warning.

I think we should add a flag to disable checking the status code for Node and the CLI.

@paulirish I’d prefer a flag in dev tools to override the “is 404” check. I don’t want to configure every temporary page I’m active developing on Angular into the back-end to respond with 200.

an SPA hosted in S3 bucket governed by cloudfront always gives 404 response for every route. Lighthouse should have a provision for auditing 404 requests because they are not actually 404 the frontend app is governing the routes.

Thanks for filing @ondrejsevcik!

I wonder why is there this restriction?

We received a very high number number of complaints that Lighthouse is incorrect when the actual problem was that users were unwittingly auditing a 403/404/500 page instead of what they wanted to audit. An error status code almost always means to most users that whatever they were trying to audit isn’t working correctly.

Is there a way to bypass this and actually test 404 HTML page?

You can’t bypass directly within Lighthouse. To audit a 404 HTML page you’d need to serve the page with a 200 status code (either by creating such a route on your server, or through request interception before Lighthouse sees the response similar to https://github.com/GoogleChrome/lighthouse/issues/4376#issuecomment-361486901)

Required is the wrong word. I meant needed to solve the problem of the people in this issue, I’m assuming (and have asked) they might not be using the CLI.