lighthouse: OOM in ImageElements gatherer

We were getting OOMs in LR and I managed to confidently bisect down to the commit where #11188 was merged. (its core(image-elements): collect CSS sizing, ShadowRoot, & position)

node lighthouse-cli http://cosmetiqon.gr/ --only-audits=unsized-images -G

Here’s one URL where this can sometimes OOM, though I definitely can’t get an OOM locally. I’m not entirely sure which context is getting the OOM… the page or Lighthouse.

I do know that if I comment out these lines…

https://github.com/GoogleChrome/lighthouse/blob/e0f7d5107e022ba96105c28fdcc54d865f29a221/lighthouse-core/gather/gatherers/image-elements.js#L353-L355

…the imageGatherer takes 2s instead of 28s.

I attempted to do some memory debugging but didn’t get too far. Still requires a bit of investigation


Similar efforts: #7274 #9818

About this issue

  • Original URL
  • State: closed
  • Created 4 years ago
  • Comments: 16 (6 by maintainers)

Most upvoted comments

isCss is a significant speed up, enough to reland this change. https://github.com/GoogleChrome/lighthouse/issues/11289

quick followup we will do is sort the elements by image size, then apply a sensible time budget to fetching source rules. https://github.com/GoogleChrome/lighthouse/pull/11340#issuecomment-682259063

I did some exploring on this issue, but I couldn’t find (& don’t think I have) access to LR so this is coming from observing what happens on my local machine:

I don’t know if the slow down from getMatchedStylesForNode has to do with the OOM issue, but my intuition believes they might be two separate things to consider, especially after reading the performance issues previously encountered when using getMatchedStylesForNode

In the font-size audit, as far as I could tell, we optimize how many times we actually call getMatchedStylesForNode, which is not something I did when I wrote unsized-images, because I didn’t realize how slow getMatchedStylesForNode can be. In order to improve the runtime of unsized-images by reducing calls to getMatchedStylesForNode one optimization that I think we should include is to change

if (!element.isInShadowDOM) {

to

if (!element.isInShadowDOM && !element.isCss) {

since we don’t currently check css background-images in unsized-images anyway, and cssWidth/cssHeight aren’t as relevant to background-images because of background-repeat & parent element sizing

Additionally, I agree with @patrickhulce about

finding this data for the largest X displayed dimension images since those will have largest CLS impact

or other workarounds that can reduce the total calls to getMatchedStylesForNode.

I noticed that a large amount of the ImageElements in http://cosmetiqon.gr/ had the same src because they were the default gif for the site’s lazy loaded images. There might be potential here to reduce the calls to getMatchedStylesForNode, i.e. caching the CSS.GetMatchedStylesForNodeResponse for sibling nodes that have the same CSS class (might make OOM worse), or not calling getMatchedStylesForNode on lazy loaded images outside of the viewport (not sure if this is what we’d want to encourage)

As for the OOM issue, I had some leads I could think of:

  1. When running unsized-images on http://cosmetiqon.gr/ there was a handful of errors like
method <= browser ERR:error DOM.pushNodeByPathToFrontend  +16s

that disappear after adding && !element.isCss. Is there a possibility for a memory leak caused from too many errors?

  1. I double checked #11188 to see if I had unknowingly added a memory leak somewhere, I couldn’t find anything obvious but I also believe we should add the following change:
const matchedRules = await driver.sendCommand('CSS.getMatchedStylesForNode', {
        nodeId: nodeId,
      })

to

const matchedRules = await driver.sendCommand('CSS.getMatchedStylesForNode', {
        nodeId,
      })

In worst case this somehow causes a circular reference since nodeId was declared earlier & in best case this is just a nit

  1. I ran ndb on my local machine with breakpoints at the start (snapshot 1 & 5) and end (snapshot 2 & 6) of async afterPass(passContext, loadData) within image-elements.js Screen Shot 2020-08-21 at 9 45 22 PM At face value there was a reduction in memory used & also it was hard for me to find anything that looked like a promising lead about where the OOM came from, so this OOM feels more insidious the more I spent time on it

  1. I estimated the ballpark memory if for some reason we were storing / didn’t deallocate all the matched rules found throughout running image-elements.js:
  • I JSON.stringified instances of matchedRules and found sizes ranging from ~30000 chars/bytes to ~200000 chars/bytes with median fitting between ~100000-150000 chars/bytes for a page such as http://cosmetiqon.gr/

  • Based off a 150000 byte ballpark for each instance of matchedRules in http://cosmetiqon.gr/, and the fact that it has ~100 ImageElements, we get 150000 * 100 -> ~15MB

  • I do not know if this is a reasonable use of memory / cache when running lighthouse or LR, & whether we do save all matchedRules, I’ll check ndb later to see if this happens locally

Once we have some urls we can start digging in more…

I’d start by adding timing marks around these three areas:

  1. DOM.pushNodeByPathToFrontend
  2. CSS.getMatchedStylesForNode
  3. the calls to getEffectiveSizingRule

It’s unfortunate we are working from a devtools node path here. Perhaps there’s a way to grab the DOM snapshot (must verify that the width/height properties from the snapshot don’t include intrinsic image sizes) and then connect that data to the image element we scraped.

Either:

  1. via constructing the devtools node path from the snapshot or
  2. for each of the unsized images in the snapshot (this limits amount of work), get the remote object id and determine its node path in the page. Tweak the artifact to have a boolean isExpicitlySized (no need to get the actual size, audit doesn’t care): ImageElements would have isExpicitlySized set to true iff the snapshot for that element has a value for width and height.

After discussing with @lemcardenas, we’re going to revert ##11217 and #11188 before we ship 6.3.0