puppeteer: load never fires when setting setRequestInterceptionEnabled to true (and disable JS)

https://github.com/GoogleChrome/puppeteer/pull/565 fixed most cases I found with pages never reporting the load event in puppeteer, but I have here found another url that breaks in the same way - only after calling setRequestInterceptionEnabled(true).

Environment

Puppeteer: 0.11.0 Node: 7.10.1. OS: macOS Sierra

Reduced test case (updated)

Js disabled + request interception => never loads

This is the real problem that relates to this issue. Originally when I posted I missed the critical detail that JS needs to be disabled for the load to never happen:

const puppeteer = require('puppeteer');
(async() => {
const browser = await puppeteer.launch({
      ignoreHTTPSErrors: true,
      args: ['--disable-setuid-sandbox', '--no-sandbox']
    })
const page = await browser.newPage()

await page.setJavaScriptEnabled(false)
await page.setRequestInterceptionEnabled(true)
page.on('request', intercepted => intercepted.continue())

await page.goto('https://apartmentsdoralfl.com/')
console.log('page load DONE') // never happens
browser.close();
})();

**NOTE that the url I’m using in this test case it not the one I had when I originally raised the issue - the original url now works (and along with it about 80% of the problematic url’s I had) - but the one listed here does not.


Thanks for all the hard work on puppeteer, it’s shaping up for me to be able to launch a new major version of Penthouse with it replacing phantomjs. 👍

About this issue

  • Original URL
  • State: closed
  • Created 7 years ago
  • Comments: 21 (7 by maintainers)

Most upvoted comments

@aslushnikov Thanks for taking a look 👋 If it helps at all with prioritization, @pocketjoso’s work to use Puppeteer in Penthouse is important for the ecosystem as tools like critical are relying on it for a shift over to Chrome headless.

@pocketjoso ok thanks, I can repro now, will take a look.

@aslushnikov Thanks for the update. I see that this works, but why is the behavior so different (i.e. works “as excpected”, without changing load trigger) when I don’t use setRequestInterceptionEnabled?

const puppeteer = require('puppeteer');
(async() => {
const browser = await puppeteer.launch({
      ignoreHTTPSErrors: true,
      args: ['--disable-setuid-sandbox', '--no-sandbox']
    })
const page = await browser.newPage()

await page.setJavaScriptEnabled(false)

// With these two lines enabled load doesn't fire.
// With them commented it does.
// await page.setRequestInterceptionEnabled(true)
// page.on('request', interceptedRequest => interceptedRequest.continue())

await page.goto('https://apartmentsdoralfl.com/')

console.log('page load DONE')
browser.close();
})();

Granted I had a look at this site’s waterfall and yeah… it’s messy, I can see how it causes problems. I just wish to not have hard to understand side effects like this when I enable request interception…