puppeteer: Hash-only navigation doesn't work
Since the hash-only navigation doesn’t cause any network requests and doesn’t cause load event, the following gets stuck:
const puppeteer = require('puppeteer');
(async() => {
let browser = await puppeteer.launch();
let page = await browser.newPage();
await page.goto('https://example.com');
await page.goto('https://example.com#ohh'); // <== stuck here
browser.close();
})();
About this issue
- Original URL
- State: closed
- Created 7 years ago
- Reactions: 41
- Comments: 27 (8 by maintainers)
Commits related to this issue
- Support anchor navigation This patch teaches the following methods to support anchor navigation: - `page.goto` - `page.waitForNavigation` - `page.goBack` - `page.goForward` Fixes #257. — committed to aslushnikov/puppeteer by aslushnikov 7 years ago
- feat(Chromium): roll Chromium to r548690 This roll includes: - https://crrev.com/548598 - DevTools: implement Page.setBypassCSP method - https://crrev.com/548690 - DevTools: introduce Page.navigatedW... — committed to aslushnikov/puppeteer by aslushnikov 6 years ago
- feat(Chromium): roll Chromium to r548690 (#2323) This roll includes: - https://crrev.com/548598 - DevTools: implement Page.setBypassCSP method - https://crrev.com/548690 - DevTools: introduce Page.... — committed to puppeteer/puppeteer by aslushnikov 6 years ago
- fix(Page): support anchor navigation This patch fixes puppeteer navigation primitives to work with same-document navigation. Same-document navigation happens when document's URL is changed, but docu... — committed to aslushnikov/puppeteer by aslushnikov 6 years ago
- fix(Page): support anchor navigation (#2338) This patch fixes puppeteer navigation primitives to work with same-document navigation. Same-document navigation happens when document's URL is change... — committed to puppeteer/puppeteer by aslushnikov 6 years ago
- Allow navigating to websites using hash, https://github.com/GoogleChrome/puppeteer/issues/257 — committed to thealphadollar/salvator by thealphadollar 6 years ago
- Allow navigating to websites using hash, https://github.com/GoogleChrome/puppeteer/issues/257 — committed to thealphadollar/salvator by thealphadollar 6 years ago
@Means88 This works as a workaround:
https://github.com/GoogleChromeLabs/puppeteer-examples/blob/master/hash_navigation.js shows how to listen for
hashchange
events and react accordingly. You might be able to extract ideas from that for a workaround.Switching between different versions of Puppeteer…
Result:
page.goto('same.url#different_hash', {waitUntil: 'networkidle'})
✓ PASSEDpage.goto('same.url#different_hash', {waitUntil: 'networkidle0'})
✗ FAILEDpage.goto('same.url#different_hash', {waitUntil: 'networkidle0'})
✗ FAILEDSo it looks like there is a regression with lastest versions, or I don’t understand the new options…
And with the history API
BTW,
page.url()
returns the original url, is it a feature or a bug?This is related to a problem in chrome’s protocol. In short, hash navigation won’t trigger any of the page initialization events.
A puppeteer problem is that page.goto FORCES you to attempt to listen to one of those events (unless there’s some undocumented configuration option), timing out with an error if (when) they never come. The only reliable way around this is to set a low timeout, catch (and disregard) the error that will come, and then manually check if the page has loaded with your own logic.
Is there some way around this?
Is there an upstream bug that we can link to?
The incompatibility with History means that sites using
react-router
will have issues usingpage.url()
,page.waitForNavigation()
, etcHere are some of my workarounds:
I use
page.waitForSelector()
instead ofpage.waitForNavigation()
if possible.I use these two functions for dealing with the URL
and these for history:
based on https://github.com/GoogleChrome/puppeteer/blob/7d18275fb981e01cec4a4fbac61a9c66e46947bc/lib/Page.js#L532-L533
@onamission thanks! It is works for me.
I am not sure if this is completely relevant to this thread, but I just created a script that allows us to navigate to hashes on a page to take various screenshots. The reason I question the relevance is because we have some JS working in the background that assists our navigation, so I don’t know if this script would work without the page JS or not. Anyway, here is what works for me (sorry, it is node6):
Using a the browser inspector and Charles it appears that the only network traffic this creates is to make the initial call to the server. After that has rendered the network goes quiet.
This is a solution for the issue I was trying to solve that lead me to #491 – which brought me here.
I hope this helps someone.