docusaurus: Docusaurus v2 doesn't allow for "mypagename.html" links
We migrated and now have a series of issues being raised because none of our 7 year old links ending in .html
resolve correctly.
This is the main issue tracking this: https://github.com/facebook/watchman/issues/798
A couple of kind souls have submitted PRs to change links elsewhere: https://github.com/facebook/watchman/pull/806 https://github.com/facebook/watchman/pull/801
but this is really a docusaurus issue. How can we get this fixed?
About this issue
- Original URL
- State: closed
- Created 4 years ago
- Comments: 30 (11 by maintainers)
Commits related to this issue
- nodejs page should be /nodejs.html, not /nodejs see https://github.com/facebook/docusaurus/issues/2697 — committed to slorber/watchman by slorber 4 years ago
- feat: docs pathname frontmatter (for #2697) — committed to slorber/docusaurus by slorber 4 years ago
- feat: docs pathname frontmatter (for #2697) — committed to slorber/docusaurus by slorber 4 years ago
- feat(v2): introduce docs slug in front matter (#2771) * feat: docs pathname frontmatter (for #2697) * feat: docs pathname frontmatter (for #2697) * chore: comment typo * feat: add slug front... — committed to facebook/docusaurus by slorber 4 years ago
- Fix broken .html pages with redirect Summary: The upgrade to Docusarus 2 broke all our old links, including ones referenced in the Pysa blog post: https://engineering.fb.com/security/pysa/ The issue... — committed to facebook/pyre-check by gbleaney 4 years ago
Hi all,
Here are a few comments, proposals and questions I have
Valid urls of Watchman
I understand that such url should work:
Does it mean that BOTH urls should work?
I don’t know how Watchman site worked before, is it still online somewhere to check?
Hosting on Github pages
Watchman is hosted on Github pages. As far as I know, it’s not possible to do any server-side redirect on this hosting solution.
Also, for Github pages to serve a non-404 answer, the file actually has to exist on the FS with the html extension.
On other platforms like netlify, it would have been posssible to drop a simple _redirects file and handle this.
It might be a good idea to start using a custom domain, which would allow more flexibility to change the underlying hosting solution without too much pain.
Using .html extension in document id
If Watchman just need the
/nodejs.html
, and not the/nodejs
page, it’s possible to use .html as suffix in document idsUsing a file like
filename.html.md
also worksduplicating the pages
Creating 2 html pages for the same document could be a portable solution
The duplicate pages would have a canonical url to the main page so that SEO can know which page is the main one.
Is it worth redirecting in this case? or can the browser just stay on the non-canonical page if it serves the correct content?
404 + redirecting
This looks like the solution @lex111 implemented here: https://github.com/facebook/docusaurus/pull/2704 Which was not merged due to SEO reasons related to serving 404.
I found this note here: https://github.com/rafrex/spa-github-pages
I’d prefer not to do that as well but that remains an option.
What do you think?
Hi @wez , sorry for the delay.
Just wanted to let you know that the fixes you need are already merged on master and will be released soon in 2.0.0-alpha.57
What you will need to do on Watchman:
add a slug to each doc
With the extension you want, which will be used for the main/canonical/SEO url
Use the client redirects plugin
Plugin doc
Your configuration should look like:
And if there exist a
/docs/nodejs.html
page, then going to/docs/nodejs
will redirect to/docs/nodejs.html
I see. So it should work to build the site first without the legacy docs folder, then manually add the legacy docs afterwards in a subfolder of ‘build’…correct?
We’ll also make a
docusaurus serve
command and recommend a way to test a production build locally => https://github.com/facebook/docusaurus/issues/3062Sorry that your experience wasn’t as great as it should have been, and for the time lost giving it a try 😞
I didn’t document that the plugin worked only for the production build on the initial release alpha 58, sorry about that. It is currently documented in the master branch here, but didn’t backport it to the alpha 58 doc.
Next time you give it a try, please reach on Discord, I’ll be there to help.
It is possible to test locally, but still involves the production build.
You can run the
docusaurus build
cmd (viayarn build
normally), and then serve the build folder locally with any http server (I’d recommend serve, very simple one, no need for Apache or whatever)// open http://localhost:5000
It is not so simple to make this work with
yarn start
easily, because the redirect files are lightweight, and not part of the Docusaurus client side routing system (SPA based on React / ReactRouter). We should be able to redirect to the correct page asap, without needing to wait for React and Docusaurus JS infra to download.It may be possible to generate those lightweight redirect files before spawning the webpack dev server, but would probably decrease the startup speed of the project in dev mode.
@slorber Yeah, that’s fine with me! Thanks for looking at this!
@slorber This plugin by @lex111 might be helpful - https://github.com/single-spa/single-spa.js.org/blob/master/website/src/plugins/docusaurus-plugin-redirects/src/index.js
Hi @wez @JoelMarcey
I understand that we are looking for a portable solution, and not really willing to leverage hosting platform configuration.
That means that we must write to disk these 2 files if we don’t want a 404 status code from github pages:
I’m looking at how to make this work and I see 2 solutions:
Note, it seems possible to trigger a client-side redirection with a html tag as well. It seems understood by google as a redirect (despite being not recommended).
It should be possible to provide more advanced configuration, like:
@wez, if we succeed to make nodejs.html the main page, is it ok to have a simple client-side redirect from “/nodejs” to “/nodejs.html”?
According to this comment: https://github.com/facebook/watchman/issues/798#issuecomment-619300064
It’s not totally clear to me what you have tried, but it looks like you tried this:
nodejs/index.html.md
What I’m suggesting isnodejs.html.md
as filename. If you specify document id as frontmatter you need to useid: nodejs.html
. In Watchman docs I can see the nodejs doc has a frontmatter id (the filename actually has no effect on the pathname).I’m able to get this working locally (should also work for GH pages). I opened an example PR for watchman website here: https://github.com/facebook/watchman/pull/812
If we validate that the workaround works:
id: nodejs.html
the api we want to recommend for this usecase? (it’s a bit weird to me, we should probably be able to customize the path of each doc completely?)We should decide if we want a document pathname customization feature, as if we start migrating Watchman doc to the workaround
id: nodejs.html
, to ship 1 week later a clean way to solve this usecase, we’d then have to migrate the Watchman site from workaround to clean solution.