gatsby: Gatsby does not resolve / find unicode URLs encoded with encodeURI

Description

Gatsby does not support pages with a path containing unicode characters and encoded with encodeURI. The development server (gatsby develop) will fail to find these pages, while the production build (gatsby build) will fail to find them if the service worker plugin (gatsby-plugin-offline) is enabled.

This was previously discussed in this issue, however it was closed by the Gatsby bot so I am reopening it, as it is a crucial bug for me.

Steps to reproduce

  1. Create a new gatsby project from the default starter.
  2. Add a new page component in src/components/page3.js:
import React from "react"
import { Link } from "gatsby"

import Layout from "./layout"
import SEO from "./seo"

const ThirdPage = () => (
  <Layout>
    <SEO title="Page three" />
    <h1>Hi from the third page</h1>
    <p>Welcome to page 3</p>
    <Link to="/">Go back to the homepage</Link>
  </Layout>
)

export default ThirdPage
  1. Add this to gatsby-node.js:
const path = require('path')

exports.createPages = ({ actions }) => {
    const { createPage } = actions
    createPage({
        path: encodeURI("/page-שלוש/"), // this is "three" in Hebrew
        component: path.resolve('./src/components/page3.js'),
    })
}
  1. Run gatsby develop.
  2. Try navigating to the page via either http://localhost:8000/page-שלוש/ or http://localhost:8000/page-%D7%A9%D7%9C%D7%95%D7%A9/, and you’ll see nothing comes up.
  3. Add gatsby-plugin-offline in gatsby-config.js (simply un-comment it).
  4. Run gatsby build && gatsby serve.
  5. Once again try navigating to the aformentioned URLs (port 9000), and again you’ll see nothing comes up.

Expected result

Encoded URLs should be resolved and found correctly.

Actual result

URLs are not found.

Environment

gatsby info --clipboard:

  System:
    OS: Linux 4.4 Ubuntu 16.04.6 LTS (Xenial Xerus)
    CPU: (4) x64 Intel(R) Core(TM) i7-5600U CPU @ 2.60GHz
    Shell: 4.3.48 - /bin/bash
  Binaries:
    Node: 10.16.0 - ~/n/bin/node
    npm: 6.10.3 - ~/n/bin/npm
  Languages:
    Python: 2.7.12 - /usr/bin/python
  npmPackages:
    gatsby: ^2.13.70 => 2.13.70
    gatsby-image: ^2.2.9 => 2.2.9
    gatsby-plugin-manifest: ^2.2.5 => 2.2.5
    gatsby-plugin-offline: ^2.2.6 => 2.2.6
    gatsby-plugin-react-helmet: ^3.1.3 => 3.1.3
    gatsby-plugin-sharp: ^2.2.12 => 2.2.12
    gatsby-source-filesystem: ^2.1.9 => 2.1.9
    gatsby-transformer-sharp: ^2.2.6 => 2.2.6
  npmGlobalPackages:
    gatsby-cli: 2.6.13

Note that this is a WSL installation on Windows 10.0.17134.799.

About this issue

  • Original URL
  • State: closed
  • Created 5 years ago
  • Comments: 32 (11 by maintainers)

Most upvoted comments

I found a local fix, but I really think it should be fixed on gatsby.

To make a long story short - you should use decodeURIComponent instead of encodeURI since you’re naming a filename and not a URL path. Taking the example from above this should work:

const path = require('path')

exports.createPages = ({ actions }) => {
    const { createPage } = actions
    createPage({
        path: decodeURIComponent("/page-שלוש/"), // this is "three" in Hebrew
        component: path.resolve('./src/components/page3.js'),
    })
}

I think that gatsby should apply decodeURIComponent on file paths by itself to avoid similar issues.

@gatsbybot @wardpeet Nope, not stale. Still happening, and the reproduction information is available in this ticket.

@btk encodeURI() should no break the page. There shouldn’t be a special reason to use it, as it is a standard JavaScript method. Moreover, creating a page with Unicode characters (such as Turkish script) but without this method will fail to load the page on MS Edge (see #17556).

I have this issue when using locales from Prismic - I have a slug with cyrillic characters, i.e. /bg/за-нас (shown as /bg/%D0%B7%D0%B0-%D0%BD%D0%B0%D1%81) and when navigating to that page, Gatsby says it cannot be found, even though it shows that exact page url below.

image

EDIT: I didn’t read above efforts properly 🤦‍♂️ - I tried decodeURI(slug) and that seems to have fixed my issue.

@machineghost The docs have been updated in those two sections to reflect this limitation. Cheers for helping me find those limitations. I spent A LOT of time wondering why those URLs were coming up as 404.

Nothing to do with Edge, it happens on Chrome.

Got same issue here: Our client (Gatsby / WordPressAPI hosted on Netlify) created an URL with accent on WordPress then decided to remove accent, created a new URL and ask to redirect for SEO purpose. We used in gatsby-node.js:

createRedirect({
    fromPath: "/réseaux-sociaux",
    toPath: "/reseaux-sociaux/",
    isPermanent: true,
  })

When we try to access “/réseaux-sociaux”, a 404 is displayed few ms before being replaced by a blank page. We also notice that every page that contains unicode character doesn’t display 404 but blank page.

We have built several website with Gatsby and did not face this issue few month ago. We tried to downgrade both gatsby and react version but it did not resolve anything.

We also tried this before we found out that this issue wasn’t related to the redirects, It didn’t work:

createRedirect({
    fromPath: encodeURI("/réseaux-sociaux"),
    toPath: "/reseaux-sociaux/",
    isPermanent: true,
  })

@roadwig I believe it’s a problem with Edge, not Gatsby. The whole reason I first started encoding my URLs is because Edge was failing to load them.

Edit: Reading through your issue, perhaps this is indeed a problem with Gatsby. Hard to say if Gatsby is to blame or Edge. Regardless, this definitely seem related.