remark: Fenced code block lang and meta not separated by curly bracket

Subject of the issue

When splitting the code block flag into lang and meta, remark does not account for this case:

```js{foo=1 bar=2 baz=3}
console.log("hello world");
```

According to the current implementation, only spaces and tabs are considered as separators:

https://github.com/remarkjs/remark/blob/8d8e3bcfbc1c89b8eae9119703f13414269ebf73/packages/remark-parse/lib/tokenize/code-fenced.js#L230-L245

Not taking { into the account leads to an issue in Prettier. For example, a flag like mylang{foo="hello   world"} is split into mylang{foo="hello and world"} instead of mylang and {foo="hello   world"}. When Prettier stitches back lang and meta, the result becomes mylang{foo="hello world"}, i.e. extra spaces are lost. See Playground.

If accepting { as a separator between lang and meta can count as a bugfix, I’m happy to open a PR that would add it. Doing so will save Prettier from a need to hack the lang value – this workaround dates back to remark v5, i.e. before #345.

Looking at how Prettier does const langMatch = node.lang.match(/^[\w-]+/);, perhaps we can also do the same in remark. I can’t imagine a scenario in which lang would not match this regex 🤔

Your environment

  • OS: web
  • Packages: remark-parse v8.0.3 via prettier v2.1.1
  • Env: browser

About this issue

  • Original URL
  • State: closed
  • Created 4 years ago
  • Comments: 18 (12 by maintainers)

Most upvoted comments

Awesome! And no problem. And glad that you understand!

Hi @wooorm, sorry for disappearing. After thinking about this delimiter problem again, I’m leaning towards agreeing with you now.

Indeed, space+tab is probably the only non-vague combination of symbols we can pick for a delimiter. If we say it’s also {, then why not any arbitrary range of UTF8 characters. The debates would never settle and different implementations of parsers would diverge from each other more and more.

I’m going to try to persuade Prettier folks to stop supporting non-whitespace delimiters (I was the one who introduced them). Not sure how soon this will land though because this is a breaking change. Thank you for finding time on this conversation! 🙌