element-web: Can't escape > at the start of message with \ in the new composer

Description

The old composer sent \>hello as >hello, but now it sends \>hello.

Version information

  • Platform: web

For the web app:

  • Browser: Firefox
  • URL: self-hosted 1.5.0

About this issue

  • Original URL
  • State: closed
  • Created 5 years ago
  • Reactions: 6
  • Comments: 16 (7 by maintainers)

Commits related to this issue

Most upvoted comments

While having /plain as an option is fine when you plan on avoiding markdown in multiple places, having \ as an option to escape a single instance would be more efficient, and is used in other applications for that reason

Markdown parsing seems quite odd/inconsistent now. Looks like backslash escapes only work if unescaped markdown is used elsewhere in the same message?

This makes it different from every other markdown/commonmark implementation I’ve ever seen, and having to use /plain specifically for the case where no other markdown is used, rather than the standard escape sequences that should work everywhere, feels very inconsistent to me.


The *quick* \*brown\* fox jumps over the lazy dog.

renders

The quick *brown* fox jumps over the lazy dog.

Which is fine.


The quick \*brown\* fox jumps over the lazy dog.

renders

The quick \*brown\* fox jumps over the lazy dog.

which is weird.


The quick* \*brown\* fox jumps over the lazy dog.

renders

The quick* \*brown\* fox jumps over the lazy dog.

which is weird.

Since the old markdown parser has been removed completely, there is no longer any straightforward workaround. As before, requiring /plain in these cases is weird.

So there’s a few things:

  • Should we rename this issue or create a new issue? It’s clear that the issue is far wider-reaching than simply escaping a block quote. All completely-escaped messages are interpreted as non-markdown, thus showing the backslashes where they shouldn’t.
  • Is there any desire to fix this? It’s clearly a bug: part of the code thinks a fully-escaped message is markdown, while another part thinks it’s not.
  • I’m happy to create a pull request, but I’m not sure what the scope of the change should be. I’ve already tracked down why this occurs, and a fairly sledgehammer-y approach at fixing it, but I don’t know the full implications. I’d assume someone more familiar with the codebase would have a better idea.

It boils down to the serialisation not performing a Markdown-to-HTML conversion if it thinks the message is plaintext: https://github.com/matrix-org/matrix-react-sdk/blob/6f6d6b096a0e752f0ad649d70cd541008334c634/src/editor/serialize.js#L38-L44

Unfortunately, .isPlainText() relies on the CommonMark node type of text as seen in: https://github.com/matrix-org/matrix-react-sdk/blob/6f6d6b096a0e752f0ad649d70cd541008334c634/src/Markdown.js#L71-L92

The problem is CommonMark describes nodes as text after parsing escapes. It thinks \*test\* is *test* and a text node. Which is correct: the post-parse node is unformatted text. Unfortunately, Riot then takes that to mean the pre-parse input should be displayed as-is, which is incorrect.

I think this looks like just an optimisation, and the easy fix would be to disable the optimisation by not using .isPlainText() at all. Anything that goes through the Markdown path should always use .toHTML() or similar to get the post-parse output. This would not affect /plain as it does not go down this code path at all.