element-web: Can't escape > at the start of message with \ in the new composer

Description

The old composer sent \>hello as >hello, but now it sends \>hello.

Version information

Platform: web

For the web app:

Browser: Firefox
URL: self-hosted 1.5.0

About this issue

Original URL
State: closed
Created 5 years ago
Reactions: 6
Comments: 16 (7 by maintainers)

Commits related to this issue

Fix escaped markdown passing backslashes through Fixes https://github.com/vector-im/riot-web/issues/11230 Signed-off-by: Bob Rao <bob+git@vulpin.com> — committed to BobVul/matrix-react-sdk by BobVul 4 years ago

Most upvoted comments

While having /plain as an option is fine when you plan on avoiding markdown in multiple places, having \ as an option to escape a single instance would be more efficient, and is used in other applications for that reason

BurnyBoi on Oct 27, 2019

Markdown parsing seems quite odd/inconsistent now. Looks like backslash escapes only work if unescaped markdown is used elsewhere in the same message?

This makes it different from every other markdown/commonmark implementation I’ve ever seen, and having to use /plain specifically for the case where no other markdown is used, rather than the standard escape sequences that should work everywhere, feels very inconsistent to me.

The *quick* \*brown\* fox jumps over the lazy dog.

renders

The quick *brown* fox jumps over the lazy dog.

Which is fine.

The quick \*brown\* fox jumps over the lazy dog.

renders

The quick \*brown\* fox jumps over the lazy dog.

which is weird.

The quick* \*brown\* fox jumps over the lazy dog.

renders

The quick* \*brown\* fox jumps over the lazy dog.

which is weird.

BobVul on Nov 21, 2019

Since the old markdown parser has been removed completely, there is no longer any straightforward workaround. As before, requiring /plain in these cases is weird.

So there’s a few things:

Should we rename this issue or create a new issue? It’s clear that the issue is far wider-reaching than simply escaping a block quote. All completely-escaped messages are interpreted as non-markdown, thus showing the backslashes where they shouldn’t.
Is there any desire to fix this? It’s clearly a bug: part of the code thinks a fully-escaped message is markdown, while another part thinks it’s not.
I’m happy to create a pull request, but I’m not sure what the scope of the change should be. I’ve already tracked down why this occurs, and a fairly sledgehammer-y approach at fixing it, but I don’t know the full implications. I’d assume someone more familiar with the codebase would have a better idea.

It boils down to the serialisation not performing a Markdown-to-HTML conversion if it thinks the message is plaintext: https://github.com/matrix-org/matrix-react-sdk/blob/6f6d6b096a0e752f0ad649d70cd541008334c634/src/editor/serialize.js#L38-L44

Unfortunately, .isPlainText() relies on the CommonMark node type of text as seen in: https://github.com/matrix-org/matrix-react-sdk/blob/6f6d6b096a0e752f0ad649d70cd541008334c634/src/Markdown.js#L71-L92

The problem is CommonMark describes nodes as text after parsing escapes. It thinks \*test\* is *test* and a text node. Which is correct: the post-parse node is unformatted text. Unfortunately, Riot then takes that to mean the pre-parse input should be displayed as-is, which is incorrect.

I think this looks like just an optimisation, and the easy fix would be to disable the optimisation by not using .isPlainText() at all. Anything that goes through the Markdown path should always use .toHTML() or similar to get the post-parse output. This would not affect /plain as it does not go down this code path at all.

BobVul on Jan 31, 2020