djot: New vs continuing paragraph after block quote or other set-off content
Here are two kinds of texts we might want to distinguish:
paragraph content
> block quote
continuation of paragraph
vs
paragraph content
> block quote
new paragraph
A deficiency of Markdown is that there is no way to distinguish these cases. The problem is reduced if one renders in a format that does not indent new paragraphs, because then there is no visual distinction between the cases. But they are semantically different and can be distinguished, e.g., in print output with indented paragraphs. There should be a way to distinguish them in the source.
The problem is not raised only by block quotes but occurs also with set-off equations, images, tables, code, and lists.
I recently found myself creating a pandoc Lua filter that implements the following syntax for the “continued paragraph case”:
paragraph content
> block quote
_ continuation of paragraph
(The filter just inserts a LaTeX \noindent command where the _ is.) This is not too bad actually. It would be nice if djot had some way of making the distinction.
About this issue
- Original URL
- State: open
- Created 2 years ago
- Reactions: 2
- Comments: 15 (10 by maintainers)
ok, thank you.
RE the bigger question, some things to consider:
I would say that Pandoc’s solves the problem of translation between so many different input and output forms by defining an IR that is syntax independent and more or less a semantic superset of those syntaxes. Then the question becomes whether representing paragraphs that span block quotes (or other elements, see below) is a universal or common enough to warrant complicating Pandoc’s IR.
An old W3C www-html list discussion: Re: Lists within Paragraphs. An excerpt:
The HTML spec’s ultimate answer admits that paragraphs might logically span block elements, but that it doesn’t apply to the HTML standard:
Allowing paragraphs to span/nest block elements provides, I think, a cleaner and more consistent solution to “tight lists”. For example, the following would be a tight list because each list item contains exactly a single element (a paragraph):
The current CommonMark solution has flaws, as can be seen by comparing
with
Both should be treated as loose lists since the second item in each contains block sequences, but CommonMark’s determination is based on the existence or lack thereof of blank lines in the source, not logical structure.
I hope this is helpful. Please let me know if you’ve had enough! It just happens to be a question I’ve been trying to tackle myself.
Re. @bpj 's suggestion about indenting: would this cause a problem with putting lists between paragraphs? That is, with a list you may (and typically) indent the list marker. Is there a difference between a list that’s its own paragraph vs a list that’s in the midst of a paragraph?
One could use a single dot on a line as a “connector” that says: the following normally-block-level thing is to be considered as part of the current paragraph. Then your A is
and your B is
and so on. Of course, this would require figuring out an AST model that actually permits this sort of thing. And some (most?) output formats just won’t allow a list or a block quote to be part of a paragraph: in HTML for example, a p element can only contain “phrasing content.”
Including lists in paragraphs is an important use case for the kind of writing that I do, at least, and neither the suggestion of a leading
_nor the suggestion of indentation work well for this case.I would need to distinguish between all of the following:
A:
B:
C
D
Leading underscore works to distinguish A from C. But not to distinguish A from B, nor C from D. It catches part of the A/D distinction.
Indentation doesn’t work for any of them.
An alternative design is a convention that there’s a div for “multi-paragraphs” that contain multiple block elements. It’s ugly but accurate:
would denote option A.
This would be tool specific, however, but that’s perhaps OK - I think the need for this kind of thing tends to arise in long-form scientific writing more than in smaller, casual documents, so having a Googleable solution like this is perhaps OK. This also remains compatible with the various ASTs out there.
Thinking about a syntax for, “anyhow, as I was saying”, I was going to suggest
..., as in:But that causes a pretty big indent, and
...already automatically gets you a “…” in djot, and it might cause problems when the author wants an actual ellipses.The leading underscore is ok, but also does make me think italics.
Since “and” is at least somewhat close to “anyhow, as I was saying”, maybe
&?I like that one because,
&is not currently used for any other djot syntax,_, which I think is a desirable characteristic for this bit of markup.I mean that if what follows the blockquote is a continuation the blockquote is indented == the blockquote is embedded in a paragraph.
vs.
I hope that is clearer.
Why not indentation for an embedded blockquote? That seems the most intuitive to me.