markdown: Parsing >>>>>>… takes quadratic time and eventually crashes with recursion error

>>> import markdown, timeit
>>> num, dur = timeit.Timer(lambda: markdown.markdown(">" * 100)).autorange(); dur / num
0.004344235559692607
>>> num, dur = timeit.Timer(lambda: markdown.markdown(">" * 200)).autorange(); dur / num
0.016907797248859425
>>> num, dur = timeit.Timer(lambda: markdown.markdown(">" * 300)).autorange(); dur / num
0.05682634799741208
>>> markdown.markdown(">" * 331)
…
RecursionError: maximum recursion depth exceeded while calling a Python object

Same results in 3.0.1 and master.

About this issue

Original URL
State: closed
Created 5 years ago
Comments: 18 (12 by maintainers)

Commits related to this issue

Limit depth of blockquotes using Python's recursion limit. If the Python stack comes within 100 frames of the recursion limit, then the nesting limit of blockquotes is met. Any remaining text, includ... — committed to waylan/markdown by waylan 4 years ago
Limit depth of blockquotes using Python's recursion limit. (#991) If the Python stack comes within 100 frames of the recursion limit, then the nesting limit of blockquotes is met. Any remaining text... — committed to Python-Markdown/markdown by waylan 4 years ago

Most upvoted comments

However, if I do this:

def get_remaining_stack_frames(depth=0):
    """ Return remaining number of stack frames before limit it hit."""
    try:
        return get_remaining_stack_frames(depth + 1)
    except RecursionError:
        return depth

def nearing_recursion_limit():
    """Return true if current stack depth is withing 100 of maximum limit."""
    return get_remaining_stack_frames() < 100

then my times are:

>>> num, dur = timeit.Timer(lambda: markdown.markdown(">" * 100)).autorange(); dur / num
0.013232544999999974
>>> num, dur = timeit.Timer(lambda: markdown.markdown(">" * 200)).autorange(); dur / num
0.0269508400000003
>>> num, dur = timeit.Timer(lambda: markdown.markdown(">" * 300)).autorange(); dur / num
0.04743795999999989

<del>The difference is negligible.</del> However, the example which counts up (to the max) is slower (relatively) that the example which counts down (the stack) for smaller size documents. As smaller documents are the norm, I think that is probably the more sensible way to go.

Whoops, my first attempt at both had the line which called the code commented out. I have since edited this comment and the last one to provide accurate results. The previous solution is noticeably better than this one.

waylan on Jun 26, 2020

Python-Markdown does not require a space between >s too:

>>> print(markdown.markdown(">>>foo"))
<blockquote>
<blockquote>
<blockquote>
<p>foo</p>
</blockquote>
</blockquote>
</blockquote>

mitya57 on Mar 6, 2019