markdown: Parsing >>>>>>… takes quadratic time and eventually crashes with recursion error

>>> import markdown, timeit
>>> num, dur = timeit.Timer(lambda: markdown.markdown(">" * 100)).autorange(); dur / num
0.004344235559692607
>>> num, dur = timeit.Timer(lambda: markdown.markdown(">" * 200)).autorange(); dur / num
0.016907797248859425
>>> num, dur = timeit.Timer(lambda: markdown.markdown(">" * 300)).autorange(); dur / num
0.05682634799741208
>>> markdown.markdown(">" * 331)
…
RecursionError: maximum recursion depth exceeded while calling a Python object

Same results in 3.0.1 and master.

About this issue

  • Original URL
  • State: closed
  • Created 5 years ago
  • Comments: 18 (12 by maintainers)

Commits related to this issue

Most upvoted comments

However, if I do this:

def get_remaining_stack_frames(depth=0):
    """ Return remaining number of stack frames before limit it hit."""
    try:
        return get_remaining_stack_frames(depth + 1)
    except RecursionError:
        return depth

def nearing_recursion_limit():
    """Return true if current stack depth is withing 100 of maximum limit."""
    return get_remaining_stack_frames() < 100

then my times are:

>>> num, dur = timeit.Timer(lambda: markdown.markdown(">" * 100)).autorange(); dur / num
0.013232544999999974
>>> num, dur = timeit.Timer(lambda: markdown.markdown(">" * 200)).autorange(); dur / num
0.0269508400000003
>>> num, dur = timeit.Timer(lambda: markdown.markdown(">" * 300)).autorange(); dur / num
0.04743795999999989

<del>The difference is negligible.</del> However, the example which counts up (to the max) is slower (relatively) that the example which counts down (the stack) for smaller size documents. As smaller documents are the norm, I think that is probably the more sensible way to go.

Whoops, my first attempt at both had the line which called the code commented out. I have since edited this comment and the last one to provide accurate results. The previous solution is noticeably better than this one.

Python-Markdown does not require a space between >s too:

>>> print(markdown.markdown(">>>foo"))
<blockquote>
<blockquote>
<blockquote>
<p>foo</p>
</blockquote>
</blockquote>
</blockquote>