markdown: Raw HTML is throwing an exception

The following mardown:

<div class="row" markdown="1">
<div class="col-md-6" markdown="1">
**SomeText**
</div>

<div class="col-md-6" markdown="1">

**blod text**  
<small>(<i class="fa fa-arrow-left"></i> small)</small>

<div class="barchart" markdown="1">
* item1
* item2
</div>

more text

</div>
</div>

is failing with this:

Traceback (most recent call last):
  File "/home/ikus060/workspace/PDSL/markdown.git/markdown/test_tools.py", line 117, in test
    output = markdown(input, **kwargs)
  File "/home/ikus060/workspace/PDSL/markdown.git/markdown/core.py", line 391, in markdown
    return md.convert(text)
  File "/home/ikus060/workspace/PDSL/markdown.git/markdown/core.py", line 268, in convert
    root = self.parser.parseDocument(self.lines).getroot()
  File "/home/ikus060/workspace/PDSL/markdown.git/markdown/blockparser.py", line 92, in parseDocument
    self.parseChunk(self.root, '\n'.join(lines))
  File "/home/ikus060/workspace/PDSL/markdown.git/markdown/blockparser.py", line 107, in parseChunk
    self.parseBlocks(parent, text.split('\n\n'))
  File "/home/ikus060/workspace/PDSL/markdown.git/markdown/blockparser.py", line 125, in parseBlocks
    if processor.run(parent, blocks) is not False:
  File "/home/ikus060/workspace/PDSL/markdown.git/markdown/extensions/extra.py", line 127, in run
    block = self._process_nests(element, block)
  File "/home/ikus060/workspace/PDSL/markdown.git/markdown/extensions/extra.py", line 95, in _process_nests
    block[nest_index[-1][1]:], True)                      # nest
  File "/home/ikus060/workspace/PDSL/markdown.git/markdown/extensions/extra.py", line 101, in run
    tag = self._tag_data[self.parser.blockprocessors.tag_counter]
IndexError: list index out of range

About this issue

Original URL
State: closed
Created 5 years ago
Comments: 15 (6 by maintainers)

Most upvoted comments

Well, this gives me some motivation to try and track this down then. Maybe I’ll get to this over the weekend…

facelessuser on Feb 7, 2019

@ikus060 I believe in my experiments I got it working by removing all of the empty lines. Or at least there was some non-obvious combination which avoided the bug. I didn’t save it as I assumed the provided example wasn’t real content anyway and probably won’t help you to workaround the problem in your actual document. If you want to experiment with removing blank lines, you may find a workaround.

As far a getting this fixed, all I did was confirm the bug exists and find the minimum document which triggers the bug. I have no idea what is causing it and that part of the code is let than ideal. There is a reason it is generally considered bad form to implement an HTML parser with regex. I suspect it is more likely to replace the entire raw HTML handling code than to fix this specific bug.

As a reminder, we work on this in our spare time as volunteers. Recently all I have had time for is managing the bug tracker. I haven’t worked on any code in months and don’t foresee that changing anytime soon. Of course, I can’t speak for the other devs. If someone provides a PR, I’ll do my best to review it. We should probably backport the fix to 3.0 as 3.1 is not quite ready, IIRC.

waylan on Jan 30, 2019