mkdocs-material: New search plugin might break in some cases

Contribution guidelines

I’ve found a bug and checked that …

  • … the problem doesn’t occur with the mkdocs or readthedocs themes
  • … the problem persists when all overrides are removed, i.e. custom_dir, extra_javascript and extra_css
  • … the documentation does not mention anything about my problem
  • … there are no open or closed issues that are related to my problem

Description

The new search plugin in Insiders has some trouble with custom HTML and repeating # headers. While custom HTML with literal hx tags cannot be expected to be indexed as sections (as there’s not corresponding anchor link to jump to), it should still build correctly and be linked to the closest parent section.

Expected behaviour

The search index is not broken, i.e. does not contain duplicate entries.

Actual behaviour

The search index contains duplicate null-ish entries, which breaks search:

...
{
  "location": "",
  "text": "",
  "title": ""
},
{
  "location": "",
  "text": "",
  "title": ""
},
{
  "location": "",
  "text": "",
  "title": ""
},
...

Steps to reproduce

Example 1

# A
# B
# C

Example 2

# A
<h2>B</h2>
<h3>C</h3>

Package versions

  • Python: 3.9
  • MkDocs: 1.2.2
  • Material: mkdocs-material-7.2.6+insiders-3.0.0

Configuration

site_name: My Docs
theme:
  name: material

System information

  • Operating system: macOS
  • Browser: Chrome

About this issue

  • Original URL
  • State: closed
  • Created 3 years ago
  • Comments: 17 (13 by maintainers)

Most upvoted comments

… what’s also cool is that as of 098f2b9, literal headlines can also induce new sections if they define an id attribute. This can be mixed with regular Markdown as desired, which allows for better integration with auto-generated markup.

Input:

# Headline

<h2 id="my-custom-id">Subheadline 1</h2>

Some content

<p>More content</p>

## Subheadline 2

And even more

Output:

{
  "location": "",
  "text": "",
  "title": "Headline"
},
{
  "location": "#my-custom-id",
  "text": "<p>Some content</p> <p>More content</p>",
  "title": "Subheadline 1"
},
{
  "location": "#subheadline-2",
  "text": "<p>And even more</p>",
  "title": "Subheadline 2"
}

@facelessuser I’m currently testing html5_parser, it’s pretty quick and looks promising. @willingc I understand. Let’s see if the HTML5 parser fixes the issue 😊

While running through my documentation I’ve come across another instance of HTML tags breaking the search index. In some places, we used a crude ‘hack’ (<img width="900">) to force Markdown tables to take full width.

A minimal example for the instance described above (everything after the tag isn’t correctly included in the search_index.json):

# Example

## Search will be working

and `displaying` basic **formatting**

| Color | State | Description |
|:-:|:-|:-|
| Color 1    | Solid | Descriptor 1 |
| Color 2 | Solid | Descriptor 2 <img width="900"> |

## Search is awesome

... but sadly not working anymore.