markdown: API docs are not being properly indexed for search

The default behavior is for search to use the heading of the relevant section as the title of the search result. However, the code spans within headings of the API docs are being HTML escaped in the results and many of the obvious headings are not being returned as results. I’m assuming the second issue is related to the first in that the correct text is not getting indexed. In other words, if only the plain text content of a heading was being indexed, then that would result in better search results.

Consider the following example. ESCAPED_CHARS is an instance attribute of the Markdown class. One might expect that its relevant section (ESCAPED_CHARS) would be returned in a search for the string ESCAPED_CHARS, but it is not. However, the method markdown.inlinepatterns.EscapeInlineProcessor.handleMatch is in the results because it mentions ESCAPED_CHARS in the body of its documentation. Yet, in the search result, the title is the text <code class="doc-symbol doc-symbol-toc doc-symbol-method"></code>&nbsp;handleMatch not handleMatch or markdown.inlinepatterns.EscapeInlineProcessor.handleMatch as one might expect. Frustratingly, the search term handleMatch does not return that result at all. Yet, it is in the results for the search term doc-symbol-method, which shouldn’t even be indexed, as it is an HTML class assigned to the code span, not text.

@pawamoy do you have any insight into this? I have not yet looked at the code and am not sure how the mkdocsstrings extension passes its generated pages to search for indexing.

About this issue

  • Original URL
  • State: open
  • Created 5 months ago
  • Comments: 18 (18 by maintainers)

Commits related to this issue

Most upvoted comments

No actually the nav option currently allows HTML, so that will have to stay