lunr.js: Lunr throws in Safari sometimes when calling the query method

When calling the .query method with certain search terms in Safari, the following error is thrown:

TypeError: undefined is not an object (evaluating 'posting._index') 

The problem is here. Adding an check for posting solves the problem but I am almost certain this is a sign of a larger problem and this should be fixed upstream.

We found that removing wildcard option from all .term calls fixes the issue but search results suffer.

We are aware of this being reported here: https://github.com/olivernn/lunr.js/issues/276#issuecomment-308388470. We are working to create a reduced test case.

About this issue

  • Original URL
  • State: open
  • Created 7 years ago
  • Reactions: 2
  • Comments: 37 (23 by maintainers)

Commits related to this issue

Most upvoted comments

With lunr 2.3.6 and the trimmer removed from the pipeline, I still encounter this issue sporadically.

The patch in #361 may have been insufficient or too localized. Maybe it should be guaranteed that an undefined posting doesn’t get past that line?

I see that the issue was already thoroughly investigated more than I can meaningfully contribute to. Anecdotally, it often happens when the term is empty (or stopwords), but I saw it happen with *m* several times too. I also use lunr-unicode-normalizer (monkey-patched for 2.x) for the rare document with Unicode text.

I’ve been doing some thinking about this and I think an approach forward is to improve the quality of the implementation of lunr.tokenizer. Its job is to turn some text into individual words or tokens. The fact that lunr.trimmer even exists suggests some inadequacy in the implementation of lunr.tokenzier. More generally, unicode is hard, and texts with non latin characters are not really well supported by the current approach.

I think an approach based on UAX#29 is probably more robust, though the implementation details are certainly more involved. I’m going to experiment with writing a tokeniser using the rules in the above document, I want to see how much better it is able to deal with these cases, as well as understanding what, if any, impact there is on performance (both speed and library size).

In the meantime I’m still interested in seeing if this bug can be isolated enough to show to the Safari developers, as its current behaviour is still weird to me.

In the past people have put together reproductions with jsfiddle. The ideal would be having an index with a single document and a query that triggers the bug.

An example fiddle for some inspiration - https://jsfiddle.net/of54k0uk/14/

Apparently the fix reported by @lucaong has been fixed https://trac.webkit.org/changeset/255975/webkit and it lives now in the Safari Tech. Preview https://webkit.org/blog/10031/release-notes-for-safari-technology-preview-101/

@chasenlehara thanks for the link, I filed this bug there: https://bugs.webkit.org/show_bug.cgi?id=187947

I took the liberty to file a bug report on Safari, as this is now clearly a browser bug and not a Lunr issue.

and here is the smallest script where I can reproduce the bug.

https://jsfiddle.net/egLzL24L/156/

It seems like it’s a combination of the trimmer RegExp, a trailing non-word, a Unicode character in a higher block than Latin-1 Supplement (so unicode of at least 2 bytes), and string concatenation.

@chasenlehara and I were able to create a reduced test case. Open this jsfiddle in Safari and you will see the error. I hope this helps.