lunr.js: Lunr throws in Safari sometimes when calling the query method
When calling the .query method with certain search terms in Safari, the following error is thrown:
TypeError: undefined is not an object (evaluating 'posting._index')
The problem is here. Adding an check for posting solves the problem but I am almost certain this is a sign of a larger problem and this should be fixed upstream.
We found that removing wildcard option from all .term calls fixes the issue but search results suffer.
We are aware of this being reported here: https://github.com/olivernn/lunr.js/issues/276#issuecomment-308388470. We are working to create a reduced test case.
About this issue
- Original URL
- State: open
- Created 7 years ago
- Reactions: 2
- Comments: 37 (23 by maintainers)
Commits related to this issue
- Add check to avoid uncaught exception in Safari Work around for https://github.com/olivernn/lunr.js/issues/279 — committed to bit-docs/lunr.js by chasenlehara 7 years ago
- PMT #114394: Fix search in Safari LunrJS has some undefined behavoir in Safari, documented here: https://github.com/olivernn/lunr.js/issues/279 This workaround console-logs out the terms which seems... — committed to PoLAR-Hub/polarhub by nbuonin 6 years ago
- Normalize unicode to avoid index corruption; fixes issue #279 — committed to coreyward/lunr.js by coreyward 6 years ago
- Fix issue #279 (bug with Safari) Calling any method defined on String.prototype on the expanded term seems to force the string to be properly represented, fixing an issue affecting Safari users. See... — committed to lucaong/lunr.js by lucaong 6 years ago
- fix issue #279 at the source, on TokenSet.prototype.toArray it turns out that a specific string concatenation in TokenSet.prototype.toArray sometimes results in a corrupted string in Safari. It is fi... — committed to lucaong/lunr.js by lucaong 6 years ago
- Fix issue #279 (bug with Safari) (#361) * Fix issue #279 (bug with Safari) Calling any method defined on String.prototype on the expanded term seems to force the string to be properly represented... — committed to olivernn/lunr.js by lucaong 6 years ago
With lunr 2.3.6 and the trimmer removed from the pipeline, I still encounter this issue sporadically.
The patch in #361 may have been insufficient or too localized. Maybe it should be guaranteed that an undefined
postingdoesn’t get past that line?I see that the issue was already thoroughly investigated more than I can meaningfully contribute to. Anecdotally, it often happens when the term is empty (or stopwords), but I saw it happen with
*m*several times too. I also use lunr-unicode-normalizer (monkey-patched for 2.x) for the rare document with Unicode text.I’ve been doing some thinking about this and I think an approach forward is to improve the quality of the implementation of
lunr.tokenizer. Its job is to turn some text into individual words or tokens. The fact thatlunr.trimmereven exists suggests some inadequacy in the implementation oflunr.tokenzier. More generally, unicode is hard, and texts with non latin characters are not really well supported by the current approach.I think an approach based on UAX#29 is probably more robust, though the implementation details are certainly more involved. I’m going to experiment with writing a tokeniser using the rules in the above document, I want to see how much better it is able to deal with these cases, as well as understanding what, if any, impact there is on performance (both speed and library size).
In the meantime I’m still interested in seeing if this bug can be isolated enough to show to the Safari developers, as its current behaviour is still weird to me.
In the past people have put together reproductions with jsfiddle. The ideal would be having an index with a single document and a query that triggers the bug.
An example fiddle for some inspiration - https://jsfiddle.net/of54k0uk/14/
Apparently the fix reported by @lucaong has been fixed https://trac.webkit.org/changeset/255975/webkit and it lives now in the Safari Tech. Preview https://webkit.org/blog/10031/release-notes-for-safari-technology-preview-101/
@chasenlehara thanks for the link, I filed this bug there: https://bugs.webkit.org/show_bug.cgi?id=187947
I took the liberty to file a bug report on Safari, as this is now clearly a browser bug and not a Lunr issue.
and here is the smallest script where I can reproduce the bug.
https://jsfiddle.net/egLzL24L/156/
It seems like it’s a combination of the trimmer RegExp, a trailing non-word, a Unicode character in a higher block than Latin-1 Supplement (so unicode of at least 2 bytes), and string concatenation.
@chasenlehara and I were able to create a reduced test case. Open this jsfiddle in Safari and you will see the error. I hope this helps.