zulip: Overlapping alert words bug

When you have an alert word that is a substring of another alert word, you get a weird double highlight.

For example, if you have an alert word Joshua Pan and another Joshua, there is the following overlap: screen shot 2018-04-19 at 2 37 51 pm

About this issue

Original URL
State: closed
Created 6 years ago
Comments: 54 (26 by maintainers)

Commits related to this issue

markdown: fix overlapping highlight of alert words and mentions Added regex to the process_message function so that anything in a span tag (which represents a mention or an alert word that has alread... — committed to brickbite/zulip by deleted user 6 years ago
markdown: fix overlapping highlight of alert words and mentions Added regex to the process_message function so that anything in a span tag (which represents a mention or an alert word that has alread... — committed to brickbite/zulip by deleted user 6 years ago
Markdown: fix overlapping highlight of alert words and mentions. Added regex to the process_message function so that anything in a span tag (which represents a mention or an alert word that has alrea... — committed to brickbite/zulip by deleted user 6 years ago
Markdown: fix overlapping highlight of alert words and mentions. Added regex to the process_message function so that anything in a span tag (which represents a mention or an alert word that has alrea... — committed to brickbite/zulip by deleted user 6 years ago
Markdown: fix overlapping highlight of alert words and mentions. Added regex to the process_message function so that anything in a span tag (which represents a mention or an alert word that has alrea... — committed to brickbite/zulip by deleted user 6 years ago
Markdown: fix overlapping highlight of alert words and mentions. Added regex to the process_message function so that anything in a span tag (which represents a mention or an alert word that has alrea... — committed to brickbite/zulip by deleted user 6 years ago
Markdown: fix overlapping highlight of alert words and mentions. Added regex to the process_message function so that anything in a span tag (which represents a mention or an alert word that has alrea... — committed to brickbite/zulip by deleted user 6 years ago

Most upvoted comments

For the issue of overlapping alert words, in keeping consistent with the existing function, we can use regex to ignore anything in a  tag. Mentions also use a  tag to highlight the username, and the processing for that seems to happen before the message is processed. So this fix will cover the overlapping highlight for both alert words and mentions.

Secondary Issue: Substrings / overlapping strings and alert word order. Since alert words are stored as an array, and `_.each` is used to iterate through the array, the order of the words will determine what does/doesn’t get detected.

A Lookup Tree (Trie) data structure could possibly be used to store the alert words instead of an array. Then, instead of using _.each and string.replace(), we can iterate through the message content. This would be more performant if we need to handle a lot of alert words. This implementation would require reworking the .process_message function entirely.

If we wanted to keep consistent with the existing function (continue to use an array to store alert words), we could use a sorted array for the _.each so that any superstring would be detected before the substring is. Performance would not be as good if there were many alert words.

The sorted array would probably be best implemented and maintained where the alert words are currently stored. That way, the user would also see their alert words sorted in the UI, and alert words added would be inserted in the correct place in the array.

If we didn’t want that behavior, a function to sort by could look something like:

function compareLenThenLex (a, b) {
  if (a.length > b.length) {
    return -1;
  } else if (a.length < b.length) {
    return 1;
  } else {
    return a > b ? 1 : -1;
  }
}

In both fixes, we would eliminate Case 2 below, so that the superstring would always be detected over its substring. In the case of overlapping strings, the one that is detected first will be the one that’s highlighted.

Case 1 (string first, then its substring):

alert_words = ["quick fox jumps", "fox"]
message.content = <p>the quick fox jumps over</p>

returns: the quick fox jumps over // "fox" is ignored

Case 2 (substring first, then its superstring):

alert_words = ["fox", "quick fox jumps"]
message.content = <p>the quick fox jumps over</p>

returns: the quick fox jumps over // "quick fox jumps" is ignored

Case 3 (partially overlapping strings):

alert_words = ["quick fox", "fox jumps"]
message.content = <p>the quick fox jumps over</p>

returns: the quick fox jumps over // "fox jumps" is ignored

Secondary Issue: If `"span"` is an alert word (when html is pasted in a `<code>` tag for example), the function may not detect all instances in the code snippet. This might be an existing problem anyway since the highlighting does already ignore text within tags.

Secondary Issue: In the current function, alert words must be space or character separated, with padding at the beginning and end. Also, alert words cannot share padding.

Case 1 (3 instances, the middle one is skipped since it shares padding):

alert_words = ["fox jumps"]
message.content = <p>fox jumps fox jumps fox jumps</p>

returns: fox jumps fox jumps fox jumps // the middle "fox jumps" is ignored

Case 2 (a minimum of 2 spaces (or punctuation characters) between the alert word is needed to correctly capture all of them):

alert_words = ["fox jumps"]
message.content = <p>fox jumps  fox jumps  fox jumps</p>

returns: fox jumps fox jumps fox jumps

brickbite on May 29, 2018

@lonerz one more interesting bug. If you first add Joshua and then add Joshua Pan this is what you get -

If you reverse the order -

akashnimare on Apr 20, 2018