OpenRefine: Text filter causes exception when regex pattern is not yet complete

We need to replace literal backslashes with additional backslashes to solve this issue. https://stackoverflow.com/questions/4653831/regex-how-to-escape-backslashes-and-special-characters

Reproducible Steps: Using OpenRefine [trunk]

This can be reproduced with any text cell value, for instance

  1. Add text filter on column.
  2. Type into the text filter column …

^tha

  1. now type 1 more character, a backslash character \

^tha\

  1. Notice that you get an Unexpected internal error above dialog.
  2. Its not an error yet to the user since they are still typing, its just an incomplete expression and we should probably be nicer and give them a more friendly helpful message as other data tools do.

About this issue

  • Original URL
  • State: closed
  • Created 7 years ago
  • Comments: 17 (17 by maintainers)

Most upvoted comments

Thanks @thadguidry thanks @jackyq2015 . I definitely understand that the user needs better feedback on what is happening and how to fix the problem.

What I’m thinking is that I can write something to validate the regular expression at the server end, and pass back a meaningful error message in JSON - which would be displayed in place of the current error generated by the java exception. Does this sound like a good approach?

Looking at the regexr example shared by @thadguidry, it catches and reports the following errors:

groupopen:"Unmatched opening parenthesis.",
groupclose:"Unmatched closing parenthesis.",
quanttarg:"Invalid target for quantifier.",
setopen:"Unmatched opening square bracket.",
esccharopen:"Dangling backslash.",
quantrev:"Quantifier minimum is greater than maximum.",
rangerev:"Range values reversed. Start char is greater than end char.",
lookbehind:"Lookbehind is not supported in JavaScript.",
fwdslash:"Unescaped forward slash.",
esccharbad:"Invalid escape sequence."

We won’t need all of these but it looks like a good starting point.

Isn’t this the same as #1203?

I see the development version no longer behaves in the same way when the regex box is checked and a bad regular expression is written. Errors always appear as you type, but no more the gray spinning wheel. It’s certainly related to #1203.

screencast2