obsidian-tasks: malformed boolean query for valid regex search that includes ()

Please check that this issue hasn’t been reported before.

  • I searched previous Bug Reports didn’t find any similar reports.

Expected Behavior

That the following search should work:

( description regex matches /(buy|order|voucher|lakeland|purchase|\spresent)/i ) OR ( path includes Home/Shopping )

Similar issue

This is similar to https://github.com/obsidian-tasks-group/obsidian-tasks/discussions/1068, but the workaround there was to ignore tasks blocks in template files, whereas in this case the search is meant to be valid.

Current behaviour

It gives:

Tasks query: malformed boolean query -- Invalid token (check the documentation for guidelines)

Steps to reproduce

Paste the following

( description regex matches /(buy|order)/ ) OR ( path includes Shopping )

And preview the results. The error will be seen.

Note: This will work, which is why I believe that the problem is in the boolean-parsing code:

description regex matches /(buy|order)/

Which Operating Systems are you using?

  • Android
  • iPhone/iPad
  • Linux
  • macOS
  • Windows

Obsidian Version

1.1.9

Tasks Plugin Version

1.22.0

Checks

  • I have tried it with all other plugins disabled and the error still occurs

Possible solution

It seems that the regex-parsing code is finding brackets inside search strings.

A workaround is to break down the regular expression to:

( description includes buy ) OR ( description includes buy )  OR ( path includes Shopping )

But with my example above, with more query strings, that gets a bit onerous.

About this issue

  • Original URL
  • State: closed
  • Created a year ago
  • Reactions: 2
  • Comments: 25 (4 by maintainers)

Commits related to this issue

Most upvoted comments

I just noticed a similar quirk when trying to filter for a filename that included parens: The filter filename includes (something) gives that error, while the same filter with path instead of filename works. I would expect them to behave the same.

Hi @aubreyz,

Oops my bad. I just realised that particular vault had not updated to 5.0.0… No beta I know of 😃 - just covering all bases in case there is a secret one… Sorry about the noise

Hi @aubreyz,

Re the following causing error messages because of this issue, when the file name contains ( or )

```tasks
(description includes [[{{query.file.filenameWithoutExtension}}]]) OR (description includes [[{{query.file.filenameWithoutExtension}}|)
```

Is there any way to recode this to avoid this problem (or is this a different problem given the report that it is fixed)?

Sure.

```tasks
filter by function \
    task.description.includes('[[{{query.file.filenameWithoutExtension}}]]') || \
    task.description.includes('[[{{query.file.filenameWithoutExtension}}|')
```

Yes, my proposal goes:

  1. from every Boolean with any filter containing ( or ) is broken
    • and there is no workaround
  2. To every Boolean with any filter containing ) AND (, ) OR ( etc is broken
    • and the workaround is to rephrase your filter prevent matching those phrase, such as lower-casing them or splitting them up in some way
    • I don’t think the regex would be too complicated

If it’s not hard to implement, I think it more goes from 80% or 90% to > 99%… As in, I doubt many real world search strings will contain those Boolean operators.

There is another option I am considering. I really want to provide syntax highlighting in Tasks code blocks, and so I have been looking at how that works. This has involved looking at code that others have written to parse programming languages for syntax highlighting.

There is a small chance I may eventually understand enough about the CodeMirror parsing mechanisms to come up with a better parsing solution for this issue too.

If I understand correctly, your suggestion is equivalent to mine here, in its basic notion of gluing the parenthesis to their adjacent operators. I think it steps up the way I phrased it, in the sense that it gives a more structured form to the same basic idea. However, it also suffers from the same weakness, which is in step 1: since we do these splits and searches using very simplistic forms, and not actual syntax trees, it is difficult to differentiate between parenthesis and operator-like text that appears inside expressions. It can be done, and I think you took it to a more practical level. However, I think that any idea that is based on simple textual preprocessing and not actual syntax trees/grammers, will inevitably be a compromise, and as such, may not be a worthwhile investment. In other words, if we say the current parsing code is at 95%, and there’s a complicated path to bring it to 97% and a completely different complicated path to bring it to 100%, we rather aim for the 100% and not take the code deeper into cryptic-regex-land 😃

I just noticed a similar quirk when trying to filter for a filename that included parens: The filter filename includes (something) gives that error, while the same filter with path instead of filename works. I would expect them to behave the same.

I am seeing the same issue with filenames containing parentheses: On a separate line, filename does not include 2023-01-06 (Friday).md throws an error, but path does not include 2023-01-06 (Friday).md does not.

But using path does not include 2023-01-06 (Friday).md inside a filter, such as (path does not include 2023-01-06 (Friday).md) AND (path does not include test.md) will also throw an error.

I was really confused as to why a working Tasks query from months ago (granted, I didn’t test it since February) suddenly didn’t work anymore…