OpenSearch: [BUG] Uppercase Regex in String queries search
Regex search on analyzed fields doesn’t work with capital letters now as if the analyzer saves them lower cased, or doesn’t use the standard analyzer for some reason.
To Reproduce Steps to reproduce the behavior:
- Create an index and add a doc containing uppercase letters to it (e.g
message: this is a TLS handshake) - Perform this search request:
{"query": {"bool": {"must": [{"query_string": {"query": "message:/TLS/"}}]}}}
Expected behavior We should find the doc we added above since it’s an exact match. However, we got 0 results.
Plugins None
Host/Environment (please complete the following information): I used OpenSearch container with a single node (version 1.2.3), running on iOS 11.6.4
Additional context
If we specify the standard analyzer to the search request, we get the expected results.
{"query": {"bool": {"must": [{"query_string": {"query": "message:/TLS/", "analyzer":"standard"}}]}}}
About this issue
- Original URL
- State: closed
- Created 2 years ago
- Comments: 24 (11 by maintainers)
@reta @nknize we believe that reverting the change (aligning the behavior with the way it worked before the fix) should be the way to go in this case because:
wildcard, which OpenSearch doesn’t support. There is no real reason for this code to exist in the OpenSearch.please let me know what you think about it.
and we of course happy to contribute to it if that’s the way to go here.
@AmiStrn FYI
I am totally with you @AmiStrn @alexgnatyuk (https://github.com/opensearch-project/OpenSearch/issues/3578#issuecomment-1162136277), need @nknize confirmation this is away to go
@reta I agree with @alexgnatyuk here. this seems like a really easy fix if the code introduced in Elasticsearch 7.9 is reverted since it is an x-pack feature that is using it. It is a bug, really, that has gone unnoticed for a while. users should not have to be explicit about this for the regex’s in their query since many times they will be using OpenSearch-dashboards and not have that level of knowledge that they need to also specify the analyzer type somehow (that would be really bad UX).
the discussion is about the proposed solution - what do you think? should we make this change as proposed?