jena: Lucene query with text:prop not working in some cases ?
Version
4.9.0
Question
This query works - searching all the fields:
select * where {
?s text:query ("beer" 10) .
}
However this query - which should search only in rdfs:label and mt:altLabel fields returns 0 hits :
select * where {
?s text:query (mt:defQuery "beer" 10) .
}
This query returns also 0 hits :
select * where {
?s text:query (mt:includeNotes "beer" 10) .
}
mytest.ttl excerption:
# Text index description
<#indexLucene>
a text:TextIndexLucene ;
text:directory ".../indexes/mytest" ;
text:entityMap <#entMap> ;
text:storeValues true ;
text:analyzer [
a text:ConfigurableAnalyzer ;
text:tokenizer text:StandardTokenizer ;
text:filters (text:ASCIIFoldingFilter text:LowerCaseFilter)
] ;
text:queryParser text:AnalyzingQueryParser ;
text:multilingualSupport true ;
text:propLists (
[ text:propListProp mt:defQuery ;
text:props (
rdfs:label
mt:altLabel
) ;
]
[ text:propListProp mt:includeNotes ;
text:props (
rdfs:label
mt:altLabel
mt:note
) ;
]
) ;
.
<#entMap>
a text:EntityMap ;
text:defaultField "ftext" ;
text:entityField "uri" ;
text:uidField "uid" ;
text:langField "lang" ;
text:graphField "graph" ;
text:map (
[ text:field "ftext" ; text:predicate rdfs:label ]
[ text:field "ftext" ; text:predicate mt:altLabel ]
[ text:field "ftext" ; text:predicate mt:note ]
) .
About this issue
- Original URL
- State: open
- Created 8 months ago
- Comments: 20 (11 by maintainers)
Commits related to this issue
- add failing test for GH-2094 — committed to OyvindLGjesdal/jena by OyvindLGjesdal 7 months ago
- GH-2094: Add Field grouping to query string in composeQField to use textField for full query string — committed to OyvindLGjesdal/jena by OyvindLGjesdal 7 months ago
- GH-2094: add failing tests for propList and single predicate/prop — committed to OyvindLGjesdal/jena by OyvindLGjesdal 7 months ago
- GH-2094: Add Field grouping to query string in composeQField — committed to OyvindLGjesdal/jena by OyvindLGjesdal 7 months ago
- GH-2094 add failing tests for propList and multi encoding searches — committed to OyvindLGjesdal/jena by OyvindLGjesdal 7 months ago
- GH-2094: surround composed query with parens to apply field for all words — committed to OyvindLGjesdal/jena by OyvindLGjesdal 7 months ago
- GH-2094: Add failing tests for propList and multi encoding searches — committed to OyvindLGjesdal/jena by OyvindLGjesdal 7 months ago
- GH-2094: Surround query string with field grouping to apply text field to all tokens — committed to OyvindLGjesdal/jena by OyvindLGjesdal 6 months ago
- GH-2094: Remove duplicate query expressions — committed to OyvindLGjesdal/jena by OyvindLGjesdal 6 months ago
- Revert "GH-2094: Remove duplicate query expressions" This reverts commit 321e95060ae80fd2e491cd130f58bdfb41b019be. — committed to OyvindLGjesdal/jena by OyvindLGjesdal 6 months ago
- GH-2094: Remove duplicate query expressions — committed to OyvindLGjesdal/jena by OyvindLGjesdal 6 months ago
- GH-2094: Remove duplicate query expressions — committed to OyvindLGjesdal/jena by OyvindLGjesdal 6 months ago
- GH-2094: Remove duplicate query expressions — committed to OyvindLGjesdal/jena by OyvindLGjesdal 6 months ago
Hi @filak and thanks for the precise examples, and thanks for the ping.
I have some problems with replicating the issues described.
One thing I notice in the test data is that the mx namespace isn’t mentioned. What is the prefix mx: in mx:alt_label, is it just a typo in the example?
I copied one of the existing tests using propLists to recreate the errors, and get the 3 expected results back when using the test-data, and no items back when I tried to replicate the other example.
I first got the warning message
during running the test, and had to add it to the text map, and rerun the test without a warning, to get the expected result back.
Was the props are not indexed step above silent when running?
Not sure what happens with the second step, but one thing I thought of from the example above, was that maybe there was leftover documents in the lucene folder, if it wasn’t deleted during debugging.
I think that lucene deletions on documents aren’t part of running the java command for reindexing. My information might be outdated or wrong on this, but we still delete the lucene folder, before running indexing on an offline database, during CI-jobs.
See the two tests which pass at https://github.com/apache/jena/compare/main...OyvindLGjesdal:jena:debug-text-prop-not-working-in-some-cases
I didn’t replicate your configuration in the test, so it could also be other stuff that breaks, but hope this helps.