You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Karl Wolf (JIRA)" <ji...@apache.org> on 2019/04/03 17:46:00 UTC
[jira] [Created] (SOLR-13367) Highlighting fails for Range queries
on Multi-valued String fields
Karl Wolf created SOLR-13367:
--------------------------------
Summary: Highlighting fails for Range queries on Multi-valued String fields
Key: SOLR-13367
URL: https://issues.apache.org/jira/browse/SOLR-13367
Project: Solr
Issue Type: Bug
Security Level: Public (Default Security Level. Issues are Public)
Components: highlighter
Affects Versions: 7.7.1, 7.5
Environment: RedHat Linux v7
Java 1.8.0_201
Reporter: Karl Wolf
Fix For: 5.1
Range queries against multi-valued string fields produces useless highlighting, even though "hl.highlightMultiTerm":"true"
I have uncovered what I believe is a bug. At the very lease it is a difference in behavior between Solr v5.1.0 and v7.5.0 (and v7.7.1).
I have a multi-valued string Field defined in my schema as:
<fieldType name="string" class="solr.StrField" sortMissingLast="true"/>
<field name="MyStringField" type="string" indexed="true" stored="true" multiValued="true" />
I am using a query containing a Range clause and I am using highlighting to get the list of values that actually matched the range query.
All examples below were using the appropriate Solr Admin Server SolrCore Query page.
***************************************************************************
First, a correctly working example of a range query using Solr v5.1.0 which produces useful results:
{
"responseHeader": {
"status": 0,
"QTime": 366,
"params": {
"q": "MyStringField:[A TO B}",
"hl": "true",
"indent": "true",
"hl.preserveMulti": "true",
"fl": "MyStringField,MyUniqueID",
"hl.requireFieldMatch": "true",
"hl.usePhraseHighlighter": "true",
"hl.fl": "MyStringField",
"wt": "json",
"hl.highlightMultiTerm": "true",
"_": "1553275722025"
}
},
"response": {
"numFound": 999,
"start": 0,
"docs": [
{
"MyStringField": [
"Stanley, Wendell M.",
"Avery, Roy"
],
"MyUniqueID": "UniqueID1"
},
{
"MyStringField": [
"Avery, Roy"
],
"MyUniqueID": "UniqueID2"
},
*** lots more docs correctly found
]
},
*** we get to the highlighting portion of the response
*** this indicates which values of each MyStringField
*** that actually matched the query
"highlighting": {
"UniqueID1": {
"MyStringField": [
"<em>Avery, Roy</em>"
]
},
"UniqueID2": {
"MyStringField": [
"<em>Avery, Roy</em>"
]
},
"UniqueID3": {
"MyStringField": [
"<em>American Institute of Biological Sciences</em>",
"<em>Albritton, Errett C.</em>"
]
},
... etc.
*** lots more useful highlight values. Note the two matching values
*** for document UniqueID3.
}
***************************************************************************
* THE PROBLEM
* Now using newer versions of Solr
***************************************************************************
Using the exact same parameters with Solr v7.5.0 or v7.7.1, the top portion of the
response is basically the same including the number of documents found
{
"responseHeader":{
"status":0,
"QTime":245,
"params":{
"q":"MyStringField:[A TO B}",
"hl":"on",
"hl.preserveMulti":"true",
"fl":"MyUniqueID, MyStringField",
"hl.requireFieldMatch":"true",
"hl.fl":"MyStringField",
"hightlightMultiTerm":"true",
"wt":"json",
"_":"1553105129887",
"usePhraseHighLighter":"true"}},
"response":{"numFound":999,"start":0,"docs":[
*** The problem is with the lighlighting portion of the results, which is effectively empty.
*** There is no way to know what values in each document that actually matched the query:
"highlighting":{
"UniqueID1":{},
"UniqueID2":{},
"UniqueID3":{},
... etc.
*** NOTE: The source data is the same for all of the tested Solr versions and the Solr indexes
*** were properly rebuilt for each Solr version.
***************************************************************************
Changing the request to using the "unified" highlighter: "hl.method=unified", the highlighting looks like:
"highlighting":{
"UniqueID1":{
"MyStringField":[]},
"UniqueID2":{
"MyStringField":[]},
"UniqueID3":{
"MyStringField":[]},
... etc.
*** The highlighting now properly lists the matching field but still no useful values are listed.
***************************************************************************
NOTE: if I change the query from using a Range clause to using a Wildcard query: q="MyStringField:A*"
the highlighting is correct in both Solr v7.5.0 and v7.7.1: These are GOOD results!
"highlighting":{
"UniqueID1": {
"MyStringField": ["<em>Avery, Roy</em>"]},
"UniqueID2": {
"MyStringField": ["<em>Avery, Roy</em>"]},
"UniqueID3": {
"MyStringField": [
"<em>American Institute of Biological Sciences</em>",
"<em>Albritton, Errett C.</em>"
]
},
... etc.
*** This makes me think there is some problem with the way a Range query
*** feeds the search results to the Solr Highlighter code.
***************************************************************************
All attempts to vary the hl specs or any other query parameters do not solve the problem.
The wildcard query is my current work around but there still is a problem with
range queries:
In summary, there is some incompatibility among:
1) A multi-valued string field AND
2) A range query against that field AND
3) The result Highlighting. It is effectively empty.
I don't know when this issue was first introduced. I have recently been updating from 5.1.0
to 7.5.0 in one big leap. I have attempted to read through the change logs for the intervening
versions but I gave up to save my sanity.
You should be able to reproduce this issue using any multi-valued, indexed and stored string field.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org