You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Johannes Siegert <jo...@marktjagd.de> on 2014/04/14 15:03:53 UTC
changed query behavior
Hi,
I have updated my solr instance from 4.5.1 to 4.7.1.
Now my solr query failing some tests.
Query: q=*:*&fq=(title:((T&E)))?debug=true
Before the update:
<lstname="debug">
<strname="rawquerystring">*:*</str>
<strname="querystring">*:*</str>
<strname="parsedquery">MatchAllDocsQuery(*:*)</str>
<strname="parsedquery_toString">*:*</str>
<lstname="explain"/>
<strname="QParser">LuceneQParser</str>
<arrname="filter_queries">
<str>(title:((T&E)))</str>
</arr>
<arrname="parsed_filter_queries">
<str>+title:t&e +title:t +title:e</str>
</arr>
...
After the update:
<lstname="debug">
<strname="rawquerystring">*:*</str>
<strname="querystring">*:*</str>
<strname="parsedquery">MatchAllDocsQuery(*:*)</str>
<strname="parsedquery_toString">*:*</str>
<lstname="explain"/>
<strname="QParser">LuceneQParser</str>
<arrname="filter_queries">
<str>(title:((T&E)))</str>
</arr>
<arrname="parsed_filter_queries">
<str>+((title:t&e title:t)/no_coord) +title:e</str>
</arr>
...
Before update the query deliver only one result. Now the query deliver
three results.
Do you have any idea why the parsed_filter_queries is "+((title:t&e
title:t)/no_coord) +title:e" instead of "+title:t&e +title:t +title:e"?
"title"-field definition:
<fieldType name="text_title" class="solr.TextField"
positionIncrementGap="100" omitNorms="true">
<analyzer type="index">
<charFilter class="solr.HTMLStripCharFilterFactory"/>
<charFilter class="solr.MappingCharFilterFactory"
mapping="mapping.txt"/>
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.WordDelimiterFilterFactory"
generateWordParts="1" generateNumberParts="1" catenateWords="1"
catenateNumbers="1" catenateAll="1" splitOnCaseChange="1"
splitOnNumerics="1" preserveOriginal="1" stemEnglishPossessive="0"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
<analyzer type="query">
<charFilter class="solr.HTMLStripCharFilterFactory"/>
<charFilter class="solr.MappingCharFilterFactory"
mapping="mapping.txt"/>
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.SynonymFilterFactory"
synonyms="synonyms.txt" ignoreCase="true" expand="false"/>
<filter class="solr.WordDelimiterFilterFactory"
generateWordParts="1" generateNumberParts="1" catenateWords="0"
catenateNumbers="0" catenateAll="0" splitOnCaseChange="0"
splitOnNumerics="0" preserveOriginal="1"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
</fieldType>
The default query operator is AND.
Thanks!
Johannes
Bug within the solr query parser (version 4.7.1)
Posted by Johannes Siegert <jo...@marktjagd.de>.
Hi,
I have updated my solr instance from 4.5.1 to 4.7.1. Now the parsed
query seems to be not correct.
Query: /*q=*:*&fq=title:T&E&debug=true */
Before the update the parsed filter query is "*/+title:t&e +title:t
+title:e/*". After the update the parsed filter query is "*/+((title:t&e
title:t)/no_coord) +title:e/*". It seems like a bug within the query parser.
I also have validated the parsed filter query with the analysis
component. The result was "*/+title:t&e +title:t +title:e/*".
The behavior is equal on all special characters that split words into 2
parts.
I use the following WordDelimiterFilter on query side:
<filter class="solr.WordDelimiterFilterFactory" generateWordParts="1"
generateNumberParts="1" catenateWords="0" catenateNumbers="0"
catenateAll="0" splitOnCaseChange="0" splitOnNumerics="0"
preserveOriginal="1"/>
Thanks.
Johannes
Additional informations:
Debug before the update:
<lstname="debug">
<strname="rawquerystring">*:*</str>
<strname="querystring">*:*</str>
<strname="parsedquery">MatchAllDocsQuery(*:*)</str>
<strname="parsedquery_toString">*:*</str>
<lstname="explain"/>
<strname="QParser">LuceneQParser</str>
<arrname="filter_queries">
<str>(title:((T&E)))</str>
</arr>
*<arrname="parsed_filter_queries"> **
**<str>+title:t&e +title:t +title:e</str> **
**</arr> *
...
Debug after the update:
<lstname="debug">
<strname="rawquerystring">*:*</str>
<strname="querystring">*:*</str>
<strname="parsedquery">MatchAllDocsQuery(*:*)</str>
<strname="parsedquery_toString">*:*</str>
<lstname="explain"/>
<strname="QParser">LuceneQParser</str>
<arrname="filter_queries">
<str>(title:((T&E)))</str>
</arr>
*<arrname="parsed_filter_queries"> **
**<str>+((title:t&e title:t)/no_coord) +title:e</str> **
**</arr>*
...
"title"-field definition:
<fieldType name="text_title" class="solr.TextField"
positionIncrementGap="100" omitNorms="true">
<analyzer type="index">
<charFilter class="solr.HTMLStripCharFilterFactory"/>
<charFilter class="solr.MappingCharFilterFactory"
mapping="mapping.txt"/>
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.WordDelimiterFilterFactory"
generateWordParts="1" generateNumberParts="1" catenateWords="1"
catenateNumbers="1" catenateAll="1" splitOnCaseChange="1"
splitOnNumerics="1" preserveOriginal="1" stemEnglishPossessive="0"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
<analyzer type="query">
<charFilter class="solr.HTMLStripCharFilterFactory"/>
<charFilter class="solr.MappingCharFilterFactory"
mapping="mapping.txt"/>
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.SynonymFilterFactory"
synonyms="synonyms.txt" ignoreCase="true" expand="false"/>
<filter class="solr.WordDelimiterFilterFactory"
generateWordParts="1" generateNumberParts="1" catenateWords="0"
catenateNumbers="0" catenateAll="0" splitOnCaseChange="0"
splitOnNumerics="0" preserveOriginal="1"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
</fieldType>