You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Johannes Siegert <jo...@marktjagd.de> on 2014/04/15 16:36:09 UTC
Bug within the solr query parser (version 4.7.1)
Hi,
I have updated my solr instance from 4.5.1 to 4.7.1. Now the parsed
query seems to be not correct.
Query: /*q=*:*&fq=title:T&E&debug=true */
Before the update the parsed filter query is "*/+title:t&e +title:t
+title:e/*". After the update the parsed filter query is "*/+((title:t&e
title:t)/no_coord) +title:e/*". It seems like a bug within the query parser.
I also have validated the parsed filter query with the analysis
component. The result was "*/+title:t&e +title:t +title:e/*".
The behavior is equal on all special characters that split words into 2
parts.
I use the following WordDelimiterFilter on query side:
<filter class="solr.WordDelimiterFilterFactory" generateWordParts="1"
generateNumberParts="1" catenateWords="0" catenateNumbers="0"
catenateAll="0" splitOnCaseChange="0" splitOnNumerics="0"
preserveOriginal="1"/>
Thanks.
Johannes
Additional informations:
Debug before the update:
<lstname="debug">
<strname="rawquerystring">*:*</str>
<strname="querystring">*:*</str>
<strname="parsedquery">MatchAllDocsQuery(*:*)</str>
<strname="parsedquery_toString">*:*</str>
<lstname="explain"/>
<strname="QParser">LuceneQParser</str>
<arrname="filter_queries">
<str>(title:((T&E)))</str>
</arr>
*<arrname="parsed_filter_queries"> **
**<str>+title:t&e +title:t +title:e</str> **
**</arr> *
...
Debug after the update:
<lstname="debug">
<strname="rawquerystring">*:*</str>
<strname="querystring">*:*</str>
<strname="parsedquery">MatchAllDocsQuery(*:*)</str>
<strname="parsedquery_toString">*:*</str>
<lstname="explain"/>
<strname="QParser">LuceneQParser</str>
<arrname="filter_queries">
<str>(title:((T&E)))</str>
</arr>
*<arrname="parsed_filter_queries"> **
**<str>+((title:t&e title:t)/no_coord) +title:e</str> **
**</arr>*
...
"title"-field definition:
<fieldType name="text_title" class="solr.TextField"
positionIncrementGap="100" omitNorms="true">
<analyzer type="index">
<charFilter class="solr.HTMLStripCharFilterFactory"/>
<charFilter class="solr.MappingCharFilterFactory"
mapping="mapping.txt"/>
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.WordDelimiterFilterFactory"
generateWordParts="1" generateNumberParts="1" catenateWords="1"
catenateNumbers="1" catenateAll="1" splitOnCaseChange="1"
splitOnNumerics="1" preserveOriginal="1" stemEnglishPossessive="0"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
<analyzer type="query">
<charFilter class="solr.HTMLStripCharFilterFactory"/>
<charFilter class="solr.MappingCharFilterFactory"
mapping="mapping.txt"/>
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.SynonymFilterFactory"
synonyms="synonyms.txt" ignoreCase="true" expand="false"/>
<filter class="solr.WordDelimiterFilterFactory"
generateWordParts="1" generateNumberParts="1" catenateWords="0"
catenateNumbers="0" catenateAll="0" splitOnCaseChange="0"
splitOnNumerics="0" preserveOriginal="1"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
</fieldType>