You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Fr...@LEW-verteilnetz.DE on 2016/10/31 08:24:10 UTC

Different results for comma and whitespace separated query string using eDisMax Query Parser

Hi,

different results are obtained for a query separated by comma and one separated by whitespace,

   "q":"foo,bar",
   "q":"foo bar",

although solr.StandardTokenizerFactory is utilized. The eDisMax Query Parser is used.
Fields of interest are determined by the 'qf' parameter.
   
   "defType":"edismax",
   "qf":"STREET_NAME COMMPART_NAME",

The different results are also reflected within the parsedquery debug output:

Whitespace:
    "rawquerystring":"foo bar",
    "querystring":"foo bar",
    "parsedquery":"(+(DisjunctionMaxQuery((STREET_NAME:foo | COMMPART_NAME:foo)) DisjunctionMaxQuery((STREET_NAME:bar | COMMPART_NAME:bar))))/no_coord",
    "parsedquery_toString":"+((STREET_NAME:foo | COMMPART_NAME:foo) (STREET_NAME:bar | COMMPART_NAME:bar))",
    "explain":{},
    "QParser":"ExtendedDismaxQParser",

Comma:
    "rawquerystring":"foo,bar",
    "querystring":"foo,bar",
    "parsedquery":"(+DisjunctionMaxQuery(((STREET_NAME:foo STREET_NAME:bar) | (COMMPART_NAME:foo COMMPART_NAME:bar))))/no_coord",
    "parsedquery_toString":"+((STREET_NAME:foo STREET_NAME:bar) | (COMMPART_NAME:foo COMMPART_NAME:bar))",
    "explain":{},
    "QParser":"ExtendedDismaxQParser",

The way I understand the standard tokenizer, both query strings should be split in the same way,
treating whitespace and punctuation as delimiters.

However, obviously, different separators result in different evaluations.
In the first case, the score values of both DisjunctionMaxQuery evaluations are added together.
In the second case, only one (the maximum) of these score values is returned.

Any ideas what I am missing here?

I am using Solr 6.2.0.
Configuration details:
   <field name="STREET_NAME" type="text_de" indexed="true" stored="true"/>
   <field name="COMMPART_NAME" type="text_de" indexed="true" stored="true"/>
and
   <fieldType name="text_de" class="solr.TextField" positionIncrementGap="100">
     <analyzer>
       <tokenizer class="solr.StandardTokenizerFactory"/>
       <filter class="solr.LowerCaseFilterFactory"/>
       <filter class="solr.StopFilterFactory" format="snowball" words="lang/stopwords_de.txt" ignoreCase="true"/>
       <filter class="solr.GermanNormalizationFilterFactory"/>
       <filter class="solr.GermanLightStemFilterFactory"/>
     </analyzer>
   </fieldType>


Thanks and all the best,

Frank

-- 

Frank Zirkelbach
LEW Verteilnetz GmbH (LVN), GIS/NIS
Schaezlerstraße 3, 86150 Augsburg

Tel. intern: 71-1379
Tel. extern: +49-821-328-1379
Fax extern: +49-821-328-1360
mailto:Frank.Zirkelbach@LEW-verteilnetz.DE
www.lew-verteilnetz.de

Vorsitzender des Aufsichtsrats: Dr. Markus Litpher;
Geschäftsführer: Manfred Lux, Theo Schmidtner, Eugen Wiedemann
Sitz der Gesellschaft: Augsburg; USt-IdNr. DE240432124
Handelsregister HRB 20929, Registergericht: Amtsgericht Augsburg