You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Chris Hostetter <ho...@fucit.org> on 2011/05/03 00:49:37 UTC

Re: Dismax Minimum Match/Stopwords Bug

: However, is there an actual fix in the 3.1 eDisMax parser which solves 
: the problem for real? Cannot find a JIRA issue for it.

edismax uses the same query structure as dismax, which means it's not 
possible to "fix" anything here ... it's how the query parsers work.

each "word" from the query string is analyzed by each field in the "qf", 
and the result is used as a query on the "word" in the field.  The 
individual clauses for each word are aggregated into a 
DisjunctionMaxQuery, and the set of DisjunctionMaxQueries are then 
combined into a BooleanQuery (with the appropriate minNrShouldMatch set)

if a "word" from the input produces no output from the analyzers of *any* 
of the of fields, then the resulting DisjunctionMaxQuery is empty and 
droped from the final BooleanQuery ... so if a "word" in the query string 
is stop word for *every* field in the qf, there is no clause.  but if 
*any* field in the qf produces a term for it, then there is a 
DisjunctionMaxQuery for that word added to hte main BooleanQuery.

As i've said many times: this isn't a bug, it's fundemental point of the 
parser and the structure of the query.

The best "solution" for people who get bit by this (in my opinion) is not 
to give up on stop words -- if you want to use stop words, by all means 
use stop words.  BUT! You must use them in all the fields of your qf ... 
evne fields where you think "why in gods name would i need stopwords on 
this field, those terms will never exist in this field!" ... you may know 
that, and it may be true, but it doesn't change the fact that people will 
be *querying* for stop words against those fields, and you want to ignore 
them when they do.



-Hoss