You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by vinay kumar kaku <vk...@hotmail.com> on 2008/12/11 07:38:27 UTC

minimum match issue with dismax

Hi,
  do any one know how to make sure minimum match in dismax is working? i change the values and try doing solrCtl restart indexname but i don't see it taking into effect. any body have an idea on this?


thank you
vinay

_________________________________________________________________
You live life online. So we put Windows on the web. 
http://clk.atdmt.com/MRT/go/127032869/direct/01/

RE: minimum match issue with dismax

Posted by Chris Hostetter <ho...@fucit.org>.
:      <str name="mm">2&lt;-25%</str>
:     </lst>
:   </requestHandler>
: 
: correct me if i am wrong isn't the above mm config means if its 1 or 2 
: terms then match all but if its more than 2 terms then 25% can be 
: missing. i get below as parsed query

This is exactly what you asked in another thread, which i answered in the 
other thread...

http://www.nabble.com/multiword-query-using-dismax-to20920925.html#a20920925

...by refering to this exact example in the dismax docs...

http://wiki.apache.org/solr/DisMaxRequestHandler

both of the examples below ultimately mean that for more then 2 clauses 
"25% can be missing" ... the difference is in how fractions are dealt 
with: the percentage of the total number of clauses is computed and then 
rounded down to get an integer.  If there was a minus sign in front of the 
percentage that integer is subtracted from the total number of clauses...

   2<-25% ... 25% of 3 is 0.75, 0.75 rounded down is 0, 3-0 is 3
   2<75%  ... 75% of 3 is 2.25, 2.25 rounded down is 2.

this is also mentioned *explicitly* in the other URL i refered you to last 
time... 
http://lucene.apache.org/solr/api/org/apache/solr/util/doc-files/min-should-match.html

>> When dealing with percentages, negative values can be used to get 
>> different behavior in edge cases. 75% and -25% mean the same thing when 
>> dealing with 4 clauses, but when dealing with 5 clauses 75% means 3 are 
>> required, but -25% means 4 are required.

It sounds like you want 75% instead of -25%


-Hoss


RE: minimum match issue with dismax

Posted by vinay kumar kaku <vk...@hotmail.com>.
Thank you chris,
   I tried you suggestion it worked but one more problem for me is that i have defined  my request handler in solrConfig.xml as below


<requestHandler name="dismax" class="solr.DisMaxRequestHandler" >
    <lst name="defaults">
     <str name="echoParams">explicit</str>
     <float name="tie">0</float>
     <str name="qf">
        txt1^1.1 txt2^1.2 txt3^1.3 txt4^1.4 txt5^1.5
     </str>
     <str name="pf">
        txt4^1.4 txt5^1.5
     </str>
     <str name="ps">1</str>
     <str name="bf">total_boost</str>
     <str name="mm">2&lt;-25%</str>
    </lst>
  </requestHandler>

correct me if i am wrong isn't the above mm config means if its 1 or 2 terms then match all but if its more than 2 terms then 25% can be missing.
i get below as parsed query

<str name="parsedquery_toString">+(((txt2:monti^1.2 | txt3:monti^1.3 | txt1:monti^1.1 | txt4:monti^1.4 | txt5:monti^1.5) (txt2:python^1.2 | txt3:python^1.3 | txt1:python^1.1 | txt4:python^1.4 | txt5:python^1.5) (txt2:scandal^1.2 | txt3:scandal^1.3 | txt1:scandal^1.1 | txt4:scandal^1.4 | txt5:scandal^1.5))~3) (txt4:"monti python scandal"~1^1.4 | txt5:"monti python scandal"~1^1.5) sum(product(float(total_boost),const(0.7)),60.0/(1.0*float(rord(upload_time))+2.0))
</str>

if i change 
  <str name="mm">2&lt;-50%</str>

and do solrCtl restart indexname i don't see any change in parsed query. 

what am i doing wrong?

thanks,
vinay



> Date: Fri, 12 Dec 2008 22:53:13 -0800
> From: hossman_lucene@fucit.org
> To: solr-user@lucene.apache.org
> Subject: Re: minimum match issue with dismax
> 
> 
> :   do any one know how to make sure minimum match in dismax is working? i 
> : change the values and try doing solrCtl restart indexname but i don't 
> : see it taking into effect. any body have an idea on this?
> 
> use debugQuery=true, and then look at the parsedquery ... it can 
> be somewhat confusing if you aren't use to it, but for simple testing: 
> don't use a pf, bf, or bq, set qf to a single field, and set tie=0
> 
> using the example configs a url like this...
> 
> http://localhost:8983/solr/select/?tie=0&pf=&bq=&bf=&q=first+second+third&qt=dismax&qf=text&mm=50%25&debugQuery=true
> 
> produces...
> 
> +((DisjunctionMaxQuery((text:first)) DisjunctionMaxQuery((text:second)) DisjunctionMaxQuery((text:third)))~1) ()
> 
> ...that ~1 is the result of computing 50% of 3 rounded down.  if i change 
> it to 70%...
> 
> http://localhost:8983/solr/select/?tie=0&pf=&bq=&bf=&q=first+second+third&qt=dismax&qf=text&mm=70%25&debugQuery=true
> 
> ...i get...
> 
> +((DisjunctionMaxQuery((text:first)) DisjunctionMaxQuery((text:second)) DisjunctionMaxQuery((text:third)))~2) ()
> 
> ...etc.  One thing to watch out for is that the "~X" syntax only shows you 
> the minNrShouldMath value for boolean queries.  for phrase queries it 
> shows you the slop value, and for the individual DisjunctionMaxQueries it 
> shows you the tie breaker value (hence blanking out all those params keeps 
> it simpler and easier to spot the mm value getting used)
> 
> 
> 
> -Hoss
> 

_________________________________________________________________
You live life online. So we put Windows on the web. 
http://clk.atdmt.com/MRT/go/127032869/direct/01/

Re: minimum match issue with dismax

Posted by Chris Hostetter <ho...@fucit.org>.
:   do any one know how to make sure minimum match in dismax is working? i 
: change the values and try doing solrCtl restart indexname but i don't 
: see it taking into effect. any body have an idea on this?

use debugQuery=true, and then look at the parsedquery ... it can 
be somewhat confusing if you aren't use to it, but for simple testing: 
don't use a pf, bf, or bq, set qf to a single field, and set tie=0

using the example configs a url like this...

http://localhost:8983/solr/select/?tie=0&pf=&bq=&bf=&q=first+second+third&qt=dismax&qf=text&mm=50%25&debugQuery=true

produces...

+((DisjunctionMaxQuery((text:first)) DisjunctionMaxQuery((text:second)) DisjunctionMaxQuery((text:third)))~1) ()

...that ~1 is the result of computing 50% of 3 rounded down.  if i change 
it to 70%...

http://localhost:8983/solr/select/?tie=0&pf=&bq=&bf=&q=first+second+third&qt=dismax&qf=text&mm=70%25&debugQuery=true

...i get...

+((DisjunctionMaxQuery((text:first)) DisjunctionMaxQuery((text:second)) DisjunctionMaxQuery((text:third)))~2) ()

...etc.  One thing to watch out for is that the "~X" syntax only shows you 
the minNrShouldMath value for boolean queries.  for phrase queries it 
shows you the slop value, and for the individual DisjunctionMaxQueries it 
shows you the tie breaker value (hence blanking out all those params keeps 
it simpler and easier to spot the mm value getting used)



-Hoss