You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by "Nemani, Raj" <Ra...@turner.com> on 2011/04/11 22:12:52 UTC

Question on Dismax plugin

All,

I have a question on the Dismax plugin for the search handler.  I have
two test instances of Solr.  In one I am using the default search
handler.  In this case, the fields that I am working with (slug and
story) are indexed via the all_text filed and the searches are done on
the all_text field.

For the other one I have configured a search handler using the dismax
plugin as shown below.

 

<requestHandler name="mydismax" class="solr.SearchHandler" >

    <lst name="defaults">

     <str name="defType">dismax</str>

     <str name="echoParams">explicit</str>

     <float name="tie">0.01</float>

     <str name="qf">

        story^3.0 slug^0.2

     </str>

     <int name="ps">100</int>

     <str name="q.alt">*:*</str>

     </lst>

  </requestHandler>

 

To make testing easier, I only have 4 (same) documents in both indexes
with the word "Obama" appearing inside as described below.

 

File 1:: The word Obama appears zero times in "slug" field and four
times in "story" field

File 2:: The word Obama appears zero times in "slug" field and thrice in
"story" field

File 3:: The word Obama appears zero times in "slug" field and two times
in "story" field

File 4:: The word Obama appears One time in "slug" field and one time in
"story" field

 

 

Here is the order of the documents in the order of decreasing scores
from the search results

 

Dismax Search Handler (steadily decreasing scores):

*         File 1:: The word Obama appears zero times in "slug" field and
four times in "story" field

*         File 4:: The word Obama appears One time in "slug" field and
one time in "story" field

*         File 2:: The word Obama appears zero times in "slug" field and
thrice in "story" field

*         File 3:: The word Obama appears zero times in "slug" field and
two times in "story" field

 

Standard Search handler:

*         File 1:: The word Obama appears zero times in "slug" field and
four times in "story" field

*         File 2:: The word Obama appears zero times in "slug" field and
thrice in "story" field (same score as File 4 score below)

*         File 4:: The word Obama appears One time in "slug" field and
one time in "story" field (same score as File 2 score above)

*         File 3:: The word Obama appears zero times in "slug" field and
two times in "story" field

 

 

My question, why is dismax showing "File 4:: The word Obama appears One
time in "slug" field and one time in "story" field" 

ahead of 

"File 2:: The word Obama appears zero times in "slug" field and thrice
in "story" field" given that I have boosted these fields as shown below.


     

<str name="qf">

        story^3.0 slug^0.2

</str>

 

I would have thought that the ""File 4:: The word Obama appears One time
in "slug" field and one time in "story" field" would have gone all the
way done in the result list.

 

Any help is appreciated

Thanks much in advance

Raj

 

 

 

 

 

 

 

 


Re: Question on Dismax plugin

Posted by Otis Gospodnetic <ot...@yahoo.com>.
Hi Raj,

I'm guessing your slug field is much shorter and thus a match in that field has 
more weight than a match is a much longer story field.  If you omit norms for 
those fields in the schema (and reindex), I believe you will see File 4 drop to 
position #4.

Otis
----
Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/



----- Original Message ----
> From: "Nemani, Raj" <Ra...@turner.com>
> To: solr-user@lucene.apache.org
> Sent: Mon, April 11, 2011 4:12:52 PM
> Subject: Question on Dismax plugin
> 
> All,
> 
> I have a question on the Dismax plugin for the search handler.   I have
> two test instances of Solr.  In one I am using the default  search
> handler.  In this case, the fields that I am working with (slug  and
> story) are indexed via the all_text filed and the searches are done  on
> the all_text field.
> 
> For the other one I have configured a search  handler using the dismax
> plugin as shown below.
> 
> 
> 
> <requestHandler name="mydismax" class="solr.SearchHandler"  >
> 
>     <lst name="defaults">
> 
>       <str name="defType">dismax</str>
> 
>      <str  name="echoParams">explicit</str>
> 
>      <float  name="tie">0.01</float>
> 
>      <str  name="qf">
> 
>         story^3.0  slug^0.2
> 
>      </str>
> 
>      <int  name="ps">100</int>
> 
>      <str  name="q.alt">*:*</str>
> 
>      </lst>
> 
>    </requestHandler>
> 
> 
> 
> To make testing easier, I only have 4  (same) documents in both indexes
> with the word "Obama" appearing inside as  described below.
> 
> 
> 
> File 1:: The word Obama appears zero times in  "slug" field and four
> times in "story" field
> 
> File 2:: The word Obama  appears zero times in "slug" field and thrice in
> "story" field
> 
> File  3:: The word Obama appears zero times in "slug" field and two times
> in  "story" field
> 
> File 4:: The word Obama appears One time in "slug" field  and one time in
> "story" field
> 
> 
> 
> 
> 
> Here is the order of  the documents in the order of decreasing scores
> from the search  results
> 
> 
> 
> Dismax Search Handler (steadily decreasing  scores):
> 
> *         File 1:: The word Obama appears  zero times in "slug" field and
> four times in "story" field
> 
> *          File 4:: The word Obama appears One time in "slug" field  and
> one time in "story" field
> 
> *         File 2::  The word Obama appears zero times in "slug" field and
> thrice in "story"  field
> 
> *         File 3:: The word Obama appears zero  times in "slug" field and
> two times in "story" field
> 
> 
> 
> Standard  Search handler:
> 
> *         File 1:: The word Obama  appears zero times in "slug" field and
> four times in "story"  field
> 
> *         File 2:: The word Obama appears zero  times in "slug" field and
> thrice in "story" field (same score as File 4 score  below)
> 
> *         File 4:: The word Obama appears One  time in "slug" field and
> one time in "story" field (same score as File 2  score above)
> 
> *         File 3:: The word Obama  appears zero times in "slug" field and
> two times in "story" field
> 
> 
> 
> 
> 
> My question, why is dismax showing "File 4:: The word Obama  appears One
> time in "slug" field and one time in "story" field" 
> 
> ahead  of 
> 
> "File 2:: The word Obama appears zero times in "slug" field and  thrice
> in "story" field" given that I have boosted these fields as shown  below.
> 
> 
>     
> 
> <str name="qf">
> 
>          story^3.0 slug^0.2
> 
> </str>
> 
> 
> 
> I  would have thought that the ""File 4:: The word Obama appears One time
> in  "slug" field and one time in "story" field" would have gone all the
> way done  in the result list.
> 
> 
> 
> Any help is appreciated
> 
> Thanks much  in advance
> 
> Raj
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
>