You are viewing a plain text version of this content. The canonical link for it is here.

Posted to solr-user@lucene.apache.org by Marc Sturlese <ma...@gmail.com> on 2009/01/30 20:06:55 UTC

solr booosting

Hey there,
I am trying to tune the boost of the results obtained using
DisMaxQueryParser.
As I understood lucene's boost, if you search for "John Le Carre" it will
give better score to the results that contains just the searched string that
results that have, for example, 50 words and the search is contained in the
words.

In Solr, my goal is to give more score to the docs that contains both words
but that have more words in the field.

I have tried 2 options:
1.-On index time, I check the length of the fields and if are bigger that
'x' chars i give more boost to that doc (I am adding 3.0 extra boost using
addBoost). 

2.-In another hand I have been playing with tie and pf but I think they are
not helping in my issue.

Before using Solr (my own Lucene searcher and indexer) the first option use
to work quite well, in Solr my extra boost seems to afect much less. Is this
normal as I am using DismaxQueryParser or it should be the same?

Any advice is more than welcome!

Thanks in advance
 
-- 
View this message in context: http://www.nabble.com/solr-booosting-tp21753617p21753617.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: solr booosting

Posted by Marc Sturlese <ma...@gmail.com>.

Thanks Hoss, that was really useful information.

hossman wrote:
> 
> 
> : As I understood lucene's boost, if you search for "John Le Carre" it
> will
> : give better score to the results that contains just the searched string
> that
> : results that have, for example, 50 words and the search is contained in
> the
> : words.
> : 
> : In Solr, my goal is to give more score to the docs that contains both
> words
> : but that have more words in the field.
> : 
> : I have tried 2 options:
> : 1.-On index time, I check the length of the fields and if are bigger
> that
> : 'x' chars i give more boost to that doc (I am adding 3.0 extra boost
> using
> : addBoost). 
> 
> rather then explicitly setting an index time boost, i would use a custom 
> similarity class to do this -- the lengthNorm function is what you want to 
> change.
> 
> : 2.-In another hand I have been playing with tie and pf but I think they
> are
> : not helping in my issue.
> 
> neither of those options will offset the penalty assigned to longer docs 
> vs shorter docs if both match -- they will help you change the scores 
> for docs that match on multiple fields however.
> 
> : Before using Solr (my own Lucene searcher and indexer) the first option
> use
> : to work quite well, in Solr my extra boost seems to afect much less. Is
> this
> : normal as I am using DismaxQueryParser or it should be the same?
> 
> try using the standard request handler to build the same query structures 
> you are use to and make sure you're getting the expected results that way, 
> then consider how dismax might change things.  one thing to watch out for 
> is that you really aren't doing things the same way ... it's really easy 
> to omitNorms="true" in Solr, in which case your index time boost isn't 
> factoring in at all.
> 
> 
> 
> 
> -Hoss
> 
> 
> 

-- 
View this message in context: http://www.nabble.com/solr-booosting-tp21753617p21930040.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: solr booosting

Posted by Chris Hostetter <ho...@fucit.org>.

: As I understood lucene's boost, if you search for "John Le Carre" it will
: give better score to the results that contains just the searched string that
: results that have, for example, 50 words and the search is contained in the
: words.
: 
: In Solr, my goal is to give more score to the docs that contains both words
: but that have more words in the field.
: 
: I have tried 2 options:
: 1.-On index time, I check the length of the fields and if are bigger that
: 'x' chars i give more boost to that doc (I am adding 3.0 extra boost using
: addBoost). 

rather then explicitly setting an index time boost, i would use a custom 
similarity class to do this -- the lengthNorm function is what you want to 
change.

: 2.-In another hand I have been playing with tie and pf but I think they are
: not helping in my issue.

neither of those options will offset the penalty assigned to longer docs 
vs shorter docs if both match -- they will help you change the scores 
for docs that match on multiple fields however.

: Before using Solr (my own Lucene searcher and indexer) the first option use
: to work quite well, in Solr my extra boost seems to afect much less. Is this
: normal as I am using DismaxQueryParser or it should be the same?

try using the standard request handler to build the same query structures 
you are use to and make sure you're getting the expected results that way, 
then consider how dismax might change things.  one thing to watch out for 
is that you really aren't doing things the same way ... it's really easy 
to omitNorms="true" in Solr, in which case your index time boost isn't 
factoring in at all.




-Hoss