You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Eugene Ezekiel <ec...@gmail.com> on 2006/02/05 13:36:04 UTC
Reducing Inflated Similarity Scores
Hi All,
I'm currently using the Default Similarity with the Boolean Query add
function to append clauses. The problem I face is this, given a query
<t1> <t2> <t3> .... <tn>, where <ti> = a term
it returns me a document which that has just ONE term in it say <t1> and
nothing else. Surprisingly, the hits score for this is 1.0.
Ok, I'm quite new to lucene so I don't really know how the Default
Similarity works but from what I gather it is a variation of the
cos-similarity. And the cos-measure penalizes extraneous terms
therefore, how can the score be 1.0?
Can anyone tell what I can tweak to bring it more to the cos-measure?
Thanks.
Regards,
Eugene
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
Re: Reducing Inflated Similarity Scores
Posted by Chris Hostetter <ho...@fucit.org>.
: Ok, I'm quite new to lucene so I don't really know how the Default
: Similarity works but from what I gather it is a variation of the
: cos-similarity. And the cos-measure penalizes extraneous terms
: therefore, how can the score be 1.0?
If you are using hte Hits API then the score you are seeing is normalized
such that if the highest score in your results is greater then 1, then all
scores are divided by one. if you want to see the "true" score you should
look at the score from one of the more advanced search methods (that
returns TopDocs).
: Can anyone tell what I can tweak to bring it more to the cos-measure?
I would start by looking at the Searchable.explain() method to really
understand where your score is comming from. then you can look at what
methods you might need to override to get the behavior you desire (if it's
not already working fine once you see the non-normalized score)
-Hoss
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org