You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Yonik Seeley <ys...@gmail.com> on 2005/09/15 22:27:18 UTC
term scoring (idf) question
I'm trying to figure out why idf is multiplied twice into the score of a
term query.
It sort of makes sense if you have just one term... the original weight is
idf*boost, and
the normalization factor is 1/(idf*boost), so you multiply in the idf again
if you want the final score to contain an idf factor (instead of being
normalized to 1.0).
But when you have a boolean query of two term queries, the amount of score
each term contributes will vary with the square of their idfs, not their
idfs. Is this what's intended? If so, the scoring formula one normally sees
(tf*idf*boost*lengthNorm), doesn't seem to convey this fact.
-Yonik
Now hiring -- http://tinyurl.com/7m67g
Re: term scoring (idf) question
Posted by Otis Gospodnetic <ot...@yahoo.com>.
I think you are asking about this stuff:
http://www.google.com/search?q=chuck%20idf%20lucene%20square
Otis
--- Yonik Seeley <ys...@gmail.com> wrote:
> I'm trying to figure out why idf is multiplied twice into the score
> of a
> term query.
> It sort of makes sense if you have just one term... the original
> weight is
> idf*boost, and
> the normalization factor is 1/(idf*boost), so you multiply in the idf
> again
> if you want the final score to contain an idf factor (instead of
> being
> normalized to 1.0).
>
> But when you have a boolean query of two term queries, the amount of
> score
> each term contributes will vary with the square of their idfs, not
> their
> idfs. Is this what's intended? If so, the scoring formula one
> normally sees
> (tf*idf*boost*lengthNorm), doesn't seem to convey this fact.
>
> -Yonik
> Now hiring -- http://tinyurl.com/7m67g
>
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org