You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Chris Hennen <ch...@hotmail.com> on 2003/09/23 09:12:38 UTC

scoring algorithm

Hi,

what is the purpose of "tf_q * idf_t / norm_q" in Lucene's scoring 
algorithm:
score_d = sum_t( tf_q * idf_t / norm_q * tf_d * idf_t / norm_d_t)

I dont understand, why the score has to be higher, when the frequency of a 
term in the query is higher. What is normalized by "norm_q"?

Thanks,
Chris

_________________________________________________________________
Alles neu beim MSN Messenger: Emoticons, Hintergründe, Spiele! 
http://messenger.msn.de Jetzt die neue Version 6.0 testen!


Re: scoring algorithm

Posted by Ype Kingma <yk...@xs4all.nl>.
On Tuesday 23 September 2003 00:12, Chris Hennen wrote:
> Hi,
>
> what is the purpose of "tf_q * idf_t / norm_q" in Lucene's scoring
> algorithm:
> score_d = sum_t( tf_q * idf_t / norm_q * tf_d * idf_t / norm_d_t)
>
> I dont understand, why the score has to be higher, when the frequency of a
> term in the query is higher. What is normalized by "norm_q"?

To give the user the possibility to assign a higher weight to a term in a
query, (by using a term weight or by repeating the term).
The norm_q compensates the total score for the query weights,
leaving the scores of two different queries somewhat comparable.

Kind regards,
Ype


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Re: scoring algorithm

Posted by Ype Kingma <yk...@xs4all.nl>.
On Tuesday 23 September 2003 00:12, Chris Hennen wrote:
> Hi,
>
> what is the purpose of "tf_q * idf_t / norm_q" in Lucene's scoring
> algorithm:
> score_d = sum_t( tf_q * idf_t / norm_q * tf_d * idf_t / norm_d_t)
>
> I dont understand, why the score has to be higher, when the frequency of a
> term in the query is higher. What is normalized by "norm_q"?

To give the user the possibility to assign a higher weight to a term in a
query, (by using a term weight or by repeating the term).
The norm_q compensates the total score for the query weights,
leaving the scores of two different queries somewhat comparable.

Kind regards,
Ype