You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Chris Hennen <ch...@hotmail.com> on 2003/09/23 09:12:38 UTC
scoring algorithm
Hi,
what is the purpose of "tf_q * idf_t / norm_q" in Lucene's scoring
algorithm:
score_d = sum_t( tf_q * idf_t / norm_q * tf_d * idf_t / norm_d_t)
I dont understand, why the score has to be higher, when the frequency of a
term in the query is higher. What is normalized by "norm_q"?
Thanks,
Chris
_________________________________________________________________
Alles neu beim MSN Messenger: Emoticons, Hintergründe, Spiele!
http://messenger.msn.de Jetzt die neue Version 6.0 testen!
Re: scoring algorithm
Posted by Ype Kingma <yk...@xs4all.nl>.
On Tuesday 23 September 2003 00:12, Chris Hennen wrote:
> Hi,
>
> what is the purpose of "tf_q * idf_t / norm_q" in Lucene's scoring
> algorithm:
> score_d = sum_t( tf_q * idf_t / norm_q * tf_d * idf_t / norm_d_t)
>
> I dont understand, why the score has to be higher, when the frequency of a
> term in the query is higher. What is normalized by "norm_q"?
To give the user the possibility to assign a higher weight to a term in a
query, (by using a term weight or by repeating the term).
The norm_q compensates the total score for the query weights,
leaving the scores of two different queries somewhat comparable.
Kind regards,
Ype
---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org
Re: scoring algorithm
Posted by Ype Kingma <yk...@xs4all.nl>.
On Tuesday 23 September 2003 00:12, Chris Hennen wrote:
> Hi,
>
> what is the purpose of "tf_q * idf_t / norm_q" in Lucene's scoring
> algorithm:
> score_d = sum_t( tf_q * idf_t / norm_q * tf_d * idf_t / norm_d_t)
>
> I dont understand, why the score has to be higher, when the frequency of a
> term in the query is higher. What is normalized by "norm_q"?
To give the user the possibility to assign a higher weight to a term in a
query, (by using a term weight or by repeating the term).
The norm_q compensates the total score for the query weights,
leaving the scores of two different queries somewhat comparable.
Kind regards,
Ype