You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Koji Sekiguchi <ko...@m4.dion.ne.jp> on 2005/10/26 04:38:00 UTC

score formula in Similarity javadoc

Hello,

I apologize if this list is not appropriate for sending a patch.

It seems there is an error on score formula in Similarity javadoc:

score(q,d) = sigma( tf * idf^2 * ... )

should be

score(q,d) = sigma( tf * idf * ... )

if my understanding is correct, I would appreciate it if
someone could apply the attached patch to svn.

BTW, in java langauge, operator ^ means BIT XOR... :)

regards,

Koji



RE: score formula in Similarity javadoc

Posted by Koji Sekiguchi <ko...@m4.dion.ne.jp>.
Hi Yonik,

I'd checked TermQuery, TermScorer and TermWeight then
sent the previous mail. But after getting your reply, I did double-check
and I understand that you are correct.

So, the formula in LIA should be re-corrected? :)

Scoring formula figure omission
http://www.lucenebook.com/blog/errata/scoring_formula_omission.html

Thank you very much,

Koji

> -----Original Message-----
> From: Yonik Seeley [mailto:yseeley@gmail.com]
> Sent: Thursday, October 27, 2005 1:23 AM
> To: java-user@lucene.apache.org
> Subject: Re: score formula in Similarity javadoc
>
>
> With respect to different terms in a boolean query, they will
> contribute to
> the total score proportional to idf^2, so I think the javadoc as it exists
> now is probably more correct.
>
> A single TermQuery will have a final score with a single idf factor in it,
> but that's because of the queryweight factor... look at the implementation
> of TermWeight for more details.
>
> -Yonik
> Now hiring -- http://forms.cnet.com/slink?231706
>
> On 10/25/05, Koji Sekiguchi <ko...@m4.dion.ne.jp> wrote:
> >
> > Hello,
> >
> > I apologize if this list is not appropriate for sending a patch.
> >
> > It seems there is an error on score formula in Similarity javadoc:
> >
> > score(q,d) = sigma( tf * idf^2 * ... )
> >
> > should be
> >
> > score(q,d) = sigma( tf * idf * ... )
> >
> > if my understanding is correct, I would appreciate it if
> > someone could apply the attached patch to svn.
> >
> > BTW, in java langauge, operator ^ means BIT XOR... :)
> >
> > regards,
> >
> > Koji
> >
>



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: score formula in Similarity javadoc

Posted by Yonik Seeley <ys...@gmail.com>.
With respect to different terms in a boolean query, they will contribute to
the total score proportional to idf^2, so I think the javadoc as it exists
now is probably more correct.

A single TermQuery will have a final score with a single idf factor in it,
but that's because of the queryweight factor... look at the implementation
of TermWeight for more details.

-Yonik
Now hiring -- http://forms.cnet.com/slink?231706

On 10/25/05, Koji Sekiguchi <ko...@m4.dion.ne.jp> wrote:
>
> Hello,
>
> I apologize if this list is not appropriate for sending a patch.
>
> It seems there is an error on score formula in Similarity javadoc:
>
> score(q,d) = sigma( tf * idf^2 * ... )
>
> should be
>
> score(q,d) = sigma( tf * idf * ... )
>
> if my understanding is correct, I would appreciate it if
> someone could apply the attached patch to svn.
>
> BTW, in java langauge, operator ^ means BIT XOR... :)
>
> regards,
>
> Koji
>

RE: score formula in Similarity javadoc

Posted by Koji Sekiguchi <ko...@m4.dion.ne.jp>.
Attached file was deleted by mailing list server.
The patch was:

Index: src/java/org/apache/lucene/search/Similarity.java
===================================================================
--- src/java/org/apache/lucene/search/Similarity.java	(繝ェ繝薙ず繝ァ繝ウ
328522)
+++ src/java/org/apache/lucene/search/Similarity.java	(菴懈・ュ繧ウ繝斐・)
@@ -42,7 +42,7 @@
  *    <big><big><big><big><big>&Sigma;</big></big></big></big></big></td>
  *    <td valign="middle"><small>
  *    ( {@link #tf(int) tf}(t in d) *
- *    {@link #idf(Term,Searcher) idf}(t)^2 *
+ *    {@link #idf(Term,Searcher) idf}(t) *
  *    {@link Query#getBoost getBoost}(t in q) *
  *    {@link Field#getBoost getBoost}(t.field in d) *
  *    {@link #lengthNorm(String,int) lengthNorm}(t.field in d) )

> -----Original Message-----
> From: Koji Sekiguchi [mailto:koji.sekiguchi@m4.dion.ne.jp]
> Sent: Wednesday, October 26, 2005 11:38 AM
> To: java-user@lucene.apache.org
> Subject: score formula in Similarity javadoc
>
>
> Hello,
>
> I apologize if this list is not appropriate for sending a patch.
>
> It seems there is an error on score formula in Similarity javadoc:
>
> score(q,d) = sigma( tf * idf^2 * ... )
>
> should be
>
> score(q,d) = sigma( tf * idf * ... )
>
> if my understanding is correct, I would appreciate it if
> someone could apply the attached patch to svn.
>
> BTW, in java langauge, operator ^ means BIT XOR... :)
>
> regards,
>
> Koji
>
>
>



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org