You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Winton Davies <wd...@yahoo-inc.com> on 2006/08/29 20:50:19 UTC
Straight TF-IDF cosine similarity?
Hi All,
I'm scratching my head - can someone tell me which class implements
an efficient multiple term TF.IDF Cosine similarity scoring mechanism?
There is clearly the single TermScorer - but I can't find the class
that would do a bucketed TF.IDF cosine - i.e. fill an accumulator
with the tf.idf^2 for each of the term posting lists, until
accumulator is full, and then compute the final score.
I don't need a Boolean Query - at least this seems like overkill.
Cheers,
Winton
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
Re: Straight TF-IDF cosine similarity?
Posted by Jason Polites <ja...@gmail.com>.
Have you looked at the MoreLikeThis class in the similarity package?
On 8/30/06, Winton Davies <wd...@yahoo-inc.com> wrote:
>
> Hi All,
>
> I'm scratching my head - can someone tell me which class implements
> an efficient multiple term TF.IDF Cosine similarity scoring mechanism?
>
> There is clearly the single TermScorer - but I can't find the class
> that would do a bucketed TF.IDF cosine - i.e. fill an accumulator
> with the tf.idf^2 for each of the term posting lists, until
> accumulator is full, and then compute the final score.
>
> I don't need a Boolean Query - at least this seems like overkill.
>
> Cheers,
> Winton
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>