You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by xing jiang <gi...@gmail.com> on 2006/02/05 04:27:15 UTC
two problems of using the lucene.
Hi,
I got two problems of using the lucene and may need your help.
1. For each word, how the lucene calculate its weight. I only know for each
work in the document will be weighed by its tf/idf values.
2. Can I modify the lucene so that i use the term frequency instead of
tf/idf value to calculate the similarity between documents and queries.
--
Regards
Jiang Xing
Re: two problems of using the lucene.
Posted by Erik Hatcher <er...@ehatchersolutions.com>.
On Feb 6, 2006, at 1:37 AM, jason wrote:
> The source code of the Queryparser.java is hard to read.
Look at QueryParser.jj instead. QueryParser.java is generated using
JavaCC and is thus not "source" code at all.
Erik
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
Re: two problems of using the lucene.
Posted by jason <gi...@gmail.com>.
Hi,
I try to read the source code of the lucene. But i only find in the
TermScorer.java where the tf/idf measure is really implemented. I guess that
whether the Queryparser class will convert each word into a termquery first.
Then, queries such as the the Booleanquery are built.
The source code of the Queryparser.java is hard to read.
....
regards
jiang xing
On 2/5/06, Klaus <kl...@vommond.de> wrote:
>
> Hi,
>
> you have to write your own similarity object and pass it to your analyzer.
>
>
> http://lucene.apache.org/java/docs/api/org/apache/lucene/search/Similarity.h
> tml
>
> Cheers,
>
> Klaus
> -----Ursprüngliche Nachricht-----
> Von: xing jiang [mailto:gingerons@gmail.com]
> Gesendet: Sonntag, 5. Februar 2006 04:27
> An: java-user@lucene.apache.org
> Betreff: two problems of using the lucene.
>
> Hi,
>
> I got two problems of using the lucene and may need your help.
>
> 1. For each word, how the lucene calculate its weight. I only know for
> each
> work in the document will be weighed by its tf/idf values.
>
> 2. Can I modify the lucene so that i use the term frequency instead of
> tf/idf value to calculate the similarity between documents and queries.
>
> --
> Regards
>
> Jiang Xing
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>
AW: two problems of using the lucene.
Posted by Klaus <kl...@vommond.de>.
Hi,
you have to write your own similarity object and pass it to your analyzer.
http://lucene.apache.org/java/docs/api/org/apache/lucene/search/Similarity.h
tml
Cheers,
Klaus
-----Ursprüngliche Nachricht-----
Von: xing jiang [mailto:gingerons@gmail.com]
Gesendet: Sonntag, 5. Februar 2006 04:27
An: java-user@lucene.apache.org
Betreff: two problems of using the lucene.
Hi,
I got two problems of using the lucene and may need your help.
1. For each word, how the lucene calculate its weight. I only know for each
work in the document will be weighed by its tf/idf values.
2. Can I modify the lucene so that i use the term frequency instead of
tf/idf value to calculate the similarity between documents and queries.
--
Regards
Jiang Xing
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org