You are viewing a plain text version of this content. The canonical link for it is here.

Posted to java-user@lucene.apache.org by xing jiang <gi...@gmail.com> on 2006/02/05 04:27:15 UTC

two problems of using the lucene.

Hi,

I got two problems of using the lucene and may need your help.

1. For each word, how the lucene calculate its weight. I only know for each
work in the document will be weighed by its tf/idf values.

2. Can I modify the lucene so that i use the term frequency instead of
tf/idf value to calculate the similarity between documents and queries.

--
Regards

Jiang Xing

Re: two problems of using the lucene.

Posted by Erik Hatcher <er...@ehatchersolutions.com>.

On Feb 6, 2006, at 1:37 AM, jason wrote:
> The source code of the Queryparser.java is hard to read.

Look at QueryParser.jj instead.  QueryParser.java is generated using  
JavaCC and is thus not "source" code at all.

	Erik


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

Re: two problems of using the lucene.

Posted by jason <gi...@gmail.com>.

Hi,

I try to read the source code of the lucene. But i only find in the
TermScorer.java where the tf/idf measure is really implemented. I guess that
whether the Queryparser class will convert each word into a termquery first.
Then, queries such as the the Booleanquery are built.

The source code of the Queryparser.java is hard to read.
....

regards
jiang xing

On 2/5/06, Klaus <kl...@vommond.de> wrote:
>
> Hi,
>
> you have to write your own similarity object and pass it to your analyzer.
>
>
> http://lucene.apache.org/java/docs/api/org/apache/lucene/search/Similarity.h
> tml
>
> Cheers,
>
> Klaus
> -----Ursprüngliche Nachricht-----
> Von: xing jiang [mailto:gingerons@gmail.com]
> Gesendet: Sonntag, 5. Februar 2006 04:27
> An: java-user@lucene.apache.org
> Betreff: two problems of using the lucene.
>
> Hi,
>
> I got two problems of using the lucene and may need your help.
>
> 1. For each word, how the lucene calculate its weight. I only know for
> each
> work in the document will be weighed by its tf/idf values.
>
> 2. Can I modify the lucene so that i use the term frequency instead of
> tf/idf value to calculate the similarity between documents and queries.
>
> --
> Regards
>
> Jiang Xing
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

AW: two problems of using the lucene.

Posted by Klaus <kl...@vommond.de>.

Hi, 

you have to write your own similarity object and pass it to your analyzer.

http://lucene.apache.org/java/docs/api/org/apache/lucene/search/Similarity.h
tml

Cheers,

Klaus
-----Ursprüngliche Nachricht-----
Von: xing jiang [mailto:gingerons@gmail.com] 
Gesendet: Sonntag, 5. Februar 2006 04:27
An: java-user@lucene.apache.org
Betreff: two problems of using the lucene.

Hi,

I got two problems of using the lucene and may need your help.

1. For each word, how the lucene calculate its weight. I only know for each
work in the document will be weighed by its tf/idf values.

2. Can I modify the lucene so that i use the term frequency instead of
tf/idf value to calculate the similarity between documents and queries.

--
Regards

Jiang Xing


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org