You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Klaus <kl...@vommond.de> on 2006/01/09 17:19:20 UTC
Finding similar documents
Hi,
is there are build-in method for finding similar documents to one given
document?
Thx,
Klaus
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
Re: Finding similar documents
Posted by Stefan Gusenbauer <st...@kbse.net>.
Grant Ingersoll wrote:
> I believe there is a MoreLikeThis class floating around somewhere (I
> think it is in the contrib/similarity package). The Lucene book also
> has a good example, and I have some examples at
> http://www.cnlp.org/apachecon2005 that demonstrate using term vectors
> to do this
>
> Klaus wrote:
>
>> Hi,
>>
>> is there are build-in method for finding similar documents to one given
>> document?
>>
>> Thx,
>>
>> Klaus
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>
>>
>>
>
I've implemented a simple relevance feedback algorithm which extracts
terms from all interesting documents and builds up a new query with this
terms. This is pretty simple but It works in most cases.
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
Re: Finding similar documents
Posted by Grant Ingersoll <gs...@syr.edu>.
I believe there is a MoreLikeThis class floating around somewhere (I
think it is in the contrib/similarity package). The Lucene book also
has a good example, and I have some examples at
http://www.cnlp.org/apachecon2005 that demonstrate using term vectors to
do this
Klaus wrote:
>Hi,
>
>is there are build-in method for finding similar documents to one given
>document?
>
>Thx,
>
>Klaus
>
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>For additional commands, e-mail: java-user-help@lucene.apache.org
>
>
>
>
--
-------------------------------------------------------------------
Grant Ingersoll
Sr. Software Engineer
Center for Natural Language Processing
Syracuse University
School of Information Studies
337 Hinds Hall
Syracuse, NY 13244
http://www.cnlp.org
Voice: 315-443-5484
Fax: 315-443-6886
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org