You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Klaus <kl...@vommond.de> on 2006/01/09 17:19:20 UTC

Finding similar documents

Hi,

is there are build-in method for finding similar documents to one given
document?

Thx,

Klaus


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Finding similar documents

Posted by Stefan Gusenbauer <st...@kbse.net>.
Grant Ingersoll wrote:

> I believe there is a MoreLikeThis class floating around somewhere (I 
> think it is in the contrib/similarity package).  The Lucene book also 
> has a good example, and I have some examples at 
> http://www.cnlp.org/apachecon2005 that demonstrate using term vectors 
> to do this
>
> Klaus wrote:
>
>> Hi,
>>
>> is there are build-in method for finding similar documents to one given
>> document?
>>
>> Thx,
>>
>> Klaus
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>
>>  
>>
>
I've implemented a simple relevance feedback algorithm which extracts 
terms from all interesting documents and builds up a new query with this 
terms. This is pretty simple but It works in most cases.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Finding similar documents

Posted by Grant Ingersoll <gs...@syr.edu>.
I believe there is a MoreLikeThis class floating around somewhere (I 
think it is in the contrib/similarity package).  The Lucene book also 
has a good example, and I have some examples at 
http://www.cnlp.org/apachecon2005 that demonstrate using term vectors to 
do this

Klaus wrote:

>Hi,
>
>is there are build-in method for finding similar documents to one given
>document?
>
>Thx,
>
>Klaus
>
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>For additional commands, e-mail: java-user-help@lucene.apache.org
>
>
>  
>

-- 
------------------------------------------------------------------- 
Grant Ingersoll 
Sr. Software Engineer 
Center for Natural Language Processing 
Syracuse University 
School of Information Studies 
337 Hinds Hall 
Syracuse, NY 13244 

http://www.cnlp.org 
Voice:  315-443-5484 
Fax: 315-443-6886 


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org