You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Igor Chudov <ic...@gmail.com> on 2010/07/09 00:12:37 UTC

Personal Intro and a question on "find top 10 similar items" functionality

Hello,

My name is Igor and I own a website algebra.com. I just joined.

I have a database of answered algebra questions (208,000 and growing).

A typical question is here (original spelling):

``who long does it take 2 people to finish painting a house if the
first one takes 6 days and the second one takes 9 days''

What I would like to do is, for anyone viewing a archived problem, to
find "top 10 similar problems" that would be most "similar" to the
currently viewed query. Note that meaning of similar is not defined in
my question.

Is Lucene even capable of this sort of thing?

Could I expect reasonable performance (under 1-2 seconds) from it?

thanks a bunch guys.

i

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Personal Intro and a question on "find top 10 similar items" functionality

Posted by Otis Gospodnetic <ot...@yahoo.com>.
Igor,

You can treat that question as the query and use it to search the index where 
you've indexed other questions.
More Like This is another option.

Otis
----
Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/



----- Original Message ----
> From: Igor Chudov <ic...@gmail.com>
> To: java-user@lucene.apache.org
> Sent: Thu, July 8, 2010 6:12:37 PM
> Subject: Personal Intro and a question on "find top 10 similar items"  
>functionality
> 
> Hello,
> 
> My name is Igor and I own a website algebra.com. I just  joined.
> 
> I have a database of answered algebra questions (208,000 and  growing).
> 
> A typical question is here (original spelling):
> 
> ``who  long does it take 2 people to finish painting a house if the
> first one takes  6 days and the second one takes 9 days''
> 
> What I would like to do is, for  anyone viewing a archived problem, to
> find "top 10 similar problems" that  would be most "similar" to the
> currently viewed query. Note that meaning of  similar is not defined in
> my question.
> 
> Is Lucene even capable of this  sort of thing?
> 
> Could I expect reasonable performance (under 1-2 seconds)  from it?
> 
> thanks a bunch  guys.
> 
> i
> 
> ---------------------------------------------------------------------
> To  unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For  additional commands, e-mail: java-user-help@lucene.apache.org
> 
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Personal Intro and a question on "find top 10 similar items" functionality

Posted by Naveen Kumar <id...@gmail.com>.
Hi
Check out a class called MoreLikeThis in lucene. It should solve your
problem.

Naveen Kumar

On Fri, Jul 9, 2010 at 3:42 AM, Igor Chudov <ic...@gmail.com> wrote:

> Hello,
>
> My name is Igor and I own a website algebra.com. I just joined.
>
> I have a database of answered algebra questions (208,000 and growing).
>
> A typical question is here (original spelling):
>
> ``who long does it take 2 people to finish painting a house if the
> first one takes 6 days and the second one takes 9 days''
>
> What I would like to do is, for anyone viewing a archived problem, to
> find "top 10 similar problems" that would be most "similar" to the
> currently viewed query. Note that meaning of similar is not defined in
> my question.
>
> Is Lucene even capable of this sort of thing?
>
> Could I expect reasonable performance (under 1-2 seconds) from it?
>
> thanks a bunch guys.
>
> i
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>