You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Ganesh <em...@yahoo.co.in> on 2010/02/08 11:14:13 UTC

Scale Out

Our indexes is growing and the sorted cache is taking huge amount of RAM. We want to add multiple nodes, and scale out the search. 

Currently my applaication supports RMI interface and it return appliaction specific result set objects as hits. I could host multiple search instance in different nodes, then i may need to sort / combine the results. 

Any thoughts on scaling / clustering? Whether i need to use Hadoop / Carrot etc...

Regards
Ganesh


Send instant messages to your online friends http://in.messenger.yahoo.com 

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Scale Out

Posted by Ian Lea <ia...@gmail.com>.
http://katta.sourceforge.net/ sounds well worth a look.


--
Ian.


On Mon, Feb 8, 2010 at 10:14 AM, Ganesh <em...@yahoo.co.in> wrote:
> Our indexes is growing and the sorted cache is taking huge amount of RAM. We want to add multiple nodes, and scale out the search.
>
> Currently my applaication supports RMI interface and it return appliaction specific result set objects as hits. I could host multiple search instance in different nodes, then i may need to sort / combine the results.
>
> Any thoughts on scaling / clustering? Whether i need to use Hadoop / Carrot etc...
>
> Regards
> Ganesh
>
>
> Send instant messages to your online friends http://in.messenger.yahoo.com
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Scale Out

Posted by Jeff Zhang <zj...@gmail.com>.
Solr has more powerful scalability than lucene, maybe you can try that


On Mon, Feb 8, 2010 at 6:14 PM, Ganesh <em...@yahoo.co.in> wrote:

> Our indexes is growing and the sorted cache is taking huge amount of RAM.
> We want to add multiple nodes, and scale out the search.
>
> Currently my applaication supports RMI interface and it return appliaction
> specific result set objects as hits. I could host multiple search instance
> in different nodes, then i may need to sort / combine the results.
>
> Any thoughts on scaling / clustering? Whether i need to use Hadoop / Carrot
> etc...
>
> Regards
> Ganesh
>
>
> Send instant messages to your online friends http://in.messenger.yahoo.com
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>


-- 
Best Regards

Jeff Zhang

Re: Scale Out

Posted by Jake Mannix <ja...@gmail.com>.
On Mon, Feb 8, 2010 at 9:33 AM, Chris Lu <ch...@gmail.com> wrote:

> Since you already have RMI interface, maybe you can parallel search on
> several nodes, collect the data, pick top ones, and send back results via
> RMI.
>

One thing to be careful about this, which you might already be aware of:
Query (and subclasses) implement Serializable, but doesn't declare a
serialversionUID, and so when you upgrade from lucene 2.4 to 2.9 or even 3.0
to 3.0.1, you can get serialization incompatibilities between your broker
and your leaf nodes if you pass serialized Query objects over RMI (and try
to do a rolling upgrade, one node at a time).  If you pass domain-specific
objects which you control, this doesn't happen, of course.

Not the end of the world, but good to keep in mind.

  -jake

Re: Scale Out

Posted by Chris Lu <ch...@gmail.com>.
Since you already have RMI interface, maybe you can parallel search on 
several nodes, collect the data, pick top ones, and send back results 
via RMI.

--
Chris Lu
-------------------------
Instant Scalable Full-Text Search On Any Database/Application
site: http://www.dbsight.net
demo: http://search.dbsight.com
Lucene Database Search in 3 minutes: http://wiki.dbsight.com/index.php?title=Create_Lucene_Database_Search_in_3_minutes
DBSight customer, a shopping comparison site, (anonymous per request) got 2.6 Million Euro funding!


Ganesh wrote:
> Our indexes is growing and the sorted cache is taking huge amount of RAM. We want to add multiple nodes, and scale out the search. 
>
> Currently my applaication supports RMI interface and it return appliaction specific result set objects as hits. I could host multiple search instance in different nodes, then i may need to sort / combine the results. 
>
> Any thoughts on scaling / clustering? Whether i need to use Hadoop / Carrot etc...
>
> Regards
> Ganesh
>
>
> Send instant messages to your online friends http://in.messenger.yahoo.com 
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>   

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Scale Out

Posted by Stanislaw Osinski <st...@gmail.com>.
> Any thoughts on scaling / clustering? Whether i need to use Hadoop / Carrot
> etc...
>

Carrot2 does search results clustering (by content), while what you probably
need is server/index clustering. See the other responses in this thread for
suggestions.

S.