You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by lu...@nitwit.de on 2004/02/21 10:42:09 UTC

Lucene scalability/clustering

Hi!

How well does Lucene scale? Is it able to handle 100.000 (more or less 
complex) queries a day (i.e. 9 to 5) on an index with half a million docs?

What hardware is recommended for that demand? What to do if it cannot handle 
it quickly enough?

Regards,
Timo

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Re: RE : Lucene scalability/clustering

Posted by Doug Cutting <cu...@apache.org>.
Anson Lau wrote:
> I'm trying to see what are some common ways to scale lucene onto
> multiple boxes.  Is RMI based search and using a MultiSearcher the
> general approach?

Yes, although you probably want to use ParallelMultiSearcher.

Doug

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


RE : Lucene scalability/clustering

Posted by Anson Lau <al...@fulfil-net.com>.
RBP,

I'm implementing a search engine for a project at work.  It's going to
index approx 1.5 rows in a database.

I am trying to get a feel of what my options are when scalability
becomes an issue.  I also want to know if those options require me to
implement my app in a different way right from the start.

Anson

-----Original Message-----
From: Rasik Pandey [mailto:rasik.pandey@ajlsm.com] 
Sent: Tuesday, February 24, 2004 9:34 PM
To: 'Lucene Users List'
Subject: RE : RE : Lucene scalability/clustering

> I'm trying to see what are some common ways to scale lucene
> onto multiple boxes.  Is RMI based search and using a 
> MultiSearcher the general approach?

More details about what you are attempting would be helpful.


RBP


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


RE : RE : Lucene scalability/clustering

Posted by Rasik Pandey <ra...@ajlsm.com>.
> I'm trying to see what are some common ways to scale lucene
> onto
> multiple boxes.  Is RMI based search and using a MultiSearcher
> the
> general approach?

More details about what you are attempting would be helpful.


RBP


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


RE: RE : Lucene scalability/clustering

Posted by Anson Lau <al...@fulfil-net.com>.
I'm trying to see what are some common ways to scale lucene onto
multiple boxes.  Is RMI based search and using a MultiSearcher the
general approach?

There doesn't seem to be many articles on the web on how to implement a
lucene search cluster.  If anyone knows a good article can you please
post it here?

Thanks,

Anson

-----Original Message-----
From: Rasik Pandey [mailto:rasik.pandey@ajlsm.com]
Sent: Monday, February 23, 2004 9:46 PM
To: 'Lucene Users List'
Subject: RE : Lucene scalability/clustering

> Further on this topic - has anyone tried implementing a
> distributed
> search with Lucene?  How does it work and does it work well?

I assume you are referring to RMI based search? It works well as does
MultiSearcher. 

RBP


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


RE : Lucene scalability/clustering

Posted by Rasik Pandey <ra...@ajlsm.com>.
> Further on this topic - has anyone tried implementing a
> distributed
> search with Lucene?  How does it work and does it work well?

I assume you are referring to RMI based search? It works well as does MultiSearcher. 

RBP


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


RE: Lucene scalability/clustering

Posted by Jochen Frey <lu...@quontis.com>.
Anson,

	One way of doing it is having subsets of your indexes / data on
different machines. Each machine indexes its own data. You implement a
system that distributes queries to the various machines and merges the
results back.

	The working well completely depends on your implementation of the
distributed search.

	I believe there was a discussion about implementing this using a
MultiSearcher somewhere as well.

	Cheers!
		Jochen


-----Original Message-----
From: Anson Lau [mailto:alau@fulfil-net.com] 
Sent: Sunday, February 22, 2004 2:17 PM
To: 'Lucene Users List'
Subject: RE: Lucene scalability/clustering


Further on this topic - has anyone tried implementing a distributed
search with Lucene?  How does it work and does it work well?


Anson


-----Original Message-----
From: Hamish Carpenter [mailto:hamish@catalyst.net.nz]
Sent: Monday, February 23, 2004 5:24 AM
To: Lucene Users List
Subject: Re: Lucene scalability/clustering

Hi All,

I'm Hamish Carpenter who contributed the benchmarks with the comment
about the IndexSearcherCache.  Using this solved our issues with too
many files open under linux.

The original IndexSearcherCache email is here:
http://www.mail-archive.com/lucene-user@jakarta.apache.org/msg01967.html

See here for a copy of the above message and a download link:
http://www.geocities.com/haytona/lucene/
The mailing list doesn't like attachments.  The source is 10K in size.

HTH

Hamish Carpenter.

lucene@nitwit.de wrote:
 > BTW, where can I get Peter Halacsy's IndexSearcherCache?

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


RE: Lucene scalability/clustering

Posted by Anson Lau <al...@fulfil-net.com>.
Further on this topic - has anyone tried implementing a distributed
search with Lucene?  How does it work and does it work well?


Anson


-----Original Message-----
From: Hamish Carpenter [mailto:hamish@catalyst.net.nz]
Sent: Monday, February 23, 2004 5:24 AM
To: Lucene Users List
Subject: Re: Lucene scalability/clustering

Hi All,

I'm Hamish Carpenter who contributed the benchmarks with the comment
about the IndexSearcherCache.  Using this solved our issues with too
many files open under linux.

The original IndexSearcherCache email is here:
http://www.mail-archive.com/lucene-user@jakarta.apache.org/msg01967.html

See here for a copy of the above message and a download link:
http://www.geocities.com/haytona/lucene/
The mailing list doesn't like attachments.  The source is 10K in size.

HTH

Hamish Carpenter.

lucene@nitwit.de wrote:
 > BTW, where can I get Peter Halacsy's IndexSearcherCache?

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Re: Lucene scalability/clustering

Posted by Hamish Carpenter <ha...@catalyst.net.nz>.
Hi All,

I'm Hamish Carpenter who contributed the benchmarks with the comment
about the IndexSearcherCache.  Using this solved our issues with too
many files open under linux.

The original IndexSearcherCache email is here:
http://www.mail-archive.com/lucene-user@jakarta.apache.org/msg01967.html

See here for a copy of the above message and a download link:
http://www.geocities.com/haytona/lucene/
The mailing list doesn't like attachments.  The source is 10K in size.

HTH

Hamish Carpenter.

lucene@nitwit.de wrote:
 > BTW, where can I get Peter Halacsy's IndexSearcherCache?

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Re: Lucene scalability/clustering

Posted by lu...@nitwit.de.
On Saturday 21 February 2004 20:24, Otis Gospodnetic wrote:
> http://jakarta.apache.org/lucene/docs/benchmarks.html

BTW, where can I get Peter Halacsy's IndexSearcherCache?

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Re: Lucene scalability/clustering

Posted by Otis Gospodnetic <ot...@yahoo.com>.
http://jakarta.apache.org/lucene/docs/benchmarks.html

--- lucene@nitwit.de wrote:
> Hi!
> 
> How well does Lucene scale? Is it able to handle 100.000 (more or
> less 
> complex) queries a day (i.e. 9 to 5) on an index with half a million
> docs?
> 
> What hardware is recommended for that demand? What to do if it cannot
> handle 
> it quickly enough?
> 
> Regards,
> Timo
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org