You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by jt oob <jt...@yahoo.co.uk> on 2004/05/14 16:19:45 UTC

(Distributed) Search system designs

Hi,

I currently have a working search system based on lucene 1.2 as follows

14 indexes, average size just over 1G, min size 36M, max size 3.3G,
total size 15G.

Search times are currently between 20s and 4 minutes depending on the
query, the system uses a multisearcher to search all indexes. The
indexes are currently all stored on an internal raid.

There are lots of things wrong with the index, including many words
which should be in stop lists which aren't etc.

The search is run on a linux system with 8G of RAM and 2G of swap.

- - - -
I am looking at writing a replacement system, and this time trying to
everything properly, writing document parsers etc.

Any pointers would be well recieved!

The questions:

1) The documentation about how to get a basic lucene search going is
great, is there any similar documentation or a HOWTO on how to design
and implement distributed searches?

2) For distributed searches what are the best options for building in
redundancy? Is a large shared storage solution such a SAN required, or
will duplicating indexes on several machines suffice?

3) I had been told that using RAMDirectory on a linux system was
pointless because the kernel cached files in spare RAM anyway. Is this
true?

Thanks!

jt


	
	
		
____________________________________________________________
Yahoo! Messenger - Communicate instantly..."Ping" 
your friends today! Download Messenger Now 
http://uk.messenger.yahoo.com/download/index.html

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org