You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Anand Kishore <an...@gmail.com> on 2005/10/04 19:05:37 UTC

Indexing-Searching Design

Hi all,

Having read the mail in the mailing list archive about Best
Indexing-Searching Practices I have come up with the following architecture
for my application. Kindly evaluate and comment regarding the same.

Figure:

http://www.flickr.com/photos/28219682@N00/49301053/

Explanation:

The primary indexer (daemon) recieves the documents to be indexed. It
dispatches the documents to one of the secondary indexer nodes (via load
balancing). These indexing nodes index the documents in the RAMDirectory,
periodically writing it to a local index in the filesystem.

A cron process running on the central server (which contains the main index)
periodically checks for any new/updated indexes (small in size) on the
secondary nodes. It copies these new (small) indexes to the central server
(based on 'push changes onto main index'). An optimizer process running on
the central server periodically merges/optimizes the main index with the
smaller newer indexes. It also creates a checkpoint of the consistent index
everytime it performs optimization (the index.DATE approach).

The 'updaters' (cron processes on searcher nodes) periodically copy new
checkpoints via rsync onto their local system and create symbolic links to
them (same as proposed and used by Doug for Technorati).


--
- Andy

Re: Indexing-Searching Design

Posted by Chris Hostetter <ho...@fucit.org>.
: The primary indexer (daemon) recieves the documents to be indexed. It
: dispatches the documents to one of the secondary indexer nodes (via load
: balancing). These indexing nodes index the documents in the RAMDirectory,
: periodically writing it to a local index in the filesystem.

I'm not certain, but unless your analysis is extremely intensive, I can't
think of any advantage in having the secondary indexer nodes.  I believe
your central server is going to do roughly the same ammount of work
merging those smaller indexes in as it would if the documents were added
directly.


I could be wrong however, I haven't really looked that closely at the
internals of addIndex.



-Hoss


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org