You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Lee Turner <Le...@oyster.com> on 2005/04/05 09:38:04 UTC

Strategies for updating indexes.

Hi

 

I was wondering whether anyone has any experience of multithreaded
updates to indexes.  I the web app I am working on there are additions,
updates and deletes that need to happen to the index throughout the
runtime of the application.  Also, the application is run in a cluster
with each app server having its own index.  This means that periodically
each app server is going to have to go through a re-indexing process to
make sure that its index has all the changes from the other app servers
in it.  This process can take a few seconds so if another update to the
index occurs at this time it will need to be queued in some way to make
sure it happens after the re-indexing.

 

I was just wondering if anyone had any pointers for doing this kind of
thing.  Any help would be gratefully appreciated.

 

Many thanks

Lee

 

 

Lee Turner | Java Developer | Oyster Partners 

D. +44 (0)20 74461229 
T. +44 (0)20 7446 7500 

www.oyster.com

 


_________________________________________________________________________________________________________________________
Internet communications are not secure and therefore Oyster Partners Ltd does not accept legal responsibility for the contents of this message. Any views or opinions presented are solely those of the author and do not necessarily represent those of Oyster Partners Ltd.

Re: Strategies for updating indexes.

Posted by Jens Kraemer <kr...@webit.de>.
Hi,
please see comments below.

On Tue, Apr 05, 2005 at 08:38:04AM +0100, Lee Turner wrote:
> Hi
> 
> I was wondering whether anyone has any experience of multithreaded
> updates to indexes.  I the web app I am working on there are additions,
> updates and deletes that need to happen to the index throughout the
> runtime of the application.  Also, the application is run in a cluster
> with each app server having its own index.  This means that periodically
> each app server is going to have to go through a re-indexing process to
> make sure that its index has all the changes from the other app servers
> in it.  This process can take a few seconds so if another update to the
> index occurs at this time it will need to be queued in some way to make
> sure it happens after the re-indexing.
> 
> I was just wondering if anyone had any pointers for doing this kind of
> thing.  Any help would be gratefully appreciated.

I usually have a service class wrapping all access to the lucene index,
which has a queue where my Servlets or Actions put the documents to be
updated or added in.  There is a single instance of this class for the
whole web app, and a thread regularly waking up and processing the
elements of the queue.

Note the queue has to be threadsafe or has to be synchronized
externally. 

Since there is only one instance of this service class, it is the only
one who will ever write to the index (if the same index is not used by
other applications).

During re-indexing the thread regularly processing the queue will
be paused. After re-indexing it is started again, processing all pending 
changes from the queue. The re-indexing itself takes place in another
thread, which is started by quartz in my case.


hope this helps you somehow,

Jens


-- 
webit! Gesellschaft für neue Medien mbH          www.webit.de
Dipl.-Wirtschaftsingenieur Jens Krämer       kraemer@webit.de
Schnorrstraße 76                      Telefon +49 351 46766 0
D-01069 Dresden                      Telefax +49 351 46766 66

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org