You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by mitu2009 <mu...@gmail.com> on 2009/06/19 06:10:42 UTC
Synchronizing Lucene indexes across 2 application servers
I've a web application which uses Lucene for search functionality. Lucene
search requests are served by web services sitting on 2 application servers
(IIS 7).The 2 application servers are Load balanced using "netscaler".
Both these servers have a batch job running which updates search indexes on
the respective servers in the night on a daily basis.
I need to synchronize search indexes on these 2 servers so that at any point
of time both the servers have uptodate indexes. I was thinking what could be
the best architecture/design strategy to do so given the fact that any of
the 2 application servers could be serving search request depending upon its
availability.
Any inputs please?
Thanks for reading!
--
View this message in context: http://www.nabble.com/Synchronizing-Lucene-indexes-across-2-application-servers-tp24105223p24105223.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
Re: Synchronizing Lucene indexes across 2 application servers
Posted by Ian Lea <ia...@gmail.com>.
Or have a third master index, as Joel suggests, apply all updates to that
index, only, then at the end of each batch index update run, use rsync or
equivalent to push the master index out to the 2 search servers and then
tell them to reopen their indexes.
--
Ian.
On Fri, Jun 19, 2009 at 9:23 AM, Joel Halbert <jo...@su3analytics.com> wrote:
> do they have to be kept in synch in real time?
> does each server handle writes to its own index which then need to be
> propagated to the other server's index?
>
> From a simplicity point of view, to minimise the amount of self consistency
> checking that needs to happen I would suggest even having a third, master
> index, to which all writes happen. As writes are applied to the master they
> are then propagated to the 2 servers. You then just need to keep a track of
> the latest document written to each of the two "slave" servers, and in
> vcase
> of failure/recovery on either you just request all deltas since the last
> known record on each.
>
> On Friday 19 June 2009 05:10:42 mitu2009 wrote:
> > I've a web application which uses Lucene for search functionality. Lucene
> > search requests are served by web services sitting on 2 application
> servers
> > (IIS 7).The 2 application servers are Load balanced using "netscaler".
> >
> > Both these servers have a batch job running which updates search indexes
> on
> > the respective servers in the night on a daily basis.
> >
> > I need to synchronize search indexes on these 2 servers so that at any
> > point of time both the servers have uptodate indexes. I was thinking what
> > could be the best architecture/design strategy to do so given the fact
> that
> > any of the 2 application servers could be serving search request
> depending
> > upon its availability.
> >
> > Any inputs please?
> >
> > Thanks for reading!
>
>
>
> --
> Joel Halbert
> 020 3051 8637
> 075 2501 0825
> joel@su3analytics.com
> www.su3analytics.com
> www.storequery.com
> SU3 Analytics Ltd, The Print House, 18 Ashwin St, London E8 3DL.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>
Re: Synchronizing Lucene indexes across 2 application servers
Posted by Joel Halbert <jo...@su3analytics.com>.
do they have to be kept in synch in real time?
does each server handle writes to its own index which then need to be
propagated to the other server's index?
From a simplicity point of view, to minimise the amount of self consistency
checking that needs to happen I would suggest even having a third, master
index, to which all writes happen. As writes are applied to the master they
are then propagated to the 2 servers. You then just need to keep a track of
the latest document written to each of the two "slave" servers, and in vcase
of failure/recovery on either you just request all deltas since the last
known record on each.
On Friday 19 June 2009 05:10:42 mitu2009 wrote:
> I've a web application which uses Lucene for search functionality. Lucene
> search requests are served by web services sitting on 2 application servers
> (IIS 7).The 2 application servers are Load balanced using "netscaler".
>
> Both these servers have a batch job running which updates search indexes on
> the respective servers in the night on a daily basis.
>
> I need to synchronize search indexes on these 2 servers so that at any
> point of time both the servers have uptodate indexes. I was thinking what
> could be the best architecture/design strategy to do so given the fact that
> any of the 2 application servers could be serving search request depending
> upon its availability.
>
> Any inputs please?
>
> Thanks for reading!
--
Joel Halbert
020 3051 8637
075 2501 0825
joel@su3analytics.com
www.su3analytics.com
www.storequery.com
SU3 Analytics Ltd, The Print House, 18 Ashwin St, London E8 3DL.
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
Re: Synchronizing Lucene indexes across 2 application servers
Posted by Ken Krugler <kk...@transpac.com>.
>I've a web application which uses Lucene for search functionality. Lucene
>search requests are served by web services sitting on 2 application servers
>(IIS 7).The 2 application servers are Load balanced using "netscaler".
>
>Both these servers have a batch job running which updates search indexes on
>the respective servers in the night on a daily basis.
>
>I need to synchronize search indexes on these 2 servers so that at any point
>of time both the servers have uptodate indexes. I was thinking what could be
>the best architecture/design strategy to do so given the fact that any of
>the 2 application servers could be serving search request depending upon its
>availability.
You could use Katta for this, as another option - it's an open source
distributed Lucene search system.
Under the hood Katta uses ZooKeeper to handle distribution of data to
multiple servers. Once Katta has added an index to both systems, then
you can switch to it (and eventually remove the old index).
The fact that you'd need two Katta "masters" makes things a bit more
interesting, as you'd have to coordinate when they both decide to
switch to using the new index(es).
-- Ken
--
Ken Krugler
+1 530-210-6378
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
Re: Synchronizing Lucene indexes across 2 application servers
Posted by Otis Gospodnetic <ot...@yahoo.com>.
Hello,
You may want to look at Lucene's younger brother named Solr: http://lucene.apache.org/solr/
Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
----- Original Message ----
> From: mitu2009 <mu...@gmail.com>
> To: java-user@lucene.apache.org
> Sent: Friday, June 19, 2009 12:10:42 AM
> Subject: Synchronizing Lucene indexes across 2 application servers
>
>
> I've a web application which uses Lucene for search functionality. Lucene
> search requests are served by web services sitting on 2 application servers
> (IIS 7).The 2 application servers are Load balanced using "netscaler".
>
> Both these servers have a batch job running which updates search indexes on
> the respective servers in the night on a daily basis.
>
> I need to synchronize search indexes on these 2 servers so that at any point
> of time both the servers have uptodate indexes. I was thinking what could be
> the best architecture/design strategy to do so given the fact that any of
> the 2 application servers could be serving search request depending upon its
> availability.
>
> Any inputs please?
>
> Thanks for reading!
>
> --
> View this message in context:
> http://www.nabble.com/Synchronizing-Lucene-indexes-across-2-application-servers-tp24105223p24105223.html
> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org