You are viewing a plain text version of this content. The canonical link for it is here.

Posted to solr-user@lucene.apache.org by Dominik Niziński <mi...@gmail.com> on 2017/05/10 15:26:21 UTC

Replicating the master node for a fail over scenario

Hello,

we're successfully using solr in our application for a few months now.
Recently we've got asked if there is a possibility of having multiple
master nodes (in case of some disaster happening in one of server
locations).
Basically what we want to do is having a few master nodes running at the
same time which then could automatically pick up the work if one is not
responding. For now we have decided to go with a complete master node
replication (basically on machine level) running the whole time and ready
to be plugged in to the live system via DNS configuration if something
wrong happens to the "real" master.

Latest stackoverflow post about that is nearly 6 years old (
http://stackoverflow.com/questions/6362484/apache-solr-failover-support-in-master-slave-setup)
so I was wondering if something has changed since then or are there any new
solutions to the issue.

Kind Regards,
Dominik

Re: Replicating the master node for a fail over scenario

Posted by Erick Erickson <er...@gmail.com>.

This is really what SolrCloud was built for, particularly CDCR (Cross
Data Center Replication) for remote DCs.

For the master/slave situation there's nothing automatic, it's a
roll-your-own type thing. People have done things like:

1> any replica can be "promoted" to master with configuration changes.
So in the disaster case have a mechanism whereby you can re-index from
"some time ago". Say your poll interval is X and at time T your master
dies. Promote one of your slaves to master (simple config changes) and
re-index anything that's changed since, say, T-(X + some margin just
to be sure). Say the poll interval is 1 hour. If I can re-index from 2
hours before the master went south I have all my data. True you will
be serving stale data for "a while", but this is sometime acceptable

2> Have your client index to two nodes. The trick is that the "backup"
isn't doing anything interesting, i.e. no slave is polling it. If the
master fails, reconfigure the slaves to point to the machine that's
live.

3> Just consider the two data centers to be completely independent as
far as Solr is concerned and replicate your system-of-record to the
second DC. Each DC indexes and (perhaps) serves searches
independently.

But really, SolrCloud is built for HA/DR (admittedly with some added
complexity). If the simple approaches I've outlined don't work and
HA/DR is that important, you might want to consider it.

Best,
Erick

On Wed, May 10, 2017 at 8:26 AM, Dominik Niziński <mi...@gmail.com> wrote:
> Hello,
>
> we're successfully using solr in our application for a few months now.
> Recently we've got asked if there is a possibility of having multiple
> master nodes (in case of some disaster happening in one of server
> locations).
> Basically what we want to do is having a few master nodes running at the
> same time which then could automatically pick up the work if one is not
> responding. For now we have decided to go with a complete master node
> replication (basically on machine level) running the whole time and ready
> to be plugged in to the live system via DNS configuration if something
> wrong happens to the "real" master.
>
> Latest stackoverflow post about that is nearly 6 years old (
> http://stackoverflow.com/questions/6362484/apache-solr-failover-support-in-master-slave-setup)
> so I was wondering if something has changed since then or are there any new
> solutions to the issue.
>
> Kind Regards,
> Dominik