You are viewing a plain text version of this content. The canonical link for it is here.

Posted to solr-user@lucene.apache.org by jaime spicciati <ja...@gmail.com> on 2015/01/02 22:52:27 UTC

SolrCloud multi-datacenter failover?

All,

At my current customer we have developed a custom federator that will
federate queries between Endeca and Solr to ease the transition from an
extremely large (TBs of data) Endeca index to Solr. (Endeca is similar to
Solr in terms of search/faceted navigation/etc).



During this transition plan we need to support multi datacenter failover
which we have historically handled via load balancers with the appropriate
failover configurations (think F5). We are currently playing our dataloads
into multiple datacenters to ensure data consistency. (Each datacenter has
a stand-alone instance of solrcloud with its own redundancy/failover)



I am curious to see how the community handles multi datacenter failureover
at the presentation layer (datacenter A goes down and we want to failover
to B). Solrcloud within a datacenter will handle single datacenter failure
within the instance, but in order to support multi datacenter failover I
haven't seen a definitive ‘answer’ as to how to handle this situation.



At this point the only two options I can come up with are

1) Fail the entire datacenter if Solrcloud goes offline (GUI/index/etc go
offline)

 - This is problematic because some portion of user activity will fail,
queries that are in transit will not complete

2) Implement failover at the custom federator level. In doing so we would
need to detect a failure at datacenter A within our federator, then query
datacenter B to fulfill the user request, then potentially fail the entire
datacenter A once all transactions have been fulfilled against A



Since we are looking up the active solr instance via zookeeper (solrcloud)
per datacenter I don’t see any reasonable means of failing over to another
datacenter if a given solrcloud instance goes down?


Any thoughts are welcome at this point?

Thanks

Jaime

Re: SolrCloud multi-datacenter failover?

Posted by Otis Gospodnetic <ot...@gmail.com>.

Hi,

Check http://search-lucene.com/?q=%22Cross+Data+Center+Replicaton%22 ->
http://issues.apache.org/jira/browse/SOLR-6273

Otis
--
Monitoring * Alerting * Anomaly Detection * Centralized Log Management
Solr & Elasticsearch Support * http://sematext.com/


On Fri, Jan 2, 2015 at 4:52 PM, jaime spicciati <ja...@gmail.com>
wrote:

> All,
>
> At my current customer we have developed a custom federator that will
> federate queries between Endeca and Solr to ease the transition from an
> extremely large (TBs of data) Endeca index to Solr. (Endeca is similar to
> Solr in terms of search/faceted navigation/etc).
>
>
>
> During this transition plan we need to support multi datacenter failover
> which we have historically handled via load balancers with the appropriate
> failover configurations (think F5). We are currently playing our dataloads
> into multiple datacenters to ensure data consistency. (Each datacenter has
> a stand-alone instance of solrcloud with its own redundancy/failover)
>
>
>
> I am curious to see how the community handles multi datacenter failureover
> at the presentation layer (datacenter A goes down and we want to failover
> to B). Solrcloud within a datacenter will handle single datacenter failure
> within the instance, but in order to support multi datacenter failover I
> haven't seen a definitive ‘answer’ as to how to handle this situation.
>
>
>
> At this point the only two options I can come up with are
>
> 1) Fail the entire datacenter if Solrcloud goes offline (GUI/index/etc go
> offline)
>
>  - This is problematic because some portion of user activity will fail,
> queries that are in transit will not complete
>
> 2) Implement failover at the custom federator level. In doing so we would
> need to detect a failure at datacenter A within our federator, then query
> datacenter B to fulfill the user request, then potentially fail the entire
> datacenter A once all transactions have been fulfilled against A
>
>
>
> Since we are looking up the active solr instance via zookeeper (solrcloud)
> per datacenter I don’t see any reasonable means of failing over to another
> datacenter if a given solrcloud instance goes down?
>
>
> Any thoughts are welcome at this point?
>
> Thanks
>
> Jaime
>

Re: SolrCloud multi-datacenter failover?

Posted by Erick Erickson <er...@gmail.com>.

bq: This is problematic because some portion of user activity will fail,
queries that are in transit will not complete

This is always interesting to think about, but is it a serious enough
problem to spend resources trying to anticipate? I can imagine situations
where even losing the queries in transit once per year is unacceptable,
but those are outliers; is yours _that_ critical?

I mean if you have data centers failing often enough that it impacts
users noticeably, you have waaaaay bigger problems than losing the results
for the current queries routed to that data center.

When was the last time one of your data centers went completely off line
anyway? I guess my point is that anticipating this kind of thing would be
way down on my priority list, personally I'd ignore it.

That said, you know your situation best and maybe it's worth the effort,
but if I were the project manager I'd push back at the requirements people
pretty hard before spending engineering effort to try to anticipate such a
thing, engineering effort I'd be taking away from addressing the problems
that impact users all the time.

FWIW,
Erick

On Fri, Jan 2, 2015 at 1:52 PM, jaime spicciati
<ja...@gmail.com> wrote:
> All,
>
> At my current customer we have developed a custom federator that will
> federate queries between Endeca and Solr to ease the transition from an
> extremely large (TBs of data) Endeca index to Solr. (Endeca is similar to
> Solr in terms of search/faceted navigation/etc).
>
>
>
> During this transition plan we need to support multi datacenter failover
> which we have historically handled via load balancers with the appropriate
> failover configurations (think F5). We are currently playing our dataloads
> into multiple datacenters to ensure data consistency. (Each datacenter has
> a stand-alone instance of solrcloud with its own redundancy/failover)
>
>
>
> I am curious to see how the community handles multi datacenter failureover
> at the presentation layer (datacenter A goes down and we want to failover
> to B). Solrcloud within a datacenter will handle single datacenter failure
> within the instance, but in order to support multi datacenter failover I
> haven't seen a definitive ‘answer’ as to how to handle this situation.
>
>
>
> At this point the only two options I can come up with are
>
> 1) Fail the entire datacenter if Solrcloud goes offline (GUI/index/etc go
> offline)
>
>  - This is problematic because some portion of user activity will fail,
> queries that are in transit will not complete
>
> 2) Implement failover at the custom federator level. In doing so we would
> need to detect a failure at datacenter A within our federator, then query
> datacenter B to fulfill the user request, then potentially fail the entire
> datacenter A once all transactions have been fulfilled against A
>
>
>
> Since we are looking up the active solr instance via zookeeper (solrcloud)
> per datacenter I don’t see any reasonable means of failing over to another
> datacenter if a given solrcloud instance goes down?
>
>
> Any thoughts are welcome at this point?
>
> Thanks
>
> Jaime