You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@zookeeper.apache.org by chen dongming <ca...@hotmail.com> on 2016/06/30 05:04:47 UTC

how is zookeeper deploy at multi datacenter?

How many ways to deploy at multi datacenter for backup?

 From my point view:

1. use observer

     use only 1 ensemble

     one datatcenter as main datacenter with leader and follower

     other datacenters only with 3 observers

     When main datacenter crash, select one datacenter as new main 
datacenter, and convert observers to leader/follower manually.

2. sync data at app level

     use multi ensembles for each datacenter

     sync data at app level, and app make sure no data conflict between 
ensembles.

Is there any other way to deploy multi datacenter for backup?

At last, I notice issue ZOOKEEPER-892 discontinue, why ? And zoorepl is 
suitable for multi datacenter for buckup?




Re: how is zookeeper deploy at multi datacenter?

Posted by Flavio Junqueira <fp...@apache.org>.
> On 30 Jun 2016, at 06:04, chen dongming <ca...@hotmail.com> wrote:
> 
> How many ways to deploy at multi datacenter for backup?
> 

It depends a lot on what you want to do. Is it active-passive, active-active? How many locations do you have?


> From my point view:
> 
> 1. use observer
> 
>     use only 1 ensemble
> 
>     one datatcenter as main datacenter with leader and follower
> 
>     other datacenters only with 3 observers
> 
>     When main datacenter crash, select one datacenter as new main 
> datacenter, and convert observers to leader/follower manually.
> 

That's an option. In this case, you have asynchronous replication to the observers, which gives you lower latency for the primary data center, but you may lose some data in the case of the primary data center going down. Such a loss is acceptable in some cases. If you want to have synchronous replication, then you need to configure it to force at least one copy to be in a different data center for every quorum write. You can use groups for this.

> 2. sync data at app level
> 
>     use multi ensembles for each datacenter
> 
>     sync data at app level, and app make sure no data conflict between 
> ensembles.
> 

Is this trying to achieve synchronous replication? For synchronous replication, there are a few options:

- If you have at least three locations, then put say three servers in each. A majority will be 5, which means that every write spans more than one data center. In general, the quorum size needs to be larger than the number of servers in a single location.
- If you have three locations, but you're really interested in two, then you can have your primary replicas in two locations, say 3 and 3, and a single "witness" replica in a third location. The witness replica simply helps to form a quorum in the case one of the primary locations goes down.
- One option for two locations that I personally like is the one of creating two groups of say 3 servers each. A setup of two groups will force all writes to go to a majority in each of the two groups. In the case one location goes down, change manually the configuration to make it single group. The single group will have all committed changes.

> Is there any other way to deploy multi datacenter for backup?
> 
> At last, I notice issue ZOOKEEPER-892 discontinue, why ? And zoorepl is 
> suitable for multi datacenter for buckup?
> 
> 

ZK-892 has been stalled because no one has been pushing for it. If anyone wants it in, then we will need to complete the work. And, as it has been pointed out in the jira, there might be better alternatives to the replication of subtrees, so if we resume that line of work, it is possible that the approach we end up with isn't the one proposed there.

-Flavio

> 


Re: how is zookeeper deploy at multi datacenter?

Posted by Alexander Shraer <sh...@gmail.com>.
our recent paper may be relevant:
https://www.usenix.org/conference/atc16/technical-sessions/presentation/lev-ari

On Wed, Jun 29, 2016 at 10:04 PM, chen dongming <ca...@hotmail.com>
wrote:

> How many ways to deploy at multi datacenter for backup?
>
>  From my point view:
>
> 1. use observer
>
>      use only 1 ensemble
>
>      one datatcenter as main datacenter with leader and follower
>
>      other datacenters only with 3 observers
>
>      When main datacenter crash, select one datacenter as new main
> datacenter, and convert observers to leader/follower manually.
>
> 2. sync data at app level
>
>      use multi ensembles for each datacenter
>
>      sync data at app level, and app make sure no data conflict between
> ensembles.
>
> Is there any other way to deploy multi datacenter for backup?
>
> At last, I notice issue ZOOKEEPER-892 discontinue, why ? And zoorepl is
> suitable for multi datacenter for buckup?
>
>
>
>