You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@zookeeper.apache.org by Christian Schuhegger <Ch...@gmx.de> on 2013/02/02 04:39:50 UTC

Re: Linking two sites via two Zookeeper instances

Hi Alexander,

Alexander Shraer wrote:
> I don't think this is currently possible. I believe there has been some
> work on building a hierarchy of ZooKeeper clusters @ Facebook, but I don't
> know the details. I don't believe that this would mean less management
> overhead though, since you'd still need several voting servers in each
> datacenter.

ok, I understand.

> But I actually wanted to ask you about your usecase. Do you have
> consistency requirements among data items mastered in different datacenters
> ? For example - do you require that all clients (no matter where they are)
> see changes to /A/* and /B/* in the same order ? could you share some more
> details ? or, lets say you have 3 datacenters, one mastering /A/* another
> /B/* and the third /C/*. Suppose that the first datacenter sees a change to
> /C/x and afterwards /A/y is updated. Is it possible that someone in
> datacenter B sees the new /A/y  before the new /C/x  ?

The two things that you might need in a distributed set-up are agreement 
and/or order. Agreement would mean that all participants in the 
distributed set-up get ALL updates and order would mean that they get 
all updates in the same sequential order.

Zookeeper is implementing both.

For several of my use cases agreement and order would be required within 
one data center, because we simply structure (shard) our services and 
user groups in such a way that the users that need both, agreement and 
order, access services within one data center. Across data centers I 
only would need agreement. I would be nice to have agreement and order 
across data centers, but because of latency requirements I guess this 
would be prohibitively expensive.

Now to your question: yes, it would be fine if client would see C/x and 
A/y in different order.

> The reason I'm asking is that some time in the past me and others made this
> initial proposal:
> http://wiki.apache.org/hadoop/ZooKeeper/MountRemoteZookeeper
> which didn't get enough support for lack of a compelling use-case (among
> other things).

It would be nice if Zookeeper would offer agreement and order in a more 
granular fashion. I could imagine that write throughput could benefit 
even within one data center if you have a use case that only needs 
agreement, but you also pay for order, e.g. you thread all writes 
through a single writer.

Thanks for your thoughts!
-- 
Christian Schuhegger