You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@zookeeper.apache.org by cheetah <xu...@gmail.com> on 2011/08/31 00:36:06 UTC

Re: How zab avoid split-brain problem?

Hi Alex,

Thanks for the explanation.

Then I have another question:

If there are 7 machines in my current zookeeper clusters, two of them are
failed. How can I reconfigure the Zookeeper to make it working with 5
machines? i.e if the master can get 3 machines' reply, it can commit the
transaction.

On the other hand, if I add 2 machines to make a 9 node Zookeeper cluster,
how can I configure it to make it taking advantages of 9 machines?

This is more related to user mailing list. So I cc to it.

Thanks,
Peter

On Tue, Aug 30, 2011 at 12:21 PM, Alexander Shraer <sh...@yahoo-inc.com>wrote:

> Hi Peter,
>
> It's the second option. The servers don't know if the leader failed or
> was partitioned from them. So each group of 3 servers in your scenario
> can't distinguish the situation from another scenario where none of the
> servers
> failed but these 3 servers are partitioned from the other 4. To prevent a
> split brain
> in an asynchronous network a leader must have the support of a quorum.
>
> Alex
>
> > -----Original Message-----
> > From: cheetah [mailto:xuwh06@gmail.com]
> > Sent: Tuesday, August 30, 2011 12:23 AM
> > To: dev@zookeeper.apache.org
> > Subject: How zab avoid split-brain problem?
> >
> > Hi folks,
> >     I am reading the zab paper, but a bit confusing how zab handle
> > split
> > brain problem.
> >     Suppose there are A, B, C, D, E, F and G seven servers, now A is
> > the
> > leader. When A dies and at the same time, B,C,D are isolated from E, F
> > and
> > G.
> >      In this case, will Zab continue working like this: if B>C>D and
> > E>F>G,
> > so the two groups are both voting and electing B and E as their leaders
> > separately. Thus, there is a split brain problem.
> >      Or Zookeeper just stop working, because there were original 7
> > servers,
> > after 1 failure, a new leader still expects to have a quorum of 3
> > servers
> > voting for it as the leader. And because the two groups are separate
> > from
> > each other, no leader can be elected out.
> >
> >       If it is the first case, Zookeeper will have a split brain
> > problem,
> > which probably is not the case. But in the second case, a 7-node
> > Zookeeper
> > service can only handle a node failure and a network partition failure.
> >
> >      Am I understanding wrongly? Looking forward to your insights.
> >
> > Thanks,
> > Peter
>

Re: How zab avoid split-brain problem?

Posted by cheetah <xu...@gmail.com>.
I see. This makes sense to me now. Thanks.

Looking forward to this feature.

Regards,
Peter

On Tue, Aug 30, 2011 at 4:04 PM, Alexander Shraer <sh...@yahoo-inc.com>wrote:

> Hi Peter,
>
> We're currently working on adding dynamic reconfiguration functionality to
> Zookeeper. I hope that it will get in to the next release of ZK (after 3.4).
> With this you'll just run a new zk command to add/remove any servers, change
> ports, change roles (followers/observers), etc.
>
> Currently, membership is determined by the config file so the only way of
> doing this is "rolling restart". This means that you change configuration
> files and bounce the servers back. You should do it in a way that guarantees
> that at any time any quorum of the servers that are up intersects with any
> quorum of the old configuration (otherwise you might lose data). For
> example, if you're going from (A, B, C) to (A, B, C, D, E), it is possible
> that A and B have the latest data whereas C is falling behind (ZK stores
> data on a quorum), so if you just change the config files of A, B, C to say
> that they are part of the larger configuration then C might be elected with
> the support of D and E and you might lose data. So in this case you'll have
> to first add D, and later add E, this way the quorums intersect. Same thing
> when removing servers.
>
> Alex
>
> > -----Original Message-----
> > From: cheetah [mailto:xuwh06@gmail.com]
> > Sent: Tuesday, August 30, 2011 3:36 PM
> > To: dev@zookeeper.apache.org
> > Cc: user@zookeeper.apache.org
> > Subject: Re: How zab avoid split-brain problem?
> >
> > Hi Alex,
> >
> > Thanks for the explanation.
> >
> > Then I have another question:
> >
> > If there are 7 machines in my current zookeeper clusters, two of them
> > are
> > failed. How can I reconfigure the Zookeeper to make it working with 5
> > machines? i.e if the master can get 3 machines' reply, it can commit
> > the
> > transaction.
> >
> > On the other hand, if I add 2 machines to make a 9 node Zookeeper
> > cluster,
> > how can I configure it to make it taking advantages of 9 machines?
> >
> > This is more related to user mailing list. So I cc to it.
> >
> > Thanks,
> > Peter
> >
> > On Tue, Aug 30, 2011 at 12:21 PM, Alexander Shraer <shralex@yahoo-
> > inc.com>wrote:
> >
> > > Hi Peter,
> > >
> > > It's the second option. The servers don't know if the leader failed
> > or
> > > was partitioned from them. So each group of 3 servers in your
> > scenario
> > > can't distinguish the situation from another scenario where none of
> > the
> > > servers
> > > failed but these 3 servers are partitioned from the other 4. To
> > prevent a
> > > split brain
> > > in an asynchronous network a leader must have the support of a
> > quorum.
> > >
> > > Alex
> > >
> > > > -----Original Message-----
> > > > From: cheetah [mailto:xuwh06@gmail.com]
> > > > Sent: Tuesday, August 30, 2011 12:23 AM
> > > > To: dev@zookeeper.apache.org
> > > > Subject: How zab avoid split-brain problem?
> > > >
> > > > Hi folks,
> > > >     I am reading the zab paper, but a bit confusing how zab handle
> > > > split
> > > > brain problem.
> > > >     Suppose there are A, B, C, D, E, F and G seven servers, now A
> > is
> > > > the
> > > > leader. When A dies and at the same time, B,C,D are isolated from
> > E, F
> > > > and
> > > > G.
> > > >      In this case, will Zab continue working like this: if B>C>D
> > and
> > > > E>F>G,
> > > > so the two groups are both voting and electing B and E as their
> > leaders
> > > > separately. Thus, there is a split brain problem.
> > > >      Or Zookeeper just stop working, because there were original 7
> > > > servers,
> > > > after 1 failure, a new leader still expects to have a quorum of 3
> > > > servers
> > > > voting for it as the leader. And because the two groups are
> > separate
> > > > from
> > > > each other, no leader can be elected out.
> > > >
> > > >       If it is the first case, Zookeeper will have a split brain
> > > > problem,
> > > > which probably is not the case. But in the second case, a 7-node
> > > > Zookeeper
> > > > service can only handle a node failure and a network partition
> > failure.
> > > >
> > > >      Am I understanding wrongly? Looking forward to your insights.
> > > >
> > > > Thanks,
> > > > Peter
> > >
>

RE: How zab avoid split-brain problem?

Posted by Alexander Shraer <sh...@yahoo-inc.com>.
Hi Peter,

We're currently working on adding dynamic reconfiguration functionality to Zookeeper. I hope that it will get in to the next release of ZK (after 3.4). With this you'll just run a new zk command to add/remove any servers, change ports, change roles (followers/observers), etc.

Currently, membership is determined by the config file so the only way of doing this is "rolling restart". This means that you change configuration files and bounce the servers back. You should do it in a way that guarantees that at any time any quorum of the servers that are up intersects with any quorum of the old configuration (otherwise you might lose data). For example, if you're going from (A, B, C) to (A, B, C, D, E), it is possible that A and B have the latest data whereas C is falling behind (ZK stores data on a quorum), so if you just change the config files of A, B, C to say that they are part of the larger configuration then C might be elected with the support of D and E and you might lose data. So in this case you'll have to first add D, and later add E, this way the quorums intersect. Same thing when removing servers.

Alex

> -----Original Message-----
> From: cheetah [mailto:xuwh06@gmail.com]
> Sent: Tuesday, August 30, 2011 3:36 PM
> To: dev@zookeeper.apache.org
> Cc: user@zookeeper.apache.org
> Subject: Re: How zab avoid split-brain problem?
> 
> Hi Alex,
> 
> Thanks for the explanation.
> 
> Then I have another question:
> 
> If there are 7 machines in my current zookeeper clusters, two of them
> are
> failed. How can I reconfigure the Zookeeper to make it working with 5
> machines? i.e if the master can get 3 machines' reply, it can commit
> the
> transaction.
> 
> On the other hand, if I add 2 machines to make a 9 node Zookeeper
> cluster,
> how can I configure it to make it taking advantages of 9 machines?
> 
> This is more related to user mailing list. So I cc to it.
> 
> Thanks,
> Peter
> 
> On Tue, Aug 30, 2011 at 12:21 PM, Alexander Shraer <shralex@yahoo-
> inc.com>wrote:
> 
> > Hi Peter,
> >
> > It's the second option. The servers don't know if the leader failed
> or
> > was partitioned from them. So each group of 3 servers in your
> scenario
> > can't distinguish the situation from another scenario where none of
> the
> > servers
> > failed but these 3 servers are partitioned from the other 4. To
> prevent a
> > split brain
> > in an asynchronous network a leader must have the support of a
> quorum.
> >
> > Alex
> >
> > > -----Original Message-----
> > > From: cheetah [mailto:xuwh06@gmail.com]
> > > Sent: Tuesday, August 30, 2011 12:23 AM
> > > To: dev@zookeeper.apache.org
> > > Subject: How zab avoid split-brain problem?
> > >
> > > Hi folks,
> > >     I am reading the zab paper, but a bit confusing how zab handle
> > > split
> > > brain problem.
> > >     Suppose there are A, B, C, D, E, F and G seven servers, now A
> is
> > > the
> > > leader. When A dies and at the same time, B,C,D are isolated from
> E, F
> > > and
> > > G.
> > >      In this case, will Zab continue working like this: if B>C>D
> and
> > > E>F>G,
> > > so the two groups are both voting and electing B and E as their
> leaders
> > > separately. Thus, there is a split brain problem.
> > >      Or Zookeeper just stop working, because there were original 7
> > > servers,
> > > after 1 failure, a new leader still expects to have a quorum of 3
> > > servers
> > > voting for it as the leader. And because the two groups are
> separate
> > > from
> > > each other, no leader can be elected out.
> > >
> > >       If it is the first case, Zookeeper will have a split brain
> > > problem,
> > > which probably is not the case. But in the second case, a 7-node
> > > Zookeeper
> > > service can only handle a node failure and a network partition
> failure.
> > >
> > >      Am I understanding wrongly? Looking forward to your insights.
> > >
> > > Thanks,
> > > Peter
> >

RE: How zab avoid split-brain problem?

Posted by Alexander Shraer <sh...@yahoo-inc.com>.
Hi Peter,

We're currently working on adding dynamic reconfiguration functionality to Zookeeper. I hope that it will get in to the next release of ZK (after 3.4). With this you'll just run a new zk command to add/remove any servers, change ports, change roles (followers/observers), etc.

Currently, membership is determined by the config file so the only way of doing this is "rolling restart". This means that you change configuration files and bounce the servers back. You should do it in a way that guarantees that at any time any quorum of the servers that are up intersects with any quorum of the old configuration (otherwise you might lose data). For example, if you're going from (A, B, C) to (A, B, C, D, E), it is possible that A and B have the latest data whereas C is falling behind (ZK stores data on a quorum), so if you just change the config files of A, B, C to say that they are part of the larger configuration then C might be elected with the support of D and E and you might lose data. So in this case you'll have to first add D, and later add E, this way the quorums intersect. Same thing when removing servers.

Alex

> -----Original Message-----
> From: cheetah [mailto:xuwh06@gmail.com]
> Sent: Tuesday, August 30, 2011 3:36 PM
> To: dev@zookeeper.apache.org
> Cc: user@zookeeper.apache.org
> Subject: Re: How zab avoid split-brain problem?
> 
> Hi Alex,
> 
> Thanks for the explanation.
> 
> Then I have another question:
> 
> If there are 7 machines in my current zookeeper clusters, two of them
> are
> failed. How can I reconfigure the Zookeeper to make it working with 5
> machines? i.e if the master can get 3 machines' reply, it can commit
> the
> transaction.
> 
> On the other hand, if I add 2 machines to make a 9 node Zookeeper
> cluster,
> how can I configure it to make it taking advantages of 9 machines?
> 
> This is more related to user mailing list. So I cc to it.
> 
> Thanks,
> Peter
> 
> On Tue, Aug 30, 2011 at 12:21 PM, Alexander Shraer <shralex@yahoo-
> inc.com>wrote:
> 
> > Hi Peter,
> >
> > It's the second option. The servers don't know if the leader failed
> or
> > was partitioned from them. So each group of 3 servers in your
> scenario
> > can't distinguish the situation from another scenario where none of
> the
> > servers
> > failed but these 3 servers are partitioned from the other 4. To
> prevent a
> > split brain
> > in an asynchronous network a leader must have the support of a
> quorum.
> >
> > Alex
> >
> > > -----Original Message-----
> > > From: cheetah [mailto:xuwh06@gmail.com]
> > > Sent: Tuesday, August 30, 2011 12:23 AM
> > > To: dev@zookeeper.apache.org
> > > Subject: How zab avoid split-brain problem?
> > >
> > > Hi folks,
> > >     I am reading the zab paper, but a bit confusing how zab handle
> > > split
> > > brain problem.
> > >     Suppose there are A, B, C, D, E, F and G seven servers, now A
> is
> > > the
> > > leader. When A dies and at the same time, B,C,D are isolated from
> E, F
> > > and
> > > G.
> > >      In this case, will Zab continue working like this: if B>C>D
> and
> > > E>F>G,
> > > so the two groups are both voting and electing B and E as their
> leaders
> > > separately. Thus, there is a split brain problem.
> > >      Or Zookeeper just stop working, because there were original 7
> > > servers,
> > > after 1 failure, a new leader still expects to have a quorum of 3
> > > servers
> > > voting for it as the leader. And because the two groups are
> separate
> > > from
> > > each other, no leader can be elected out.
> > >
> > >       If it is the first case, Zookeeper will have a split brain
> > > problem,
> > > which probably is not the case. But in the second case, a 7-node
> > > Zookeeper
> > > service can only handle a node failure and a network partition
> failure.
> > >
> > >      Am I understanding wrongly? Looking forward to your insights.
> > >
> > > Thanks,
> > > Peter
> >