You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@zookeeper.apache.org by jiangwen w <wj...@gmail.com> on 2011/03/28 10:24:14 UTC

send UPTODATE to follower until a quorum of servers synced with leader

1. current process
when leader fail, a new leader will be elected, followers will sync with the
new leader.
After synced, leader send UPTODATE to follower.

2. a corner case
but there is a corner case, things will go wrong.
suppose message M only exists on leader, after a follower synced with
leader, the client connected to the follower will see M.
but it only exists on two servers, not on a quorum of servers. If the new
leader and the follower failed, message M is lost, but M is already seen by
client.

3. one solution
So I think UPTODATE  can be sent to follower only when a quorum of server
synced with the leader.

Sincerely

Re: send UPTODATE to follower until a quorum of servers synced with leader

Posted by Benjamin Reed <br...@apache.org>.
yes, camille is correct. right now a leader will validate a client session
even though it has not assumed leadership. we are incorrectly counting on
UPTODATE to prevent validates before leadership is assumed. we need to fix
the UPTODATE problem, but we should also fix the leader so that it doesn't
validate any sessions until it has assumed leadership.

ben

On Mon, Mar 28, 2011 at 12:33 PM, Fournier, Camille F. [Tech] <
Camille.Fournier@gs.com> wrote:

> Looking at the code it looks like we don’t need a synched quorum to accept
> a new client session, just a quorum in the process of synching, so I don’t
> think the session handling will solve this. I suppose it’s a warning that
> correctness for n=3 doesn’t extend to all possible cluster sizes of N.
>
> Definitely worth opening a JIRA.
>
>
>
> C
>
>
>
> *From:* Flavio Junqueira [mailto:fpj@yahoo-inc.com]
> *Sent:* Monday, March 28, 2011 11:49 AM
> *To:* dev@zookeeper.apache.org
> *Subject:* Re: send UPTODATE to follower until a quorum of servers synced
> with leader
>
>
>
> Hi Jiangwen, Good catch. I followed the code and it does sound like this
> scenario can happen, ignoring how sessions are handled. I checked that a
> follower takes a snapshot and starts a zookeeper server right after
> receiving an UPTODATE message. I'm not clear, though, if it is possible for
> a client to revalidate a session while the leader hasn't started. I was
> discussing with Ben offline and it sounds like we do not necessarily wait
> for a leader to come up to revalidate sessions. I'm not so familiar with the
> session handling part of the code, so I'll let perhaps Ben or someone else
> add to this discussion.
>
>
>
> In any case, you might want to open a jira to track our comments so that we
> don't miss important comments. I also wanted to point out that we have been
> observing a few corner cases like the one you raised, and we have been
> designing changes to the implementation that take care of such problems. If
> I'm not mistaken, the scenario you point out wouldn't happen under our
> changes because followers would wait for a commit message (wait for a quorum
> to ack) before starting a server, as you point out. The latest notes on the
> design are under Zab1.0 in the ZooKeeper wiki.
>
>
>
> Thanks,
>
> -Flavio
>
>
>
>
>
> On Mar 28, 2011, at 10:24 AM, jiangwen w wrote:
>
>
>
> 1. current process
> when leader fail, a new leader will be elected, followers will sync with
> the
> new leader.
> After synced, leader send UPTODATE to follower.
>
> 2. a corner case
> but there is a corner case, things will go wrong.
> suppose message M only exists on leader, after a follower synced with
> leader, the client connected to the follower will see M.
> but it only exists on two servers, not on a quorum of servers. If the new
> leader and the follower failed, message M is lost, but M is already seen by
> client.
>
> 3. one solution
> So I think UPTODATE  can be sent to follower only when a quorum of server
> synced with the leader.
>
> Sincerely
>
>
>
> *flavio*
> *junqueira*
>
> research scientist
>
> fpj@yahoo-inc.com
> direct +34 93-183-8828
>
> avinguda diagonal 177, 8th floor, barcelona, 08018, es
> phone (408) 349 3300    fax (408) 349 3301
>
>
>
>

RE: send UPTODATE to follower until a quorum of servers synced with leader

Posted by "Fournier, Camille F. [Tech]" <Ca...@gs.com>.
Looking at the code it looks like we don't need a synched quorum to accept a new client session, just a quorum in the process of synching, so I don't think the session handling will solve this. I suppose it's a warning that correctness for n=3 doesn't extend to all possible cluster sizes of N.
Definitely worth opening a JIRA.

C

From: Flavio Junqueira [mailto:fpj@yahoo-inc.com]
Sent: Monday, March 28, 2011 11:49 AM
To: dev@zookeeper.apache.org
Subject: Re: send UPTODATE to follower until a quorum of servers synced with leader

Hi Jiangwen, Good catch. I followed the code and it does sound like this scenario can happen, ignoring how sessions are handled. I checked that a follower takes a snapshot and starts a zookeeper server right after receiving an UPTODATE message. I'm not clear, though, if it is possible for a client to revalidate a session while the leader hasn't started. I was discussing with Ben offline and it sounds like we do not necessarily wait for a leader to come up to revalidate sessions. I'm not so familiar with the session handling part of the code, so I'll let perhaps Ben or someone else add to this discussion.

In any case, you might want to open a jira to track our comments so that we don't miss important comments. I also wanted to point out that we have been observing a few corner cases like the one you raised, and we have been designing changes to the implementation that take care of such problems. If I'm not mistaken, the scenario you point out wouldn't happen under our changes because followers would wait for a commit message (wait for a quorum to ack) before starting a server, as you point out. The latest notes on the design are under Zab1.0 in the ZooKeeper wiki.

Thanks,
-Flavio


On Mar 28, 2011, at 10:24 AM, jiangwen w wrote:


1. current process
when leader fail, a new leader will be elected, followers will sync with the
new leader.
After synced, leader send UPTODATE to follower.

2. a corner case
but there is a corner case, things will go wrong.
suppose message M only exists on leader, after a follower synced with
leader, the client connected to the follower will see M.
but it only exists on two servers, not on a quorum of servers. If the new
leader and the follower failed, message M is lost, but M is already seen by
client.

3. one solution
So I think UPTODATE  can be sent to follower only when a quorum of server
synced with the leader.

Sincerely

flavio
junqueira

research scientist

fpj@yahoo-inc.com<ma...@yahoo-inc.com>
direct +34 93-183-8828

avinguda diagonal 177, 8th floor, barcelona, 08018, es
phone (408) 349 3300    fax (408) 349 3301

[cid:image001.png@01CBED57.BD434070]


Re: send UPTODATE to follower until a quorum of servers synced with leader

Posted by Benjamin Reed <br...@apache.org>.
sorry, i'm behind on my email. you are correct :)

ben

On Mon, Mar 28, 2011 at 1:03 PM, Fournier, Camille F. [Tech] <
Camille.Fournier@gs.com> wrote:

> I take that back. Right after the UPTODATE send in LearnerHandler, we wait
> for the final ACK from that follower and call processAck on that packet. We
> need that ack set to reach a quorum set before we start up the Leader
> ZooKeeperServer. Until that is started, we won’t process REVALIDATE requests
> and we won’t accept connections ourselves (so clients can’t connect to us to
> revalidate their session). So I think we are ok.
>
>
>
> C
>
>
>
> *From:* Fournier, Camille F. [Tech]
> *Sent:* Monday, March 28, 2011 3:34 PM
> *To:* 'dev@zookeeper.apache.org'
> *Subject:* RE: send UPTODATE to follower until a quorum of servers synced
> with leader
>
>
>
> Looking at the code it looks like we don’t need a synched quorum to accept
> a new client session, just a quorum in the process of synching, so I don’t
> think the session handling will solve this. I suppose it’s a warning that
> correctness for n=3 doesn’t extend to all possible cluster sizes of N.
>
> Definitely worth opening a JIRA.
>
>
>
> C
>
>
>
> *From:* Flavio Junqueira [mailto:fpj@yahoo-inc.com]
> *Sent:* Monday, March 28, 2011 11:49 AM
> *To:* dev@zookeeper.apache.org
> *Subject:* Re: send UPTODATE to follower until a quorum of servers synced
> with leader
>
>
>
> Hi Jiangwen, Good catch. I followed the code and it does sound like this
> scenario can happen, ignoring how sessions are handled. I checked that a
> follower takes a snapshot and starts a zookeeper server right after
> receiving an UPTODATE message. I'm not clear, though, if it is possible for
> a client to revalidate a session while the leader hasn't started. I was
> discussing with Ben offline and it sounds like we do not necessarily wait
> for a leader to come up to revalidate sessions. I'm not so familiar with the
> session handling part of the code, so I'll let perhaps Ben or someone else
> add to this discussion.
>
>
>
> In any case, you might want to open a jira to track our comments so that we
> don't miss important comments. I also wanted to point out that we have been
> observing a few corner cases like the one you raised, and we have been
> designing changes to the implementation that take care of such problems. If
> I'm not mistaken, the scenario you point out wouldn't happen under our
> changes because followers would wait for a commit message (wait for a quorum
> to ack) before starting a server, as you point out. The latest notes on the
> design are under Zab1.0 in the ZooKeeper wiki.
>
>
>
> Thanks,
>
> -Flavio
>
>
>
>
>
> On Mar 28, 2011, at 10:24 AM, jiangwen w wrote:
>
>
>
> 1. current process
> when leader fail, a new leader will be elected, followers will sync with
> the
> new leader.
> After synced, leader send UPTODATE to follower.
>
> 2. a corner case
> but there is a corner case, things will go wrong.
> suppose message M only exists on leader, after a follower synced with
> leader, the client connected to the follower will see M.
> but it only exists on two servers, not on a quorum of servers. If the new
> leader and the follower failed, message M is lost, but M is already seen by
> client.
>
> 3. one solution
> So I think UPTODATE  can be sent to follower only when a quorum of server
> synced with the leader.
>
> Sincerely
>
>
>
> *flavio*
> *junqueira*
>
> research scientist
>
> fpj@yahoo-inc.com
> direct +34 93-183-8828
>
> avinguda diagonal 177, 8th floor, barcelona, 08018, es
> phone (408) 349 3300    fax (408) 349 3301
>
>
>
>

RE: send UPTODATE to follower until a quorum of servers synced with leader

Posted by "Fournier, Camille F. [Tech]" <Ca...@gs.com>.
I take that back. Right after the UPTODATE send in LearnerHandler, we wait for the final ACK from that follower and call processAck on that packet. We need that ack set to reach a quorum set before we start up the Leader ZooKeeperServer. Until that is started, we won't process REVALIDATE requests and we won't accept connections ourselves (so clients can't connect to us to revalidate their session). So I think we are ok.

C

From: Fournier, Camille F. [Tech]
Sent: Monday, March 28, 2011 3:34 PM
To: 'dev@zookeeper.apache.org'
Subject: RE: send UPTODATE to follower until a quorum of servers synced with leader

Looking at the code it looks like we don't need a synched quorum to accept a new client session, just a quorum in the process of synching, so I don't think the session handling will solve this. I suppose it's a warning that correctness for n=3 doesn't extend to all possible cluster sizes of N.
Definitely worth opening a JIRA.

C

From: Flavio Junqueira [mailto:fpj@yahoo-inc.com]
Sent: Monday, March 28, 2011 11:49 AM
To: dev@zookeeper.apache.org
Subject: Re: send UPTODATE to follower until a quorum of servers synced with leader

Hi Jiangwen, Good catch. I followed the code and it does sound like this scenario can happen, ignoring how sessions are handled. I checked that a follower takes a snapshot and starts a zookeeper server right after receiving an UPTODATE message. I'm not clear, though, if it is possible for a client to revalidate a session while the leader hasn't started. I was discussing with Ben offline and it sounds like we do not necessarily wait for a leader to come up to revalidate sessions. I'm not so familiar with the session handling part of the code, so I'll let perhaps Ben or someone else add to this discussion.

In any case, you might want to open a jira to track our comments so that we don't miss important comments. I also wanted to point out that we have been observing a few corner cases like the one you raised, and we have been designing changes to the implementation that take care of such problems. If I'm not mistaken, the scenario you point out wouldn't happen under our changes because followers would wait for a commit message (wait for a quorum to ack) before starting a server, as you point out. The latest notes on the design are under Zab1.0 in the ZooKeeper wiki.

Thanks,
-Flavio


On Mar 28, 2011, at 10:24 AM, jiangwen w wrote:

1. current process
when leader fail, a new leader will be elected, followers will sync with the
new leader.
After synced, leader send UPTODATE to follower.

2. a corner case
but there is a corner case, things will go wrong.
suppose message M only exists on leader, after a follower synced with
leader, the client connected to the follower will see M.
but it only exists on two servers, not on a quorum of servers. If the new
leader and the follower failed, message M is lost, but M is already seen by
client.

3. one solution
So I think UPTODATE  can be sent to follower only when a quorum of server
synced with the leader.

Sincerely

flavio
junqueira

research scientist

fpj@yahoo-inc.com<ma...@yahoo-inc.com>
direct +34 93-183-8828

avinguda diagonal 177, 8th floor, barcelona, 08018, es
phone (408) 349 3300    fax (408) 349 3301

[cid:image001.png@01CBED61.9E286850]


Re: send UPTODATE to follower until a quorum of servers synced with leader

Posted by Flavio Junqueira <fp...@yahoo-inc.com>.
Hi Jiangwen, Good catch. I followed the code and it does sound like  
this scenario can happen, ignoring how sessions are handled. I checked  
that a follower takes a snapshot and starts a zookeeper server right  
after receiving an UPTODATE message. I'm not clear, though, if it is  
possible for a client to revalidate a session while the leader hasn't  
started. I was discussing with Ben offline and it sounds like we do  
not necessarily wait for a leader to come up to revalidate sessions.  
I'm not so familiar with the session handling part of the code, so  
I'll let perhaps Ben or someone else add to this discussion.

In any case, you might want to open a jira to track our comments so  
that we don't miss important comments. I also wanted to point out that  
we have been observing a few corner cases like the one you raised, and  
we have been designing changes to the implementation that take care of  
such problems. If I'm not mistaken, the scenario you point out  
wouldn't happen under our changes because followers would wait for a  
commit message (wait for a quorum to ack) before starting a server, as  
you point out. The latest notes on the design are under Zab1.0 in the  
ZooKeeper wiki.

Thanks,
-Flavio


On Mar 28, 2011, at 10:24 AM, jiangwen w wrote:

> 1. current process
> when leader fail, a new leader will be elected, followers will sync  
> with the
> new leader.
> After synced, leader send UPTODATE to follower.
>
> 2. a corner case
> but there is a corner case, things will go wrong.
> suppose message M only exists on leader, after a follower synced with
> leader, the client connected to the follower will see M.
> but it only exists on two servers, not on a quorum of servers. If  
> the new
> leader and the follower failed, message M is lost, but M is already  
> seen by
> client.
>
> 3. one solution
> So I think UPTODATE  can be sent to follower only when a quorum of  
> server
> synced with the leader.
>
> Sincerely

flavio
junqueira

research scientist

fpj@yahoo-inc.com
direct +34 93-183-8828

avinguda diagonal 177, 8th floor, barcelona, 08018, es
phone (408) 349 3300    fax (408) 349 3301