You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@zookeeper.apache.org by Jeremy Stribling <st...@nicira.com> on 2011/03/22 03:15:51 UTC

new leader accepting create requests too early?

Hi all,

I ran into a weird case where Zookeeper seems to have elected a new 
leader (a node that just restarted and missed several operations), and a 
client is able to connect to it and create a new sequential node that 
has a number earlier than the last node it created.  I don't have full 
logs, or a live system in this state, or any data directories, just some 
partial server logs and the evidence as seen by the client.  Haven't 
tried reproducing it yet, just wanted to see if anyone here had any 
ideas.  Here's the scenario (probably more info than necessary, but 
trying to be complete)

Version: Zookeeper 3.3.3

1) Initially (5:37:20): 3 nodes up, with ids 215, 126, and 37 (called 
nodes #1, #2, and #3 below):
2) Nodes periodically (and throughout this whole timeline) create 
sequential, non-ephemeral nodes under the /zkrsm parent node.
3) 5:46:57: Node #1 gets notified of 
/zkrsm/0000000000000000_record0000002116
4) 5:47:06: Node #1 restarts and rejoins
5) 5:49:26: Node #2 gets notified of 
/zkrsm/0000000000000000_record0000002708
6) 5:49:29: Node #2 restarts and rejoins
7) 5:52:01: Node #3 gets notified of 
/zkrsm/0000000000000000_record0000003291
8) 5:52:02: Node #3 restarts and begins the rejoining process
9) 5:52:08: Node #1 successfully creates 
/zkrsm/0000000000000000_record0000003348
10) 5:52:08: Node #2 dies after getting notified of 
/zkrsm/0000000000000000_record0000003348
11) 5:52:10ish: Node #3 is elected leader (the ZK server log doesn't 
have wallclock timestamps, so not exactly sure on the ordering of this step)
12) 5:52:15: Node #1 successfully creates 
/zkrsm/0000000000000000_record0000003292

Note that the node created in step #12 is lower than the one created in 
step #9, and is exactly one greater than the last node seen by node #3 
before it restarted.

The leader election bit from node #3's log after restarting might be of 
interest:

>
> 2644 [QuorumPeer:/0.0.0.0:2888] INFO 
> org.apache.zookeeper.server.quorum.QuorumPeer  - LOOKING
> 2644 [QuorumPeer:/0.0.0.0:2888] DEBUG 
> org.apache.zookeeper.server.quorum.QuorumPeer  - Initializing leader 
> election protocol...
> 2647 [QuorumPeer:/0.0.0.0:2888] INFO 
> org.apache.zookeeper.server.quorum.FastLeaderElection  - New election. 
> My id =  37, Proposed zxid = 17179869831
> 2650 [WorkerSender Thread] DEBUG 
> org.apache.zookeeper.server.quorum.QuorumCnxManager  - Opening channel 
> to server 126
> 2653 [WorkerReceiver Thread] DEBUG 
> org.apache.zookeeper.server.quorum.FastLeaderElection  - Receive new 
> notification message. My id = 37
> 2653 [WorkerReceiver Thread] INFO 
> org.apache.zookeeper.server.quorum.FastLeaderElection  - Notification: 
> 37 (n.leader), 17179869831 (n.zxid), 1 (n.round), LOOKING (n.state), 
> 37 (n.sid), LOOKING (my state)
> 2654 [QuorumPeer:/0.0.0.0:2888] DEBUG 
> org.apache.zookeeper.server.quorum.FastLeaderElection  - id: 37, 
> proposed id: 37, zxid: 17179869831, proposed zxid: 17179869831
> 2655 [QuorumPeer:/0.0.0.0:2888] DEBUG 
> org.apache.zookeeper.server.quorum.FastLeaderElection  - Adding vote: 
> From = 37, Proposed leader = 37, Porposed zxid = 17179869831, Proposed 
> epoch = 1
> 2657 [WorkerSender Thread] INFO 
> org.apache.zookeeper.server.quorum.QuorumCnxManager  - Have smaller 
> server identifier, so dropping the connection: (126, 37)
> 2657 [WorkerSender Thread] DEBUG 
> org.apache.zookeeper.server.quorum.QuorumCnxManager  - Opening channel 
> to server 215
> 2661 [Thread-3] DEBUG 
> org.apache.zookeeper.server.quorum.QuorumCnxManager  - Connection 
> request /13.0.0.12:60623
> 2661 [Thread-3] DEBUG 
> org.apache.zookeeper.server.quorum.QuorumCnxManager  - Connection 
> request: 37
> 2661 [WorkerSender Thread] INFO 
> org.apache.zookeeper.server.quorum.QuorumCnxManager  - Have smaller 
> server identifier, so dropping the connection: (215, 37)
> 2662 [Thread-3] DEBUG 
> org.apache.zookeeper.server.quorum.QuorumCnxManager  - Address of 
> remote peer: 126
> 2668 [Thread-3] DEBUG 
> org.apache.zookeeper.server.quorum.QuorumCnxManager  - Connection 
> request /13.0.0.11:60190
> 2668 [Thread-3] DEBUG 
> org.apache.zookeeper.server.quorum.QuorumCnxManager  - Connection 
> request: 37
> 2668 [Thread-3] DEBUG 
> org.apache.zookeeper.server.quorum.QuorumCnxManager  - Address of 
> remote peer: 215
> 2669 [WorkerReceiver Thread] DEBUG 
> org.apache.zookeeper.server.quorum.FastLeaderElection  - Receive new 
> notification message. My id = 37
> 2670 [WorkerReceiver Thread] INFO 
> org.apache.zookeeper.server.quorum.FastLeaderElection  - Notification: 
> 215 (n.leader), 12884902548 (n.zxid), 3 (n.round), LOOKING (n.state), 
> 126 (n.sid), LOOKING (my state)
> 2670 [QuorumPeer:/0.0.0.0:2888] DEBUG 
> org.apache.zookeeper.server.quorum.FastLeaderElection  - id: 215, 
> proposed id: 37, zxid: 12884902548, proposed zxid: 17179869831
> 2670 [WorkerReceiver Thread] DEBUG 
> org.apache.zookeeper.server.quorum.FastLeaderElection  - Receive new 
> notification message. My id = 37
> 2670 [QuorumPeer:/0.0.0.0:2888] DEBUG 
> org.apache.zookeeper.server.quorum.FastLeaderElection  - Adding vote: 
> From = 126, Proposed leader = 215, Porposed zxid = 12884902548, 
> Proposed epoch = 3
> 2670 [WorkerReceiver Thread] INFO 
> org.apache.zookeeper.server.quorum.FastLeaderElection  - Notification: 
> 215 (n.leader), 12884902548 (n.zxid), 3 (n.round), FOLLOWING 
> (n.state), 126 (n.sid), LOOKING (my state)
> 2671 [WorkerSender Thread] DEBUG 
> org.apache.zookeeper.server.quorum.QuorumCnxManager  - There is a 
> connection already for server 126
> 2671 [WorkerSender Thread] DEBUG 
> org.apache.zookeeper.server.quorum.QuorumCnxManager  - There is a 
> connection already for server 215
> 2671 [WorkerReceiver Thread] DEBUG 
> org.apache.zookeeper.server.quorum.FastLeaderElection  - Receive new 
> notification message. My id = 37
> 2671 [WorkerReceiver Thread] INFO 
> org.apache.zookeeper.server.quorum.FastLeaderElection  - Notification: 
> 37 (n.leader), 17179869831 (n.zxid), 3 (n.round), LOOKING (n.state), 
> 37 (n.sid), LOOKING (my state)
> 2671 [WorkerReceiver Thread] DEBUG 
> org.apache.zookeeper.server.quorum.FastLeaderElection  - Receive new 
> notification message. My id = 37
> 2671 [WorkerReceiver Thread] INFO 
> org.apache.zookeeper.server.quorum.FastLeaderElection  - Notification: 
> 215 (n.leader), 12884902548 (n.zxid), 3 (n.round), LOOKING (n.state), 
> 215 (n.sid), LOOKING (my state)
> 2671 [QuorumPeer:/0.0.0.0:2888] DEBUG 
> org.apache.zookeeper.server.quorum.FastLeaderElection  - id: 37, 
> proposed id: 37, zxid: 17179869831, proposed zxid: 17179869831
> 2672 [WorkerReceiver Thread] DEBUG 
> org.apache.zookeeper.server.quorum.FastLeaderElection  - Receive new 
> notification message. My id = 37
> 2672 [QuorumPeer:/0.0.0.0:2888] DEBUG 
> org.apache.zookeeper.server.quorum.FastLeaderElection  - Adding vote: 
> From = 37, Proposed leader = 37, Porposed zxid = 17179869831, Proposed 
> epoch = 3
> 2672 [WorkerReceiver Thread] INFO 
> org.apache.zookeeper.server.quorum.FastLeaderElection  - Notification: 
> 215 (n.leader), 12884902548 (n.zxid), 3 (n.round), LEADING (n.state), 
> 215 (n.sid), LOOKING (my state)
> 2672 [QuorumPeer:/0.0.0.0:2888] DEBUG 
> org.apache.zookeeper.server.quorum.FastLeaderElection  - id: 215, 
> proposed id: 37, zxid: 12884902548, proposed zxid: 17179869831
> 2672 [QuorumPeer:/0.0.0.0:2888] DEBUG 
> org.apache.zookeeper.server.quorum.FastLeaderElection  - Adding vote: 
> From = 215, Proposed leader = 215, Porposed zxid = 12884902548, 
> Proposed epoch = 3
> 2672 [WorkerReceiver Thread] DEBUG 
> org.apache.zookeeper.server.quorum.FastLeaderElection  - Receive new 
> notification message. My id = 37
> 2672 [WorkerReceiver Thread] INFO 
> org.apache.zookeeper.server.quorum.FastLeaderElection  - Notification: 
> 215 (n.leader), 12884902548 (n.zxid), 3 (n.round), FOLLOWING 
> (n.state), 126 (n.sid), LEADING (my state)
> 2673 [QuorumPeer:/0.0.0.0:2888] INFO 
> org.apache.zookeeper.server.quorum.QuorumPeer  - LEADING
> 2675 [QuorumPeer:/0.0.0.0:2888] INFO 
> org.apache.zookeeper.server.quorum.Leader  - TCP NoDelay set to: true
> 2693 [WorkerReceiver Thread] DEBUG 
> org.apache.zookeeper.server.quorum.FastLeaderElection  - Receive new 
> notification message. My id = 37

I don't know much about the internal workings of Zookeeper, but what I 
find weird here is that node #3 proposes zxid 17179869831 (0x400000287), 
which seems to indicate that it wants to start well into epoch 4 before 
it even opens channels to other nodes, finding out that their zxids are 
only 12884902548 (0x300000294).  Is it possible that node #3 thought its 
zxid was not less than the other nodes' zxids, and so did not bother 
transferring the operations it missed?

Any other ideas on what could have gone wrong?  I dont see any other 
obvious exceptions in node #3's log, except:

> 2487 [pool-1-thread-2] DEBUG 
> org.apache.zookeeper.server.persistence.FileTxnLog  - EOF excepton 
> java.io.EOFException: Failed to read

but that seems to happen all the time, and it's at DEBUG level, so I'm 
guessing it's expected.

Thanks,

Jeremy



Re: new leader accepting create requests too early?

Posted by Jeremy Stribling <st...@nicira.com>.
I created https://issues.apache.org/jira/browse/ZOOKEEPER-1026 to track 
this, and attached logs to it.  Thanks.

Jeremy

On 03/22/2011 12:26 AM, Flavio Junqueira wrote:
> You're right in that node #3 declared itself leader given the sequence 
> of notifications it received: it received one notification from node 
> #1 saying that it was looking in the first few lines of your log 
> excerpt and at least one from itself. However, node #3 does not have 
> enough support (followers), so it won't exercise leadership. From the 
> notifications, it sounds like node #1 is the one exercising leadership.
>
> -Flavio
>
> On Mar 22, 2011, at 8:11 AM, Jeremy Stribling wrote:
>
>> Thanks for the response.  I thought that the "my state" in this line,
>> printed in node #3's log:
>>
>>>
>>> 2672 [WorkerReceiver Thread] INFO
>>> org.apache.zookeeper.server.quorum.FastLeaderElection  - Notification:
>>> 215 (n.leader), 12884902548 (n.zxid), 3 (n.round), FOLLOWING
>>> (n.state), 126 (n.sid), LEADING (my state)
>>
>> indicated that node #3 was the leader, but I'm probably misinterpreting
>> it (I haven't had a chance to look through the source yet to figure it
>> out for sure).  In any case, what I think are the relevant notifications
>> of node #1's logs look like this:
>>
>>
>>>
>>> 307122 [WorkerReceiver Thread] INFO
>>> org.apache.zookeeper.server.quorum.FastLeaderElection  - Notification:
>>> 37 (n.leader), 17179869831 (n.zxid), 1 (n.round), LOOKING (n.state),
>>> 37 (n.sid), LEADING (my state)
>>> 307142 [WorkerReceiver Thread] INFO
>>> org.apache.zookeeper.server.quorum.FastLeaderElection  - Notification:
>>> 37 (n.leader), 17179869831 (n.zxid), 3 (n.round), LOOKING (n.state),
>>> 37 (n.sid), LEADING (my state)
>>> 310850 [WorkerReceiver Thread] INFO
>>> org.apache.zookeeper.server.quorum.FastLeaderElection  - Notification:
>>> 215 (n.leader), 17179869918 (n.zxid), 4 (n.round), LOOKING (n.state),
>>> 215 (n.sid), LOOKING (my state)
>>> 310850 [WorkerReceiver Thread] INFO
>>> org.apache.zookeeper.server.quorum.FastLeaderElection  - Notification:
>>> 37 (n.leader), 17179869831 (n.zxid), 3 (n.round), LEADING (n.state),
>>> 37 (n.sid), LOOKING (my state)
>>> 311051 [QuorumPeer:/0.0.0.0:2888] INFO
>>> org.apache.zookeeper.server.quorum.FastLeaderElection  - Notification
>>> time out: 400
>>> 311053 [WorkerReceiver Thread] INFO
>>> org.apache.zookeeper.server.quorum.FastLeaderElection  - Notification:
>>> 37 (n.leader), 17179869831 (n.zxid), 3 (n.round), LEADING (n.state),
>>> 37 (n.sid), LOOKING (my state)
>>> 311054 [WorkerReceiver Thread] INFO
>>> org.apache.zookeeper.server.quorum.FastLeaderElection  - Notification:
>>> 215 (n.leader), 17179869918 (n.zxid), 4 (n.round), LOOKING (n.state),
>>> 215 (n.sid), LOOKING (my state)
>>> 311454 [QuorumPeer:/0.0.0.0:2888] INFO
>>> org.apache.zookeeper.server.quorum.FastLeaderElection  - Notification
>>> time out: 800
>>> 311456 [WorkerReceiver Thread] INFO
>>> org.apache.zookeeper.server.quorum.FastLeaderElection  - Notification:
>>> 37 (n.leader), 17179869831 (n.zxid), 3 (n.round), LEADING (n.state),
>>> 37 (n.sid), LOOKING (my state)
>>> 311457 [WorkerReceiver Thread] INFO
>>> org.apache.zookeeper.server.quorum.FastLeaderElection  - Notification:
>>> 215 (n.leader), 17179869918 (n.zxid), 4 (n.round), LOOKING (n.state),
>>> 215 (n.sid), LOOKING (my state)
>>> 312257 [QuorumPeer:/0.0.0.0:2888] INFO
>>> org.apache.zookeeper.server.quorum.FastLeaderElection  - Notification
>>> time out: 1600
>>> 312260 [WorkerReceiver Thread] INFO
>>> org.apache.zookeeper.server.quorum.FastLeaderElection  - Notification:
>>> 215 (n.leader), 17179869918 (n.zxid), 4 (n.round), LOOKING (n.state),
>>> 215 (n.sid), LOOKING (my state)
>>> 312263 [WorkerReceiver Thread] INFO
>>> org.apache.zookeeper.server.quorum.FastLeaderElection  - Notification:
>>> 37 (n.leader), 17179869831 (n.zxid), 3 (n.round), LEADING (n.state),
>>> 37 (n.sid), LOOKING (my state)
>>
>> which, according to my earlier logic, seems to indicate that node #1
>> never even thought it was following node #3.
>>
>> Anyway, I will put the logs together and make a JIRA tomorrow if I get
>> some time, and will follow up here with a link.  Thanks again,
>>
>> Jeremy
>>
>>
>
> *flavio*
> *junqueira*
>
> research scientist
>
> fpj@yahoo-inc.com <ma...@yahoo-inc.com>
> direct +34 93-183-8828
>
> avinguda diagonal 177, 8th floor, barcelona, 08018, es
> phone (408) 349 3300 fax (408) 349 3301
>
>

Re: new leader accepting create requests too early?

Posted by Jeremy Stribling <st...@nicira.com>.
I created https://issues.apache.org/jira/browse/ZOOKEEPER-1026 to track 
this, and attached logs to it.  Thanks.

Jeremy

On 03/22/2011 12:26 AM, Flavio Junqueira wrote:
> You're right in that node #3 declared itself leader given the sequence 
> of notifications it received: it received one notification from node 
> #1 saying that it was looking in the first few lines of your log 
> excerpt and at least one from itself. However, node #3 does not have 
> enough support (followers), so it won't exercise leadership. From the 
> notifications, it sounds like node #1 is the one exercising leadership.
>
> -Flavio
>
> On Mar 22, 2011, at 8:11 AM, Jeremy Stribling wrote:
>
>> Thanks for the response.  I thought that the "my state" in this line,
>> printed in node #3's log:
>>
>>>
>>> 2672 [WorkerReceiver Thread] INFO
>>> org.apache.zookeeper.server.quorum.FastLeaderElection  - Notification:
>>> 215 (n.leader), 12884902548 (n.zxid), 3 (n.round), FOLLOWING
>>> (n.state), 126 (n.sid), LEADING (my state)
>>
>> indicated that node #3 was the leader, but I'm probably misinterpreting
>> it (I haven't had a chance to look through the source yet to figure it
>> out for sure).  In any case, what I think are the relevant notifications
>> of node #1's logs look like this:
>>
>>
>>>
>>> 307122 [WorkerReceiver Thread] INFO
>>> org.apache.zookeeper.server.quorum.FastLeaderElection  - Notification:
>>> 37 (n.leader), 17179869831 (n.zxid), 1 (n.round), LOOKING (n.state),
>>> 37 (n.sid), LEADING (my state)
>>> 307142 [WorkerReceiver Thread] INFO
>>> org.apache.zookeeper.server.quorum.FastLeaderElection  - Notification:
>>> 37 (n.leader), 17179869831 (n.zxid), 3 (n.round), LOOKING (n.state),
>>> 37 (n.sid), LEADING (my state)
>>> 310850 [WorkerReceiver Thread] INFO
>>> org.apache.zookeeper.server.quorum.FastLeaderElection  - Notification:
>>> 215 (n.leader), 17179869918 (n.zxid), 4 (n.round), LOOKING (n.state),
>>> 215 (n.sid), LOOKING (my state)
>>> 310850 [WorkerReceiver Thread] INFO
>>> org.apache.zookeeper.server.quorum.FastLeaderElection  - Notification:
>>> 37 (n.leader), 17179869831 (n.zxid), 3 (n.round), LEADING (n.state),
>>> 37 (n.sid), LOOKING (my state)
>>> 311051 [QuorumPeer:/0.0.0.0:2888] INFO
>>> org.apache.zookeeper.server.quorum.FastLeaderElection  - Notification
>>> time out: 400
>>> 311053 [WorkerReceiver Thread] INFO
>>> org.apache.zookeeper.server.quorum.FastLeaderElection  - Notification:
>>> 37 (n.leader), 17179869831 (n.zxid), 3 (n.round), LEADING (n.state),
>>> 37 (n.sid), LOOKING (my state)
>>> 311054 [WorkerReceiver Thread] INFO
>>> org.apache.zookeeper.server.quorum.FastLeaderElection  - Notification:
>>> 215 (n.leader), 17179869918 (n.zxid), 4 (n.round), LOOKING (n.state),
>>> 215 (n.sid), LOOKING (my state)
>>> 311454 [QuorumPeer:/0.0.0.0:2888] INFO
>>> org.apache.zookeeper.server.quorum.FastLeaderElection  - Notification
>>> time out: 800
>>> 311456 [WorkerReceiver Thread] INFO
>>> org.apache.zookeeper.server.quorum.FastLeaderElection  - Notification:
>>> 37 (n.leader), 17179869831 (n.zxid), 3 (n.round), LEADING (n.state),
>>> 37 (n.sid), LOOKING (my state)
>>> 311457 [WorkerReceiver Thread] INFO
>>> org.apache.zookeeper.server.quorum.FastLeaderElection  - Notification:
>>> 215 (n.leader), 17179869918 (n.zxid), 4 (n.round), LOOKING (n.state),
>>> 215 (n.sid), LOOKING (my state)
>>> 312257 [QuorumPeer:/0.0.0.0:2888] INFO
>>> org.apache.zookeeper.server.quorum.FastLeaderElection  - Notification
>>> time out: 1600
>>> 312260 [WorkerReceiver Thread] INFO
>>> org.apache.zookeeper.server.quorum.FastLeaderElection  - Notification:
>>> 215 (n.leader), 17179869918 (n.zxid), 4 (n.round), LOOKING (n.state),
>>> 215 (n.sid), LOOKING (my state)
>>> 312263 [WorkerReceiver Thread] INFO
>>> org.apache.zookeeper.server.quorum.FastLeaderElection  - Notification:
>>> 37 (n.leader), 17179869831 (n.zxid), 3 (n.round), LEADING (n.state),
>>> 37 (n.sid), LOOKING (my state)
>>
>> which, according to my earlier logic, seems to indicate that node #1
>> never even thought it was following node #3.
>>
>> Anyway, I will put the logs together and make a JIRA tomorrow if I get
>> some time, and will follow up here with a link.  Thanks again,
>>
>> Jeremy
>>
>>
>
> *flavio*
> *junqueira*
>
> research scientist
>
> fpj@yahoo-inc.com <ma...@yahoo-inc.com>
> direct +34 93-183-8828
>
> avinguda diagonal 177, 8th floor, barcelona, 08018, es
> phone (408) 349 3300 fax (408) 349 3301
>
>

Re: new leader accepting create requests too early?

Posted by Flavio Junqueira <fp...@yahoo-inc.com>.
You're right in that node #3 declared itself leader given the sequence  
of notifications it received: it received one notification from node  
#1 saying that it was looking in the first few lines of your log  
excerpt and at least one from itself. However, node #3 does not have  
enough support (followers), so it won't exercise leadership. From the  
notifications, it sounds like node #1 is the one exercising leadership.

-Flavio

On Mar 22, 2011, at 8:11 AM, Jeremy Stribling wrote:

> Thanks for the response.  I thought that the "my state" in this line,
> printed in node #3's log:
>
>>
>> 2672 [WorkerReceiver Thread] INFO
>> org.apache.zookeeper.server.quorum.FastLeaderElection  -  
>> Notification:
>> 215 (n.leader), 12884902548 (n.zxid), 3 (n.round), FOLLOWING
>> (n.state), 126 (n.sid), LEADING (my state)
>
> indicated that node #3 was the leader, but I'm probably  
> misinterpreting
> it (I haven't had a chance to look through the source yet to figure it
> out for sure).  In any case, what I think are the relevant  
> notifications
> of node #1's logs look like this:
>
>
>>
>> 307122 [WorkerReceiver Thread] INFO
>> org.apache.zookeeper.server.quorum.FastLeaderElection  -  
>> Notification:
>> 37 (n.leader), 17179869831 (n.zxid), 1 (n.round), LOOKING (n.state),
>> 37 (n.sid), LEADING (my state)
>> 307142 [WorkerReceiver Thread] INFO
>> org.apache.zookeeper.server.quorum.FastLeaderElection  -  
>> Notification:
>> 37 (n.leader), 17179869831 (n.zxid), 3 (n.round), LOOKING (n.state),
>> 37 (n.sid), LEADING (my state)
>> 310850 [WorkerReceiver Thread] INFO
>> org.apache.zookeeper.server.quorum.FastLeaderElection  -  
>> Notification:
>> 215 (n.leader), 17179869918 (n.zxid), 4 (n.round), LOOKING (n.state),
>> 215 (n.sid), LOOKING (my state)
>> 310850 [WorkerReceiver Thread] INFO
>> org.apache.zookeeper.server.quorum.FastLeaderElection  -  
>> Notification:
>> 37 (n.leader), 17179869831 (n.zxid), 3 (n.round), LEADING (n.state),
>> 37 (n.sid), LOOKING (my state)
>> 311051 [QuorumPeer:/0.0.0.0:2888] INFO
>> org.apache.zookeeper.server.quorum.FastLeaderElection  - Notification
>> time out: 400
>> 311053 [WorkerReceiver Thread] INFO
>> org.apache.zookeeper.server.quorum.FastLeaderElection  -  
>> Notification:
>> 37 (n.leader), 17179869831 (n.zxid), 3 (n.round), LEADING (n.state),
>> 37 (n.sid), LOOKING (my state)
>> 311054 [WorkerReceiver Thread] INFO
>> org.apache.zookeeper.server.quorum.FastLeaderElection  -  
>> Notification:
>> 215 (n.leader), 17179869918 (n.zxid), 4 (n.round), LOOKING (n.state),
>> 215 (n.sid), LOOKING (my state)
>> 311454 [QuorumPeer:/0.0.0.0:2888] INFO
>> org.apache.zookeeper.server.quorum.FastLeaderElection  - Notification
>> time out: 800
>> 311456 [WorkerReceiver Thread] INFO
>> org.apache.zookeeper.server.quorum.FastLeaderElection  -  
>> Notification:
>> 37 (n.leader), 17179869831 (n.zxid), 3 (n.round), LEADING (n.state),
>> 37 (n.sid), LOOKING (my state)
>> 311457 [WorkerReceiver Thread] INFO
>> org.apache.zookeeper.server.quorum.FastLeaderElection  -  
>> Notification:
>> 215 (n.leader), 17179869918 (n.zxid), 4 (n.round), LOOKING (n.state),
>> 215 (n.sid), LOOKING (my state)
>> 312257 [QuorumPeer:/0.0.0.0:2888] INFO
>> org.apache.zookeeper.server.quorum.FastLeaderElection  - Notification
>> time out: 1600
>> 312260 [WorkerReceiver Thread] INFO
>> org.apache.zookeeper.server.quorum.FastLeaderElection  -  
>> Notification:
>> 215 (n.leader), 17179869918 (n.zxid), 4 (n.round), LOOKING (n.state),
>> 215 (n.sid), LOOKING (my state)
>> 312263 [WorkerReceiver Thread] INFO
>> org.apache.zookeeper.server.quorum.FastLeaderElection  -  
>> Notification:
>> 37 (n.leader), 17179869831 (n.zxid), 3 (n.round), LEADING (n.state),
>> 37 (n.sid), LOOKING (my state)
>
> which, according to my earlier logic, seems to indicate that node #1
> never even thought it was following node #3.
>
> Anyway, I will put the logs together and make a JIRA tomorrow if I get
> some time, and will follow up here with a link.  Thanks again,
>
> Jeremy
>
>

flavio
junqueira

research scientist

fpj@yahoo-inc.com
direct +34 93-183-8828

avinguda diagonal 177, 8th floor, barcelona, 08018, es
phone (408) 349 3300    fax (408) 349 3301




Re: new leader accepting create requests too early?

Posted by Flavio Junqueira <fp...@yahoo-inc.com>.
You're right in that node #3 declared itself leader given the sequence  
of notifications it received: it received one notification from node  
#1 saying that it was looking in the first few lines of your log  
excerpt and at least one from itself. However, node #3 does not have  
enough support (followers), so it won't exercise leadership. From the  
notifications, it sounds like node #1 is the one exercising leadership.

-Flavio

On Mar 22, 2011, at 8:11 AM, Jeremy Stribling wrote:

> Thanks for the response.  I thought that the "my state" in this line,
> printed in node #3's log:
>
>>
>> 2672 [WorkerReceiver Thread] INFO
>> org.apache.zookeeper.server.quorum.FastLeaderElection  -  
>> Notification:
>> 215 (n.leader), 12884902548 (n.zxid), 3 (n.round), FOLLOWING
>> (n.state), 126 (n.sid), LEADING (my state)
>
> indicated that node #3 was the leader, but I'm probably  
> misinterpreting
> it (I haven't had a chance to look through the source yet to figure it
> out for sure).  In any case, what I think are the relevant  
> notifications
> of node #1's logs look like this:
>
>
>>
>> 307122 [WorkerReceiver Thread] INFO
>> org.apache.zookeeper.server.quorum.FastLeaderElection  -  
>> Notification:
>> 37 (n.leader), 17179869831 (n.zxid), 1 (n.round), LOOKING (n.state),
>> 37 (n.sid), LEADING (my state)
>> 307142 [WorkerReceiver Thread] INFO
>> org.apache.zookeeper.server.quorum.FastLeaderElection  -  
>> Notification:
>> 37 (n.leader), 17179869831 (n.zxid), 3 (n.round), LOOKING (n.state),
>> 37 (n.sid), LEADING (my state)
>> 310850 [WorkerReceiver Thread] INFO
>> org.apache.zookeeper.server.quorum.FastLeaderElection  -  
>> Notification:
>> 215 (n.leader), 17179869918 (n.zxid), 4 (n.round), LOOKING (n.state),
>> 215 (n.sid), LOOKING (my state)
>> 310850 [WorkerReceiver Thread] INFO
>> org.apache.zookeeper.server.quorum.FastLeaderElection  -  
>> Notification:
>> 37 (n.leader), 17179869831 (n.zxid), 3 (n.round), LEADING (n.state),
>> 37 (n.sid), LOOKING (my state)
>> 311051 [QuorumPeer:/0.0.0.0:2888] INFO
>> org.apache.zookeeper.server.quorum.FastLeaderElection  - Notification
>> time out: 400
>> 311053 [WorkerReceiver Thread] INFO
>> org.apache.zookeeper.server.quorum.FastLeaderElection  -  
>> Notification:
>> 37 (n.leader), 17179869831 (n.zxid), 3 (n.round), LEADING (n.state),
>> 37 (n.sid), LOOKING (my state)
>> 311054 [WorkerReceiver Thread] INFO
>> org.apache.zookeeper.server.quorum.FastLeaderElection  -  
>> Notification:
>> 215 (n.leader), 17179869918 (n.zxid), 4 (n.round), LOOKING (n.state),
>> 215 (n.sid), LOOKING (my state)
>> 311454 [QuorumPeer:/0.0.0.0:2888] INFO
>> org.apache.zookeeper.server.quorum.FastLeaderElection  - Notification
>> time out: 800
>> 311456 [WorkerReceiver Thread] INFO
>> org.apache.zookeeper.server.quorum.FastLeaderElection  -  
>> Notification:
>> 37 (n.leader), 17179869831 (n.zxid), 3 (n.round), LEADING (n.state),
>> 37 (n.sid), LOOKING (my state)
>> 311457 [WorkerReceiver Thread] INFO
>> org.apache.zookeeper.server.quorum.FastLeaderElection  -  
>> Notification:
>> 215 (n.leader), 17179869918 (n.zxid), 4 (n.round), LOOKING (n.state),
>> 215 (n.sid), LOOKING (my state)
>> 312257 [QuorumPeer:/0.0.0.0:2888] INFO
>> org.apache.zookeeper.server.quorum.FastLeaderElection  - Notification
>> time out: 1600
>> 312260 [WorkerReceiver Thread] INFO
>> org.apache.zookeeper.server.quorum.FastLeaderElection  -  
>> Notification:
>> 215 (n.leader), 17179869918 (n.zxid), 4 (n.round), LOOKING (n.state),
>> 215 (n.sid), LOOKING (my state)
>> 312263 [WorkerReceiver Thread] INFO
>> org.apache.zookeeper.server.quorum.FastLeaderElection  -  
>> Notification:
>> 37 (n.leader), 17179869831 (n.zxid), 3 (n.round), LEADING (n.state),
>> 37 (n.sid), LOOKING (my state)
>
> which, according to my earlier logic, seems to indicate that node #1
> never even thought it was following node #3.
>
> Anyway, I will put the logs together and make a JIRA tomorrow if I get
> some time, and will follow up here with a link.  Thanks again,
>
> Jeremy
>
>

flavio
junqueira

research scientist

fpj@yahoo-inc.com
direct +34 93-183-8828

avinguda diagonal 177, 8th floor, barcelona, 08018, es
phone (408) 349 3300    fax (408) 349 3301




Re: new leader accepting create requests too early?

Posted by Jeremy Stribling <st...@nicira.com>.
Thanks for the response.  I thought that the "my state" in this line, 
printed in node #3's log:

>
> 2672 [WorkerReceiver Thread] INFO 
> org.apache.zookeeper.server.quorum.FastLeaderElection  - Notification: 
> 215 (n.leader), 12884902548 (n.zxid), 3 (n.round), FOLLOWING 
> (n.state), 126 (n.sid), LEADING (my state)

indicated that node #3 was the leader, but I'm probably misinterpreting 
it (I haven't had a chance to look through the source yet to figure it 
out for sure).  In any case, what I think are the relevant notifications 
of node #1's logs look like this:


>
> 307122 [WorkerReceiver Thread] INFO 
> org.apache.zookeeper.server.quorum.FastLeaderElection  - Notification: 
> 37 (n.leader), 17179869831 (n.zxid), 1 (n.round), LOOKING (n.state), 
> 37 (n.sid), LEADING (my state)
> 307142 [WorkerReceiver Thread] INFO 
> org.apache.zookeeper.server.quorum.FastLeaderElection  - Notification: 
> 37 (n.leader), 17179869831 (n.zxid), 3 (n.round), LOOKING (n.state), 
> 37 (n.sid), LEADING (my state)
> 310850 [WorkerReceiver Thread] INFO 
> org.apache.zookeeper.server.quorum.FastLeaderElection  - Notification: 
> 215 (n.leader), 17179869918 (n.zxid), 4 (n.round), LOOKING (n.state), 
> 215 (n.sid), LOOKING (my state)
> 310850 [WorkerReceiver Thread] INFO 
> org.apache.zookeeper.server.quorum.FastLeaderElection  - Notification: 
> 37 (n.leader), 17179869831 (n.zxid), 3 (n.round), LEADING (n.state), 
> 37 (n.sid), LOOKING (my state)
> 311051 [QuorumPeer:/0.0.0.0:2888] INFO 
> org.apache.zookeeper.server.quorum.FastLeaderElection  - Notification 
> time out: 400
> 311053 [WorkerReceiver Thread] INFO 
> org.apache.zookeeper.server.quorum.FastLeaderElection  - Notification: 
> 37 (n.leader), 17179869831 (n.zxid), 3 (n.round), LEADING (n.state), 
> 37 (n.sid), LOOKING (my state)
> 311054 [WorkerReceiver Thread] INFO 
> org.apache.zookeeper.server.quorum.FastLeaderElection  - Notification: 
> 215 (n.leader), 17179869918 (n.zxid), 4 (n.round), LOOKING (n.state), 
> 215 (n.sid), LOOKING (my state)
> 311454 [QuorumPeer:/0.0.0.0:2888] INFO 
> org.apache.zookeeper.server.quorum.FastLeaderElection  - Notification 
> time out: 800
> 311456 [WorkerReceiver Thread] INFO 
> org.apache.zookeeper.server.quorum.FastLeaderElection  - Notification: 
> 37 (n.leader), 17179869831 (n.zxid), 3 (n.round), LEADING (n.state), 
> 37 (n.sid), LOOKING (my state)
> 311457 [WorkerReceiver Thread] INFO 
> org.apache.zookeeper.server.quorum.FastLeaderElection  - Notification: 
> 215 (n.leader), 17179869918 (n.zxid), 4 (n.round), LOOKING (n.state), 
> 215 (n.sid), LOOKING (my state)
> 312257 [QuorumPeer:/0.0.0.0:2888] INFO 
> org.apache.zookeeper.server.quorum.FastLeaderElection  - Notification 
> time out: 1600
> 312260 [WorkerReceiver Thread] INFO 
> org.apache.zookeeper.server.quorum.FastLeaderElection  - Notification: 
> 215 (n.leader), 17179869918 (n.zxid), 4 (n.round), LOOKING (n.state), 
> 215 (n.sid), LOOKING (my state)
> 312263 [WorkerReceiver Thread] INFO 
> org.apache.zookeeper.server.quorum.FastLeaderElection  - Notification: 
> 37 (n.leader), 17179869831 (n.zxid), 3 (n.round), LEADING (n.state), 
> 37 (n.sid), LOOKING (my state)

which, according to my earlier logic, seems to indicate that node #1 
never even thought it was following node #3.

Anyway, I will put the logs together and make a JIRA tomorrow if I get 
some time, and will follow up here with a link.  Thanks again,

Jeremy



Re: new leader accepting create requests too early?

Posted by Jeremy Stribling <st...@nicira.com>.
Thanks for the response.  I thought that the "my state" in this line, 
printed in node #3's log:

>
> 2672 [WorkerReceiver Thread] INFO 
> org.apache.zookeeper.server.quorum.FastLeaderElection  - Notification: 
> 215 (n.leader), 12884902548 (n.zxid), 3 (n.round), FOLLOWING 
> (n.state), 126 (n.sid), LEADING (my state)

indicated that node #3 was the leader, but I'm probably misinterpreting 
it (I haven't had a chance to look through the source yet to figure it 
out for sure).  In any case, what I think are the relevant notifications 
of node #1's logs look like this:


>
> 307122 [WorkerReceiver Thread] INFO 
> org.apache.zookeeper.server.quorum.FastLeaderElection  - Notification: 
> 37 (n.leader), 17179869831 (n.zxid), 1 (n.round), LOOKING (n.state), 
> 37 (n.sid), LEADING (my state)
> 307142 [WorkerReceiver Thread] INFO 
> org.apache.zookeeper.server.quorum.FastLeaderElection  - Notification: 
> 37 (n.leader), 17179869831 (n.zxid), 3 (n.round), LOOKING (n.state), 
> 37 (n.sid), LEADING (my state)
> 310850 [WorkerReceiver Thread] INFO 
> org.apache.zookeeper.server.quorum.FastLeaderElection  - Notification: 
> 215 (n.leader), 17179869918 (n.zxid), 4 (n.round), LOOKING (n.state), 
> 215 (n.sid), LOOKING (my state)
> 310850 [WorkerReceiver Thread] INFO 
> org.apache.zookeeper.server.quorum.FastLeaderElection  - Notification: 
> 37 (n.leader), 17179869831 (n.zxid), 3 (n.round), LEADING (n.state), 
> 37 (n.sid), LOOKING (my state)
> 311051 [QuorumPeer:/0.0.0.0:2888] INFO 
> org.apache.zookeeper.server.quorum.FastLeaderElection  - Notification 
> time out: 400
> 311053 [WorkerReceiver Thread] INFO 
> org.apache.zookeeper.server.quorum.FastLeaderElection  - Notification: 
> 37 (n.leader), 17179869831 (n.zxid), 3 (n.round), LEADING (n.state), 
> 37 (n.sid), LOOKING (my state)
> 311054 [WorkerReceiver Thread] INFO 
> org.apache.zookeeper.server.quorum.FastLeaderElection  - Notification: 
> 215 (n.leader), 17179869918 (n.zxid), 4 (n.round), LOOKING (n.state), 
> 215 (n.sid), LOOKING (my state)
> 311454 [QuorumPeer:/0.0.0.0:2888] INFO 
> org.apache.zookeeper.server.quorum.FastLeaderElection  - Notification 
> time out: 800
> 311456 [WorkerReceiver Thread] INFO 
> org.apache.zookeeper.server.quorum.FastLeaderElection  - Notification: 
> 37 (n.leader), 17179869831 (n.zxid), 3 (n.round), LEADING (n.state), 
> 37 (n.sid), LOOKING (my state)
> 311457 [WorkerReceiver Thread] INFO 
> org.apache.zookeeper.server.quorum.FastLeaderElection  - Notification: 
> 215 (n.leader), 17179869918 (n.zxid), 4 (n.round), LOOKING (n.state), 
> 215 (n.sid), LOOKING (my state)
> 312257 [QuorumPeer:/0.0.0.0:2888] INFO 
> org.apache.zookeeper.server.quorum.FastLeaderElection  - Notification 
> time out: 1600
> 312260 [WorkerReceiver Thread] INFO 
> org.apache.zookeeper.server.quorum.FastLeaderElection  - Notification: 
> 215 (n.leader), 17179869918 (n.zxid), 4 (n.round), LOOKING (n.state), 
> 215 (n.sid), LOOKING (my state)
> 312263 [WorkerReceiver Thread] INFO 
> org.apache.zookeeper.server.quorum.FastLeaderElection  - Notification: 
> 37 (n.leader), 17179869831 (n.zxid), 3 (n.round), LEADING (n.state), 
> 37 (n.sid), LOOKING (my state)

which, according to my earlier logic, seems to indicate that node #1 
never even thought it was following node #3.

Anyway, I will put the logs together and make a JIRA tomorrow if I get 
some time, and will follow up here with a link.  Thanks again,

Jeremy



Re: new leader accepting create requests too early?

Posted by Flavio Junqueira <fp...@yahoo-inc.com>.
Hi Jeremy, Thanks for reporting. Without logs from the other nodes, it  
is somewaht difficult to determine more precisely what happened. But,  
from the logs, it sounds like node #1 (id 215) is the only leading  
from these two notifications:

2672 [WorkerReceiver Thread] INFO
org.apache.zookeeper.server.quorum.FastLeaderElection  - Notification:
215 (n.leader), 12884902548 (n.zxid), 3 (n.round), LEADING (n.state),
215 (n.sid), LOOKING (my state)

2672 [WorkerReceiver Thread] INFO
org.apache.zookeeper.server.quorum.FastLeaderElection  - Notification:
215 (n.leader), 12884902548 (n.zxid), 3 (n.round), FOLLOWING
(n.state), 126 (n.sid), LEADING (my state)

 From the information you have given, I can't explain right now why  
the sequence number is smaller in step 12. If you happen to have more  
information, it would be great if you could report in a jira.

-Flavio

PS: We should perhaps discuss this issue on dev, so I'm copying it to  
dev.

On Mar 22, 2011, at 3:15 AM, Jeremy Stribling wrote:

> Hi all,
>
> I ran into a weird case where Zookeeper seems to have elected a new
> leader (a node that just restarted and missed several operations),  
> and a
> client is able to connect to it and create a new sequential node that
> has a number earlier than the last node it created.  I don't have full
> logs, or a live system in this state, or any data directories, just  
> some
> partial server logs and the evidence as seen by the client.  Haven't
> tried reproducing it yet, just wanted to see if anyone here had any
> ideas.  Here's the scenario (probably more info than necessary, but
> trying to be complete)
>
> Version: Zookeeper 3.3.3
>
> 1) Initially (5:37:20): 3 nodes up, with ids 215, 126, and 37 (called
> nodes #1, #2, and #3 below):
> 2) Nodes periodically (and throughout this whole timeline) create
> sequential, non-ephemeral nodes under the /zkrsm parent node.
> 3) 5:46:57: Node #1 gets notified of
> /zkrsm/0000000000000000_record0000002116
> 4) 5:47:06: Node #1 restarts and rejoins
> 5) 5:49:26: Node #2 gets notified of
> /zkrsm/0000000000000000_record0000002708
> 6) 5:49:29: Node #2 restarts and rejoins
> 7) 5:52:01: Node #3 gets notified of
> /zkrsm/0000000000000000_record0000003291
> 8) 5:52:02: Node #3 restarts and begins the rejoining process
> 9) 5:52:08: Node #1 successfully creates
> /zkrsm/0000000000000000_record0000003348
> 10) 5:52:08: Node #2 dies after getting notified of
> /zkrsm/0000000000000000_record0000003348
> 11) 5:52:10ish: Node #3 is elected leader (the ZK server log doesn't
> have wallclock timestamps, so not exactly sure on the ordering of  
> this step)
> 12) 5:52:15: Node #1 successfully creates
> /zkrsm/0000000000000000_record0000003292
>
> Note that the node created in step #12 is lower than the one created  
> in
> step #9, and is exactly one greater than the last node seen by node #3
> before it restarted.
>
> The leader election bit from node #3's log after restarting might be  
> of
> interest:
>
>>
>> 2644 [QuorumPeer:/0.0.0.0:2888] INFO
>> org.apache.zookeeper.server.quorum.QuorumPeer  - LOOKING
>> 2644 [QuorumPeer:/0.0.0.0:2888] DEBUG
>> org.apache.zookeeper.server.quorum.QuorumPeer  - Initializing leader
>> election protocol...
>> 2647 [QuorumPeer:/0.0.0.0:2888] INFO
>> org.apache.zookeeper.server.quorum.FastLeaderElection  - New  
>> election.
>> My id =  37, Proposed zxid = 17179869831
>> 2650 [WorkerSender Thread] DEBUG
>> org.apache.zookeeper.server.quorum.QuorumCnxManager  - Opening  
>> channel
>> to server 126
>> 2653 [WorkerReceiver Thread] DEBUG
>> org.apache.zookeeper.server.quorum.FastLeaderElection  - Receive new
>> notification message. My id = 37
>> 2653 [WorkerReceiver Thread] INFO
>> org.apache.zookeeper.server.quorum.FastLeaderElection  -  
>> Notification:
>> 37 (n.leader), 17179869831 (n.zxid), 1 (n.round), LOOKING (n.state),
>> 37 (n.sid), LOOKING (my state)
>> 2654 [QuorumPeer:/0.0.0.0:2888] DEBUG
>> org.apache.zookeeper.server.quorum.FastLeaderElection  - id: 37,
>> proposed id: 37, zxid: 17179869831, proposed zxid: 17179869831
>> 2655 [QuorumPeer:/0.0.0.0:2888] DEBUG
>> org.apache.zookeeper.server.quorum.FastLeaderElection  - Adding vote:
>> From = 37, Proposed leader = 37, Porposed zxid = 17179869831,  
>> Proposed
>> epoch = 1
>> 2657 [WorkerSender Thread] INFO
>> org.apache.zookeeper.server.quorum.QuorumCnxManager  - Have smaller
>> server identifier, so dropping the connection: (126, 37)
>> 2657 [WorkerSender Thread] DEBUG
>> org.apache.zookeeper.server.quorum.QuorumCnxManager  - Opening  
>> channel
>> to server 215
>> 2661 [Thread-3] DEBUG
>> org.apache.zookeeper.server.quorum.QuorumCnxManager  - Connection
>> request /13.0.0.12:60623
>> 2661 [Thread-3] DEBUG
>> org.apache.zookeeper.server.quorum.QuorumCnxManager  - Connection
>> request: 37
>> 2661 [WorkerSender Thread] INFO
>> org.apache.zookeeper.server.quorum.QuorumCnxManager  - Have smaller
>> server identifier, so dropping the connection: (215, 37)
>> 2662 [Thread-3] DEBUG
>> org.apache.zookeeper.server.quorum.QuorumCnxManager  - Address of
>> remote peer: 126
>> 2668 [Thread-3] DEBUG
>> org.apache.zookeeper.server.quorum.QuorumCnxManager  - Connection
>> request /13.0.0.11:60190
>> 2668 [Thread-3] DEBUG
>> org.apache.zookeeper.server.quorum.QuorumCnxManager  - Connection
>> request: 37
>> 2668 [Thread-3] DEBUG
>> org.apache.zookeeper.server.quorum.QuorumCnxManager  - Address of
>> remote peer: 215
>> 2669 [WorkerReceiver Thread] DEBUG
>> org.apache.zookeeper.server.quorum.FastLeaderElection  - Receive new
>> notification message. My id = 37
>> 2670 [WorkerReceiver Thread] INFO
>> org.apache.zookeeper.server.quorum.FastLeaderElection  -  
>> Notification:
>> 215 (n.leader), 12884902548 (n.zxid), 3 (n.round), LOOKING (n.state),
>> 126 (n.sid), LOOKING (my state)
>> 2670 [QuorumPeer:/0.0.0.0:2888] DEBUG
>> org.apache.zookeeper.server.quorum.FastLeaderElection  - id: 215,
>> proposed id: 37, zxid: 12884902548, proposed zxid: 17179869831
>> 2670 [WorkerReceiver Thread] DEBUG
>> org.apache.zookeeper.server.quorum.FastLeaderElection  - Receive new
>> notification message. My id = 37
>> 2670 [QuorumPeer:/0.0.0.0:2888] DEBUG
>> org.apache.zookeeper.server.quorum.FastLeaderElection  - Adding vote:
>> From = 126, Proposed leader = 215, Porposed zxid = 12884902548,
>> Proposed epoch = 3
>> 2670 [WorkerReceiver Thread] INFO
>> org.apache.zookeeper.server.quorum.FastLeaderElection  -  
>> Notification:
>> 215 (n.leader), 12884902548 (n.zxid), 3 (n.round), FOLLOWING
>> (n.state), 126 (n.sid), LOOKING (my state)
>> 2671 [WorkerSender Thread] DEBUG
>> org.apache.zookeeper.server.quorum.QuorumCnxManager  - There is a
>> connection already for server 126
>> 2671 [WorkerSender Thread] DEBUG
>> org.apache.zookeeper.server.quorum.QuorumCnxManager  - There is a
>> connection already for server 215
>> 2671 [WorkerReceiver Thread] DEBUG
>> org.apache.zookeeper.server.quorum.FastLeaderElection  - Receive new
>> notification message. My id = 37
>> 2671 [WorkerReceiver Thread] INFO
>> org.apache.zookeeper.server.quorum.FastLeaderElection  -  
>> Notification:
>> 37 (n.leader), 17179869831 (n.zxid), 3 (n.round), LOOKING (n.state),
>> 37 (n.sid), LOOKING (my state)
>> 2671 [WorkerReceiver Thread] DEBUG
>> org.apache.zookeeper.server.quorum.FastLeaderElection  - Receive new
>> notification message. My id = 37
>> 2671 [WorkerReceiver Thread] INFO
>> org.apache.zookeeper.server.quorum.FastLeaderElection  -  
>> Notification:
>> 215 (n.leader), 12884902548 (n.zxid), 3 (n.round), LOOKING (n.state),
>> 215 (n.sid), LOOKING (my state)
>> 2671 [QuorumPeer:/0.0.0.0:2888] DEBUG
>> org.apache.zookeeper.server.quorum.FastLeaderElection  - id: 37,
>> proposed id: 37, zxid: 17179869831, proposed zxid: 17179869831
>> 2672 [WorkerReceiver Thread] DEBUG
>> org.apache.zookeeper.server.quorum.FastLeaderElection  - Receive new
>> notification message. My id = 37
>> 2672 [QuorumPeer:/0.0.0.0:2888] DEBUG
>> org.apache.zookeeper.server.quorum.FastLeaderElection  - Adding vote:
>> From = 37, Proposed leader = 37, Porposed zxid = 17179869831,  
>> Proposed
>> epoch = 3
>> 2672 [WorkerReceiver Thread] INFO
>> org.apache.zookeeper.server.quorum.FastLeaderElection  -  
>> Notification:
>> 215 (n.leader), 12884902548 (n.zxid), 3 (n.round), LEADING (n.state),
>> 215 (n.sid), LOOKING (my state)
>> 2672 [QuorumPeer:/0.0.0.0:2888] DEBUG
>> org.apache.zookeeper.server.quorum.FastLeaderElection  - id: 215,
>> proposed id: 37, zxid: 12884902548, proposed zxid: 17179869831
>> 2672 [QuorumPeer:/0.0.0.0:2888] DEBUG
>> org.apache.zookeeper.server.quorum.FastLeaderElection  - Adding vote:
>> From = 215, Proposed leader = 215, Porposed zxid = 12884902548,
>> Proposed epoch = 3
>> 2672 [WorkerReceiver Thread] DEBUG
>> org.apache.zookeeper.server.quorum.FastLeaderElection  - Receive new
>> notification message. My id = 37
>> 2672 [WorkerReceiver Thread] INFO
>> org.apache.zookeeper.server.quorum.FastLeaderElection  -  
>> Notification:
>> 215 (n.leader), 12884902548 (n.zxid), 3 (n.round), FOLLOWING
>> (n.state), 126 (n.sid), LEADING (my state)
>> 2673 [QuorumPeer:/0.0.0.0:2888] INFO
>> org.apache.zookeeper.server.quorum.QuorumPeer  - LEADING
>> 2675 [QuorumPeer:/0.0.0.0:2888] INFO
>> org.apache.zookeeper.server.quorum.Leader  - TCP NoDelay set to: true
>> 2693 [WorkerReceiver Thread] DEBUG
>> org.apache.zookeeper.server.quorum.FastLeaderElection  - Receive new
>> notification message. My id = 37
>
> I don't know much about the internal workings of Zookeeper, but what I
> find weird here is that node #3 proposes zxid 17179869831  
> (0x400000287),
> which seems to indicate that it wants to start well into epoch 4  
> before
> it even opens channels to other nodes, finding out that their zxids  
> are
> only 12884902548 (0x300000294).  Is it possible that node #3 thought  
> its
> zxid was not less than the other nodes' zxids, and so did not bother
> transferring the operations it missed?
>
> Any other ideas on what could have gone wrong?  I dont see any other
> obvious exceptions in node #3's log, except:
>
>> 2487 [pool-1-thread-2] DEBUG
>> org.apache.zookeeper.server.persistence.FileTxnLog  - EOF excepton
>> java.io.EOFException: Failed to read
>
> but that seems to happen all the time, and it's at DEBUG level, so I'm
> guessing it's expected.
>
> Thanks,
>
> Jeremy
>
>

flavio
junqueira

research scientist

fpj@yahoo-inc.com
direct +34 93-183-8828

avinguda diagonal 177, 8th floor, barcelona, 08018, es
phone (408) 349 3300    fax (408) 349 3301




Re: new leader accepting create requests too early?

Posted by Flavio Junqueira <fp...@yahoo-inc.com>.
Hi Jeremy, Thanks for reporting. Without logs from the other nodes, it  
is somewaht difficult to determine more precisely what happened. But,  
from the logs, it sounds like node #1 (id 215) is the only leading  
from these two notifications:

2672 [WorkerReceiver Thread] INFO
org.apache.zookeeper.server.quorum.FastLeaderElection  - Notification:
215 (n.leader), 12884902548 (n.zxid), 3 (n.round), LEADING (n.state),
215 (n.sid), LOOKING (my state)

2672 [WorkerReceiver Thread] INFO
org.apache.zookeeper.server.quorum.FastLeaderElection  - Notification:
215 (n.leader), 12884902548 (n.zxid), 3 (n.round), FOLLOWING
(n.state), 126 (n.sid), LEADING (my state)

 From the information you have given, I can't explain right now why  
the sequence number is smaller in step 12. If you happen to have more  
information, it would be great if you could report in a jira.

-Flavio

PS: We should perhaps discuss this issue on dev, so I'm copying it to  
dev.

On Mar 22, 2011, at 3:15 AM, Jeremy Stribling wrote:

> Hi all,
>
> I ran into a weird case where Zookeeper seems to have elected a new
> leader (a node that just restarted and missed several operations),  
> and a
> client is able to connect to it and create a new sequential node that
> has a number earlier than the last node it created.  I don't have full
> logs, or a live system in this state, or any data directories, just  
> some
> partial server logs and the evidence as seen by the client.  Haven't
> tried reproducing it yet, just wanted to see if anyone here had any
> ideas.  Here's the scenario (probably more info than necessary, but
> trying to be complete)
>
> Version: Zookeeper 3.3.3
>
> 1) Initially (5:37:20): 3 nodes up, with ids 215, 126, and 37 (called
> nodes #1, #2, and #3 below):
> 2) Nodes periodically (and throughout this whole timeline) create
> sequential, non-ephemeral nodes under the /zkrsm parent node.
> 3) 5:46:57: Node #1 gets notified of
> /zkrsm/0000000000000000_record0000002116
> 4) 5:47:06: Node #1 restarts and rejoins
> 5) 5:49:26: Node #2 gets notified of
> /zkrsm/0000000000000000_record0000002708
> 6) 5:49:29: Node #2 restarts and rejoins
> 7) 5:52:01: Node #3 gets notified of
> /zkrsm/0000000000000000_record0000003291
> 8) 5:52:02: Node #3 restarts and begins the rejoining process
> 9) 5:52:08: Node #1 successfully creates
> /zkrsm/0000000000000000_record0000003348
> 10) 5:52:08: Node #2 dies after getting notified of
> /zkrsm/0000000000000000_record0000003348
> 11) 5:52:10ish: Node #3 is elected leader (the ZK server log doesn't
> have wallclock timestamps, so not exactly sure on the ordering of  
> this step)
> 12) 5:52:15: Node #1 successfully creates
> /zkrsm/0000000000000000_record0000003292
>
> Note that the node created in step #12 is lower than the one created  
> in
> step #9, and is exactly one greater than the last node seen by node #3
> before it restarted.
>
> The leader election bit from node #3's log after restarting might be  
> of
> interest:
>
>>
>> 2644 [QuorumPeer:/0.0.0.0:2888] INFO
>> org.apache.zookeeper.server.quorum.QuorumPeer  - LOOKING
>> 2644 [QuorumPeer:/0.0.0.0:2888] DEBUG
>> org.apache.zookeeper.server.quorum.QuorumPeer  - Initializing leader
>> election protocol...
>> 2647 [QuorumPeer:/0.0.0.0:2888] INFO
>> org.apache.zookeeper.server.quorum.FastLeaderElection  - New  
>> election.
>> My id =  37, Proposed zxid = 17179869831
>> 2650 [WorkerSender Thread] DEBUG
>> org.apache.zookeeper.server.quorum.QuorumCnxManager  - Opening  
>> channel
>> to server 126
>> 2653 [WorkerReceiver Thread] DEBUG
>> org.apache.zookeeper.server.quorum.FastLeaderElection  - Receive new
>> notification message. My id = 37
>> 2653 [WorkerReceiver Thread] INFO
>> org.apache.zookeeper.server.quorum.FastLeaderElection  -  
>> Notification:
>> 37 (n.leader), 17179869831 (n.zxid), 1 (n.round), LOOKING (n.state),
>> 37 (n.sid), LOOKING (my state)
>> 2654 [QuorumPeer:/0.0.0.0:2888] DEBUG
>> org.apache.zookeeper.server.quorum.FastLeaderElection  - id: 37,
>> proposed id: 37, zxid: 17179869831, proposed zxid: 17179869831
>> 2655 [QuorumPeer:/0.0.0.0:2888] DEBUG
>> org.apache.zookeeper.server.quorum.FastLeaderElection  - Adding vote:
>> From = 37, Proposed leader = 37, Porposed zxid = 17179869831,  
>> Proposed
>> epoch = 1
>> 2657 [WorkerSender Thread] INFO
>> org.apache.zookeeper.server.quorum.QuorumCnxManager  - Have smaller
>> server identifier, so dropping the connection: (126, 37)
>> 2657 [WorkerSender Thread] DEBUG
>> org.apache.zookeeper.server.quorum.QuorumCnxManager  - Opening  
>> channel
>> to server 215
>> 2661 [Thread-3] DEBUG
>> org.apache.zookeeper.server.quorum.QuorumCnxManager  - Connection
>> request /13.0.0.12:60623
>> 2661 [Thread-3] DEBUG
>> org.apache.zookeeper.server.quorum.QuorumCnxManager  - Connection
>> request: 37
>> 2661 [WorkerSender Thread] INFO
>> org.apache.zookeeper.server.quorum.QuorumCnxManager  - Have smaller
>> server identifier, so dropping the connection: (215, 37)
>> 2662 [Thread-3] DEBUG
>> org.apache.zookeeper.server.quorum.QuorumCnxManager  - Address of
>> remote peer: 126
>> 2668 [Thread-3] DEBUG
>> org.apache.zookeeper.server.quorum.QuorumCnxManager  - Connection
>> request /13.0.0.11:60190
>> 2668 [Thread-3] DEBUG
>> org.apache.zookeeper.server.quorum.QuorumCnxManager  - Connection
>> request: 37
>> 2668 [Thread-3] DEBUG
>> org.apache.zookeeper.server.quorum.QuorumCnxManager  - Address of
>> remote peer: 215
>> 2669 [WorkerReceiver Thread] DEBUG
>> org.apache.zookeeper.server.quorum.FastLeaderElection  - Receive new
>> notification message. My id = 37
>> 2670 [WorkerReceiver Thread] INFO
>> org.apache.zookeeper.server.quorum.FastLeaderElection  -  
>> Notification:
>> 215 (n.leader), 12884902548 (n.zxid), 3 (n.round), LOOKING (n.state),
>> 126 (n.sid), LOOKING (my state)
>> 2670 [QuorumPeer:/0.0.0.0:2888] DEBUG
>> org.apache.zookeeper.server.quorum.FastLeaderElection  - id: 215,
>> proposed id: 37, zxid: 12884902548, proposed zxid: 17179869831
>> 2670 [WorkerReceiver Thread] DEBUG
>> org.apache.zookeeper.server.quorum.FastLeaderElection  - Receive new
>> notification message. My id = 37
>> 2670 [QuorumPeer:/0.0.0.0:2888] DEBUG
>> org.apache.zookeeper.server.quorum.FastLeaderElection  - Adding vote:
>> From = 126, Proposed leader = 215, Porposed zxid = 12884902548,
>> Proposed epoch = 3
>> 2670 [WorkerReceiver Thread] INFO
>> org.apache.zookeeper.server.quorum.FastLeaderElection  -  
>> Notification:
>> 215 (n.leader), 12884902548 (n.zxid), 3 (n.round), FOLLOWING
>> (n.state), 126 (n.sid), LOOKING (my state)
>> 2671 [WorkerSender Thread] DEBUG
>> org.apache.zookeeper.server.quorum.QuorumCnxManager  - There is a
>> connection already for server 126
>> 2671 [WorkerSender Thread] DEBUG
>> org.apache.zookeeper.server.quorum.QuorumCnxManager  - There is a
>> connection already for server 215
>> 2671 [WorkerReceiver Thread] DEBUG
>> org.apache.zookeeper.server.quorum.FastLeaderElection  - Receive new
>> notification message. My id = 37
>> 2671 [WorkerReceiver Thread] INFO
>> org.apache.zookeeper.server.quorum.FastLeaderElection  -  
>> Notification:
>> 37 (n.leader), 17179869831 (n.zxid), 3 (n.round), LOOKING (n.state),
>> 37 (n.sid), LOOKING (my state)
>> 2671 [WorkerReceiver Thread] DEBUG
>> org.apache.zookeeper.server.quorum.FastLeaderElection  - Receive new
>> notification message. My id = 37
>> 2671 [WorkerReceiver Thread] INFO
>> org.apache.zookeeper.server.quorum.FastLeaderElection  -  
>> Notification:
>> 215 (n.leader), 12884902548 (n.zxid), 3 (n.round), LOOKING (n.state),
>> 215 (n.sid), LOOKING (my state)
>> 2671 [QuorumPeer:/0.0.0.0:2888] DEBUG
>> org.apache.zookeeper.server.quorum.FastLeaderElection  - id: 37,
>> proposed id: 37, zxid: 17179869831, proposed zxid: 17179869831
>> 2672 [WorkerReceiver Thread] DEBUG
>> org.apache.zookeeper.server.quorum.FastLeaderElection  - Receive new
>> notification message. My id = 37
>> 2672 [QuorumPeer:/0.0.0.0:2888] DEBUG
>> org.apache.zookeeper.server.quorum.FastLeaderElection  - Adding vote:
>> From = 37, Proposed leader = 37, Porposed zxid = 17179869831,  
>> Proposed
>> epoch = 3
>> 2672 [WorkerReceiver Thread] INFO
>> org.apache.zookeeper.server.quorum.FastLeaderElection  -  
>> Notification:
>> 215 (n.leader), 12884902548 (n.zxid), 3 (n.round), LEADING (n.state),
>> 215 (n.sid), LOOKING (my state)
>> 2672 [QuorumPeer:/0.0.0.0:2888] DEBUG
>> org.apache.zookeeper.server.quorum.FastLeaderElection  - id: 215,
>> proposed id: 37, zxid: 12884902548, proposed zxid: 17179869831
>> 2672 [QuorumPeer:/0.0.0.0:2888] DEBUG
>> org.apache.zookeeper.server.quorum.FastLeaderElection  - Adding vote:
>> From = 215, Proposed leader = 215, Porposed zxid = 12884902548,
>> Proposed epoch = 3
>> 2672 [WorkerReceiver Thread] DEBUG
>> org.apache.zookeeper.server.quorum.FastLeaderElection  - Receive new
>> notification message. My id = 37
>> 2672 [WorkerReceiver Thread] INFO
>> org.apache.zookeeper.server.quorum.FastLeaderElection  -  
>> Notification:
>> 215 (n.leader), 12884902548 (n.zxid), 3 (n.round), FOLLOWING
>> (n.state), 126 (n.sid), LEADING (my state)
>> 2673 [QuorumPeer:/0.0.0.0:2888] INFO
>> org.apache.zookeeper.server.quorum.QuorumPeer  - LEADING
>> 2675 [QuorumPeer:/0.0.0.0:2888] INFO
>> org.apache.zookeeper.server.quorum.Leader  - TCP NoDelay set to: true
>> 2693 [WorkerReceiver Thread] DEBUG
>> org.apache.zookeeper.server.quorum.FastLeaderElection  - Receive new
>> notification message. My id = 37
>
> I don't know much about the internal workings of Zookeeper, but what I
> find weird here is that node #3 proposes zxid 17179869831  
> (0x400000287),
> which seems to indicate that it wants to start well into epoch 4  
> before
> it even opens channels to other nodes, finding out that their zxids  
> are
> only 12884902548 (0x300000294).  Is it possible that node #3 thought  
> its
> zxid was not less than the other nodes' zxids, and so did not bother
> transferring the operations it missed?
>
> Any other ideas on what could have gone wrong?  I dont see any other
> obvious exceptions in node #3's log, except:
>
>> 2487 [pool-1-thread-2] DEBUG
>> org.apache.zookeeper.server.persistence.FileTxnLog  - EOF excepton
>> java.io.EOFException: Failed to read
>
> but that seems to happen all the time, and it's at DEBUG level, so I'm
> guessing it's expected.
>
> Thanks,
>
> Jeremy
>
>

flavio
junqueira

research scientist

fpj@yahoo-inc.com
direct +34 93-183-8828

avinguda diagonal 177, 8th floor, barcelona, 08018, es
phone (408) 349 3300    fax (408) 349 3301