You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@zookeeper.apache.org by "Flavio Junqueira (JIRA)" <ji...@apache.org> on 2013/09/18 00:31:56 UTC

[jira] [Resolved] (ZOOKEEPER-1599) 3.3 server cannot join 3.4 quorum

     [ https://issues.apache.org/jira/browse/ZOOKEEPER-1599?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Flavio Junqueira resolved ZOOKEEPER-1599.
-----------------------------------------

      Resolution: Not A Problem
    Release Note: During a rolling upgrade from the 3.3 branch to the 3.4 branch, a 3.3 server won't be able to follow a 3.4, so if there is an election during the upgrade and the new leader is a 3.4 server, then the 3.3 server will be unavailable until it is upgraded. If a 3.3 server leads during the upgrade process and it is the last one to be upgraded, then no problem should be observed.
    
> 3.3 server cannot join 3.4 quorum
> ---------------------------------
>
>                 Key: ZOOKEEPER-1599
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1599
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: quorum
>    Affects Versions: 3.3.6, 3.4.5
>            Reporter: Skye Wanderman-Milne
>            Assignee: Skye Wanderman-Milne
>            Priority: Blocker
>             Fix For: 3.4.6
>
>         Attachments: ZOOKEEPER-1599.patch
>
>
> When a 3.3 server attempts to join an existing quorum lead by a 3.4 server, the 3.3 server is disconnected while trying to download the leader's snapshot. The 3.3 server restarts and starts the process over again, but is never able to join the quorum.
> 3.3 server log:
> {code}
> 2012-12-07 10:44:34,582 - INFO  [QuorumPeer:/0:0:0:0:0:0:0:0:2183:Learner@294] - Getting a snapshot from leader
> 2012-12-07 10:44:34,582 - INFO  [QuorumPeer:/0:0:0:0:0:0:0:0:2183:Learner@325] - Setting leader epoch 12
> 2012-12-07 10:44:54,604 - WARN  [QuorumPeer:/0:0:0:0:0:0:0:0:2183:Follower@82] - Exception when following the leader
> java.io.EOFException
>         at java.io.DataInputStream.readInt(DataInputStream.java:392)
>         at org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
>         at org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:84)
>         at org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:108)
>         at org.apache.zookeeper.server.quorum.Learner.readPacket(Learner.java:148)
>         at org.apache.zookeeper.server.quorum.Learner.syncWithLeader(Learner.java:332)
>         at org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:75)
>         at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:645)
> 2012-12-07 10:44:54,605 - INFO  [QuorumPeer:/0:0:0:0:0:0:0:0:2183:Follower@165] - shutdown called
> java.lang.Exception: shutdown Follower
>         at org.apache.zookeeper.server.quorum.Follower.shutdown(Follower.java:165)
>         at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:649)
> {code}
> 3.4 leader log:
> {code}
> 2012-12-07 10:51:35,178 [myid:2] - INFO  [WorkerReceiver[myid=2]:FastLeaderElection$Messenger$WorkerReceiver@273] - Backward compatibility mode, server id=3
> 2012-12-07 10:51:35,178 [myid:2] - INFO  [WorkerReceiver[myid=2]:FastLeaderElection@542] - Notification: 3 (n.leader), 0x1100000000 (n.zxid), 0x2 (n.round), LOOKING (n.state), 3 (n.sid), 0x11 (n.peerEPoch), LEADING (my state)
> 2012-12-07 10:51:35,182 [myid:2] - INFO  [LearnerHandler-/127.0.0.1:37654:LearnerHandler@263] - Follower sid: 3 : info : org.apache.zookeeper.server.quorum.QuorumPeer$QuorumServer@262f4873
> 2012-12-07 10:51:35,182 [myid:2] - INFO  [LearnerHandler-/127.0.0.1:37654:LearnerHandler@318] - Synchronizing with Follower sid: 3 maxCommittedLog=0x0 minCommittedLog=0x0 peerLastZxid=0x1100000000
> 2012-12-07 10:51:35,182 [myid:2] - INFO  [LearnerHandler-/127.0.0.1:37654:LearnerHandler@395] - Sending SNAP
> 2012-12-07 10:51:35,183 [myid:2] - INFO  [LearnerHandler-/127.0.0.1:37654:LearnerHandler@419] - Sending snapshot last zxid of peer is 0x1100000000  zxid of leader is 0x1200000000sent zxid of db as 0x1200000000
> 2012-12-07 10:51:55,204 [myid:2] - ERROR [LearnerHandler-/127.0.0.1:37654:LearnerHandler@562] - Unexpected exception causing shutdown while sock still open
> java.net.SocketTimeoutException: Read timed out
>         at java.net.SocketInputStream.socketRead0(Native Method)
>         at java.net.SocketInputStream.read(SocketInputStream.java:150)
>         at java.net.SocketInputStream.read(SocketInputStream.java:121)
>         at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
>         at java.io.BufferedInputStream.read(BufferedInputStream.java:254)
>         at java.io.DataInputStream.readInt(DataInputStream.java:387)
>         at org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
>         at org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:83)
>         at org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:108)
>         at org.apache.zookeeper.server.quorum.LearnerHandler.run(LearnerHandler.java:450)
> 2012-12-07 10:51:55,205 [myid:2] - WARN  [LearnerHandler-/127.0.0.1:37654:LearnerHandler@575] - ******* GOODBYE /127.0.0.1:37654 ********
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira