You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@zookeeper.apache.org by "helei (JIRA)" <ji...@apache.org> on 2011/07/01 09:20:28 UTC

[jira] [Created] (ZOOKEEPER-1115) follower can not sync with leader

follower can not sync with leader
---------------------------------

                 Key: ZOOKEEPER-1115
                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1115
             Project: ZooKeeper
          Issue Type: Bug
          Components: quorum
    Affects Versions: 3.3.3, 3.3.0
         Environment: linux rhel 4, x64, java version 1.4.2
            Reporter: helei
            Priority: Critical


there are 5 members in the quorum. one follower can not sync with leader after restart. it seems leader has close the data connection with this follower because of read timeout. here is the key log in follower:
2011-06-30 22:14:45,069 - WARN  [Thread-17:QuorumCnxManager$RecvWorker@658] - Connection broken: 
java.nio.channels.ClosedChannelException
        at sun.nio.ch.SocketChannelImpl.ensureReadOpen(SocketChannelImpl.java:113)
        at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:156)
        at org.apache.zookeeper.server.quorum.QuorumCnxManager$RecvWorker.run(QuorumCnxManager.java:629)
2011-06-30 22:14:45,069 - INFO  [QuorumPeer:/0.0.0.0:2181:FastLeaderElection@689] - Notification: 3, 17198470148, 3, 3, LOOKING, LOOKING, 3
2011-06-30 22:14:45,070 - ERROR [Thread-16:QuorumCnxManager$SendWorker@559] - Failed to send last message. Shutting down thread.
java.nio.channels.ClosedChannelException
        at sun.nio.ch.SocketChannelImpl.ensureWriteOpen(SocketChannelImpl.java:126)
        at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:324)
        at org.apache.zookeeper.server.quorum.QuorumCnxManager$SendWorker.send(QuorumCnxManager.java:548)
        at org.apache.zookeeper.server.quorum.QuorumCnxManager$SendWorker.run(QuorumCnxManager.java:557)
2011-06-30 22:14:45,082 - INFO  [QuorumPeer:/0.0.0.0:2181:Learner@282] - Getting a diff from the leader 0x4011bd462
2011-06-30 22:14:45,083 - WARN  [Thread-18:QuorumCnxManager$SendWorker@589] - Send worker leaving thread
2011-06-30 22:14:45,085 - WARN  [QuorumPeer:/0.0.0.0:2181:Follower@116] - Got zxid 0x4011bd405 expected 0x1
2011-06-30 22:14:45,090 - INFO  [QuorumPeer:/0.0.0.0:2181:FileTxnSnapLog@208] - Snapshotting: 4011bd462
2011-06-30 22:14:53,397 - WARN  [SyncThread:3:SendAckRequestProcessor@63] - Closing connection to leader, exception during packet send
java.net.SocketException: Broken pipe
        at java.net.SocketOutputStream.socketWrite0(Native Method)
        at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:92)
        at java.net.SocketOutputStream.write(SocketOutputStream.java:136)
        at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65)
        at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:123)
        at org.apache.zookeeper.server.quorum.Learner.writePacket(Learner.java:126)
        at org.apache.zookeeper.server.quorum.SendAckRequestProcessor.flush(SendAckRequestProcessor.java:61)
        at org.apache.zookeeper.server.SyncRequestProcessor.flush(SyncRequestProcessor.java:164)
        at org.apache.zookeeper.server.SyncRequestProcessor.run(SyncRequestProcessor.java:98)
2011-06-30 22:14:53,398 - WARN  [QuorumPeer:/0.0.0.0:2181:Follower@82] - Exception when following the leader
java.net.SocketException: Socket closed
        at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:99)
        at java.net.SocketOutputStream.write(SocketOutputStream.java:136)
        at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65)
        at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:123)
        at org.apache.zookeeper.server.quorum.Learner.writePacket(Learner.java:126)
        at org.apache.zookeeper.server.quorum.Learner.ping(Learner.java:358)
        at org.apache.zookeeper.server.quorum.Follower.processPacket(Follower.java:108)
        at org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:79)
        at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:634)
2011-06-30 22:14:53,398 - WARN  [SyncThread:3:SendAckRequestProcessor@63] - Closing connection to leader, exception during packet send
java.net.SocketException: Socket closed
        at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:99)
        at java.net.SocketOutputStream.write(SocketOutputStream.java:136)
        at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65)
        at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:123)
        at org.apache.zookeeper.server.quorum.Learner.writePacket(Learner.java:126)
        at org.apache.zookeeper.server.quorum.SendAckRequestProcessor.flush(SendAckRequestProcessor.java:61)
        at org.apache.zookeeper.server.SyncRequestProcessor.flush(SyncRequestProcessor.java:164)
        at org.apache.zookeeper.server.SyncRequestProcessor.run(SyncRequestProcessor.java:98)
2011-06-30 22:14:53,399 - INFO  [QuorumPeer:/0.0.0.0:2181:Follower@166] - shutdown called
java.lang.Exception: shutdown Follower
        at org.apache.zookeeper.server.quorum.Follower.shutdown(Follower.java:166)
        at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:638)
and these are the leader's:
2011-06-30 22:14:35,943 - ERROR [LearnerHandler-/10.23.247.163:14975:LearnerHandler@444] - Unexpected exception causing shutdown while sock still open
java.net.SocketTimeoutException: Read timed out
        at java.net.SocketInputStream.socketRead0(Native Method)
        at java.net.SocketInputStream.read(SocketInputStream.java:129)
        at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
        at java.io.BufferedInputStream.read(BufferedInputStream.java:237)
        at java.io.DataInputStream.readInt(DataInputStream.java:370)
        at org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
        at org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:84)
        at org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:108)
        at org.apache.zookeeper.server.quorum.LearnerHandler.run(LearnerHandler.java:358)
2011-06-30 22:14:35,943 - WARN  [LearnerHandler-/10.23.247.163:14975:LearnerHandler@457] - ******* GOODBYE /10.23.247.163:14975 ********
2011-06-30 22:14:48,943 - ERROR [CommitProcessor:4:NIOServerCnxn@422] - Unexpected Exception: 
java.nio.channels.CancelledKeyException
        at sun.nio.ch.SelectionKeyImpl.ensureValid(SelectionKeyImpl.java:55)
        at sun.nio.ch.SelectionKeyImpl.interestOps(SelectionKeyImpl.java:59)
        at org.apache.zookeeper.server.NIOServerCnxn.sendBuffer(NIOServerCnxn.java:395)
        at org.apache.zookeeper.server.NIOServerCnxn.sendResponse(NIOServerCnxn.java:1360)
        at org.apache.zookeeper.server.FinalRequestProcessor.processRequest(FinalRequestProcessor.java:367)
        at org.apache.zookeeper.server.quorum.Leader$ToBeAppliedRequestProcessor.processRequest(Leader.java:535)
        at org.apache.zookeeper.server.quorum.CommitProcessor.run(CommitProcessor.java:73)
2011-06-30 22:14:49,084 - ERROR [LearnerHandler-/10.23.247.163:14998:LearnerHandler@444] - Unexpected exception causing shutdown while sock still open
java.net.SocketTimeoutException: Read timed out
        at java.net.SocketInputStream.socketRead0(Native Method)
        at java.net.SocketInputStream.read(SocketInputStream.java:129)
        at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
        at java.io.BufferedInputStream.read(BufferedInputStream.java:237)
        at java.io.DataInputStream.readInt(DataInputStream.java:370)
        at org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
        at org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:84)
        at org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:108)
        at org.apache.zookeeper.server.quorum.LearnerHandler.run(LearnerHandler.java:358)
2011-06-30 22:14:49,084 - WARN  [LearnerHandler-/10.23.247.163:14998:LearnerHandler@457] - ******* GOODBYE /10.23.247.163:14998 ********

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Re: [jira] [Commented] (ZOOKEEPER-1115) follower can not sync with leader

Posted by lei he <hl...@gmail.com>.
Hmm, of cause it works. But I really don't think that's the right way... It
obviously harm the performance. We have to limit the node number, or the
leader will be the single point. But that's OK, it surely works this time.
Thanks again.

2011/7/5 Flavio Junqueira (JIRA) <ji...@apache.org>

>
>    [
> https://issues.apache.org/jira/browse/ZOOKEEPER-1115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13059817#comment-13059817]
>
> Flavio Junqueira commented on ZOOKEEPER-1115:
> ---------------------------------------------
>
> What about increasing syncLimit?
>
> > follower can not sync with leader
> > ---------------------------------
> >
> >                 Key: ZOOKEEPER-1115
> >                 URL:
> https://issues.apache.org/jira/browse/ZOOKEEPER-1115
> >             Project: ZooKeeper
> >          Issue Type: Bug
> >          Components: quorum
> >    Affects Versions: 3.3.0, 3.3.3
> >         Environment: linux rhel 4, x64, java version 1.4.2
> >            Reporter: helei
> >            Priority: Critical
> >
> > there are 5 members in the quorum. one follower can not sync with leader
> after restart. it seems leader has close the data connection with this
> follower because of read timeout. here is the key log in follower:
> > 2011-06-30 22:14:45,069 - WARN
>  [Thread-17:QuorumCnxManager$RecvWorker@658] - Connection broken:
> > java.nio.channels.ClosedChannelException
> >         at
> sun.nio.ch.SocketChannelImpl.ensureReadOpen(SocketChannelImpl.java:113)
> >         at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:156)
> >         at
> org.apache.zookeeper.server.quorum.QuorumCnxManager$RecvWorker.run(QuorumCnxManager.java:629)
> > 2011-06-30 22:14:45,069 - INFO  [QuorumPeer:/0.0.0.0:2181
> :FastLeaderElection@689] - Notification: 3, 17198470148, 3, 3, LOOKING,
> LOOKING, 3
> > 2011-06-30 22:14:45,070 - ERROR
> [Thread-16:QuorumCnxManager$SendWorker@559] - Failed to send last message.
> Shutting down thread.
> > java.nio.channels.ClosedChannelException
> >         at
> sun.nio.ch.SocketChannelImpl.ensureWriteOpen(SocketChannelImpl.java:126)
> >         at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:324)
> >         at
> org.apache.zookeeper.server.quorum.QuorumCnxManager$SendWorker.send(QuorumCnxManager.java:548)
> >         at
> org.apache.zookeeper.server.quorum.QuorumCnxManager$SendWorker.run(QuorumCnxManager.java:557)
> > 2011-06-30 22:14:45,082 - INFO  [QuorumPeer:/0.0.0.0:2181:Learner@282] -
> Getting a diff from the leader 0x4011bd462
> > 2011-06-30 22:14:45,083 - WARN
>  [Thread-18:QuorumCnxManager$SendWorker@589] - Send worker leaving thread
> > 2011-06-30 22:14:45,085 - WARN  [QuorumPeer:/0.0.0.0:2181:Follower@116]
> - Got zxid 0x4011bd405 expected 0x1
> > 2011-06-30 22:14:45,090 - INFO  [QuorumPeer:/0.0.0.0:2181
> :FileTxnSnapLog@208] - Snapshotting: 4011bd462
> > 2011-06-30 22:14:53,397 - WARN  [SyncThread:3:SendAckRequestProcessor@63]
> - Closing connection to leader, exception during packet send
> > java.net.SocketException: Broken pipe
> >         at java.net.SocketOutputStream.socketWrite0(Native Method)
> >         at
> java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:92)
> >         at java.net.SocketOutputStream.write(SocketOutputStream.java:136)
> >         at
> java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65)
> >         at
> java.io.BufferedOutputStream.flush(BufferedOutputStream.java:123)
> >         at
> org.apache.zookeeper.server.quorum.Learner.writePacket(Learner.java:126)
> >         at
> org.apache.zookeeper.server.quorum.SendAckRequestProcessor.flush(SendAckRequestProcessor.java:61)
> >         at
> org.apache.zookeeper.server.SyncRequestProcessor.flush(SyncRequestProcessor.java:164)
> >         at
> org.apache.zookeeper.server.SyncRequestProcessor.run(SyncRequestProcessor.java:98)
> > 2011-06-30 22:14:53,398 - WARN  [QuorumPeer:/0.0.0.0:2181:Follower@82] -
> Exception when following the leader
> > java.net.SocketException: Socket closed
> >         at
> java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:99)
> >         at java.net.SocketOutputStream.write(SocketOutputStream.java:136)
> >         at
> java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65)
> >         at
> java.io.BufferedOutputStream.flush(BufferedOutputStream.java:123)
> >         at
> org.apache.zookeeper.server.quorum.Learner.writePacket(Learner.java:126)
> >         at
> org.apache.zookeeper.server.quorum.Learner.ping(Learner.java:358)
> >         at
> org.apache.zookeeper.server.quorum.Follower.processPacket(Follower.java:108)
> >         at
> org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:79)
> >         at
> org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:634)
> > 2011-06-30 22:14:53,398 - WARN  [SyncThread:3:SendAckRequestProcessor@63]
> - Closing connection to leader, exception during packet send
> > java.net.SocketException: Socket closed
> >         at
> java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:99)
> >         at java.net.SocketOutputStream.write(SocketOutputStream.java:136)
> >         at
> java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65)
> >         at
> java.io.BufferedOutputStream.flush(BufferedOutputStream.java:123)
> >         at
> org.apache.zookeeper.server.quorum.Learner.writePacket(Learner.java:126)
> >         at
> org.apache.zookeeper.server.quorum.SendAckRequestProcessor.flush(SendAckRequestProcessor.java:61)
> >         at
> org.apache.zookeeper.server.SyncRequestProcessor.flush(SyncRequestProcessor.java:164)
> >         at
> org.apache.zookeeper.server.SyncRequestProcessor.run(SyncRequestProcessor.java:98)
> > 2011-06-30 22:14:53,399 - INFO  [QuorumPeer:/0.0.0.0:2181:Follower@166]
> - shutdown called
> > java.lang.Exception: shutdown Follower
> >         at
> org.apache.zookeeper.server.quorum.Follower.shutdown(Follower.java:166)
> >         at
> org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:638)
> > and these are the leader's:
> > 2011-06-30 22:14:35,943 - ERROR [LearnerHandler-/10.23.247.163:14975
> :LearnerHandler@444] - Unexpected exception causing shutdown while sock
> still open
> > java.net.SocketTimeoutException: Read timed out
> >         at java.net.SocketInputStream.socketRead0(Native Method)
> >         at java.net.SocketInputStream.read(SocketInputStream.java:129)
> >         at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
> >         at java.io.BufferedInputStream.read(BufferedInputStream.java:237)
> >         at java.io.DataInputStream.readInt(DataInputStream.java:370)
> >         at
> org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
> >         at
> org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:84)
> >         at
> org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:108)
> >         at
> org.apache.zookeeper.server.quorum.LearnerHandler.run(LearnerHandler.java:358)
> > 2011-06-30 22:14:35,943 - WARN  [LearnerHandler-/10.23.247.163:14975
> :LearnerHandler@457] - ******* GOODBYE /10.23.247.163:14975 ********
> > 2011-06-30 22:14:48,943 - ERROR [CommitProcessor:4:NIOServerCnxn@422] -
> Unexpected Exception:
> > java.nio.channels.CancelledKeyException
> >         at
> sun.nio.ch.SelectionKeyImpl.ensureValid(SelectionKeyImpl.java:55)
> >         at
> sun.nio.ch.SelectionKeyImpl.interestOps(SelectionKeyImpl.java:59)
> >         at
> org.apache.zookeeper.server.NIOServerCnxn.sendBuffer(NIOServerCnxn.java:395)
> >         at
> org.apache.zookeeper.server.NIOServerCnxn.sendResponse(NIOServerCnxn.java:1360)
> >         at
> org.apache.zookeeper.server.FinalRequestProcessor.processRequest(FinalRequestProcessor.java:367)
> >         at
> org.apache.zookeeper.server.quorum.Leader$ToBeAppliedRequestProcessor.processRequest(Leader.java:535)
> >         at
> org.apache.zookeeper.server.quorum.CommitProcessor.run(CommitProcessor.java:73)
> > 2011-06-30 22:14:49,084 - ERROR [LearnerHandler-/10.23.247.163:14998
> :LearnerHandler@444] - Unexpected exception causing shutdown while sock
> still open
> > java.net.SocketTimeoutException: Read timed out
> >         at java.net.SocketInputStream.socketRead0(Native Method)
> >         at java.net.SocketInputStream.read(SocketInputStream.java:129)
> >         at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
> >         at java.io.BufferedInputStream.read(BufferedInputStream.java:237)
> >         at java.io.DataInputStream.readInt(DataInputStream.java:370)
> >         at
> org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
> >         at
> org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:84)
> >         at
> org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:108)
> >         at
> org.apache.zookeeper.server.quorum.LearnerHandler.run(LearnerHandler.java:358)
> > 2011-06-30 22:14:49,084 - WARN  [LearnerHandler-/10.23.247.163:14998
> :LearnerHandler@457] - ******* GOODBYE /10.23.247.163:14998 ********
>
> --
> This message is automatically generated by JIRA.
> For more information on JIRA, see: http://www.atlassian.com/software/jira
>
>
>


-- 
Best Regards
Lei, He

Re: [jira] [Commented] (ZOOKEEPER-1115) follower can not sync with leader

Posted by lei he <hl...@gmail.com>.
Hmm, of cause it works. But I really don't think that's the right way... It
obviously harm the performance. We have to limit the node number, or the
leader will be the single point. But that's OK, it surely works this time.
Thanks again.

2011/7/5 Flavio Junqueira (JIRA) <ji...@apache.org>

>
>    [
> https://issues.apache.org/jira/browse/ZOOKEEPER-1115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13059817#comment-13059817]
>
> Flavio Junqueira commented on ZOOKEEPER-1115:
> ---------------------------------------------
>
> What about increasing syncLimit?
>
> > follower can not sync with leader
> > ---------------------------------
> >
> >                 Key: ZOOKEEPER-1115
> >                 URL:
> https://issues.apache.org/jira/browse/ZOOKEEPER-1115
> >             Project: ZooKeeper
> >          Issue Type: Bug
> >          Components: quorum
> >    Affects Versions: 3.3.0, 3.3.3
> >         Environment: linux rhel 4, x64, java version 1.4.2
> >            Reporter: helei
> >            Priority: Critical
> >
> > there are 5 members in the quorum. one follower can not sync with leader
> after restart. it seems leader has close the data connection with this
> follower because of read timeout. here is the key log in follower:
> > 2011-06-30 22:14:45,069 - WARN
>  [Thread-17:QuorumCnxManager$RecvWorker@658] - Connection broken:
> > java.nio.channels.ClosedChannelException
> >         at
> sun.nio.ch.SocketChannelImpl.ensureReadOpen(SocketChannelImpl.java:113)
> >         at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:156)
> >         at
> org.apache.zookeeper.server.quorum.QuorumCnxManager$RecvWorker.run(QuorumCnxManager.java:629)
> > 2011-06-30 22:14:45,069 - INFO  [QuorumPeer:/0.0.0.0:2181
> :FastLeaderElection@689] - Notification: 3, 17198470148, 3, 3, LOOKING,
> LOOKING, 3
> > 2011-06-30 22:14:45,070 - ERROR
> [Thread-16:QuorumCnxManager$SendWorker@559] - Failed to send last message.
> Shutting down thread.
> > java.nio.channels.ClosedChannelException
> >         at
> sun.nio.ch.SocketChannelImpl.ensureWriteOpen(SocketChannelImpl.java:126)
> >         at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:324)
> >         at
> org.apache.zookeeper.server.quorum.QuorumCnxManager$SendWorker.send(QuorumCnxManager.java:548)
> >         at
> org.apache.zookeeper.server.quorum.QuorumCnxManager$SendWorker.run(QuorumCnxManager.java:557)
> > 2011-06-30 22:14:45,082 - INFO  [QuorumPeer:/0.0.0.0:2181:Learner@282] -
> Getting a diff from the leader 0x4011bd462
> > 2011-06-30 22:14:45,083 - WARN
>  [Thread-18:QuorumCnxManager$SendWorker@589] - Send worker leaving thread
> > 2011-06-30 22:14:45,085 - WARN  [QuorumPeer:/0.0.0.0:2181:Follower@116]
> - Got zxid 0x4011bd405 expected 0x1
> > 2011-06-30 22:14:45,090 - INFO  [QuorumPeer:/0.0.0.0:2181
> :FileTxnSnapLog@208] - Snapshotting: 4011bd462
> > 2011-06-30 22:14:53,397 - WARN  [SyncThread:3:SendAckRequestProcessor@63]
> - Closing connection to leader, exception during packet send
> > java.net.SocketException: Broken pipe
> >         at java.net.SocketOutputStream.socketWrite0(Native Method)
> >         at
> java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:92)
> >         at java.net.SocketOutputStream.write(SocketOutputStream.java:136)
> >         at
> java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65)
> >         at
> java.io.BufferedOutputStream.flush(BufferedOutputStream.java:123)
> >         at
> org.apache.zookeeper.server.quorum.Learner.writePacket(Learner.java:126)
> >         at
> org.apache.zookeeper.server.quorum.SendAckRequestProcessor.flush(SendAckRequestProcessor.java:61)
> >         at
> org.apache.zookeeper.server.SyncRequestProcessor.flush(SyncRequestProcessor.java:164)
> >         at
> org.apache.zookeeper.server.SyncRequestProcessor.run(SyncRequestProcessor.java:98)
> > 2011-06-30 22:14:53,398 - WARN  [QuorumPeer:/0.0.0.0:2181:Follower@82] -
> Exception when following the leader
> > java.net.SocketException: Socket closed
> >         at
> java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:99)
> >         at java.net.SocketOutputStream.write(SocketOutputStream.java:136)
> >         at
> java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65)
> >         at
> java.io.BufferedOutputStream.flush(BufferedOutputStream.java:123)
> >         at
> org.apache.zookeeper.server.quorum.Learner.writePacket(Learner.java:126)
> >         at
> org.apache.zookeeper.server.quorum.Learner.ping(Learner.java:358)
> >         at
> org.apache.zookeeper.server.quorum.Follower.processPacket(Follower.java:108)
> >         at
> org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:79)
> >         at
> org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:634)
> > 2011-06-30 22:14:53,398 - WARN  [SyncThread:3:SendAckRequestProcessor@63]
> - Closing connection to leader, exception during packet send
> > java.net.SocketException: Socket closed
> >         at
> java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:99)
> >         at java.net.SocketOutputStream.write(SocketOutputStream.java:136)
> >         at
> java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65)
> >         at
> java.io.BufferedOutputStream.flush(BufferedOutputStream.java:123)
> >         at
> org.apache.zookeeper.server.quorum.Learner.writePacket(Learner.java:126)
> >         at
> org.apache.zookeeper.server.quorum.SendAckRequestProcessor.flush(SendAckRequestProcessor.java:61)
> >         at
> org.apache.zookeeper.server.SyncRequestProcessor.flush(SyncRequestProcessor.java:164)
> >         at
> org.apache.zookeeper.server.SyncRequestProcessor.run(SyncRequestProcessor.java:98)
> > 2011-06-30 22:14:53,399 - INFO  [QuorumPeer:/0.0.0.0:2181:Follower@166]
> - shutdown called
> > java.lang.Exception: shutdown Follower
> >         at
> org.apache.zookeeper.server.quorum.Follower.shutdown(Follower.java:166)
> >         at
> org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:638)
> > and these are the leader's:
> > 2011-06-30 22:14:35,943 - ERROR [LearnerHandler-/10.23.247.163:14975
> :LearnerHandler@444] - Unexpected exception causing shutdown while sock
> still open
> > java.net.SocketTimeoutException: Read timed out
> >         at java.net.SocketInputStream.socketRead0(Native Method)
> >         at java.net.SocketInputStream.read(SocketInputStream.java:129)
> >         at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
> >         at java.io.BufferedInputStream.read(BufferedInputStream.java:237)
> >         at java.io.DataInputStream.readInt(DataInputStream.java:370)
> >         at
> org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
> >         at
> org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:84)
> >         at
> org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:108)
> >         at
> org.apache.zookeeper.server.quorum.LearnerHandler.run(LearnerHandler.java:358)
> > 2011-06-30 22:14:35,943 - WARN  [LearnerHandler-/10.23.247.163:14975
> :LearnerHandler@457] - ******* GOODBYE /10.23.247.163:14975 ********
> > 2011-06-30 22:14:48,943 - ERROR [CommitProcessor:4:NIOServerCnxn@422] -
> Unexpected Exception:
> > java.nio.channels.CancelledKeyException
> >         at
> sun.nio.ch.SelectionKeyImpl.ensureValid(SelectionKeyImpl.java:55)
> >         at
> sun.nio.ch.SelectionKeyImpl.interestOps(SelectionKeyImpl.java:59)
> >         at
> org.apache.zookeeper.server.NIOServerCnxn.sendBuffer(NIOServerCnxn.java:395)
> >         at
> org.apache.zookeeper.server.NIOServerCnxn.sendResponse(NIOServerCnxn.java:1360)
> >         at
> org.apache.zookeeper.server.FinalRequestProcessor.processRequest(FinalRequestProcessor.java:367)
> >         at
> org.apache.zookeeper.server.quorum.Leader$ToBeAppliedRequestProcessor.processRequest(Leader.java:535)
> >         at
> org.apache.zookeeper.server.quorum.CommitProcessor.run(CommitProcessor.java:73)
> > 2011-06-30 22:14:49,084 - ERROR [LearnerHandler-/10.23.247.163:14998
> :LearnerHandler@444] - Unexpected exception causing shutdown while sock
> still open
> > java.net.SocketTimeoutException: Read timed out
> >         at java.net.SocketInputStream.socketRead0(Native Method)
> >         at java.net.SocketInputStream.read(SocketInputStream.java:129)
> >         at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
> >         at java.io.BufferedInputStream.read(BufferedInputStream.java:237)
> >         at java.io.DataInputStream.readInt(DataInputStream.java:370)
> >         at
> org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
> >         at
> org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:84)
> >         at
> org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:108)
> >         at
> org.apache.zookeeper.server.quorum.LearnerHandler.run(LearnerHandler.java:358)
> > 2011-06-30 22:14:49,084 - WARN  [LearnerHandler-/10.23.247.163:14998
> :LearnerHandler@457] - ******* GOODBYE /10.23.247.163:14998 ********
>
> --
> This message is automatically generated by JIRA.
> For more information on JIRA, see: http://www.atlassian.com/software/jira
>
>
>


-- 
Best Regards
Lei, He

[jira] [Updated] (ZOOKEEPER-1115) follower can not sync with leader

Posted by "Dave Latham (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/ZOOKEEPER-1115?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Dave Latham updated ZOOKEEPER-1115:
-----------------------------------

    Description: 
there are 5 members in the quorum. one follower can not sync with leader after restart. it seems leader has close the data connection with this follower because of read timeout. here is the key log in follower:
{noformat}
2011-06-30 22:14:45,069 - WARN  [Thread-17:QuorumCnxManager$RecvWorker@658] - Connection broken: 
java.nio.channels.ClosedChannelException
        at sun.nio.ch.SocketChannelImpl.ensureReadOpen(SocketChannelImpl.java:113)
        at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:156)
        at org.apache.zookeeper.server.quorum.QuorumCnxManager$RecvWorker.run(QuorumCnxManager.java:629)
2011-06-30 22:14:45,069 - INFO  [QuorumPeer:/0.0.0.0:2181:FastLeaderElection@689] - Notification: 3, 17198470148, 3, 3, LOOKING, LOOKING, 3
2011-06-30 22:14:45,070 - ERROR [Thread-16:QuorumCnxManager$SendWorker@559] - Failed to send last message. Shutting down thread.
java.nio.channels.ClosedChannelException
        at sun.nio.ch.SocketChannelImpl.ensureWriteOpen(SocketChannelImpl.java:126)
        at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:324)
        at org.apache.zookeeper.server.quorum.QuorumCnxManager$SendWorker.send(QuorumCnxManager.java:548)
        at org.apache.zookeeper.server.quorum.QuorumCnxManager$SendWorker.run(QuorumCnxManager.java:557)
2011-06-30 22:14:45,082 - INFO  [QuorumPeer:/0.0.0.0:2181:Learner@282] - Getting a diff from the leader 0x4011bd462
2011-06-30 22:14:45,083 - WARN  [Thread-18:QuorumCnxManager$SendWorker@589] - Send worker leaving thread
2011-06-30 22:14:45,085 - WARN  [QuorumPeer:/0.0.0.0:2181:Follower@116] - Got zxid 0x4011bd405 expected 0x1
2011-06-30 22:14:45,090 - INFO  [QuorumPeer:/0.0.0.0:2181:FileTxnSnapLog@208] - Snapshotting: 4011bd462
2011-06-30 22:14:53,397 - WARN  [SyncThread:3:SendAckRequestProcessor@63] - Closing connection to leader, exception during packet send
java.net.SocketException: Broken pipe
        at java.net.SocketOutputStream.socketWrite0(Native Method)
        at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:92)
        at java.net.SocketOutputStream.write(SocketOutputStream.java:136)
        at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65)
        at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:123)
        at org.apache.zookeeper.server.quorum.Learner.writePacket(Learner.java:126)
        at org.apache.zookeeper.server.quorum.SendAckRequestProcessor.flush(SendAckRequestProcessor.java:61)
        at org.apache.zookeeper.server.SyncRequestProcessor.flush(SyncRequestProcessor.java:164)
        at org.apache.zookeeper.server.SyncRequestProcessor.run(SyncRequestProcessor.java:98)
2011-06-30 22:14:53,398 - WARN  [QuorumPeer:/0.0.0.0:2181:Follower@82] - Exception when following the leader
java.net.SocketException: Socket closed
        at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:99)
        at java.net.SocketOutputStream.write(SocketOutputStream.java:136)
        at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65)
        at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:123)
        at org.apache.zookeeper.server.quorum.Learner.writePacket(Learner.java:126)
        at org.apache.zookeeper.server.quorum.Learner.ping(Learner.java:358)
        at org.apache.zookeeper.server.quorum.Follower.processPacket(Follower.java:108)
        at org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:79)
        at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:634)
2011-06-30 22:14:53,398 - WARN  [SyncThread:3:SendAckRequestProcessor@63] - Closing connection to leader, exception during packet send
java.net.SocketException: Socket closed
        at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:99)
        at java.net.SocketOutputStream.write(SocketOutputStream.java:136)
        at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65)
        at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:123)
        at org.apache.zookeeper.server.quorum.Learner.writePacket(Learner.java:126)
        at org.apache.zookeeper.server.quorum.SendAckRequestProcessor.flush(SendAckRequestProcessor.java:61)
        at org.apache.zookeeper.server.SyncRequestProcessor.flush(SyncRequestProcessor.java:164)
        at org.apache.zookeeper.server.SyncRequestProcessor.run(SyncRequestProcessor.java:98)
2011-06-30 22:14:53,399 - INFO  [QuorumPeer:/0.0.0.0:2181:Follower@166] - shutdown called
java.lang.Exception: shutdown Follower
        at org.apache.zookeeper.server.quorum.Follower.shutdown(Follower.java:166)
        at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:638)
and these are the leader's:
2011-06-30 22:14:35,943 - ERROR [LearnerHandler-/10.23.247.163:14975:LearnerHandler@444] - Unexpected exception causing shutdown while sock still open
java.net.SocketTimeoutException: Read timed out
        at java.net.SocketInputStream.socketRead0(Native Method)
        at java.net.SocketInputStream.read(SocketInputStream.java:129)
        at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
        at java.io.BufferedInputStream.read(BufferedInputStream.java:237)
        at java.io.DataInputStream.readInt(DataInputStream.java:370)
        at org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
        at org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:84)
        at org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:108)
        at org.apache.zookeeper.server.quorum.LearnerHandler.run(LearnerHandler.java:358)
2011-06-30 22:14:35,943 - WARN  [LearnerHandler-/10.23.247.163:14975:LearnerHandler@457] - ******* GOODBYE /10.23.247.163:14975 ********
2011-06-30 22:14:48,943 - ERROR [CommitProcessor:4:NIOServerCnxn@422] - Unexpected Exception: 
java.nio.channels.CancelledKeyException
        at sun.nio.ch.SelectionKeyImpl.ensureValid(SelectionKeyImpl.java:55)
        at sun.nio.ch.SelectionKeyImpl.interestOps(SelectionKeyImpl.java:59)
        at org.apache.zookeeper.server.NIOServerCnxn.sendBuffer(NIOServerCnxn.java:395)
        at org.apache.zookeeper.server.NIOServerCnxn.sendResponse(NIOServerCnxn.java:1360)
        at org.apache.zookeeper.server.FinalRequestProcessor.processRequest(FinalRequestProcessor.java:367)
        at org.apache.zookeeper.server.quorum.Leader$ToBeAppliedRequestProcessor.processRequest(Leader.java:535)
        at org.apache.zookeeper.server.quorum.CommitProcessor.run(CommitProcessor.java:73)
2011-06-30 22:14:49,084 - ERROR [LearnerHandler-/10.23.247.163:14998:LearnerHandler@444] - Unexpected exception causing shutdown while sock still open
java.net.SocketTimeoutException: Read timed out
        at java.net.SocketInputStream.socketRead0(Native Method)
        at java.net.SocketInputStream.read(SocketInputStream.java:129)
        at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
        at java.io.BufferedInputStream.read(BufferedInputStream.java:237)
        at java.io.DataInputStream.readInt(DataInputStream.java:370)
        at org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
        at org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:84)
        at org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:108)
        at org.apache.zookeeper.server.quorum.LearnerHandler.run(LearnerHandler.java:358)
2011-06-30 22:14:49,084 - WARN  [LearnerHandler-/10.23.247.163:14998:LearnerHandler@457] - ******* GOODBYE /10.23.247.163:14998 ********
{noformat}

  was:
there are 5 members in the quorum. one follower can not sync with leader after restart. it seems leader has close the data connection with this follower because of read timeout. here is the key log in follower:
2011-06-30 22:14:45,069 - WARN  [Thread-17:QuorumCnxManager$RecvWorker@658] - Connection broken: 
java.nio.channels.ClosedChannelException
        at sun.nio.ch.SocketChannelImpl.ensureReadOpen(SocketChannelImpl.java:113)
        at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:156)
        at org.apache.zookeeper.server.quorum.QuorumCnxManager$RecvWorker.run(QuorumCnxManager.java:629)
2011-06-30 22:14:45,069 - INFO  [QuorumPeer:/0.0.0.0:2181:FastLeaderElection@689] - Notification: 3, 17198470148, 3, 3, LOOKING, LOOKING, 3
2011-06-30 22:14:45,070 - ERROR [Thread-16:QuorumCnxManager$SendWorker@559] - Failed to send last message. Shutting down thread.
java.nio.channels.ClosedChannelException
        at sun.nio.ch.SocketChannelImpl.ensureWriteOpen(SocketChannelImpl.java:126)
        at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:324)
        at org.apache.zookeeper.server.quorum.QuorumCnxManager$SendWorker.send(QuorumCnxManager.java:548)
        at org.apache.zookeeper.server.quorum.QuorumCnxManager$SendWorker.run(QuorumCnxManager.java:557)
2011-06-30 22:14:45,082 - INFO  [QuorumPeer:/0.0.0.0:2181:Learner@282] - Getting a diff from the leader 0x4011bd462
2011-06-30 22:14:45,083 - WARN  [Thread-18:QuorumCnxManager$SendWorker@589] - Send worker leaving thread
2011-06-30 22:14:45,085 - WARN  [QuorumPeer:/0.0.0.0:2181:Follower@116] - Got zxid 0x4011bd405 expected 0x1
2011-06-30 22:14:45,090 - INFO  [QuorumPeer:/0.0.0.0:2181:FileTxnSnapLog@208] - Snapshotting: 4011bd462
2011-06-30 22:14:53,397 - WARN  [SyncThread:3:SendAckRequestProcessor@63] - Closing connection to leader, exception during packet send
java.net.SocketException: Broken pipe
        at java.net.SocketOutputStream.socketWrite0(Native Method)
        at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:92)
        at java.net.SocketOutputStream.write(SocketOutputStream.java:136)
        at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65)
        at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:123)
        at org.apache.zookeeper.server.quorum.Learner.writePacket(Learner.java:126)
        at org.apache.zookeeper.server.quorum.SendAckRequestProcessor.flush(SendAckRequestProcessor.java:61)
        at org.apache.zookeeper.server.SyncRequestProcessor.flush(SyncRequestProcessor.java:164)
        at org.apache.zookeeper.server.SyncRequestProcessor.run(SyncRequestProcessor.java:98)
2011-06-30 22:14:53,398 - WARN  [QuorumPeer:/0.0.0.0:2181:Follower@82] - Exception when following the leader
java.net.SocketException: Socket closed
        at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:99)
        at java.net.SocketOutputStream.write(SocketOutputStream.java:136)
        at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65)
        at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:123)
        at org.apache.zookeeper.server.quorum.Learner.writePacket(Learner.java:126)
        at org.apache.zookeeper.server.quorum.Learner.ping(Learner.java:358)
        at org.apache.zookeeper.server.quorum.Follower.processPacket(Follower.java:108)
        at org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:79)
        at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:634)
2011-06-30 22:14:53,398 - WARN  [SyncThread:3:SendAckRequestProcessor@63] - Closing connection to leader, exception during packet send
java.net.SocketException: Socket closed
        at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:99)
        at java.net.SocketOutputStream.write(SocketOutputStream.java:136)
        at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65)
        at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:123)
        at org.apache.zookeeper.server.quorum.Learner.writePacket(Learner.java:126)
        at org.apache.zookeeper.server.quorum.SendAckRequestProcessor.flush(SendAckRequestProcessor.java:61)
        at org.apache.zookeeper.server.SyncRequestProcessor.flush(SyncRequestProcessor.java:164)
        at org.apache.zookeeper.server.SyncRequestProcessor.run(SyncRequestProcessor.java:98)
2011-06-30 22:14:53,399 - INFO  [QuorumPeer:/0.0.0.0:2181:Follower@166] - shutdown called
java.lang.Exception: shutdown Follower
        at org.apache.zookeeper.server.quorum.Follower.shutdown(Follower.java:166)
        at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:638)
and these are the leader's:
2011-06-30 22:14:35,943 - ERROR [LearnerHandler-/10.23.247.163:14975:LearnerHandler@444] - Unexpected exception causing shutdown while sock still open
java.net.SocketTimeoutException: Read timed out
        at java.net.SocketInputStream.socketRead0(Native Method)
        at java.net.SocketInputStream.read(SocketInputStream.java:129)
        at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
        at java.io.BufferedInputStream.read(BufferedInputStream.java:237)
        at java.io.DataInputStream.readInt(DataInputStream.java:370)
        at org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
        at org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:84)
        at org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:108)
        at org.apache.zookeeper.server.quorum.LearnerHandler.run(LearnerHandler.java:358)
2011-06-30 22:14:35,943 - WARN  [LearnerHandler-/10.23.247.163:14975:LearnerHandler@457] - ******* GOODBYE /10.23.247.163:14975 ********
2011-06-30 22:14:48,943 - ERROR [CommitProcessor:4:NIOServerCnxn@422] - Unexpected Exception: 
java.nio.channels.CancelledKeyException
        at sun.nio.ch.SelectionKeyImpl.ensureValid(SelectionKeyImpl.java:55)
        at sun.nio.ch.SelectionKeyImpl.interestOps(SelectionKeyImpl.java:59)
        at org.apache.zookeeper.server.NIOServerCnxn.sendBuffer(NIOServerCnxn.java:395)
        at org.apache.zookeeper.server.NIOServerCnxn.sendResponse(NIOServerCnxn.java:1360)
        at org.apache.zookeeper.server.FinalRequestProcessor.processRequest(FinalRequestProcessor.java:367)
        at org.apache.zookeeper.server.quorum.Leader$ToBeAppliedRequestProcessor.processRequest(Leader.java:535)
        at org.apache.zookeeper.server.quorum.CommitProcessor.run(CommitProcessor.java:73)
2011-06-30 22:14:49,084 - ERROR [LearnerHandler-/10.23.247.163:14998:LearnerHandler@444] - Unexpected exception causing shutdown while sock still open
java.net.SocketTimeoutException: Read timed out
        at java.net.SocketInputStream.socketRead0(Native Method)
        at java.net.SocketInputStream.read(SocketInputStream.java:129)
        at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
        at java.io.BufferedInputStream.read(BufferedInputStream.java:237)
        at java.io.DataInputStream.readInt(DataInputStream.java:370)
        at org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
        at org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:84)
        at org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:108)
        at org.apache.zookeeper.server.quorum.LearnerHandler.run(LearnerHandler.java:358)
2011-06-30 22:14:49,084 - WARN  [LearnerHandler-/10.23.247.163:14998:LearnerHandler@457] - ******* GOODBYE /10.23.247.163:14998 ********

    
> follower can not sync with leader
> ---------------------------------
>
>                 Key: ZOOKEEPER-1115
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1115
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: quorum
>    Affects Versions: 3.3.0, 3.3.3
>         Environment: linux rhel 4, x64, java version 1.4.2
>            Reporter: helei
>            Priority: Critical
>
> there are 5 members in the quorum. one follower can not sync with leader after restart. it seems leader has close the data connection with this follower because of read timeout. here is the key log in follower:
> {noformat}
> 2011-06-30 22:14:45,069 - WARN  [Thread-17:QuorumCnxManager$RecvWorker@658] - Connection broken: 
> java.nio.channels.ClosedChannelException
>         at sun.nio.ch.SocketChannelImpl.ensureReadOpen(SocketChannelImpl.java:113)
>         at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:156)
>         at org.apache.zookeeper.server.quorum.QuorumCnxManager$RecvWorker.run(QuorumCnxManager.java:629)
> 2011-06-30 22:14:45,069 - INFO  [QuorumPeer:/0.0.0.0:2181:FastLeaderElection@689] - Notification: 3, 17198470148, 3, 3, LOOKING, LOOKING, 3
> 2011-06-30 22:14:45,070 - ERROR [Thread-16:QuorumCnxManager$SendWorker@559] - Failed to send last message. Shutting down thread.
> java.nio.channels.ClosedChannelException
>         at sun.nio.ch.SocketChannelImpl.ensureWriteOpen(SocketChannelImpl.java:126)
>         at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:324)
>         at org.apache.zookeeper.server.quorum.QuorumCnxManager$SendWorker.send(QuorumCnxManager.java:548)
>         at org.apache.zookeeper.server.quorum.QuorumCnxManager$SendWorker.run(QuorumCnxManager.java:557)
> 2011-06-30 22:14:45,082 - INFO  [QuorumPeer:/0.0.0.0:2181:Learner@282] - Getting a diff from the leader 0x4011bd462
> 2011-06-30 22:14:45,083 - WARN  [Thread-18:QuorumCnxManager$SendWorker@589] - Send worker leaving thread
> 2011-06-30 22:14:45,085 - WARN  [QuorumPeer:/0.0.0.0:2181:Follower@116] - Got zxid 0x4011bd405 expected 0x1
> 2011-06-30 22:14:45,090 - INFO  [QuorumPeer:/0.0.0.0:2181:FileTxnSnapLog@208] - Snapshotting: 4011bd462
> 2011-06-30 22:14:53,397 - WARN  [SyncThread:3:SendAckRequestProcessor@63] - Closing connection to leader, exception during packet send
> java.net.SocketException: Broken pipe
>         at java.net.SocketOutputStream.socketWrite0(Native Method)
>         at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:92)
>         at java.net.SocketOutputStream.write(SocketOutputStream.java:136)
>         at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65)
>         at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:123)
>         at org.apache.zookeeper.server.quorum.Learner.writePacket(Learner.java:126)
>         at org.apache.zookeeper.server.quorum.SendAckRequestProcessor.flush(SendAckRequestProcessor.java:61)
>         at org.apache.zookeeper.server.SyncRequestProcessor.flush(SyncRequestProcessor.java:164)
>         at org.apache.zookeeper.server.SyncRequestProcessor.run(SyncRequestProcessor.java:98)
> 2011-06-30 22:14:53,398 - WARN  [QuorumPeer:/0.0.0.0:2181:Follower@82] - Exception when following the leader
> java.net.SocketException: Socket closed
>         at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:99)
>         at java.net.SocketOutputStream.write(SocketOutputStream.java:136)
>         at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65)
>         at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:123)
>         at org.apache.zookeeper.server.quorum.Learner.writePacket(Learner.java:126)
>         at org.apache.zookeeper.server.quorum.Learner.ping(Learner.java:358)
>         at org.apache.zookeeper.server.quorum.Follower.processPacket(Follower.java:108)
>         at org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:79)
>         at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:634)
> 2011-06-30 22:14:53,398 - WARN  [SyncThread:3:SendAckRequestProcessor@63] - Closing connection to leader, exception during packet send
> java.net.SocketException: Socket closed
>         at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:99)
>         at java.net.SocketOutputStream.write(SocketOutputStream.java:136)
>         at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65)
>         at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:123)
>         at org.apache.zookeeper.server.quorum.Learner.writePacket(Learner.java:126)
>         at org.apache.zookeeper.server.quorum.SendAckRequestProcessor.flush(SendAckRequestProcessor.java:61)
>         at org.apache.zookeeper.server.SyncRequestProcessor.flush(SyncRequestProcessor.java:164)
>         at org.apache.zookeeper.server.SyncRequestProcessor.run(SyncRequestProcessor.java:98)
> 2011-06-30 22:14:53,399 - INFO  [QuorumPeer:/0.0.0.0:2181:Follower@166] - shutdown called
> java.lang.Exception: shutdown Follower
>         at org.apache.zookeeper.server.quorum.Follower.shutdown(Follower.java:166)
>         at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:638)
> and these are the leader's:
> 2011-06-30 22:14:35,943 - ERROR [LearnerHandler-/10.23.247.163:14975:LearnerHandler@444] - Unexpected exception causing shutdown while sock still open
> java.net.SocketTimeoutException: Read timed out
>         at java.net.SocketInputStream.socketRead0(Native Method)
>         at java.net.SocketInputStream.read(SocketInputStream.java:129)
>         at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
>         at java.io.BufferedInputStream.read(BufferedInputStream.java:237)
>         at java.io.DataInputStream.readInt(DataInputStream.java:370)
>         at org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
>         at org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:84)
>         at org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:108)
>         at org.apache.zookeeper.server.quorum.LearnerHandler.run(LearnerHandler.java:358)
> 2011-06-30 22:14:35,943 - WARN  [LearnerHandler-/10.23.247.163:14975:LearnerHandler@457] - ******* GOODBYE /10.23.247.163:14975 ********
> 2011-06-30 22:14:48,943 - ERROR [CommitProcessor:4:NIOServerCnxn@422] - Unexpected Exception: 
> java.nio.channels.CancelledKeyException
>         at sun.nio.ch.SelectionKeyImpl.ensureValid(SelectionKeyImpl.java:55)
>         at sun.nio.ch.SelectionKeyImpl.interestOps(SelectionKeyImpl.java:59)
>         at org.apache.zookeeper.server.NIOServerCnxn.sendBuffer(NIOServerCnxn.java:395)
>         at org.apache.zookeeper.server.NIOServerCnxn.sendResponse(NIOServerCnxn.java:1360)
>         at org.apache.zookeeper.server.FinalRequestProcessor.processRequest(FinalRequestProcessor.java:367)
>         at org.apache.zookeeper.server.quorum.Leader$ToBeAppliedRequestProcessor.processRequest(Leader.java:535)
>         at org.apache.zookeeper.server.quorum.CommitProcessor.run(CommitProcessor.java:73)
> 2011-06-30 22:14:49,084 - ERROR [LearnerHandler-/10.23.247.163:14998:LearnerHandler@444] - Unexpected exception causing shutdown while sock still open
> java.net.SocketTimeoutException: Read timed out
>         at java.net.SocketInputStream.socketRead0(Native Method)
>         at java.net.SocketInputStream.read(SocketInputStream.java:129)
>         at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
>         at java.io.BufferedInputStream.read(BufferedInputStream.java:237)
>         at java.io.DataInputStream.readInt(DataInputStream.java:370)
>         at org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
>         at org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:84)
>         at org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:108)
>         at org.apache.zookeeper.server.quorum.LearnerHandler.run(LearnerHandler.java:358)
> 2011-06-30 22:14:49,084 - WARN  [LearnerHandler-/10.23.247.163:14998:LearnerHandler@457] - ******* GOODBYE /10.23.247.163:14998 ********
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (ZOOKEEPER-1115) follower can not sync with leader

Posted by "Flavio Junqueira (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/ZOOKEEPER-1115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13059817#comment-13059817 ] 

Flavio Junqueira commented on ZOOKEEPER-1115:
---------------------------------------------

What about increasing syncLimit?

> follower can not sync with leader
> ---------------------------------
>
>                 Key: ZOOKEEPER-1115
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1115
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: quorum
>    Affects Versions: 3.3.0, 3.3.3
>         Environment: linux rhel 4, x64, java version 1.4.2
>            Reporter: helei
>            Priority: Critical
>
> there are 5 members in the quorum. one follower can not sync with leader after restart. it seems leader has close the data connection with this follower because of read timeout. here is the key log in follower:
> 2011-06-30 22:14:45,069 - WARN  [Thread-17:QuorumCnxManager$RecvWorker@658] - Connection broken: 
> java.nio.channels.ClosedChannelException
>         at sun.nio.ch.SocketChannelImpl.ensureReadOpen(SocketChannelImpl.java:113)
>         at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:156)
>         at org.apache.zookeeper.server.quorum.QuorumCnxManager$RecvWorker.run(QuorumCnxManager.java:629)
> 2011-06-30 22:14:45,069 - INFO  [QuorumPeer:/0.0.0.0:2181:FastLeaderElection@689] - Notification: 3, 17198470148, 3, 3, LOOKING, LOOKING, 3
> 2011-06-30 22:14:45,070 - ERROR [Thread-16:QuorumCnxManager$SendWorker@559] - Failed to send last message. Shutting down thread.
> java.nio.channels.ClosedChannelException
>         at sun.nio.ch.SocketChannelImpl.ensureWriteOpen(SocketChannelImpl.java:126)
>         at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:324)
>         at org.apache.zookeeper.server.quorum.QuorumCnxManager$SendWorker.send(QuorumCnxManager.java:548)
>         at org.apache.zookeeper.server.quorum.QuorumCnxManager$SendWorker.run(QuorumCnxManager.java:557)
> 2011-06-30 22:14:45,082 - INFO  [QuorumPeer:/0.0.0.0:2181:Learner@282] - Getting a diff from the leader 0x4011bd462
> 2011-06-30 22:14:45,083 - WARN  [Thread-18:QuorumCnxManager$SendWorker@589] - Send worker leaving thread
> 2011-06-30 22:14:45,085 - WARN  [QuorumPeer:/0.0.0.0:2181:Follower@116] - Got zxid 0x4011bd405 expected 0x1
> 2011-06-30 22:14:45,090 - INFO  [QuorumPeer:/0.0.0.0:2181:FileTxnSnapLog@208] - Snapshotting: 4011bd462
> 2011-06-30 22:14:53,397 - WARN  [SyncThread:3:SendAckRequestProcessor@63] - Closing connection to leader, exception during packet send
> java.net.SocketException: Broken pipe
>         at java.net.SocketOutputStream.socketWrite0(Native Method)
>         at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:92)
>         at java.net.SocketOutputStream.write(SocketOutputStream.java:136)
>         at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65)
>         at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:123)
>         at org.apache.zookeeper.server.quorum.Learner.writePacket(Learner.java:126)
>         at org.apache.zookeeper.server.quorum.SendAckRequestProcessor.flush(SendAckRequestProcessor.java:61)
>         at org.apache.zookeeper.server.SyncRequestProcessor.flush(SyncRequestProcessor.java:164)
>         at org.apache.zookeeper.server.SyncRequestProcessor.run(SyncRequestProcessor.java:98)
> 2011-06-30 22:14:53,398 - WARN  [QuorumPeer:/0.0.0.0:2181:Follower@82] - Exception when following the leader
> java.net.SocketException: Socket closed
>         at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:99)
>         at java.net.SocketOutputStream.write(SocketOutputStream.java:136)
>         at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65)
>         at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:123)
>         at org.apache.zookeeper.server.quorum.Learner.writePacket(Learner.java:126)
>         at org.apache.zookeeper.server.quorum.Learner.ping(Learner.java:358)
>         at org.apache.zookeeper.server.quorum.Follower.processPacket(Follower.java:108)
>         at org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:79)
>         at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:634)
> 2011-06-30 22:14:53,398 - WARN  [SyncThread:3:SendAckRequestProcessor@63] - Closing connection to leader, exception during packet send
> java.net.SocketException: Socket closed
>         at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:99)
>         at java.net.SocketOutputStream.write(SocketOutputStream.java:136)
>         at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65)
>         at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:123)
>         at org.apache.zookeeper.server.quorum.Learner.writePacket(Learner.java:126)
>         at org.apache.zookeeper.server.quorum.SendAckRequestProcessor.flush(SendAckRequestProcessor.java:61)
>         at org.apache.zookeeper.server.SyncRequestProcessor.flush(SyncRequestProcessor.java:164)
>         at org.apache.zookeeper.server.SyncRequestProcessor.run(SyncRequestProcessor.java:98)
> 2011-06-30 22:14:53,399 - INFO  [QuorumPeer:/0.0.0.0:2181:Follower@166] - shutdown called
> java.lang.Exception: shutdown Follower
>         at org.apache.zookeeper.server.quorum.Follower.shutdown(Follower.java:166)
>         at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:638)
> and these are the leader's:
> 2011-06-30 22:14:35,943 - ERROR [LearnerHandler-/10.23.247.163:14975:LearnerHandler@444] - Unexpected exception causing shutdown while sock still open
> java.net.SocketTimeoutException: Read timed out
>         at java.net.SocketInputStream.socketRead0(Native Method)
>         at java.net.SocketInputStream.read(SocketInputStream.java:129)
>         at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
>         at java.io.BufferedInputStream.read(BufferedInputStream.java:237)
>         at java.io.DataInputStream.readInt(DataInputStream.java:370)
>         at org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
>         at org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:84)
>         at org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:108)
>         at org.apache.zookeeper.server.quorum.LearnerHandler.run(LearnerHandler.java:358)
> 2011-06-30 22:14:35,943 - WARN  [LearnerHandler-/10.23.247.163:14975:LearnerHandler@457] - ******* GOODBYE /10.23.247.163:14975 ********
> 2011-06-30 22:14:48,943 - ERROR [CommitProcessor:4:NIOServerCnxn@422] - Unexpected Exception: 
> java.nio.channels.CancelledKeyException
>         at sun.nio.ch.SelectionKeyImpl.ensureValid(SelectionKeyImpl.java:55)
>         at sun.nio.ch.SelectionKeyImpl.interestOps(SelectionKeyImpl.java:59)
>         at org.apache.zookeeper.server.NIOServerCnxn.sendBuffer(NIOServerCnxn.java:395)
>         at org.apache.zookeeper.server.NIOServerCnxn.sendResponse(NIOServerCnxn.java:1360)
>         at org.apache.zookeeper.server.FinalRequestProcessor.processRequest(FinalRequestProcessor.java:367)
>         at org.apache.zookeeper.server.quorum.Leader$ToBeAppliedRequestProcessor.processRequest(Leader.java:535)
>         at org.apache.zookeeper.server.quorum.CommitProcessor.run(CommitProcessor.java:73)
> 2011-06-30 22:14:49,084 - ERROR [LearnerHandler-/10.23.247.163:14998:LearnerHandler@444] - Unexpected exception causing shutdown while sock still open
> java.net.SocketTimeoutException: Read timed out
>         at java.net.SocketInputStream.socketRead0(Native Method)
>         at java.net.SocketInputStream.read(SocketInputStream.java:129)
>         at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
>         at java.io.BufferedInputStream.read(BufferedInputStream.java:237)
>         at java.io.DataInputStream.readInt(DataInputStream.java:370)
>         at org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
>         at org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:84)
>         at org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:108)
>         at org.apache.zookeeper.server.quorum.LearnerHandler.run(LearnerHandler.java:358)
> 2011-06-30 22:14:49,084 - WARN  [LearnerHandler-/10.23.247.163:14998:LearnerHandler@457] - ******* GOODBYE /10.23.247.163:14998 ********

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (ZOOKEEPER-1115) follower can not sync with leader

Posted by "Flavio Junqueira (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/ZOOKEEPER-1115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13059350#comment-13059350 ] 

Flavio Junqueira commented on ZOOKEEPER-1115:
---------------------------------------------

Hi Helei, Have you tried increasing the value of initLimit? Check options here:

http://zookeeper.apache.org/doc/r3.3.3/zookeeperAdmin.html

> follower can not sync with leader
> ---------------------------------
>
>                 Key: ZOOKEEPER-1115
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1115
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: quorum
>    Affects Versions: 3.3.0, 3.3.3
>         Environment: linux rhel 4, x64, java version 1.4.2
>            Reporter: helei
>            Priority: Critical
>
> there are 5 members in the quorum. one follower can not sync with leader after restart. it seems leader has close the data connection with this follower because of read timeout. here is the key log in follower:
> 2011-06-30 22:14:45,069 - WARN  [Thread-17:QuorumCnxManager$RecvWorker@658] - Connection broken: 
> java.nio.channels.ClosedChannelException
>         at sun.nio.ch.SocketChannelImpl.ensureReadOpen(SocketChannelImpl.java:113)
>         at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:156)
>         at org.apache.zookeeper.server.quorum.QuorumCnxManager$RecvWorker.run(QuorumCnxManager.java:629)
> 2011-06-30 22:14:45,069 - INFO  [QuorumPeer:/0.0.0.0:2181:FastLeaderElection@689] - Notification: 3, 17198470148, 3, 3, LOOKING, LOOKING, 3
> 2011-06-30 22:14:45,070 - ERROR [Thread-16:QuorumCnxManager$SendWorker@559] - Failed to send last message. Shutting down thread.
> java.nio.channels.ClosedChannelException
>         at sun.nio.ch.SocketChannelImpl.ensureWriteOpen(SocketChannelImpl.java:126)
>         at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:324)
>         at org.apache.zookeeper.server.quorum.QuorumCnxManager$SendWorker.send(QuorumCnxManager.java:548)
>         at org.apache.zookeeper.server.quorum.QuorumCnxManager$SendWorker.run(QuorumCnxManager.java:557)
> 2011-06-30 22:14:45,082 - INFO  [QuorumPeer:/0.0.0.0:2181:Learner@282] - Getting a diff from the leader 0x4011bd462
> 2011-06-30 22:14:45,083 - WARN  [Thread-18:QuorumCnxManager$SendWorker@589] - Send worker leaving thread
> 2011-06-30 22:14:45,085 - WARN  [QuorumPeer:/0.0.0.0:2181:Follower@116] - Got zxid 0x4011bd405 expected 0x1
> 2011-06-30 22:14:45,090 - INFO  [QuorumPeer:/0.0.0.0:2181:FileTxnSnapLog@208] - Snapshotting: 4011bd462
> 2011-06-30 22:14:53,397 - WARN  [SyncThread:3:SendAckRequestProcessor@63] - Closing connection to leader, exception during packet send
> java.net.SocketException: Broken pipe
>         at java.net.SocketOutputStream.socketWrite0(Native Method)
>         at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:92)
>         at java.net.SocketOutputStream.write(SocketOutputStream.java:136)
>         at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65)
>         at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:123)
>         at org.apache.zookeeper.server.quorum.Learner.writePacket(Learner.java:126)
>         at org.apache.zookeeper.server.quorum.SendAckRequestProcessor.flush(SendAckRequestProcessor.java:61)
>         at org.apache.zookeeper.server.SyncRequestProcessor.flush(SyncRequestProcessor.java:164)
>         at org.apache.zookeeper.server.SyncRequestProcessor.run(SyncRequestProcessor.java:98)
> 2011-06-30 22:14:53,398 - WARN  [QuorumPeer:/0.0.0.0:2181:Follower@82] - Exception when following the leader
> java.net.SocketException: Socket closed
>         at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:99)
>         at java.net.SocketOutputStream.write(SocketOutputStream.java:136)
>         at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65)
>         at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:123)
>         at org.apache.zookeeper.server.quorum.Learner.writePacket(Learner.java:126)
>         at org.apache.zookeeper.server.quorum.Learner.ping(Learner.java:358)
>         at org.apache.zookeeper.server.quorum.Follower.processPacket(Follower.java:108)
>         at org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:79)
>         at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:634)
> 2011-06-30 22:14:53,398 - WARN  [SyncThread:3:SendAckRequestProcessor@63] - Closing connection to leader, exception during packet send
> java.net.SocketException: Socket closed
>         at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:99)
>         at java.net.SocketOutputStream.write(SocketOutputStream.java:136)
>         at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65)
>         at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:123)
>         at org.apache.zookeeper.server.quorum.Learner.writePacket(Learner.java:126)
>         at org.apache.zookeeper.server.quorum.SendAckRequestProcessor.flush(SendAckRequestProcessor.java:61)
>         at org.apache.zookeeper.server.SyncRequestProcessor.flush(SyncRequestProcessor.java:164)
>         at org.apache.zookeeper.server.SyncRequestProcessor.run(SyncRequestProcessor.java:98)
> 2011-06-30 22:14:53,399 - INFO  [QuorumPeer:/0.0.0.0:2181:Follower@166] - shutdown called
> java.lang.Exception: shutdown Follower
>         at org.apache.zookeeper.server.quorum.Follower.shutdown(Follower.java:166)
>         at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:638)
> and these are the leader's:
> 2011-06-30 22:14:35,943 - ERROR [LearnerHandler-/10.23.247.163:14975:LearnerHandler@444] - Unexpected exception causing shutdown while sock still open
> java.net.SocketTimeoutException: Read timed out
>         at java.net.SocketInputStream.socketRead0(Native Method)
>         at java.net.SocketInputStream.read(SocketInputStream.java:129)
>         at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
>         at java.io.BufferedInputStream.read(BufferedInputStream.java:237)
>         at java.io.DataInputStream.readInt(DataInputStream.java:370)
>         at org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
>         at org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:84)
>         at org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:108)
>         at org.apache.zookeeper.server.quorum.LearnerHandler.run(LearnerHandler.java:358)
> 2011-06-30 22:14:35,943 - WARN  [LearnerHandler-/10.23.247.163:14975:LearnerHandler@457] - ******* GOODBYE /10.23.247.163:14975 ********
> 2011-06-30 22:14:48,943 - ERROR [CommitProcessor:4:NIOServerCnxn@422] - Unexpected Exception: 
> java.nio.channels.CancelledKeyException
>         at sun.nio.ch.SelectionKeyImpl.ensureValid(SelectionKeyImpl.java:55)
>         at sun.nio.ch.SelectionKeyImpl.interestOps(SelectionKeyImpl.java:59)
>         at org.apache.zookeeper.server.NIOServerCnxn.sendBuffer(NIOServerCnxn.java:395)
>         at org.apache.zookeeper.server.NIOServerCnxn.sendResponse(NIOServerCnxn.java:1360)
>         at org.apache.zookeeper.server.FinalRequestProcessor.processRequest(FinalRequestProcessor.java:367)
>         at org.apache.zookeeper.server.quorum.Leader$ToBeAppliedRequestProcessor.processRequest(Leader.java:535)
>         at org.apache.zookeeper.server.quorum.CommitProcessor.run(CommitProcessor.java:73)
> 2011-06-30 22:14:49,084 - ERROR [LearnerHandler-/10.23.247.163:14998:LearnerHandler@444] - Unexpected exception causing shutdown while sock still open
> java.net.SocketTimeoutException: Read timed out
>         at java.net.SocketInputStream.socketRead0(Native Method)
>         at java.net.SocketInputStream.read(SocketInputStream.java:129)
>         at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
>         at java.io.BufferedInputStream.read(BufferedInputStream.java:237)
>         at java.io.DataInputStream.readInt(DataInputStream.java:370)
>         at org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
>         at org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:84)
>         at org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:108)
>         at org.apache.zookeeper.server.quorum.LearnerHandler.run(LearnerHandler.java:358)
> 2011-06-30 22:14:49,084 - WARN  [LearnerHandler-/10.23.247.163:14998:LearnerHandler@457] - ******* GOODBYE /10.23.247.163:14998 ********

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (ZOOKEEPER-1115) follower can not sync with leader

Posted by "helei (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/ZOOKEEPER-1115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13059326#comment-13059326 ] 

helei commented on ZOOKEEPER-1115:
----------------------------------

Hi, guys. After follower get synced with leader, we will call zk.takeSnapshot in communication thread in Learner.java. And this is a function may block. In my case, it takes about 4 seconds to finish the snap. During this period, follower would not ack leader, and leader would thought that something happend to this follower, and close the connection with this follower. After that the follower will receive a broken pipe exception, and it will retry after that. Of cause it will fail time after time.
I think it is a problem.
thanks

> follower can not sync with leader
> ---------------------------------
>
>                 Key: ZOOKEEPER-1115
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1115
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: quorum
>    Affects Versions: 3.3.0, 3.3.3
>         Environment: linux rhel 4, x64, java version 1.4.2
>            Reporter: helei
>            Priority: Critical
>
> there are 5 members in the quorum. one follower can not sync with leader after restart. it seems leader has close the data connection with this follower because of read timeout. here is the key log in follower:
> 2011-06-30 22:14:45,069 - WARN  [Thread-17:QuorumCnxManager$RecvWorker@658] - Connection broken: 
> java.nio.channels.ClosedChannelException
>         at sun.nio.ch.SocketChannelImpl.ensureReadOpen(SocketChannelImpl.java:113)
>         at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:156)
>         at org.apache.zookeeper.server.quorum.QuorumCnxManager$RecvWorker.run(QuorumCnxManager.java:629)
> 2011-06-30 22:14:45,069 - INFO  [QuorumPeer:/0.0.0.0:2181:FastLeaderElection@689] - Notification: 3, 17198470148, 3, 3, LOOKING, LOOKING, 3
> 2011-06-30 22:14:45,070 - ERROR [Thread-16:QuorumCnxManager$SendWorker@559] - Failed to send last message. Shutting down thread.
> java.nio.channels.ClosedChannelException
>         at sun.nio.ch.SocketChannelImpl.ensureWriteOpen(SocketChannelImpl.java:126)
>         at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:324)
>         at org.apache.zookeeper.server.quorum.QuorumCnxManager$SendWorker.send(QuorumCnxManager.java:548)
>         at org.apache.zookeeper.server.quorum.QuorumCnxManager$SendWorker.run(QuorumCnxManager.java:557)
> 2011-06-30 22:14:45,082 - INFO  [QuorumPeer:/0.0.0.0:2181:Learner@282] - Getting a diff from the leader 0x4011bd462
> 2011-06-30 22:14:45,083 - WARN  [Thread-18:QuorumCnxManager$SendWorker@589] - Send worker leaving thread
> 2011-06-30 22:14:45,085 - WARN  [QuorumPeer:/0.0.0.0:2181:Follower@116] - Got zxid 0x4011bd405 expected 0x1
> 2011-06-30 22:14:45,090 - INFO  [QuorumPeer:/0.0.0.0:2181:FileTxnSnapLog@208] - Snapshotting: 4011bd462
> 2011-06-30 22:14:53,397 - WARN  [SyncThread:3:SendAckRequestProcessor@63] - Closing connection to leader, exception during packet send
> java.net.SocketException: Broken pipe
>         at java.net.SocketOutputStream.socketWrite0(Native Method)
>         at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:92)
>         at java.net.SocketOutputStream.write(SocketOutputStream.java:136)
>         at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65)
>         at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:123)
>         at org.apache.zookeeper.server.quorum.Learner.writePacket(Learner.java:126)
>         at org.apache.zookeeper.server.quorum.SendAckRequestProcessor.flush(SendAckRequestProcessor.java:61)
>         at org.apache.zookeeper.server.SyncRequestProcessor.flush(SyncRequestProcessor.java:164)
>         at org.apache.zookeeper.server.SyncRequestProcessor.run(SyncRequestProcessor.java:98)
> 2011-06-30 22:14:53,398 - WARN  [QuorumPeer:/0.0.0.0:2181:Follower@82] - Exception when following the leader
> java.net.SocketException: Socket closed
>         at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:99)
>         at java.net.SocketOutputStream.write(SocketOutputStream.java:136)
>         at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65)
>         at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:123)
>         at org.apache.zookeeper.server.quorum.Learner.writePacket(Learner.java:126)
>         at org.apache.zookeeper.server.quorum.Learner.ping(Learner.java:358)
>         at org.apache.zookeeper.server.quorum.Follower.processPacket(Follower.java:108)
>         at org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:79)
>         at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:634)
> 2011-06-30 22:14:53,398 - WARN  [SyncThread:3:SendAckRequestProcessor@63] - Closing connection to leader, exception during packet send
> java.net.SocketException: Socket closed
>         at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:99)
>         at java.net.SocketOutputStream.write(SocketOutputStream.java:136)
>         at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65)
>         at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:123)
>         at org.apache.zookeeper.server.quorum.Learner.writePacket(Learner.java:126)
>         at org.apache.zookeeper.server.quorum.SendAckRequestProcessor.flush(SendAckRequestProcessor.java:61)
>         at org.apache.zookeeper.server.SyncRequestProcessor.flush(SyncRequestProcessor.java:164)
>         at org.apache.zookeeper.server.SyncRequestProcessor.run(SyncRequestProcessor.java:98)
> 2011-06-30 22:14:53,399 - INFO  [QuorumPeer:/0.0.0.0:2181:Follower@166] - shutdown called
> java.lang.Exception: shutdown Follower
>         at org.apache.zookeeper.server.quorum.Follower.shutdown(Follower.java:166)
>         at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:638)
> and these are the leader's:
> 2011-06-30 22:14:35,943 - ERROR [LearnerHandler-/10.23.247.163:14975:LearnerHandler@444] - Unexpected exception causing shutdown while sock still open
> java.net.SocketTimeoutException: Read timed out
>         at java.net.SocketInputStream.socketRead0(Native Method)
>         at java.net.SocketInputStream.read(SocketInputStream.java:129)
>         at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
>         at java.io.BufferedInputStream.read(BufferedInputStream.java:237)
>         at java.io.DataInputStream.readInt(DataInputStream.java:370)
>         at org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
>         at org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:84)
>         at org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:108)
>         at org.apache.zookeeper.server.quorum.LearnerHandler.run(LearnerHandler.java:358)
> 2011-06-30 22:14:35,943 - WARN  [LearnerHandler-/10.23.247.163:14975:LearnerHandler@457] - ******* GOODBYE /10.23.247.163:14975 ********
> 2011-06-30 22:14:48,943 - ERROR [CommitProcessor:4:NIOServerCnxn@422] - Unexpected Exception: 
> java.nio.channels.CancelledKeyException
>         at sun.nio.ch.SelectionKeyImpl.ensureValid(SelectionKeyImpl.java:55)
>         at sun.nio.ch.SelectionKeyImpl.interestOps(SelectionKeyImpl.java:59)
>         at org.apache.zookeeper.server.NIOServerCnxn.sendBuffer(NIOServerCnxn.java:395)
>         at org.apache.zookeeper.server.NIOServerCnxn.sendResponse(NIOServerCnxn.java:1360)
>         at org.apache.zookeeper.server.FinalRequestProcessor.processRequest(FinalRequestProcessor.java:367)
>         at org.apache.zookeeper.server.quorum.Leader$ToBeAppliedRequestProcessor.processRequest(Leader.java:535)
>         at org.apache.zookeeper.server.quorum.CommitProcessor.run(CommitProcessor.java:73)
> 2011-06-30 22:14:49,084 - ERROR [LearnerHandler-/10.23.247.163:14998:LearnerHandler@444] - Unexpected exception causing shutdown while sock still open
> java.net.SocketTimeoutException: Read timed out
>         at java.net.SocketInputStream.socketRead0(Native Method)
>         at java.net.SocketInputStream.read(SocketInputStream.java:129)
>         at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
>         at java.io.BufferedInputStream.read(BufferedInputStream.java:237)
>         at java.io.DataInputStream.readInt(DataInputStream.java:370)
>         at org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
>         at org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:84)
>         at org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:108)
>         at org.apache.zookeeper.server.quorum.LearnerHandler.run(LearnerHandler.java:358)
> 2011-06-30 22:14:49,084 - WARN  [LearnerHandler-/10.23.247.163:14998:LearnerHandler@457] - ******* GOODBYE /10.23.247.163:14998 ********

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (ZOOKEEPER-1115) follower can not sync with leader

Posted by "helei (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/ZOOKEEPER-1115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13060253#comment-13060253 ] 

helei commented on ZOOKEEPER-1115:
----------------------------------

Hmm, of cause it works. But I really don't think that's the right way... It obviously harm the performance. We have to limit the node number, or the leader will be the single point. But that's OK, it surely works this time. Thanks again.

> follower can not sync with leader
> ---------------------------------
>
>                 Key: ZOOKEEPER-1115
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1115
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: quorum
>    Affects Versions: 3.3.0, 3.3.3
>         Environment: linux rhel 4, x64, java version 1.4.2
>            Reporter: helei
>            Priority: Critical
>
> there are 5 members in the quorum. one follower can not sync with leader after restart. it seems leader has close the data connection with this follower because of read timeout. here is the key log in follower:
> 2011-06-30 22:14:45,069 - WARN  [Thread-17:QuorumCnxManager$RecvWorker@658] - Connection broken: 
> java.nio.channels.ClosedChannelException
>         at sun.nio.ch.SocketChannelImpl.ensureReadOpen(SocketChannelImpl.java:113)
>         at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:156)
>         at org.apache.zookeeper.server.quorum.QuorumCnxManager$RecvWorker.run(QuorumCnxManager.java:629)
> 2011-06-30 22:14:45,069 - INFO  [QuorumPeer:/0.0.0.0:2181:FastLeaderElection@689] - Notification: 3, 17198470148, 3, 3, LOOKING, LOOKING, 3
> 2011-06-30 22:14:45,070 - ERROR [Thread-16:QuorumCnxManager$SendWorker@559] - Failed to send last message. Shutting down thread.
> java.nio.channels.ClosedChannelException
>         at sun.nio.ch.SocketChannelImpl.ensureWriteOpen(SocketChannelImpl.java:126)
>         at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:324)
>         at org.apache.zookeeper.server.quorum.QuorumCnxManager$SendWorker.send(QuorumCnxManager.java:548)
>         at org.apache.zookeeper.server.quorum.QuorumCnxManager$SendWorker.run(QuorumCnxManager.java:557)
> 2011-06-30 22:14:45,082 - INFO  [QuorumPeer:/0.0.0.0:2181:Learner@282] - Getting a diff from the leader 0x4011bd462
> 2011-06-30 22:14:45,083 - WARN  [Thread-18:QuorumCnxManager$SendWorker@589] - Send worker leaving thread
> 2011-06-30 22:14:45,085 - WARN  [QuorumPeer:/0.0.0.0:2181:Follower@116] - Got zxid 0x4011bd405 expected 0x1
> 2011-06-30 22:14:45,090 - INFO  [QuorumPeer:/0.0.0.0:2181:FileTxnSnapLog@208] - Snapshotting: 4011bd462
> 2011-06-30 22:14:53,397 - WARN  [SyncThread:3:SendAckRequestProcessor@63] - Closing connection to leader, exception during packet send
> java.net.SocketException: Broken pipe
>         at java.net.SocketOutputStream.socketWrite0(Native Method)
>         at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:92)
>         at java.net.SocketOutputStream.write(SocketOutputStream.java:136)
>         at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65)
>         at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:123)
>         at org.apache.zookeeper.server.quorum.Learner.writePacket(Learner.java:126)
>         at org.apache.zookeeper.server.quorum.SendAckRequestProcessor.flush(SendAckRequestProcessor.java:61)
>         at org.apache.zookeeper.server.SyncRequestProcessor.flush(SyncRequestProcessor.java:164)
>         at org.apache.zookeeper.server.SyncRequestProcessor.run(SyncRequestProcessor.java:98)
> 2011-06-30 22:14:53,398 - WARN  [QuorumPeer:/0.0.0.0:2181:Follower@82] - Exception when following the leader
> java.net.SocketException: Socket closed
>         at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:99)
>         at java.net.SocketOutputStream.write(SocketOutputStream.java:136)
>         at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65)
>         at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:123)
>         at org.apache.zookeeper.server.quorum.Learner.writePacket(Learner.java:126)
>         at org.apache.zookeeper.server.quorum.Learner.ping(Learner.java:358)
>         at org.apache.zookeeper.server.quorum.Follower.processPacket(Follower.java:108)
>         at org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:79)
>         at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:634)
> 2011-06-30 22:14:53,398 - WARN  [SyncThread:3:SendAckRequestProcessor@63] - Closing connection to leader, exception during packet send
> java.net.SocketException: Socket closed
>         at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:99)
>         at java.net.SocketOutputStream.write(SocketOutputStream.java:136)
>         at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65)
>         at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:123)
>         at org.apache.zookeeper.server.quorum.Learner.writePacket(Learner.java:126)
>         at org.apache.zookeeper.server.quorum.SendAckRequestProcessor.flush(SendAckRequestProcessor.java:61)
>         at org.apache.zookeeper.server.SyncRequestProcessor.flush(SyncRequestProcessor.java:164)
>         at org.apache.zookeeper.server.SyncRequestProcessor.run(SyncRequestProcessor.java:98)
> 2011-06-30 22:14:53,399 - INFO  [QuorumPeer:/0.0.0.0:2181:Follower@166] - shutdown called
> java.lang.Exception: shutdown Follower
>         at org.apache.zookeeper.server.quorum.Follower.shutdown(Follower.java:166)
>         at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:638)
> and these are the leader's:
> 2011-06-30 22:14:35,943 - ERROR [LearnerHandler-/10.23.247.163:14975:LearnerHandler@444] - Unexpected exception causing shutdown while sock still open
> java.net.SocketTimeoutException: Read timed out
>         at java.net.SocketInputStream.socketRead0(Native Method)
>         at java.net.SocketInputStream.read(SocketInputStream.java:129)
>         at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
>         at java.io.BufferedInputStream.read(BufferedInputStream.java:237)
>         at java.io.DataInputStream.readInt(DataInputStream.java:370)
>         at org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
>         at org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:84)
>         at org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:108)
>         at org.apache.zookeeper.server.quorum.LearnerHandler.run(LearnerHandler.java:358)
> 2011-06-30 22:14:35,943 - WARN  [LearnerHandler-/10.23.247.163:14975:LearnerHandler@457] - ******* GOODBYE /10.23.247.163:14975 ********
> 2011-06-30 22:14:48,943 - ERROR [CommitProcessor:4:NIOServerCnxn@422] - Unexpected Exception: 
> java.nio.channels.CancelledKeyException
>         at sun.nio.ch.SelectionKeyImpl.ensureValid(SelectionKeyImpl.java:55)
>         at sun.nio.ch.SelectionKeyImpl.interestOps(SelectionKeyImpl.java:59)
>         at org.apache.zookeeper.server.NIOServerCnxn.sendBuffer(NIOServerCnxn.java:395)
>         at org.apache.zookeeper.server.NIOServerCnxn.sendResponse(NIOServerCnxn.java:1360)
>         at org.apache.zookeeper.server.FinalRequestProcessor.processRequest(FinalRequestProcessor.java:367)
>         at org.apache.zookeeper.server.quorum.Leader$ToBeAppliedRequestProcessor.processRequest(Leader.java:535)
>         at org.apache.zookeeper.server.quorum.CommitProcessor.run(CommitProcessor.java:73)
> 2011-06-30 22:14:49,084 - ERROR [LearnerHandler-/10.23.247.163:14998:LearnerHandler@444] - Unexpected exception causing shutdown while sock still open
> java.net.SocketTimeoutException: Read timed out
>         at java.net.SocketInputStream.socketRead0(Native Method)
>         at java.net.SocketInputStream.read(SocketInputStream.java:129)
>         at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
>         at java.io.BufferedInputStream.read(BufferedInputStream.java:237)
>         at java.io.DataInputStream.readInt(DataInputStream.java:370)
>         at org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
>         at org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:84)
>         at org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:108)
>         at org.apache.zookeeper.server.quorum.LearnerHandler.run(LearnerHandler.java:358)
> 2011-06-30 22:14:49,084 - WARN  [LearnerHandler-/10.23.247.163:14998:LearnerHandler@457] - ******* GOODBYE /10.23.247.163:14998 ********

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (ZOOKEEPER-1115) follower can not sync with leader

Posted by "helei (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/ZOOKEEPER-1115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13059782#comment-13059782 ] 

helei commented on ZOOKEEPER-1115:
----------------------------------

Hi Flavio, this does not help. When the follower took a long time to do a snap, the leader would timeout and then close the connection. I think the problem is here:
     Leader.java: 237   s.setSoTimeout(self.tickTime * self.syncLimit);
After this time, Leader would close the connection. And follower would receive a broken pipe error.

Thanks

> follower can not sync with leader
> ---------------------------------
>
>                 Key: ZOOKEEPER-1115
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1115
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: quorum
>    Affects Versions: 3.3.0, 3.3.3
>         Environment: linux rhel 4, x64, java version 1.4.2
>            Reporter: helei
>            Priority: Critical
>
> there are 5 members in the quorum. one follower can not sync with leader after restart. it seems leader has close the data connection with this follower because of read timeout. here is the key log in follower:
> 2011-06-30 22:14:45,069 - WARN  [Thread-17:QuorumCnxManager$RecvWorker@658] - Connection broken: 
> java.nio.channels.ClosedChannelException
>         at sun.nio.ch.SocketChannelImpl.ensureReadOpen(SocketChannelImpl.java:113)
>         at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:156)
>         at org.apache.zookeeper.server.quorum.QuorumCnxManager$RecvWorker.run(QuorumCnxManager.java:629)
> 2011-06-30 22:14:45,069 - INFO  [QuorumPeer:/0.0.0.0:2181:FastLeaderElection@689] - Notification: 3, 17198470148, 3, 3, LOOKING, LOOKING, 3
> 2011-06-30 22:14:45,070 - ERROR [Thread-16:QuorumCnxManager$SendWorker@559] - Failed to send last message. Shutting down thread.
> java.nio.channels.ClosedChannelException
>         at sun.nio.ch.SocketChannelImpl.ensureWriteOpen(SocketChannelImpl.java:126)
>         at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:324)
>         at org.apache.zookeeper.server.quorum.QuorumCnxManager$SendWorker.send(QuorumCnxManager.java:548)
>         at org.apache.zookeeper.server.quorum.QuorumCnxManager$SendWorker.run(QuorumCnxManager.java:557)
> 2011-06-30 22:14:45,082 - INFO  [QuorumPeer:/0.0.0.0:2181:Learner@282] - Getting a diff from the leader 0x4011bd462
> 2011-06-30 22:14:45,083 - WARN  [Thread-18:QuorumCnxManager$SendWorker@589] - Send worker leaving thread
> 2011-06-30 22:14:45,085 - WARN  [QuorumPeer:/0.0.0.0:2181:Follower@116] - Got zxid 0x4011bd405 expected 0x1
> 2011-06-30 22:14:45,090 - INFO  [QuorumPeer:/0.0.0.0:2181:FileTxnSnapLog@208] - Snapshotting: 4011bd462
> 2011-06-30 22:14:53,397 - WARN  [SyncThread:3:SendAckRequestProcessor@63] - Closing connection to leader, exception during packet send
> java.net.SocketException: Broken pipe
>         at java.net.SocketOutputStream.socketWrite0(Native Method)
>         at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:92)
>         at java.net.SocketOutputStream.write(SocketOutputStream.java:136)
>         at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65)
>         at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:123)
>         at org.apache.zookeeper.server.quorum.Learner.writePacket(Learner.java:126)
>         at org.apache.zookeeper.server.quorum.SendAckRequestProcessor.flush(SendAckRequestProcessor.java:61)
>         at org.apache.zookeeper.server.SyncRequestProcessor.flush(SyncRequestProcessor.java:164)
>         at org.apache.zookeeper.server.SyncRequestProcessor.run(SyncRequestProcessor.java:98)
> 2011-06-30 22:14:53,398 - WARN  [QuorumPeer:/0.0.0.0:2181:Follower@82] - Exception when following the leader
> java.net.SocketException: Socket closed
>         at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:99)
>         at java.net.SocketOutputStream.write(SocketOutputStream.java:136)
>         at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65)
>         at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:123)
>         at org.apache.zookeeper.server.quorum.Learner.writePacket(Learner.java:126)
>         at org.apache.zookeeper.server.quorum.Learner.ping(Learner.java:358)
>         at org.apache.zookeeper.server.quorum.Follower.processPacket(Follower.java:108)
>         at org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:79)
>         at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:634)
> 2011-06-30 22:14:53,398 - WARN  [SyncThread:3:SendAckRequestProcessor@63] - Closing connection to leader, exception during packet send
> java.net.SocketException: Socket closed
>         at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:99)
>         at java.net.SocketOutputStream.write(SocketOutputStream.java:136)
>         at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65)
>         at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:123)
>         at org.apache.zookeeper.server.quorum.Learner.writePacket(Learner.java:126)
>         at org.apache.zookeeper.server.quorum.SendAckRequestProcessor.flush(SendAckRequestProcessor.java:61)
>         at org.apache.zookeeper.server.SyncRequestProcessor.flush(SyncRequestProcessor.java:164)
>         at org.apache.zookeeper.server.SyncRequestProcessor.run(SyncRequestProcessor.java:98)
> 2011-06-30 22:14:53,399 - INFO  [QuorumPeer:/0.0.0.0:2181:Follower@166] - shutdown called
> java.lang.Exception: shutdown Follower
>         at org.apache.zookeeper.server.quorum.Follower.shutdown(Follower.java:166)
>         at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:638)
> and these are the leader's:
> 2011-06-30 22:14:35,943 - ERROR [LearnerHandler-/10.23.247.163:14975:LearnerHandler@444] - Unexpected exception causing shutdown while sock still open
> java.net.SocketTimeoutException: Read timed out
>         at java.net.SocketInputStream.socketRead0(Native Method)
>         at java.net.SocketInputStream.read(SocketInputStream.java:129)
>         at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
>         at java.io.BufferedInputStream.read(BufferedInputStream.java:237)
>         at java.io.DataInputStream.readInt(DataInputStream.java:370)
>         at org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
>         at org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:84)
>         at org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:108)
>         at org.apache.zookeeper.server.quorum.LearnerHandler.run(LearnerHandler.java:358)
> 2011-06-30 22:14:35,943 - WARN  [LearnerHandler-/10.23.247.163:14975:LearnerHandler@457] - ******* GOODBYE /10.23.247.163:14975 ********
> 2011-06-30 22:14:48,943 - ERROR [CommitProcessor:4:NIOServerCnxn@422] - Unexpected Exception: 
> java.nio.channels.CancelledKeyException
>         at sun.nio.ch.SelectionKeyImpl.ensureValid(SelectionKeyImpl.java:55)
>         at sun.nio.ch.SelectionKeyImpl.interestOps(SelectionKeyImpl.java:59)
>         at org.apache.zookeeper.server.NIOServerCnxn.sendBuffer(NIOServerCnxn.java:395)
>         at org.apache.zookeeper.server.NIOServerCnxn.sendResponse(NIOServerCnxn.java:1360)
>         at org.apache.zookeeper.server.FinalRequestProcessor.processRequest(FinalRequestProcessor.java:367)
>         at org.apache.zookeeper.server.quorum.Leader$ToBeAppliedRequestProcessor.processRequest(Leader.java:535)
>         at org.apache.zookeeper.server.quorum.CommitProcessor.run(CommitProcessor.java:73)
> 2011-06-30 22:14:49,084 - ERROR [LearnerHandler-/10.23.247.163:14998:LearnerHandler@444] - Unexpected exception causing shutdown while sock still open
> java.net.SocketTimeoutException: Read timed out
>         at java.net.SocketInputStream.socketRead0(Native Method)
>         at java.net.SocketInputStream.read(SocketInputStream.java:129)
>         at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
>         at java.io.BufferedInputStream.read(BufferedInputStream.java:237)
>         at java.io.DataInputStream.readInt(DataInputStream.java:370)
>         at org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
>         at org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:84)
>         at org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:108)
>         at org.apache.zookeeper.server.quorum.LearnerHandler.run(LearnerHandler.java:358)
> 2011-06-30 22:14:49,084 - WARN  [LearnerHandler-/10.23.247.163:14998:LearnerHandler@457] - ******* GOODBYE /10.23.247.163:14998 ********

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira