You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@zookeeper.apache.org by Anand Parthasarathy <an...@avinetworks.com> on 2017/10/04 18:17:37 UTC

Zookeeper quorum goes down for no apparent reason in 3.4.5

Hi,

We have an issue with a 3-node zookeeper ensemble where the quorum goes
down due to no apparent reason every once in a while. Here is what I see in
the ZK leader:

2017-09-21 03:00:03,648 [myid:3] - INFO  [QuorumPeer[myid=3]/127.0.0.1:5002
:Leader@493] - Shutting down
2017-09-21 03:00:03,648 [myid:3] - INFO  [QuorumPeer[myid=3]/127.0.0.1:5002
:Leader@499] - Shutdown called
java.lang.Exception: shutdown Leader! reason: Not sufficient followers
synced, only synced with sids: [ 3 ]
    at org.apache.zookeeper.server.quorum.Leader.shutdown(Leader.java:499)
    at org.apache.zookeeper.server.quorum.Leader.lead(Leader.java:474)
    at
org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:799)

I have attached the logs from the 3 nodes around this time. Could you pls.
help understand what the issue could be here. The only thing I see a little
bit ahead of this timestamp is that all of them did a PurgeTask pretty much
at the same time.

Thanks,
Anand.

Re: Zookeeper quorum goes down for no apparent reason in 3.4.5

Posted by Patrick Hunt <ph...@apache.org>.
According to the logs the reads are timing out:

java.net.SocketTimeoutException: Read timed out

this could be any number of things, networking/cpu/.... - resource
starvation. For example I mentioned GC previously, did you turn on GC
logging and rule that out?

The "leaderServes" option might also be useful, hard to say with such
limited insight into your environment. See
http://zookeeper.apache.org/doc/r3.4.10/zookeeperAdmin.html#sc_clusterOptions

Regards,

Patrick

On Thu, Oct 5, 2017 at 10:23 AM, Anand Parthasarathy <
anpartha@avinetworks.com> wrote:

> I will attach the logs inline this time. I looked at the metrics and can't
> find any issue wrt network connectivity or cpu/disk/memory issues. We also
> moved the data directory to /run memfs to avoid fsync latencies as we don't
> need persistency. Could you pls. take a look at the logs and see if you can
> spot something that gives an insight to the underlying issue.
>
> Thanks,
> Anand.
>
> *From node1:*
>
> 2017-09-21 02:57:13,164 [myid:1] - INFO
> [PurgeTask:DatadirCleanupManager$PurgeTask@138] - Purge task started.
> Removing file: Sep 21, 2017 1:17:24 AM
> /run/zookeeper/version-2/log.50006f026
> Removing file: Sep 21, 2017 12:47:17 AM
> /run/zookeeper/version-2/log.50006ecf6
> Removing file: Sep 21, 2017 1:17:24 AM
> /run/zookeeper/version-2/snapshot.50006f375
> Removing file: Sep 21, 2017 12:47:17 AM
> /run/zookeeper/version-2/snapshot.50006f024
> 2017-09-21 02:57:13,164 [myid:1] - INFO
> [PurgeTask:DatadirCleanupManager$PurgeTask@144] - Purge task completed.
> 2017-09-21 02:59:32,298 [myid:1] - INFO
> [NIOServerCxn.Factory:localhost/127.0.0.1:5002:Learner@108] - Revalidating
> client: 0x15e5304d29b0046
> 2017-09-21 02:59:34,271 [myid:1] - WARN  [QuorumPeer[myid=1]/127.0.0.1:
> 5002
> :Follower@89] - Exception when following the leader
> java.net.SocketTimeoutException: Read timed out
> at java.net.SocketInputStream.socketRead0(Native Method)
> at java.net.SocketInputStream.read(SocketInputStream.java:153)
> at java.net.SocketInputStream.read(SocketInputStream.java:122)
> at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
> at java.io.BufferedInputStream.read(BufferedInputStream.java:254)
> at java.io.DataInputStream.readInt(DataInputStream.java:387)
> at org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
> at
> org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(
> QuorumPacket.java:83)
> at
> org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:103)
> at org.apache.zookeeper.server.quorum.Learner.readPacket(Learner.java:153)
> at
> org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:85)
> at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:786)
> 2017-09-21 02:59:34,271 [myid:1] - INFO  [QuorumPeer[myid=1]/127.0.0.1:
> 5002
> :Follower@166] - shutdown called
> java.lang.Exception: shutdown Follower
> at org.apache.zookeeper.server.quorum.Follower.shutdown(Follower.java:166)
> at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:790)
> 2017-09-21 02:59:34,272 [myid:1] - INFO  [QuorumPeer[myid=1]/127.0.0.1:
> 5002
> :FollowerZooKeeperServer@139] - Shutting down
> 2017-09-21 02:59:34,272 [myid:1] - INFO  [QuorumPeer[myid=1]/127.0.0.1:
> 5002
> :ZooKeeperServer@441] - shutting down
> 2017-09-21 02:59:34,272 [myid:1] - INFO  [QuorumPeer[myid=1]/127.0.0.1:
> 5002
> :FollowerRequestProcessor@105] - Shutting down
> 2017-09-21 02:59:34,272 [myid:1] - INFO
> [FollowerRequestProcessor:1:FollowerRequestProcessor@95] -
> FollowerRequestProcessor exited loop!
> 2017-09-21 02:59:34,272 [myid:1] - INFO  [QuorumPeer[myid=1]/127.0.0.1:
> 5002
> :CommitProcessor@181] - Shutting down
> 2017-09-21 02:59:34,272 [myid:1] - INFO
> [CommitProcessor:1:CommitProcessor@150] - CommitProcessor exited loop!
> 2017-09-21 02:59:34,272 [myid:1] - INFO  [QuorumPeer[myid=1]/127.0.0.1:
> 5002
> :FinalRequestProcessor@415] - shutdown of request processor complete
> 2017-09-21 02:59:34,272 [myid:1] - INFO  [QuorumPeer[myid=1]/127.0.0.1:
> 5002
> :SyncRequestProcessor@209] - Shutting down
> 2017-09-21 02:59:34,272 [myid:1] - INFO
> [SyncThread:1:SyncRequestProcessor@187] - SyncRequestProcessor exited!
> 2017-09-21 02:59:34,272 [myid:1] - INFO  [QuorumPeer[myid=1]/127.0.0.1:
> 5002
> :QuorumPeer@714] - LOOKING
> 2017-09-21 02:59:34,273 [myid:1] - INFO  [QuorumPeer[myid=1]/127.0.0.1:
> 5002
> :FileSnap@83] - Reading snapshot /run/zookeeper/version-2/snaps
> hot.50006fba3
> 2017-09-21 02:59:34,275 [myid:1] - WARN
> [NIOServerCxn.Factory:localhost/127.0.0.1:5002:NIOServerCnxn@362] -
> Exception causing close of session 0x0 due to java.io.IOException:
> ZooKeeperServer not running
> 2017-09-21 02:59:34,275 [myid:1] - WARN
> [NIOServerCxn.Factory:localhost/127.0.0.1:5002:NIOServerCnxn@362] -
> Exception causing close of session 0x0 due to java.io.IOException:
> ZooKeeperServer not running
> 2017-09-21 02:59:34,277 [myid:1] - INFO  [QuorumPeer[myid=1]/127.0.0.1:
> 5002
> :FastLeaderElection@815] - New election. My id =  1, proposed
> zxid=0x50006fec3
> 2017-09-21 02:59:34,277 [myid:1] - INFO
> [WorkerReceiver[myid=1]:FastLeaderElection@597] - Notification: 1 (message
> format version), 1 (n.leader), 0x50006fec3 (n.zxid), 0xd (n.round), LOOKING
> (n.state), 1 (n.sid), 0x5 (n.peerEpoch) LOOKING (my state)
> 2017-09-21 02:59:34,278 [myid:1] - INFO
> [WorkerReceiver[myid=1]:FastLeaderElection@597] - Notification: 1 (message
> format version), 2 (n.leader), 0x50006fec3 (n.zxid), 0xd (n.round), LOOKING
> (n.state), 2 (n.sid), 0x5 (n.peerEpoch) LOOKING (my state)
> 2017-09-21 02:59:34,278 [myid:1] - INFO
> [WorkerReceiver[myid=1]:FastLeaderElection@597] - Notification: 1 (message
> format version), 2 (n.leader), 0x50006fec3 (n.zxid), 0xd (n.round), LOOKING
> (n.state), 1 (n.sid), 0x5 (n.peerEpoch) LOOKING (my state)
> 2017-09-21 02:59:34,478 [myid:1] - INFO  [QuorumPeer[myid=1]/127.0.0.1:
> 5002
> :QuorumPeer@784] - FOLLOWING
> 2017-09-21 02:59:34,478 [myid:1] - INFO  [QuorumPeer[myid=1]/127.0.0.1:
> 5002
> :ZooKeeperServer@162] - Created server with tickTime 2000
> minSessionTimeout
> 4000 maxSessionTimeout 40000 datadir /run/zookeeper/version-2 snapdir
> /run/zookeeper/version-2
> 2017-09-21 02:59:34,478 [myid:1] - INFO  [QuorumPeer[myid=1]/127.0.0.1:
> 5002
> :Follower@63] - FOLLOWING - LEADER ELECTION TOOK - 205
> 2017-09-21 02:59:34,481 [myid:1] - INFO  [QuorumPeer[myid=1]/127.0.0.1:
> 5002
> :Learner@323] - Getting a diff from the leader 0x50006fec3
> 2017-09-21 02:59:34,520 [myid:1] - INFO  [QuorumPeer[myid=1]/127.0.0.1:
> 5002
> :FileTxnSnapLog@240] - Snapshotting: 0x50006fec3 to
> /run/zookeeper/version-2/snapshot.50006fec3
> 2017-09-21 02:59:34,876 [myid:1] - INFO
> [NIOServerCxn.Factory:localhost/127.0.0.1:5002:Learner@108] - Revalidating
> client: 0x15e5d10a4b50020
> 2017-09-21 02:59:34,944 [myid:1] - WARN  [QuorumPeer[myid=1]/127.0.0.1:
> 5002
> :Follower@118] - Got zxid 0x600000001 expected 0x1
> 2017-09-21 02:59:35,495 [myid:1] - INFO
> [NIOServerCxn.Factory:localhost/127.0.0.1:5002:Learner@108] - Revalidating
> client: 0x15e901612194a6f
> 2017-09-21 03:00:02,649 [myid:1] - INFO
> [WorkerReceiver[myid=1]:FastLeaderElection@597] - Notification: 1 (message
> format version), 3 (n.leader), 0x40000e237 (n.zxid), 0xc (n.round), LEADING
> (n.state), 3 (n.sid), 0x5 (n.peerEpoch) FOLLOWING (my state)
> 2017-09-21 03:00:02,650 [myid:1] - INFO
> [WorkerReceiver[myid=1]:FastLeaderElection@597] - Notification: 1 (message
> format version), 3 (n.leader), 0x40000e237 (n.zxid), 0xc (n.round), LEADING
> (n.state), 3 (n.sid), 0x5 (n.peerEpoch) FOLLOWING (my state)
> 2017-09-21 03:00:03,651 [myid:1] - INFO
> [NIOServerCxn.Factory:localhost/127.0.0.1:5002:Learner@108] - Revalidating
> client: 0x15e5304d29b0046
> 2017-09-21 03:00:03,651 [myid:1] - INFO  [QuorumPeer[myid=1]/127.0.0.1:
> 5002
> :ZooKeeperServer@610] - Invalid session 0x15e5304d29b0046 for client /
> 127.0.0.1:57028, probably expired
> 2017-09-21 03:00:03,660 [myid:1] - INFO
> [WorkerReceiver[myid=1]:FastLeaderElection@597] - Notification: 1 (message
> format version), 3 (n.leader), 0x50006fed5 (n.zxid), 0xd (n.round), LOOKING
> (n.state), 3 (n.sid), 0x5 (n.peerEpoch) FOLLOWING (my state)
> 2017-09-21 03:00:04,748 [myid:1] - WARN
> [NIOServerCxn.Factory:localhost/127.0.0.1:5002:NIOServerCnxn@357] - caught
> end of stream exception
> EndOfStreamException: Unable to read additional data from client sessionid
> 0x15ea2604fd1000e, likely client has closed socket
> at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228)
> at
> org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServ
> erCnxnFactory.java:208)
> at java.lang.Thread.run(Thread.java:745)
> 2017-09-21 03:00:11,624 [myid:1] - WARN
> [NIOServerCxn.Factory:localhost/127.0.0.1:5002:NIOServerCnxn@357] - caught
> end of stream exception
> EndOfStreamException: Unable to read additional data from client sessionid
> 0x15ea2604fd10001, likely client has closed socket
> at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228)
> at
> org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServ
> erCnxnFactory.java:208)
> at java.lang.Thread.run(Thread.java:745)
> 2017-09-21 03:00:11,847 [myid:1] - INFO  [SyncThread:1:FileTxnLog@199] -
> Creating new log file: log.6000003ac
> 2017-09-21 03:00:11,847 [myid:1] - INFO  [Snapshot
> Thread:FileTxnSnapLog@240]
> - Snapshotting: 0x6000003ab to /run/zookeeper/version-2/snapshot.6000003ab
> 2017-09-21 03:00:34,482 [myid:1] - WARN
> [NIOServerCxn.Factory:localhost/127.0.0.1:5002:NIOServerCnxn@357] - caught
> end of stream exception
> EndOfStreamException: Unable to read additional data from client sessionid
> 0x15ea2604fd1006a, likely client has closed socket
> at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228)
> at
> org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServ
> erCnxnFactory.java:208)
> at java.lang.Thread.run(Thread.java:745)
> 2017-09-21 03:00:34,536 [myid:1] - WARN
> [NIOServerCxn.Factory:localhost/127.0.0.1:5002:NIOServerCnxn@357] - caught
> end of stream exception
> EndOfStreamException: Unable to read additional data from client sessionid
> 0x15ea2604fd1006c, likely client has closed socket
> at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228)
> at
> org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServ
> erCnxnFactory.java:208)
> at java.lang.Thread.run(Thread.java:745)
> 2017-09-21 03:00:34,645 [myid:1] - WARN
> [NIOServerCxn.Factory:localhost/127.0.0.1:5002:NIOServerCnxn@357] - caught
> end of stream exception
> EndOfStreamException: Unable to read additional data from client sessionid
> 0x15ea2604fd1006e, likely client has closed socket
> at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228)
> at
> org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServ
> erCnxnFactory.java:208)
> at java.lang.Thread.run(Thread.java:745)
> 2017-09-21 03:00:36,104 [myid:1] - INFO  [Snapshot
> Thread:FileTxnSnapLog@240]
> - Snapshotting: 0x60000072a to /run/zookeeper/version-2/snapshot.60000072a
> 2017-09-21 03:00:36,138 [myid:1] - INFO  [SyncThread:1:FileTxnLog@199] -
> Creating new log file: log.60000072c
> 2017-09-21 03:00:36,558 [myid:1] - WARN
> [NIOServerCxn.Factory:localhost/127.0.0.1:5002:NIOServerCnxn@357] - caught
> end of stream exception
> EndOfStreamException: Unable to read additional data from client sessionid
> 0x15ea2604fd1007c, likely client has closed socket
> at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228)
> at
> org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServ
> erCnxnFactory.java:208)
> at java.lang.Thread.run(Thread.java:745)
> 2017-09-21 03:00:37,770 [myid:1] - WARN
> [NIOServerCxn.Factory:localhost/127.0.0.1:5002:NIOServerCnxn@357] - caught
> end of stream exception
> EndOfStreamException: Unable to read additional data from client sessionid
> 0x15ea2604fd10089, likely client has closed socket
> at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228)
> at
> org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServ
> erCnxnFactory.java:208)
> at java.lang.Thread.run(Thread.java:745)
>
> *From node2:*
>
> 2017-09-21 02:56:34,210 [myid:2] - INFO  [Snapshot
> Thread:FileTxnSnapLog@240]
> - Snapshotting: 0x50006fe73 to /run/zookeeper/version-2/snapshot.50006fe73
> 2017-09-21 02:56:34,212 [myid:2] - INFO  [SyncThread:2:FileTxnLog@199] -
> Creating new log file: log.50006fe75
> 2017-09-21 02:57:15,736 [myid:2] - INFO
> [PurgeTask:DatadirCleanupManager$PurgeTask@138] - Purge task started.
> Removing file: Sep 21, 2017 1:39:07 AM
> /run/zookeeper/version-2/log.50006f22d
> Removing file: Sep 21, 2017 1:05:52 AM
> /run/zookeeper/version-2/log.50006f01d
> Removing file: Sep 21, 2017 12:47:06 AM
> /run/zookeeper/version-2/log.50006ec55
> Removing file: Sep 21, 2017 1:39:07 AM
> /run/zookeeper/version-2/snapshot.50006f5df
> Removing file: Sep 21, 2017 1:05:52 AM
> /run/zookeeper/version-2/snapshot.50006f22d
> Removing file: Sep 21, 2017 12:47:06 AM
> /run/zookeeper/version-2/snapshot.50006f01b
> 2017-09-21 02:57:15,737 [myid:2] - INFO
> [PurgeTask:DatadirCleanupManager$PurgeTask@144] - Purge task completed.
> 2017-09-21 02:59:31,513 [myid:2] - INFO
> [NIOServerCxn.Factory:localhost/127.0.0.1:5002:Learner@108] - Revalidating
> client: 0x15e5d10a4b50020
> 2017-09-21 02:59:31,748 [myid:2] - WARN
> [NIOServerCxn.Factory:localhost/127.0.0.1:5002:NIOServerCnxn@357] - caught
> end of stream exception
> EndOfStreamException: Unable to read additional data from client sessionid
> 0x25e8d4776a2557c, likely client has closed socket
> at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228)
> at
> org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServ
> erCnxnFactory.java:208)
> at java.lang.Thread.run(Thread.java:745)
> 2017-09-21 02:59:32,350 [myid:2] - INFO
> [NIOServerCxn.Factory:localhost/127.0.0.1:5002:Learner@108] - Revalidating
> client: 0x25e8d4776a2557c
> 2017-09-21 02:59:33,221 [myid:2] - INFO
> [NIOServerCxn.Factory:localhost/127.0.0.1:5002:Learner@108] - Revalidating
> client: 0x25e5304d29d003c
> 2017-09-21 02:59:33,393 [myid:2] - INFO
> [NIOServerCxn.Factory:localhost/127.0.0.1:5002:Learner@108] - Revalidating
> client: 0x25e5304d29d003b
> 2017-09-21 02:59:34,272 [myid:2] - WARN  [QuorumPeer[myid=2]/127.0.0.1:
> 5002
> :Follower@89] - Exception when following the leader
> java.net.SocketTimeoutException: Read timed out
> at java.net.SocketInputStream.socketRead0(Native Method)
> at java.net.SocketInputStream.read(SocketInputStream.java:153)
> at java.net.SocketInputStream.read(SocketInputStream.java:122)
> at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
> at java.io.BufferedInputStream.read(BufferedInputStream.java:254)
> at java.io.DataInputStream.readInt(DataInputStream.java:387)
> at org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
> at
> org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(
> QuorumPacket.java:83)
> at
> org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:103)
> at org.apache.zookeeper.server.quorum.Learner.readPacket(Learner.java:153)
> at
> org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:85)
> at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:786)
> 2017-09-21 02:59:34,272 [myid:2] - INFO  [QuorumPeer[myid=2]/127.0.0.1:
> 5002
> :Follower@166] - shutdown called
> java.lang.Exception: shutdown Follower
> at org.apache.zookeeper.server.quorum.Follower.shutdown(Follower.java:166)
> at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:790)
> 2017-09-21 02:59:34,273 [myid:2] - INFO  [QuorumPeer[myid=2]/127.0.0.1:
> 5002
> :FollowerZooKeeperServer@139] - Shutting down
> 2017-09-21 02:59:34,273 [myid:2] - INFO  [QuorumPeer[myid=2]/127.0.0.1:
> 5002
> :ZooKeeperServer@441] - shutting down
> 2017-09-21 02:59:34,273 [myid:2] - INFO  [QuorumPeer[myid=2]/127.0.0.1:
> 5002
> :FollowerRequestProcessor@105] - Shutting down
> 2017-09-21 02:59:34,273 [myid:2] - INFO
> [FollowerRequestProcessor:2:FollowerRequestProcessor@95] -
> FollowerRequestProcessor exited loop!
> 2017-09-21 02:59:34,273 [myid:2] - INFO  [QuorumPeer[myid=2]/127.0.0.1:
> 5002
> :CommitProcessor@181] - Shutting down
> 2017-09-21 02:59:34,273 [myid:2] - INFO  [QuorumPeer[myid=2]/127.0.0.1:
> 5002
> :FinalRequestProcessor@415] - shutdown of request processor complete
> 2017-09-21 02:59:34,274 [myid:2] - INFO
> [CommitProcessor:2:CommitProcessor@150] - CommitProcessor exited loop!
> 2017-09-21 02:59:34,274 [myid:2] - INFO  [QuorumPeer[myid=2]/127.0.0.1:
> 5002
> :SyncRequestProcessor@209] - Shutting down
> 2017-09-21 02:59:34,274 [myid:2] - INFO
> [NIOServerCxn.Factory:localhost/127.0.0.1:5002:ZooKeeperServer@841] -
> Refusing session request for client /127.0.0.1:59500 as it has seen zxid
> 0x50006febd our last zxid is 0x0 client must try another server
> 2017-09-21 02:59:34,274 [myid:2] - INFO
> [SyncThread:2:SyncRequestProcessor@187] - SyncRequestProcessor exited!
> 2017-09-21 02:59:34,274 [myid:2] - INFO  [QuorumPeer[myid=2]/127.0.0.1:
> 5002
> :QuorumPeer@714] - LOOKING
> 2017-09-21 02:59:34,274 [myid:2] - INFO  [QuorumPeer[myid=2]/127.0.0.1:
> 5002
> :FileSnap@83] - Reading snapshot /run/zookeeper/version-2/snaps
> hot.50006fe73
> 2017-09-21 02:59:34,276 [myid:2] - INFO  [QuorumPeer[myid=2]/127.0.0.1:
> 5002
> :FastLeaderElection@815] - New election. My id =  2, proposed
> zxid=0x50006fec3
> 2017-09-21 02:59:34,277 [myid:2] - INFO
> [WorkerReceiver[myid=2]:FastLeaderElection@597] - Notification: 1 (message
> format version), 2 (n.leader), 0x50006fec3 (n.zxid), 0xd (n.round), LOOKING
> (n.state), 2 (n.sid), 0x5 (n.peerEpoch) LOOKING (my state)
> 2017-09-21 02:59:34,277 [myid:2] - INFO
> [WorkerReceiver[myid=2]:FastLeaderElection@597] - Notification: 1 (message
> format version), 1 (n.leader), 0x50006fec3 (n.zxid), 0xd (n.round), LOOKING
> (n.state), 1 (n.sid), 0x5 (n.peerEpoch) LOOKING (my state)
> 2017-09-21 02:59:34,277 [myid:2] - INFO
> [WorkerReceiver[myid=2]:FastLeaderElection@597] - Notification: 1 (message
> format version), 2 (n.leader), 0x50006fec3 (n.zxid), 0xd (n.round), LOOKING
> (n.state), 1 (n.sid), 0x5 (n.peerEpoch) LOOKING (my state)
> 2017-09-21 02:59:34,404 [myid:2] - WARN
> [NIOServerCxn.Factory:localhost/127.0.0.1:5002:NIOServerCnxn@362] -
> Exception causing close of session 0x0 due to java.io.IOException:
> ZooKeeperServer not running
> 2017-09-21 02:59:34,478 [myid:2] - INFO  [QuorumPeer[myid=2]/127.0.0.1:
> 5002
> :QuorumPeer@796] - LEADING
> 2017-09-21 02:59:34,478 [myid:2] - INFO  [QuorumPeer[myid=2]/127.0.0.1:
> 5002
> :ZooKeeperServer@162] - Created server with tickTime 2000
> minSessionTimeout
> 4000 maxSessionTimeout 40000 datadir /run/zookeeper/version-2 snapdir
> /run/zookeeper/version-2
> 2017-09-21 02:59:34,478 [myid:2] - INFO  [QuorumPeer[myid=2]/127.0.0.1:
> 5002
> :Leader@358] - LEADING - LEADER ELECTION TOOK - 204
> 2017-09-21 02:59:34,479 [myid:2] - INFO  [LearnerHandler-/127.0.0.1:39016
> :LearnerHandler@330] - Follower sid: 1 : info :
> org.apache.zookeeper.server.quorum.QuorumPeer$QuorumServer@988c127
> 2017-09-21 02:59:34,479 [myid:2] - INFO  [LearnerHandler-/127.0.0.1:39016
> :LearnerHandler@385] - Synchronizing with Follower sid: 1
> maxCommittedLog=0x50006fec3 minCommittedLog=0x50006fe74
> peerLastZxid=0x50006fec3
> 2017-09-21 02:59:34,480 [myid:2] - INFO  [LearnerHandler-/127.0.0.1:39016
> :LearnerHandler@462] - Sending DIFF
> 2017-09-21 02:59:34,520 [myid:2] - INFO  [LearnerHandler-/127.0.0.1:39016
> :LearnerHandler@522] - Received NEWLEADER-ACK message from 1
> 2017-09-21 02:59:34,520 [myid:2] - INFO  [QuorumPeer[myid=2]/127.0.0.1:
> 5002
> :Leader@943] - Have quorum of supporters, sids: [ 1,2 ]; starting up and
> setting last processed zxid: 0x600000000
> 2017-09-21 02:59:40,000 [myid:2] - INFO  [SessionTracker:ZooKeeperServe
> r@347]
> - Expiring session 0x35e68954661a833, timeout of 5000ms exceeded
> 2017-09-21 02:59:40,000 [myid:2] - INFO  [SessionTracker:ZooKeeperServe
> r@347]
> - Expiring session 0x25e8d4776a2557c, timeout of 5000ms exceeded
> 2017-09-21 02:59:56,001 [myid:2] - INFO  [SessionTracker:ZooKeeperServe
> r@347]
> - Expiring session 0x15e5304d29b0046, timeout of 20000ms exceeded
> 2017-09-21 02:59:56,001 [myid:2] - INFO  [SessionTracker:ZooKeeperServe
> r@347]
> - Expiring session 0x25e5304d29d003b, timeout of 20000ms exceeded
> 2017-09-21 02:59:56,001 [myid:2] - INFO  [SessionTracker:ZooKeeperServe
> r@347]
> - Expiring session 0x25e5304d29d003c, timeout of 20000ms exceeded
> 2017-09-21 02:59:56,001 [myid:2] - INFO  [SessionTracker:ZooKeeperServe
> r@347]
> - Expiring session 0x15e5d10a4b50046, timeout of 20000ms exceeded
> 2017-09-21 03:00:02,649 [myid:2] - INFO
> [WorkerReceiver[myid=2]:FastLeaderElection@597] - Notification: 1 (message
> format version), 3 (n.leader), 0x40000e237 (n.zxid), 0xc (n.round), LEADING
> (n.state), 3 (n.sid), 0x5 (n.peerEpoch) LEADING (my state)
> 2017-09-21 03:00:03,659 [myid:2] - INFO
> [WorkerReceiver[myid=2]:FastLeaderElection@597] - Notification: 1 (message
> format version), 3 (n.leader), 0x50006fed5 (n.zxid), 0xd (n.round), LOOKING
> (n.state), 3 (n.sid), 0x5 (n.peerEpoch) LEADING (my state)
> 2017-09-21 03:00:03,698 [myid:2] - INFO  [LearnerHandler-/127.0.0.1:39060
> :LearnerHandler@330] - Follower sid: 3 : info :
> org.apache.zookeeper.server.quorum.QuorumPeer$QuorumServer@3753377a
> 2017-09-21 03:00:03,699 [myid:2] - INFO  [LearnerHandler-/127.0.0.1:39060
> :LearnerHandler@385] - Synchronizing with Follower sid: 3
> maxCommittedLog=0x600000086 minCommittedLog=0x50006fe74
> peerLastZxid=0x50006fed5
> 2017-09-21 03:00:03,699 [myid:2] - INFO  [LearnerHandler-/127.0.0.1:39060
> :LearnerHandler@462] - Sending TRUNC
> 2017-09-21 03:00:03,709 [myid:2] - INFO  [LearnerHandler-/127.0.0.1:39060
> :LearnerHandler@522] - Received NEWLEADER-ACK message from 3
> 2017-09-21 03:00:03,978 [myid:2] - WARN
> [NIOServerCxn.Factory:localhost/127.0.0.1:5002:NIOServerCnxn@357] - caught
> end of stream exception
> EndOfStreamException: Unable to read additional data from client sessionid
> 0x25ea2604ff80006, likely client has closed socket
> at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228)
> at
> org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServ
> erCnxnFactory.java:208)
> at java.lang.Thread.run(Thread.java:745)
> 2017-09-21 03:00:07,354 [myid:2] - WARN
> [NIOServerCxn.Factory:localhost/127.0.0.1:5002:NIOServerCnxn@357] - caught
> end of stream exception
> EndOfStreamException: Unable to read additional data from client sessionid
> 0x25ea2604ff80025, likely client has closed socket
> at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228)
> at
> org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServ
> erCnxnFactory.java:208)
> at java.lang.Thread.run(Thread.java:745)
> 2017-09-21 03:00:08,928 [myid:2] - INFO  [Snapshot
> Thread:FileTxnSnapLog@240]
> - Snapshotting: 0x60000026c to /run/zookeeper/version-2/snapshot.60000026c
> 2017-09-21 03:00:08,929 [myid:2] - INFO  [SyncThread:2:FileTxnLog@199] -
> Creating new log file: log.60000026e
> 2017-09-21 03:00:22,000 [myid:2] - INFO  [SessionTracker:ZooKeeperServe
> r@347]
> - Expiring session 0x35ea260c1fb002f, timeout of 10000ms exceeded
> 2017-09-21 03:00:23,900 [myid:2] - INFO  [Snapshot
> Thread:FileTxnSnapLog@240]
> - Snapshotting: 0x600000489 to /run/zookeeper/version-2/snapshot.600000489
> 2017-09-21 03:00:23,914 [myid:2] - INFO  [SyncThread:2:FileTxnLog@199] -
> Creating new log file: log.60000048b
>
>
> *From node3:*
>
> 2017-09-21 02:52:22,913 [myid:3] - WARN
> [NIOServerCxn.Factory:localhost/127.0.0.1:5002:NIOServerCnxn@357] - caught
> end of stream exception
> EndOfStreamException: Unable to read additional data from client sessionid
> 0x35e68954661a822, likely client has closed socket
> at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228)
> at
> org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServ
> erCnxnFactory.java:208)
> at java.lang.Thread.run(Thread.java:745)
> 2017-09-21 02:55:23,499 [myid:3] - WARN
> [NIOServerCxn.Factory:localhost/127.0.0.1:5002:NIOServerCnxn@357] - caught
> end of stream exception
> EndOfStreamException: Unable to read additional data from client sessionid
> 0x35e68954661a836, likely client has closed socket
> at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228)
> at
> org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServ
> erCnxnFactory.java:208)
> at java.lang.Thread.run(Thread.java:745)
> 2017-09-21 02:57:17,081 [myid:3] - INFO
> [PurgeTask:DatadirCleanupManager$PurgeTask@138] - Purge task started.
> Removing file: Sep 21, 2017 1:34:26 AM
> /run/zookeeper/version-2/log.50006f20a
> Removing file: Sep 21, 2017 1:04:15 AM
> /run/zookeeper/version-2/log.50006effd
> Removing file: Sep 21, 2017 12:46:00 AM
> /run/zookeeper/version-2/log.50006ec6c
> Removing file: Sep 21, 2017 1:34:26 AM
> /run/zookeeper/version-2/snapshot.50006f55b
> Removing file: Sep 21, 2017 1:04:15 AM
> /run/zookeeper/version-2/snapshot.50006f208
> Removing file: Sep 21, 2017 12:46:00 AM
> /run/zookeeper/version-2/snapshot.50006effb
> 2017-09-21 02:57:17,082 [myid:3] - INFO
> [PurgeTask:DatadirCleanupManager$PurgeTask@144] - Purge task completed.
> 2017-09-21 02:59:24,260 [myid:3] - WARN
> [NIOServerCxn.Factory:localhost/127.0.0.1:5002:NIOServerCnxn@357] - caught
> end of stream exception
> EndOfStreamException: Unable to read additional data from client sessionid
> 0x35e68954661a852, likely client has closed socket
> at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228)
> at
> org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServ
> erCnxnFactory.java:208)
> at java.lang.Thread.run(Thread.java:745)
> 2017-09-21 03:00:02,644 [myid:3] - ERROR [LearnerHandler-/127.0.0.1:40958
> :LearnerHandler@633] - Unexpected exception causing shutdown while sock
> still open
> java.io.EOFException
> at java.io.DataInputStream.readInt(DataInputStream.java:392)
> at org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
> at
> org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(
> QuorumPacket.java:83)
> at
> org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:103)
> at
> org.apache.zookeeper.server.quorum.LearnerHandler.run(Learne
> rHandler.java:546)
> 2017-09-21 03:00:02,645 [myid:3] - WARN  [LearnerHandler-/127.0.0.1:40958
> :LearnerHandler@646] - ******* GOODBYE /127.0.0.1:40958 ********
> 2017-09-21 03:00:02,645 [myid:3] - INFO  [SessionTracker:ZooKeeperServe
> r@347]
> - Expiring session 0x15e901612194a6f, timeout of 5000ms exceeded
> 2017-09-21 03:00:02,645 [myid:3] - INFO  [SessionTracker:ZooKeeperServe
> r@347]
> - Expiring session 0x25e8d4776a2557c, timeout of 5000ms exceeded
> 2017-09-21 03:00:02,645 [myid:3] - INFO  [SessionTracker:ZooKeeperServe
> r@347]
> - Expiring session 0x35e68954661a833, timeout of 5000ms exceeded
> 2017-09-21 03:00:02,645 [myid:3] - INFO  [SessionTracker:ZooKeeperServe
> r@347]
> - Expiring session 0x25e5304d29d003c, timeout of 20000ms exceeded
> 2017-09-21 03:00:02,645 [myid:3] - INFO  [SessionTracker:ZooKeeperServe
> r@347]
> - Expiring session 0x15e5d10a4b50020, timeout of 20000ms exceeded
> 2017-09-21 03:00:02,645 [myid:3] - INFO  [SessionTracker:ZooKeeperServe
> r@347]
> - Expiring session 0x25e5304d29d003b, timeout of 20000ms exceeded
> 2017-09-21 03:00:02,645 [myid:3] - INFO  [SessionTracker:ZooKeeperServe
> r@347]
> - Expiring session 0x15e5d10a4b50046, timeout of 20000ms exceeded
> 2017-09-21 03:00:02,645 [myid:3] - INFO  [SessionTracker:ZooKeeperServe
> r@347]
> - Expiring session 0x15e5d10a4b50022, timeout of 20000ms exceeded
> 2017-09-21 03:00:02,645 [myid:3] - INFO  [SessionTracker:ZooKeeperServe
> r@347]
> - Expiring session 0x35e5d10a4de0029, timeout of 20000ms exceeded
> 2017-09-21 03:00:02,645 [myid:3] - INFO  [SessionTracker:ZooKeeperServe
> r@347]
> - Expiring session 0x15e5d10a4b50021, timeout of 20000ms exceeded
> 2017-09-21 03:00:02,645 [myid:3] - INFO  [SessionTracker:ZooKeeperServe
> r@347]
> - Expiring session 0x35e5d10a4de002a, timeout of 20000ms exceeded
> 2017-09-21 03:00:02,648 [myid:3] - INFO
> [WorkerReceiver[myid=3]:FastLeaderElection@597] - Notification: 1 (message
> format version), 2 (n.leader), 0x50006fec3 (n.zxid), 0xd (n.round), LOOKING
> (n.state), 2 (n.sid), 0x5 (n.peerEpoch) LEADING (my state)
> 2017-09-21 03:00:02,648 [myid:3] - INFO
> [WorkerReceiver[myid=3]:FastLeaderElection@597] - Notification: 1 (message
> format version), 1 (n.leader), 0x50006fec3 (n.zxid), 0xd (n.round), LOOKING
> (n.state), 1 (n.sid), 0x5 (n.peerEpoch) LEADING (my state)
> 2017-09-21 03:00:02,648 [myid:3] - INFO
> [WorkerReceiver[myid=3]:FastLeaderElection@597] - Notification: 1 (message
> format version), 2 (n.leader), 0x50006fec3 (n.zxid), 0xd (n.round), LOOKING
> (n.state), 1 (n.sid), 0x5 (n.peerEpoch) LEADING (my state)
> 2017-09-21 03:00:02,649 [myid:3] - ERROR [LearnerHandler-/127.0.0.1:58432
> :LearnerHandler@633] - Unexpected exception causing shutdown while sock
> still open
> java.io.EOFException
> at java.io.DataInputStream.readInt(DataInputStream.java:392)
> at org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
> at
> org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(
> QuorumPacket.java:83)
> at
> org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:103)
> at
> org.apache.zookeeper.server.quorum.LearnerHandler.run(Learne
> rHandler.java:546)
> 2017-09-21 03:00:02,649 [myid:3] - WARN  [LearnerHandler-/127.0.0.1:58432
> :LearnerHandler@646] - ******* GOODBYE /127.0.0.1:58432 ********
> 2017-09-21 03:00:02,651 [myid:3] - INFO  [ProcessThread(sid:3
> cport:-1)::PrepRequestProcessor@645] - Got user-level KeeperException when
> processing sessionid:0x15e5304d29b0046 type:ping cxid:0xfffffffffffffffe
> zxid:0xfffffffffffffffe txntype:unknown reqpath:n/a Error Path:null
> Error:KeeperErrorCode = Session moved
> 2017-09-21 03:00:02,651 [myid:3] - WARN
> [NIOServerCxn.Factory:localhost/127.0.0.1:5002:NIOServerCnxn@357] - caught
> end of stream exception
> EndOfStreamException: Unable to read additional data from client sessionid
> 0x15e5d10a4b50020, likely client has closed socket
> at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228)
> at
> org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServ
> erCnxnFactory.java:208)
> at java.lang.Thread.run(Thread.java:745)
> 2017-09-21 03:00:02,651 [myid:3] - WARN
> [NIOServerCxn.Factory:localhost/127.0.0.1:5002:NIOServerCnxn@357] - caught
> end of stream exception
> EndOfStreamException: Unable to read additional data from client sessionid
> 0x25e5304d29d003b, likely client has closed socket
> at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228)
> at
> org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServ
> erCnxnFactory.java:208)
> at java.lang.Thread.run(Thread.java:745)
> 2017-09-21 03:00:02,651 [myid:3] - WARN
> [NIOServerCxn.Factory:localhost/127.0.0.1:5002:NIOServerCnxn@357] - caught
> end of stream exception
> EndOfStreamException: Unable to read additional data from client sessionid
> 0x35e68954661a833, likely client has closed socket
> at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228)
> at
> org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServ
> erCnxnFactory.java:208)
> at java.lang.Thread.run(Thread.java:745)
> 2017-09-21 03:00:02,652 [myid:3] - WARN
> [NIOServerCxn.Factory:localhost/127.0.0.1:5002:NIOServerCnxn@357] - caught
> end of stream exception
> EndOfStreamException: Unable to read additional data from client sessionid
> 0x25e5304d29d003c, likely client has closed socket
> at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228)
> at
> org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServ
> erCnxnFactory.java:208)
> at java.lang.Thread.run(Thread.java:745)
> 2017-09-21 03:00:02,652 [myid:3] - WARN
> [NIOServerCxn.Factory:localhost/127.0.0.1:5002:NIOServerCnxn@357] - caught
> end of stream exception
> EndOfStreamException: Unable to read additional data from client sessionid
> 0x15e5304d29b0046, likely client has closed socket
> at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228)
> at
> org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServ
> erCnxnFactory.java:208)
> at java.lang.Thread.run(Thread.java:745)
> 2017-09-21 03:00:02,652 [myid:3] - WARN
> [NIOServerCxn.Factory:localhost/127.0.0.1:5002:NIOServerCnxn@357] - caught
> end of stream exception
> EndOfStreamException: Unable to read additional data from client sessionid
> 0x35e5d10a4de0029, likely client has closed socket
> at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228)
> at
> org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServ
> erCnxnFactory.java:208)
> at java.lang.Thread.run(Thread.java:745)
> 2017-09-21 03:00:02,653 [myid:3] - WARN
> [NIOServerCxn.Factory:localhost/127.0.0.1:5002:NIOServerCnxn@357] - caught
> end of stream exception
> EndOfStreamException: Unable to read additional data from client sessionid
> 0x15e5d10a4b50022, likely client has closed socket
> at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228)
> at
> org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServ
> erCnxnFactory.java:208)
> at java.lang.Thread.run(Thread.java:745)
> 2017-09-21 03:00:02,653 [myid:3] - WARN
> [NIOServerCxn.Factory:localhost/127.0.0.1:5002:NIOServerCnxn@357] - caught
> end of stream exception
> EndOfStreamException: Unable to read additional data from client sessionid
> 0x35e5d10a4de002a, likely client has closed socket
> at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228)
> at
> org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServ
> erCnxnFactory.java:208)
> at java.lang.Thread.run(Thread.java:745)
> 2017-09-21 03:00:02,653 [myid:3] - INFO
> [NIOServerCxn.Factory:localhost/127.0.0.1:5002:ZooKeeperServer@610] -
> Invalid session 0x35e68954661a833 for client /127.0.0.1:46968, probably
> expired
> 2017-09-21 03:00:02,653 [myid:3] - WARN
> [NIOServerCxn.Factory:localhost/127.0.0.1:5002:NIOServerCnxn@357] - caught
> end of stream exception
> EndOfStreamException: Unable to read additional data from client sessionid
> 0x15e5d10a4b50021, likely client has closed socket
> at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228)
> at
> org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServ
> erCnxnFactory.java:208)
> at java.lang.Thread.run(Thread.java:745)
> 2017-09-21 03:00:02,654 [myid:3] - INFO
> [NIOServerCxn.Factory:localhost/127.0.0.1:5002:ZooKeeperServer@610] -
> Invalid session 0x25e5304d29d003c for client /127.0.0.1:46982, probably
> expired
> 2017-09-21 03:00:02,654 [myid:3] - INFO
> [NIOServerCxn.Factory:localhost/127.0.0.1:5002:ZooKeeperServer@610] -
> Invalid session 0x15e5d10a4b50046 for client /127.0.0.1:46984, probably
> expired
> 2017-09-21 03:00:02,660 [myid:3] - INFO
> [NIOServerCxn.Factory:localhost/127.0.0.1:5002:ZooKeeperServer@610] -
> Invalid session 0x25e5304d29d003b for client /127.0.0.1:46994, probably
> expired
> 2017-09-21 03:00:03,648 [myid:3] - INFO  [QuorumPeer[myid=3]/127.0.0.1:
> 5002
> :Leader@493] - Shutting down
> 2017-09-21 03:00:03,648 [myid:3] - INFO  [QuorumPeer[myid=3]/127.0.0.1:
> 5002
> :Leader@499] - Shutdown called
> java.lang.Exception: shutdown Leader! reason: Not sufficient followers
> synced, only synced with sids: [ 3 ]
> at org.apache.zookeeper.server.quorum.Leader.shutdown(Leader.java:499)
> at org.apache.zookeeper.server.quorum.Leader.lead(Leader.java:474)
> at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:799)
> 2017-09-21 03:00:03,648 [myid:3] - INFO
> [Thread-17:Leader$LearnerCnxAcceptor@322] - exception while shutting down
> acceptor: java.net.SocketException: Socket closed
> 2017-09-21 03:00:03,648 [myid:3] - INFO  [QuorumPeer[myid=3]/127.0.0.1:
> 5002
> :ZooKeeperServer@441] - shutting down
> 2017-09-21 03:00:03,649 [myid:3] - INFO  [QuorumPeer[myid=3]/127.0.0.1:
> 5002
> :SessionTrackerImpl@225] - Shutting down
> 2017-09-21 03:00:03,649 [myid:3] - INFO  [QuorumPeer[myid=3]/127.0.0.1:
> 5002
> :PrepRequestProcessor@761] - Shutting down
> 2017-09-21 03:00:03,649 [myid:3] - INFO  [QuorumPeer[myid=3]/127.0.0.1:
> 5002
> :ProposalRequestProcessor@88] - Shutting down
> 2017-09-21 03:00:03,649 [myid:3] - INFO  [QuorumPeer[myid=3]/127.0.0.1:
> 5002
> :CommitProcessor@181] - Shutting down
> 2017-09-21 03:00:03,649 [myid:3] - INFO  [QuorumPeer[myid=3]/127.0.0.1:
> 5002
> :Leader$ToBeAppliedRequestProcessor@658] - Shutting down
> 2017-09-21 03:00:03,649 [myid:3] - INFO  [QuorumPeer[myid=3]/127.0.0.1:
> 5002
> :FinalRequestProcessor@415] - shutdown of request processor complete
> 2017-09-21 03:00:03,649 [myid:3] - INFO  [QuorumPeer[myid=3]/127.0.0.1:
> 5002
> :SyncRequestProcessor@209] - Shutting down
> 2017-09-21 03:00:03,649 [myid:3] - INFO  [ProcessThread(sid:3
> cport:-1)::PrepRequestProcessor@143] - PrepRequestProcessor exited loop!
> 2017-09-21 03:00:03,649 [myid:3] - INFO
> [CommitProcessor:3:CommitProcessor@150] - CommitProcessor exited loop!
> 2017-09-21 03:00:03,649 [myid:3] - INFO
> [SyncThread:3:SyncRequestProcessor@187] - SyncRequestProcessor exited!
> 2017-09-21 03:00:03,652 [myid:3] - INFO  [QuorumPeer[myid=3]/127.0.0.1:
> 5002
> :QuorumPeer@714] - LOOKING
> 2017-09-21 03:00:03,653 [myid:3] - INFO  [QuorumPeer[myid=3]/127.0.0.1:
> 5002
> :FileSnap@83] - Reading snapshot /run/zookeeper/version-2/snaps
> hot.50006fcf2
> 2017-09-21 03:00:03,659 [myid:3] - INFO  [QuorumPeer[myid=3]/127.0.0.1:
> 5002
> :FastLeaderElection@815] - New election. My id =  3, proposed
> zxid=0x50006fed5
> 2017-09-21 03:00:03,659 [myid:3] - INFO
> [WorkerReceiver[myid=3]:FastLeaderElection@597] - Notification: 1 (message
> format version), 3 (n.leader), 0x50006fed5 (n.zxid), 0xd (n.round), LOOKING
> (n.state), 3 (n.sid), 0x5 (n.peerEpoch) LOOKING (my state)
> 2017-09-21 03:00:03,660 [myid:3] - INFO
> [WorkerReceiver[myid=3]:FastLeaderElection@597] - Notification: 1 (message
> format version), 2 (n.leader), 0x50006fec3 (n.zxid), 0xd (n.round), LEADING
> (n.state), 2 (n.sid), 0x6 (n.peerEpoch) LOOKING (my state)
> 2017-09-21 03:00:03,697 [myid:3] - INFO
> [WorkerReceiver[myid=3]:FastLeaderElection@597] - Notification: 1 (message
> format version), 2 (n.leader), 0x50006fec3 (n.zxid), 0xd (n.round),
> FOLLOWING (n.state), 1 (n.sid), 0x6 (n.peerEpoch) LOOKING (my state)
> 2017-09-21 03:00:03,697 [myid:3] - INFO  [QuorumPeer[myid=3]/127.0.0.1:
> 5002
> :QuorumPeer@784] - FOLLOWING
> 2017-09-21 03:00:03,697 [myid:3] - INFO  [QuorumPeer[myid=3]/127.0.0.1:
> 5002
> :ZooKeeperServer@162] - Created server with tickTime 2000
> minSessionTimeout
> 4000 maxSessionTimeout 40000 datadir /run/zookeeper/version-2 snapdir
> /run/zookeeper/version-2
> 2017-09-21 03:00:03,697 [myid:3] - INFO  [QuorumPeer[myid=3]/127.0.0.1:
> 5002
> :Follower@63] - FOLLOWING - LEADER ELECTION TOOK - 44
> 2017-09-21 03:00:03,700 [myid:3] - WARN  [QuorumPeer[myid=3]/127.0.0.1:
> 5002
> :Learner@338] - Truncating log to get in sync with the leader 0x50006fec3
> 2017-09-21 03:00:03,702 [myid:3] - INFO  [QuorumPeer[myid=3]/127.0.0.1:
> 5002
> :FileSnap@83] - Reading snapshot /run/zookeeper/version-2/snaps
> hot.50006fcf2
> 2017-09-21 03:00:03,707 [myid:3] - WARN  [QuorumPeer[myid=3]/127.0.0.1:
> 5002
> :Learner@374] - Got zxid 0x600000001 expected 0x1
> 2017-09-21 03:00:03,708 [myid:3] - INFO  [QuorumPeer[myid=3]/127.0.0.1:
> 5002
> :FileTxnSnapLog@240] - Snapshotting: 0x600000086 to
> /run/zookeeper/version-2/snapshot.600000086
> 2017-09-21 03:00:03,773 [myid:3] - INFO  [SyncThread:3:FileTxnLog@199] -
> Creating new log file: log.600000087
> 2017-09-21 03:00:03,776 [myid:3] - WARN  [QuorumPeer[myid=3]/127.0.0.1:
> 5002
> :Follower@118] - Got zxid 0x600000088 expected 0x1
> 2017-09-21 03:00:03,895 [myid:3] - WARN
> [NIOServerCxn.Factory:localhost/127.0.0.1:5002:NIOServerCnxn@357] - caught
> end of stream exception
> EndOfStreamException: Unable to read additional data from client sessionid
> 0x35ea260c1fb0002, likely client has closed socket
> at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228)
> at
> org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServ
> erCnxnFactory.java:208)
> at java.lang.Thread.run(Thread.java:745)
> 2017-09-21 03:00:03,978 [myid:3] - WARN
> [NIOServerCxn.Factory:localhost/127.0.0.1:5002:NIOServerCnxn@362] -
> Exception causing close of session 0x35ea260c1fb0003 due to
> java.io.IOException: Connection reset by peer
> 2017-09-21 03:00:04,000 [myid:3] - INFO
> [SessionTracker:SessionTrackerImpl@162] - SessionTrackerImpl exited loop!
> 2017-09-21 03:00:05,806 [myid:3] - WARN
> [NIOServerCxn.Factory:localhost/127.0.0.1:5002:NIOServerCnxn@357] - caught
> end of stream exception
> EndOfStreamException: Unable to read additional data from client sessionid
> 0x35ea260c1fb0011, likely client has closed socket
> at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228)
> at
> org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServ
> erCnxnFactory.java:208)
> at java.lang.Thread.run(Thread.java:745)
> 2017-09-21 03:00:09,600 [myid:3] - INFO  [Snapshot
> Thread:FileTxnSnapLog@240]
> - Snapshotting: 0x60000029c to /run/zookeeper/version-2/snapshot.60000029c
> 2017-09-21 03:00:09,606 [myid:3] - INFO  [SyncThread:3:FileTxnLog@199] -
> Creating new log file: log.60000029e
> 2017-09-21 03:00:10,399 [myid:3] - WARN
> [NIOServerCxn.Factory:localhost/127.0.0.1:5002:NIOServerCnxn@357] - caught
> end of stream exception
> EndOfStreamException: Unable to read additional data from client sessionid
> 0x35ea260c1fb002f, likely client has closed socket
> at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228)
> at
> org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServ
> erCnxnFactory.java:208)
> at java.lang.Thread.run(Thread.java:745)
>
> Thanks,
> Anand.
>
> On Thu, Oct 5, 2017 at 9:50 AM, Patrick Hunt <ph...@apache.org> wrote:
>
> > Unfortunately I don't see any attached logs, which makes it difficult to
> > provide you with insight. "Not sufficient followers synced" indicates
> that
> > you're losing followers, likely they are falling behind - what is your
> > metric tracking telling your wrt load on the compute and load on the
> > disk/memory/network/etc... also metrics at the ZK level (e.g. are zk
> > latencies increasing?) Check the logs to see if you're seeing "fsync"
> > slowness issues (it's a warning in the server logs). This is a pretty
> > common issue. GC might also be an issue, although that's more rare these
> > days (hard to say w/o knowing your use case, etc...) Again, look to your
> > metrics collection for insight where to start.
> >
> > Patrick
> >
> > On Wed, Oct 4, 2017 at 11:17 AM, Anand Parthasarathy <
> > anpartha@avinetworks.com> wrote:
> >
> > > Hi,
> > >
> > > We have an issue with a 3-node zookeeper ensemble where the quorum goes
> > > down due to no apparent reason every once in a while. Here is what I
> see
> > in
> > > the ZK leader:
> > >
> > > 2017-09-21 03:00:03,648 [myid:3] - INFO  [QuorumPeer[myid=3]/127.0.0.1
> :
> > > 5002:Leader@493] - Shutting down
> > > 2017-09-21 03:00:03,648 [myid:3] - INFO  [QuorumPeer[myid=3]/127.0.0.1
> :
> > > 5002:Leader@499] - Shutdown called
> > > java.lang.Exception: shutdown Leader! reason: Not sufficient followers
> > > synced, only synced with sids: [ 3 ]
> > >     at org.apache.zookeeper.server.quorum.Leader.shutdown(Leader.
> > java:499)
> > >     at org.apache.zookeeper.server.quorum.Leader.lead(Leader.java:474)
> > >     at org.apache.zookeeper.server.quorum.QuorumPeer.run(
> > > QuorumPeer.java:799)
> > >
> > > I have attached the logs from the 3 nodes around this time. Could you
> > pls.
> > > help understand what the issue could be here. The only thing I see a
> > little
> > > bit ahead of this timestamp is that all of them did a PurgeTask pretty
> > much
> > > at the same time.
> > >
> > > Thanks,
> > > Anand.
> > >
> >
>

Re: Zookeeper quorum goes down for no apparent reason in 3.4.5

Posted by Anand Parthasarathy <an...@avinetworks.com>.
I will attach the logs inline this time. I looked at the metrics and can't
find any issue wrt network connectivity or cpu/disk/memory issues. We also
moved the data directory to /run memfs to avoid fsync latencies as we don't
need persistency. Could you pls. take a look at the logs and see if you can
spot something that gives an insight to the underlying issue.

Thanks,
Anand.

*From node1:*

2017-09-21 02:57:13,164 [myid:1] - INFO
[PurgeTask:DatadirCleanupManager$PurgeTask@138] - Purge task started.
Removing file: Sep 21, 2017 1:17:24 AM
/run/zookeeper/version-2/log.50006f026
Removing file: Sep 21, 2017 12:47:17 AM
/run/zookeeper/version-2/log.50006ecf6
Removing file: Sep 21, 2017 1:17:24 AM
/run/zookeeper/version-2/snapshot.50006f375
Removing file: Sep 21, 2017 12:47:17 AM
/run/zookeeper/version-2/snapshot.50006f024
2017-09-21 02:57:13,164 [myid:1] - INFO
[PurgeTask:DatadirCleanupManager$PurgeTask@144] - Purge task completed.
2017-09-21 02:59:32,298 [myid:1] - INFO
[NIOServerCxn.Factory:localhost/127.0.0.1:5002:Learner@108] - Revalidating
client: 0x15e5304d29b0046
2017-09-21 02:59:34,271 [myid:1] - WARN  [QuorumPeer[myid=1]/127.0.0.1:5002
:Follower@89] - Exception when following the leader
java.net.SocketTimeoutException: Read timed out
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.read(SocketInputStream.java:153)
at java.net.SocketInputStream.read(SocketInputStream.java:122)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
at java.io.BufferedInputStream.read(BufferedInputStream.java:254)
at java.io.DataInputStream.readInt(DataInputStream.java:387)
at org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
at
org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:83)
at
org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:103)
at org.apache.zookeeper.server.quorum.Learner.readPacket(Learner.java:153)
at
org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:85)
at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:786)
2017-09-21 02:59:34,271 [myid:1] - INFO  [QuorumPeer[myid=1]/127.0.0.1:5002
:Follower@166] - shutdown called
java.lang.Exception: shutdown Follower
at org.apache.zookeeper.server.quorum.Follower.shutdown(Follower.java:166)
at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:790)
2017-09-21 02:59:34,272 [myid:1] - INFO  [QuorumPeer[myid=1]/127.0.0.1:5002
:FollowerZooKeeperServer@139] - Shutting down
2017-09-21 02:59:34,272 [myid:1] - INFO  [QuorumPeer[myid=1]/127.0.0.1:5002
:ZooKeeperServer@441] - shutting down
2017-09-21 02:59:34,272 [myid:1] - INFO  [QuorumPeer[myid=1]/127.0.0.1:5002
:FollowerRequestProcessor@105] - Shutting down
2017-09-21 02:59:34,272 [myid:1] - INFO
[FollowerRequestProcessor:1:FollowerRequestProcessor@95] -
FollowerRequestProcessor exited loop!
2017-09-21 02:59:34,272 [myid:1] - INFO  [QuorumPeer[myid=1]/127.0.0.1:5002
:CommitProcessor@181] - Shutting down
2017-09-21 02:59:34,272 [myid:1] - INFO
[CommitProcessor:1:CommitProcessor@150] - CommitProcessor exited loop!
2017-09-21 02:59:34,272 [myid:1] - INFO  [QuorumPeer[myid=1]/127.0.0.1:5002
:FinalRequestProcessor@415] - shutdown of request processor complete
2017-09-21 02:59:34,272 [myid:1] - INFO  [QuorumPeer[myid=1]/127.0.0.1:5002
:SyncRequestProcessor@209] - Shutting down
2017-09-21 02:59:34,272 [myid:1] - INFO
[SyncThread:1:SyncRequestProcessor@187] - SyncRequestProcessor exited!
2017-09-21 02:59:34,272 [myid:1] - INFO  [QuorumPeer[myid=1]/127.0.0.1:5002
:QuorumPeer@714] - LOOKING
2017-09-21 02:59:34,273 [myid:1] - INFO  [QuorumPeer[myid=1]/127.0.0.1:5002
:FileSnap@83] - Reading snapshot /run/zookeeper/version-2/snapshot.50006fba3
2017-09-21 02:59:34,275 [myid:1] - WARN
[NIOServerCxn.Factory:localhost/127.0.0.1:5002:NIOServerCnxn@362] -
Exception causing close of session 0x0 due to java.io.IOException:
ZooKeeperServer not running
2017-09-21 02:59:34,275 [myid:1] - WARN
[NIOServerCxn.Factory:localhost/127.0.0.1:5002:NIOServerCnxn@362] -
Exception causing close of session 0x0 due to java.io.IOException:
ZooKeeperServer not running
2017-09-21 02:59:34,277 [myid:1] - INFO  [QuorumPeer[myid=1]/127.0.0.1:5002
:FastLeaderElection@815] - New election. My id =  1, proposed
zxid=0x50006fec3
2017-09-21 02:59:34,277 [myid:1] - INFO
[WorkerReceiver[myid=1]:FastLeaderElection@597] - Notification: 1 (message
format version), 1 (n.leader), 0x50006fec3 (n.zxid), 0xd (n.round), LOOKING
(n.state), 1 (n.sid), 0x5 (n.peerEpoch) LOOKING (my state)
2017-09-21 02:59:34,278 [myid:1] - INFO
[WorkerReceiver[myid=1]:FastLeaderElection@597] - Notification: 1 (message
format version), 2 (n.leader), 0x50006fec3 (n.zxid), 0xd (n.round), LOOKING
(n.state), 2 (n.sid), 0x5 (n.peerEpoch) LOOKING (my state)
2017-09-21 02:59:34,278 [myid:1] - INFO
[WorkerReceiver[myid=1]:FastLeaderElection@597] - Notification: 1 (message
format version), 2 (n.leader), 0x50006fec3 (n.zxid), 0xd (n.round), LOOKING
(n.state), 1 (n.sid), 0x5 (n.peerEpoch) LOOKING (my state)
2017-09-21 02:59:34,478 [myid:1] - INFO  [QuorumPeer[myid=1]/127.0.0.1:5002
:QuorumPeer@784] - FOLLOWING
2017-09-21 02:59:34,478 [myid:1] - INFO  [QuorumPeer[myid=1]/127.0.0.1:5002
:ZooKeeperServer@162] - Created server with tickTime 2000 minSessionTimeout
4000 maxSessionTimeout 40000 datadir /run/zookeeper/version-2 snapdir
/run/zookeeper/version-2
2017-09-21 02:59:34,478 [myid:1] - INFO  [QuorumPeer[myid=1]/127.0.0.1:5002
:Follower@63] - FOLLOWING - LEADER ELECTION TOOK - 205
2017-09-21 02:59:34,481 [myid:1] - INFO  [QuorumPeer[myid=1]/127.0.0.1:5002
:Learner@323] - Getting a diff from the leader 0x50006fec3
2017-09-21 02:59:34,520 [myid:1] - INFO  [QuorumPeer[myid=1]/127.0.0.1:5002
:FileTxnSnapLog@240] - Snapshotting: 0x50006fec3 to
/run/zookeeper/version-2/snapshot.50006fec3
2017-09-21 02:59:34,876 [myid:1] - INFO
[NIOServerCxn.Factory:localhost/127.0.0.1:5002:Learner@108] - Revalidating
client: 0x15e5d10a4b50020
2017-09-21 02:59:34,944 [myid:1] - WARN  [QuorumPeer[myid=1]/127.0.0.1:5002
:Follower@118] - Got zxid 0x600000001 expected 0x1
2017-09-21 02:59:35,495 [myid:1] - INFO
[NIOServerCxn.Factory:localhost/127.0.0.1:5002:Learner@108] - Revalidating
client: 0x15e901612194a6f
2017-09-21 03:00:02,649 [myid:1] - INFO
[WorkerReceiver[myid=1]:FastLeaderElection@597] - Notification: 1 (message
format version), 3 (n.leader), 0x40000e237 (n.zxid), 0xc (n.round), LEADING
(n.state), 3 (n.sid), 0x5 (n.peerEpoch) FOLLOWING (my state)
2017-09-21 03:00:02,650 [myid:1] - INFO
[WorkerReceiver[myid=1]:FastLeaderElection@597] - Notification: 1 (message
format version), 3 (n.leader), 0x40000e237 (n.zxid), 0xc (n.round), LEADING
(n.state), 3 (n.sid), 0x5 (n.peerEpoch) FOLLOWING (my state)
2017-09-21 03:00:03,651 [myid:1] - INFO
[NIOServerCxn.Factory:localhost/127.0.0.1:5002:Learner@108] - Revalidating
client: 0x15e5304d29b0046
2017-09-21 03:00:03,651 [myid:1] - INFO  [QuorumPeer[myid=1]/127.0.0.1:5002
:ZooKeeperServer@610] - Invalid session 0x15e5304d29b0046 for client /
127.0.0.1:57028, probably expired
2017-09-21 03:00:03,660 [myid:1] - INFO
[WorkerReceiver[myid=1]:FastLeaderElection@597] - Notification: 1 (message
format version), 3 (n.leader), 0x50006fed5 (n.zxid), 0xd (n.round), LOOKING
(n.state), 3 (n.sid), 0x5 (n.peerEpoch) FOLLOWING (my state)
2017-09-21 03:00:04,748 [myid:1] - WARN
[NIOServerCxn.Factory:localhost/127.0.0.1:5002:NIOServerCnxn@357] - caught
end of stream exception
EndOfStreamException: Unable to read additional data from client sessionid
0x15ea2604fd1000e, likely client has closed socket
at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228)
at
org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208)
at java.lang.Thread.run(Thread.java:745)
2017-09-21 03:00:11,624 [myid:1] - WARN
[NIOServerCxn.Factory:localhost/127.0.0.1:5002:NIOServerCnxn@357] - caught
end of stream exception
EndOfStreamException: Unable to read additional data from client sessionid
0x15ea2604fd10001, likely client has closed socket
at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228)
at
org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208)
at java.lang.Thread.run(Thread.java:745)
2017-09-21 03:00:11,847 [myid:1] - INFO  [SyncThread:1:FileTxnLog@199] -
Creating new log file: log.6000003ac
2017-09-21 03:00:11,847 [myid:1] - INFO  [Snapshot Thread:FileTxnSnapLog@240]
- Snapshotting: 0x6000003ab to /run/zookeeper/version-2/snapshot.6000003ab
2017-09-21 03:00:34,482 [myid:1] - WARN
[NIOServerCxn.Factory:localhost/127.0.0.1:5002:NIOServerCnxn@357] - caught
end of stream exception
EndOfStreamException: Unable to read additional data from client sessionid
0x15ea2604fd1006a, likely client has closed socket
at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228)
at
org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208)
at java.lang.Thread.run(Thread.java:745)
2017-09-21 03:00:34,536 [myid:1] - WARN
[NIOServerCxn.Factory:localhost/127.0.0.1:5002:NIOServerCnxn@357] - caught
end of stream exception
EndOfStreamException: Unable to read additional data from client sessionid
0x15ea2604fd1006c, likely client has closed socket
at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228)
at
org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208)
at java.lang.Thread.run(Thread.java:745)
2017-09-21 03:00:34,645 [myid:1] - WARN
[NIOServerCxn.Factory:localhost/127.0.0.1:5002:NIOServerCnxn@357] - caught
end of stream exception
EndOfStreamException: Unable to read additional data from client sessionid
0x15ea2604fd1006e, likely client has closed socket
at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228)
at
org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208)
at java.lang.Thread.run(Thread.java:745)
2017-09-21 03:00:36,104 [myid:1] - INFO  [Snapshot Thread:FileTxnSnapLog@240]
- Snapshotting: 0x60000072a to /run/zookeeper/version-2/snapshot.60000072a
2017-09-21 03:00:36,138 [myid:1] - INFO  [SyncThread:1:FileTxnLog@199] -
Creating new log file: log.60000072c
2017-09-21 03:00:36,558 [myid:1] - WARN
[NIOServerCxn.Factory:localhost/127.0.0.1:5002:NIOServerCnxn@357] - caught
end of stream exception
EndOfStreamException: Unable to read additional data from client sessionid
0x15ea2604fd1007c, likely client has closed socket
at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228)
at
org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208)
at java.lang.Thread.run(Thread.java:745)
2017-09-21 03:00:37,770 [myid:1] - WARN
[NIOServerCxn.Factory:localhost/127.0.0.1:5002:NIOServerCnxn@357] - caught
end of stream exception
EndOfStreamException: Unable to read additional data from client sessionid
0x15ea2604fd10089, likely client has closed socket
at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228)
at
org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208)
at java.lang.Thread.run(Thread.java:745)

*From node2:*

2017-09-21 02:56:34,210 [myid:2] - INFO  [Snapshot Thread:FileTxnSnapLog@240]
- Snapshotting: 0x50006fe73 to /run/zookeeper/version-2/snapshot.50006fe73
2017-09-21 02:56:34,212 [myid:2] - INFO  [SyncThread:2:FileTxnLog@199] -
Creating new log file: log.50006fe75
2017-09-21 02:57:15,736 [myid:2] - INFO
[PurgeTask:DatadirCleanupManager$PurgeTask@138] - Purge task started.
Removing file: Sep 21, 2017 1:39:07 AM
/run/zookeeper/version-2/log.50006f22d
Removing file: Sep 21, 2017 1:05:52 AM
/run/zookeeper/version-2/log.50006f01d
Removing file: Sep 21, 2017 12:47:06 AM
/run/zookeeper/version-2/log.50006ec55
Removing file: Sep 21, 2017 1:39:07 AM
/run/zookeeper/version-2/snapshot.50006f5df
Removing file: Sep 21, 2017 1:05:52 AM
/run/zookeeper/version-2/snapshot.50006f22d
Removing file: Sep 21, 2017 12:47:06 AM
/run/zookeeper/version-2/snapshot.50006f01b
2017-09-21 02:57:15,737 [myid:2] - INFO
[PurgeTask:DatadirCleanupManager$PurgeTask@144] - Purge task completed.
2017-09-21 02:59:31,513 [myid:2] - INFO
[NIOServerCxn.Factory:localhost/127.0.0.1:5002:Learner@108] - Revalidating
client: 0x15e5d10a4b50020
2017-09-21 02:59:31,748 [myid:2] - WARN
[NIOServerCxn.Factory:localhost/127.0.0.1:5002:NIOServerCnxn@357] - caught
end of stream exception
EndOfStreamException: Unable to read additional data from client sessionid
0x25e8d4776a2557c, likely client has closed socket
at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228)
at
org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208)
at java.lang.Thread.run(Thread.java:745)
2017-09-21 02:59:32,350 [myid:2] - INFO
[NIOServerCxn.Factory:localhost/127.0.0.1:5002:Learner@108] - Revalidating
client: 0x25e8d4776a2557c
2017-09-21 02:59:33,221 [myid:2] - INFO
[NIOServerCxn.Factory:localhost/127.0.0.1:5002:Learner@108] - Revalidating
client: 0x25e5304d29d003c
2017-09-21 02:59:33,393 [myid:2] - INFO
[NIOServerCxn.Factory:localhost/127.0.0.1:5002:Learner@108] - Revalidating
client: 0x25e5304d29d003b
2017-09-21 02:59:34,272 [myid:2] - WARN  [QuorumPeer[myid=2]/127.0.0.1:5002
:Follower@89] - Exception when following the leader
java.net.SocketTimeoutException: Read timed out
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.read(SocketInputStream.java:153)
at java.net.SocketInputStream.read(SocketInputStream.java:122)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
at java.io.BufferedInputStream.read(BufferedInputStream.java:254)
at java.io.DataInputStream.readInt(DataInputStream.java:387)
at org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
at
org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:83)
at
org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:103)
at org.apache.zookeeper.server.quorum.Learner.readPacket(Learner.java:153)
at
org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:85)
at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:786)
2017-09-21 02:59:34,272 [myid:2] - INFO  [QuorumPeer[myid=2]/127.0.0.1:5002
:Follower@166] - shutdown called
java.lang.Exception: shutdown Follower
at org.apache.zookeeper.server.quorum.Follower.shutdown(Follower.java:166)
at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:790)
2017-09-21 02:59:34,273 [myid:2] - INFO  [QuorumPeer[myid=2]/127.0.0.1:5002
:FollowerZooKeeperServer@139] - Shutting down
2017-09-21 02:59:34,273 [myid:2] - INFO  [QuorumPeer[myid=2]/127.0.0.1:5002
:ZooKeeperServer@441] - shutting down
2017-09-21 02:59:34,273 [myid:2] - INFO  [QuorumPeer[myid=2]/127.0.0.1:5002
:FollowerRequestProcessor@105] - Shutting down
2017-09-21 02:59:34,273 [myid:2] - INFO
[FollowerRequestProcessor:2:FollowerRequestProcessor@95] -
FollowerRequestProcessor exited loop!
2017-09-21 02:59:34,273 [myid:2] - INFO  [QuorumPeer[myid=2]/127.0.0.1:5002
:CommitProcessor@181] - Shutting down
2017-09-21 02:59:34,273 [myid:2] - INFO  [QuorumPeer[myid=2]/127.0.0.1:5002
:FinalRequestProcessor@415] - shutdown of request processor complete
2017-09-21 02:59:34,274 [myid:2] - INFO
[CommitProcessor:2:CommitProcessor@150] - CommitProcessor exited loop!
2017-09-21 02:59:34,274 [myid:2] - INFO  [QuorumPeer[myid=2]/127.0.0.1:5002
:SyncRequestProcessor@209] - Shutting down
2017-09-21 02:59:34,274 [myid:2] - INFO
[NIOServerCxn.Factory:localhost/127.0.0.1:5002:ZooKeeperServer@841] -
Refusing session request for client /127.0.0.1:59500 as it has seen zxid
0x50006febd our last zxid is 0x0 client must try another server
2017-09-21 02:59:34,274 [myid:2] - INFO
[SyncThread:2:SyncRequestProcessor@187] - SyncRequestProcessor exited!
2017-09-21 02:59:34,274 [myid:2] - INFO  [QuorumPeer[myid=2]/127.0.0.1:5002
:QuorumPeer@714] - LOOKING
2017-09-21 02:59:34,274 [myid:2] - INFO  [QuorumPeer[myid=2]/127.0.0.1:5002
:FileSnap@83] - Reading snapshot /run/zookeeper/version-2/snapshot.50006fe73
2017-09-21 02:59:34,276 [myid:2] - INFO  [QuorumPeer[myid=2]/127.0.0.1:5002
:FastLeaderElection@815] - New election. My id =  2, proposed
zxid=0x50006fec3
2017-09-21 02:59:34,277 [myid:2] - INFO
[WorkerReceiver[myid=2]:FastLeaderElection@597] - Notification: 1 (message
format version), 2 (n.leader), 0x50006fec3 (n.zxid), 0xd (n.round), LOOKING
(n.state), 2 (n.sid), 0x5 (n.peerEpoch) LOOKING (my state)
2017-09-21 02:59:34,277 [myid:2] - INFO
[WorkerReceiver[myid=2]:FastLeaderElection@597] - Notification: 1 (message
format version), 1 (n.leader), 0x50006fec3 (n.zxid), 0xd (n.round), LOOKING
(n.state), 1 (n.sid), 0x5 (n.peerEpoch) LOOKING (my state)
2017-09-21 02:59:34,277 [myid:2] - INFO
[WorkerReceiver[myid=2]:FastLeaderElection@597] - Notification: 1 (message
format version), 2 (n.leader), 0x50006fec3 (n.zxid), 0xd (n.round), LOOKING
(n.state), 1 (n.sid), 0x5 (n.peerEpoch) LOOKING (my state)
2017-09-21 02:59:34,404 [myid:2] - WARN
[NIOServerCxn.Factory:localhost/127.0.0.1:5002:NIOServerCnxn@362] -
Exception causing close of session 0x0 due to java.io.IOException:
ZooKeeperServer not running
2017-09-21 02:59:34,478 [myid:2] - INFO  [QuorumPeer[myid=2]/127.0.0.1:5002
:QuorumPeer@796] - LEADING
2017-09-21 02:59:34,478 [myid:2] - INFO  [QuorumPeer[myid=2]/127.0.0.1:5002
:ZooKeeperServer@162] - Created server with tickTime 2000 minSessionTimeout
4000 maxSessionTimeout 40000 datadir /run/zookeeper/version-2 snapdir
/run/zookeeper/version-2
2017-09-21 02:59:34,478 [myid:2] - INFO  [QuorumPeer[myid=2]/127.0.0.1:5002
:Leader@358] - LEADING - LEADER ELECTION TOOK - 204
2017-09-21 02:59:34,479 [myid:2] - INFO  [LearnerHandler-/127.0.0.1:39016
:LearnerHandler@330] - Follower sid: 1 : info :
org.apache.zookeeper.server.quorum.QuorumPeer$QuorumServer@988c127
2017-09-21 02:59:34,479 [myid:2] - INFO  [LearnerHandler-/127.0.0.1:39016
:LearnerHandler@385] - Synchronizing with Follower sid: 1
maxCommittedLog=0x50006fec3 minCommittedLog=0x50006fe74
peerLastZxid=0x50006fec3
2017-09-21 02:59:34,480 [myid:2] - INFO  [LearnerHandler-/127.0.0.1:39016
:LearnerHandler@462] - Sending DIFF
2017-09-21 02:59:34,520 [myid:2] - INFO  [LearnerHandler-/127.0.0.1:39016
:LearnerHandler@522] - Received NEWLEADER-ACK message from 1
2017-09-21 02:59:34,520 [myid:2] - INFO  [QuorumPeer[myid=2]/127.0.0.1:5002
:Leader@943] - Have quorum of supporters, sids: [ 1,2 ]; starting up and
setting last processed zxid: 0x600000000
2017-09-21 02:59:40,000 [myid:2] - INFO  [SessionTracker:ZooKeeperServer@347]
- Expiring session 0x35e68954661a833, timeout of 5000ms exceeded
2017-09-21 02:59:40,000 [myid:2] - INFO  [SessionTracker:ZooKeeperServer@347]
- Expiring session 0x25e8d4776a2557c, timeout of 5000ms exceeded
2017-09-21 02:59:56,001 [myid:2] - INFO  [SessionTracker:ZooKeeperServer@347]
- Expiring session 0x15e5304d29b0046, timeout of 20000ms exceeded
2017-09-21 02:59:56,001 [myid:2] - INFO  [SessionTracker:ZooKeeperServer@347]
- Expiring session 0x25e5304d29d003b, timeout of 20000ms exceeded
2017-09-21 02:59:56,001 [myid:2] - INFO  [SessionTracker:ZooKeeperServer@347]
- Expiring session 0x25e5304d29d003c, timeout of 20000ms exceeded
2017-09-21 02:59:56,001 [myid:2] - INFO  [SessionTracker:ZooKeeperServer@347]
- Expiring session 0x15e5d10a4b50046, timeout of 20000ms exceeded
2017-09-21 03:00:02,649 [myid:2] - INFO
[WorkerReceiver[myid=2]:FastLeaderElection@597] - Notification: 1 (message
format version), 3 (n.leader), 0x40000e237 (n.zxid), 0xc (n.round), LEADING
(n.state), 3 (n.sid), 0x5 (n.peerEpoch) LEADING (my state)
2017-09-21 03:00:03,659 [myid:2] - INFO
[WorkerReceiver[myid=2]:FastLeaderElection@597] - Notification: 1 (message
format version), 3 (n.leader), 0x50006fed5 (n.zxid), 0xd (n.round), LOOKING
(n.state), 3 (n.sid), 0x5 (n.peerEpoch) LEADING (my state)
2017-09-21 03:00:03,698 [myid:2] - INFO  [LearnerHandler-/127.0.0.1:39060
:LearnerHandler@330] - Follower sid: 3 : info :
org.apache.zookeeper.server.quorum.QuorumPeer$QuorumServer@3753377a
2017-09-21 03:00:03,699 [myid:2] - INFO  [LearnerHandler-/127.0.0.1:39060
:LearnerHandler@385] - Synchronizing with Follower sid: 3
maxCommittedLog=0x600000086 minCommittedLog=0x50006fe74
peerLastZxid=0x50006fed5
2017-09-21 03:00:03,699 [myid:2] - INFO  [LearnerHandler-/127.0.0.1:39060
:LearnerHandler@462] - Sending TRUNC
2017-09-21 03:00:03,709 [myid:2] - INFO  [LearnerHandler-/127.0.0.1:39060
:LearnerHandler@522] - Received NEWLEADER-ACK message from 3
2017-09-21 03:00:03,978 [myid:2] - WARN
[NIOServerCxn.Factory:localhost/127.0.0.1:5002:NIOServerCnxn@357] - caught
end of stream exception
EndOfStreamException: Unable to read additional data from client sessionid
0x25ea2604ff80006, likely client has closed socket
at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228)
at
org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208)
at java.lang.Thread.run(Thread.java:745)
2017-09-21 03:00:07,354 [myid:2] - WARN
[NIOServerCxn.Factory:localhost/127.0.0.1:5002:NIOServerCnxn@357] - caught
end of stream exception
EndOfStreamException: Unable to read additional data from client sessionid
0x25ea2604ff80025, likely client has closed socket
at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228)
at
org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208)
at java.lang.Thread.run(Thread.java:745)
2017-09-21 03:00:08,928 [myid:2] - INFO  [Snapshot Thread:FileTxnSnapLog@240]
- Snapshotting: 0x60000026c to /run/zookeeper/version-2/snapshot.60000026c
2017-09-21 03:00:08,929 [myid:2] - INFO  [SyncThread:2:FileTxnLog@199] -
Creating new log file: log.60000026e
2017-09-21 03:00:22,000 [myid:2] - INFO  [SessionTracker:ZooKeeperServer@347]
- Expiring session 0x35ea260c1fb002f, timeout of 10000ms exceeded
2017-09-21 03:00:23,900 [myid:2] - INFO  [Snapshot Thread:FileTxnSnapLog@240]
- Snapshotting: 0x600000489 to /run/zookeeper/version-2/snapshot.600000489
2017-09-21 03:00:23,914 [myid:2] - INFO  [SyncThread:2:FileTxnLog@199] -
Creating new log file: log.60000048b


*From node3:*

2017-09-21 02:52:22,913 [myid:3] - WARN
[NIOServerCxn.Factory:localhost/127.0.0.1:5002:NIOServerCnxn@357] - caught
end of stream exception
EndOfStreamException: Unable to read additional data from client sessionid
0x35e68954661a822, likely client has closed socket
at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228)
at
org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208)
at java.lang.Thread.run(Thread.java:745)
2017-09-21 02:55:23,499 [myid:3] - WARN
[NIOServerCxn.Factory:localhost/127.0.0.1:5002:NIOServerCnxn@357] - caught
end of stream exception
EndOfStreamException: Unable to read additional data from client sessionid
0x35e68954661a836, likely client has closed socket
at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228)
at
org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208)
at java.lang.Thread.run(Thread.java:745)
2017-09-21 02:57:17,081 [myid:3] - INFO
[PurgeTask:DatadirCleanupManager$PurgeTask@138] - Purge task started.
Removing file: Sep 21, 2017 1:34:26 AM
/run/zookeeper/version-2/log.50006f20a
Removing file: Sep 21, 2017 1:04:15 AM
/run/zookeeper/version-2/log.50006effd
Removing file: Sep 21, 2017 12:46:00 AM
/run/zookeeper/version-2/log.50006ec6c
Removing file: Sep 21, 2017 1:34:26 AM
/run/zookeeper/version-2/snapshot.50006f55b
Removing file: Sep 21, 2017 1:04:15 AM
/run/zookeeper/version-2/snapshot.50006f208
Removing file: Sep 21, 2017 12:46:00 AM
/run/zookeeper/version-2/snapshot.50006effb
2017-09-21 02:57:17,082 [myid:3] - INFO
[PurgeTask:DatadirCleanupManager$PurgeTask@144] - Purge task completed.
2017-09-21 02:59:24,260 [myid:3] - WARN
[NIOServerCxn.Factory:localhost/127.0.0.1:5002:NIOServerCnxn@357] - caught
end of stream exception
EndOfStreamException: Unable to read additional data from client sessionid
0x35e68954661a852, likely client has closed socket
at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228)
at
org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208)
at java.lang.Thread.run(Thread.java:745)
2017-09-21 03:00:02,644 [myid:3] - ERROR [LearnerHandler-/127.0.0.1:40958
:LearnerHandler@633] - Unexpected exception causing shutdown while sock
still open
java.io.EOFException
at java.io.DataInputStream.readInt(DataInputStream.java:392)
at org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
at
org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:83)
at
org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:103)
at
org.apache.zookeeper.server.quorum.LearnerHandler.run(LearnerHandler.java:546)
2017-09-21 03:00:02,645 [myid:3] - WARN  [LearnerHandler-/127.0.0.1:40958
:LearnerHandler@646] - ******* GOODBYE /127.0.0.1:40958 ********
2017-09-21 03:00:02,645 [myid:3] - INFO  [SessionTracker:ZooKeeperServer@347]
- Expiring session 0x15e901612194a6f, timeout of 5000ms exceeded
2017-09-21 03:00:02,645 [myid:3] - INFO  [SessionTracker:ZooKeeperServer@347]
- Expiring session 0x25e8d4776a2557c, timeout of 5000ms exceeded
2017-09-21 03:00:02,645 [myid:3] - INFO  [SessionTracker:ZooKeeperServer@347]
- Expiring session 0x35e68954661a833, timeout of 5000ms exceeded
2017-09-21 03:00:02,645 [myid:3] - INFO  [SessionTracker:ZooKeeperServer@347]
- Expiring session 0x25e5304d29d003c, timeout of 20000ms exceeded
2017-09-21 03:00:02,645 [myid:3] - INFO  [SessionTracker:ZooKeeperServer@347]
- Expiring session 0x15e5d10a4b50020, timeout of 20000ms exceeded
2017-09-21 03:00:02,645 [myid:3] - INFO  [SessionTracker:ZooKeeperServer@347]
- Expiring session 0x25e5304d29d003b, timeout of 20000ms exceeded
2017-09-21 03:00:02,645 [myid:3] - INFO  [SessionTracker:ZooKeeperServer@347]
- Expiring session 0x15e5d10a4b50046, timeout of 20000ms exceeded
2017-09-21 03:00:02,645 [myid:3] - INFO  [SessionTracker:ZooKeeperServer@347]
- Expiring session 0x15e5d10a4b50022, timeout of 20000ms exceeded
2017-09-21 03:00:02,645 [myid:3] - INFO  [SessionTracker:ZooKeeperServer@347]
- Expiring session 0x35e5d10a4de0029, timeout of 20000ms exceeded
2017-09-21 03:00:02,645 [myid:3] - INFO  [SessionTracker:ZooKeeperServer@347]
- Expiring session 0x15e5d10a4b50021, timeout of 20000ms exceeded
2017-09-21 03:00:02,645 [myid:3] - INFO  [SessionTracker:ZooKeeperServer@347]
- Expiring session 0x35e5d10a4de002a, timeout of 20000ms exceeded
2017-09-21 03:00:02,648 [myid:3] - INFO
[WorkerReceiver[myid=3]:FastLeaderElection@597] - Notification: 1 (message
format version), 2 (n.leader), 0x50006fec3 (n.zxid), 0xd (n.round), LOOKING
(n.state), 2 (n.sid), 0x5 (n.peerEpoch) LEADING (my state)
2017-09-21 03:00:02,648 [myid:3] - INFO
[WorkerReceiver[myid=3]:FastLeaderElection@597] - Notification: 1 (message
format version), 1 (n.leader), 0x50006fec3 (n.zxid), 0xd (n.round), LOOKING
(n.state), 1 (n.sid), 0x5 (n.peerEpoch) LEADING (my state)
2017-09-21 03:00:02,648 [myid:3] - INFO
[WorkerReceiver[myid=3]:FastLeaderElection@597] - Notification: 1 (message
format version), 2 (n.leader), 0x50006fec3 (n.zxid), 0xd (n.round), LOOKING
(n.state), 1 (n.sid), 0x5 (n.peerEpoch) LEADING (my state)
2017-09-21 03:00:02,649 [myid:3] - ERROR [LearnerHandler-/127.0.0.1:58432
:LearnerHandler@633] - Unexpected exception causing shutdown while sock
still open
java.io.EOFException
at java.io.DataInputStream.readInt(DataInputStream.java:392)
at org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
at
org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:83)
at
org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:103)
at
org.apache.zookeeper.server.quorum.LearnerHandler.run(LearnerHandler.java:546)
2017-09-21 03:00:02,649 [myid:3] - WARN  [LearnerHandler-/127.0.0.1:58432
:LearnerHandler@646] - ******* GOODBYE /127.0.0.1:58432 ********
2017-09-21 03:00:02,651 [myid:3] - INFO  [ProcessThread(sid:3
cport:-1)::PrepRequestProcessor@645] - Got user-level KeeperException when
processing sessionid:0x15e5304d29b0046 type:ping cxid:0xfffffffffffffffe
zxid:0xfffffffffffffffe txntype:unknown reqpath:n/a Error Path:null
Error:KeeperErrorCode = Session moved
2017-09-21 03:00:02,651 [myid:3] - WARN
[NIOServerCxn.Factory:localhost/127.0.0.1:5002:NIOServerCnxn@357] - caught
end of stream exception
EndOfStreamException: Unable to read additional data from client sessionid
0x15e5d10a4b50020, likely client has closed socket
at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228)
at
org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208)
at java.lang.Thread.run(Thread.java:745)
2017-09-21 03:00:02,651 [myid:3] - WARN
[NIOServerCxn.Factory:localhost/127.0.0.1:5002:NIOServerCnxn@357] - caught
end of stream exception
EndOfStreamException: Unable to read additional data from client sessionid
0x25e5304d29d003b, likely client has closed socket
at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228)
at
org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208)
at java.lang.Thread.run(Thread.java:745)
2017-09-21 03:00:02,651 [myid:3] - WARN
[NIOServerCxn.Factory:localhost/127.0.0.1:5002:NIOServerCnxn@357] - caught
end of stream exception
EndOfStreamException: Unable to read additional data from client sessionid
0x35e68954661a833, likely client has closed socket
at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228)
at
org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208)
at java.lang.Thread.run(Thread.java:745)
2017-09-21 03:00:02,652 [myid:3] - WARN
[NIOServerCxn.Factory:localhost/127.0.0.1:5002:NIOServerCnxn@357] - caught
end of stream exception
EndOfStreamException: Unable to read additional data from client sessionid
0x25e5304d29d003c, likely client has closed socket
at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228)
at
org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208)
at java.lang.Thread.run(Thread.java:745)
2017-09-21 03:00:02,652 [myid:3] - WARN
[NIOServerCxn.Factory:localhost/127.0.0.1:5002:NIOServerCnxn@357] - caught
end of stream exception
EndOfStreamException: Unable to read additional data from client sessionid
0x15e5304d29b0046, likely client has closed socket
at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228)
at
org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208)
at java.lang.Thread.run(Thread.java:745)
2017-09-21 03:00:02,652 [myid:3] - WARN
[NIOServerCxn.Factory:localhost/127.0.0.1:5002:NIOServerCnxn@357] - caught
end of stream exception
EndOfStreamException: Unable to read additional data from client sessionid
0x35e5d10a4de0029, likely client has closed socket
at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228)
at
org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208)
at java.lang.Thread.run(Thread.java:745)
2017-09-21 03:00:02,653 [myid:3] - WARN
[NIOServerCxn.Factory:localhost/127.0.0.1:5002:NIOServerCnxn@357] - caught
end of stream exception
EndOfStreamException: Unable to read additional data from client sessionid
0x15e5d10a4b50022, likely client has closed socket
at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228)
at
org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208)
at java.lang.Thread.run(Thread.java:745)
2017-09-21 03:00:02,653 [myid:3] - WARN
[NIOServerCxn.Factory:localhost/127.0.0.1:5002:NIOServerCnxn@357] - caught
end of stream exception
EndOfStreamException: Unable to read additional data from client sessionid
0x35e5d10a4de002a, likely client has closed socket
at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228)
at
org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208)
at java.lang.Thread.run(Thread.java:745)
2017-09-21 03:00:02,653 [myid:3] - INFO
[NIOServerCxn.Factory:localhost/127.0.0.1:5002:ZooKeeperServer@610] -
Invalid session 0x35e68954661a833 for client /127.0.0.1:46968, probably
expired
2017-09-21 03:00:02,653 [myid:3] - WARN
[NIOServerCxn.Factory:localhost/127.0.0.1:5002:NIOServerCnxn@357] - caught
end of stream exception
EndOfStreamException: Unable to read additional data from client sessionid
0x15e5d10a4b50021, likely client has closed socket
at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228)
at
org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208)
at java.lang.Thread.run(Thread.java:745)
2017-09-21 03:00:02,654 [myid:3] - INFO
[NIOServerCxn.Factory:localhost/127.0.0.1:5002:ZooKeeperServer@610] -
Invalid session 0x25e5304d29d003c for client /127.0.0.1:46982, probably
expired
2017-09-21 03:00:02,654 [myid:3] - INFO
[NIOServerCxn.Factory:localhost/127.0.0.1:5002:ZooKeeperServer@610] -
Invalid session 0x15e5d10a4b50046 for client /127.0.0.1:46984, probably
expired
2017-09-21 03:00:02,660 [myid:3] - INFO
[NIOServerCxn.Factory:localhost/127.0.0.1:5002:ZooKeeperServer@610] -
Invalid session 0x25e5304d29d003b for client /127.0.0.1:46994, probably
expired
2017-09-21 03:00:03,648 [myid:3] - INFO  [QuorumPeer[myid=3]/127.0.0.1:5002
:Leader@493] - Shutting down
2017-09-21 03:00:03,648 [myid:3] - INFO  [QuorumPeer[myid=3]/127.0.0.1:5002
:Leader@499] - Shutdown called
java.lang.Exception: shutdown Leader! reason: Not sufficient followers
synced, only synced with sids: [ 3 ]
at org.apache.zookeeper.server.quorum.Leader.shutdown(Leader.java:499)
at org.apache.zookeeper.server.quorum.Leader.lead(Leader.java:474)
at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:799)
2017-09-21 03:00:03,648 [myid:3] - INFO
[Thread-17:Leader$LearnerCnxAcceptor@322] - exception while shutting down
acceptor: java.net.SocketException: Socket closed
2017-09-21 03:00:03,648 [myid:3] - INFO  [QuorumPeer[myid=3]/127.0.0.1:5002
:ZooKeeperServer@441] - shutting down
2017-09-21 03:00:03,649 [myid:3] - INFO  [QuorumPeer[myid=3]/127.0.0.1:5002
:SessionTrackerImpl@225] - Shutting down
2017-09-21 03:00:03,649 [myid:3] - INFO  [QuorumPeer[myid=3]/127.0.0.1:5002
:PrepRequestProcessor@761] - Shutting down
2017-09-21 03:00:03,649 [myid:3] - INFO  [QuorumPeer[myid=3]/127.0.0.1:5002
:ProposalRequestProcessor@88] - Shutting down
2017-09-21 03:00:03,649 [myid:3] - INFO  [QuorumPeer[myid=3]/127.0.0.1:5002
:CommitProcessor@181] - Shutting down
2017-09-21 03:00:03,649 [myid:3] - INFO  [QuorumPeer[myid=3]/127.0.0.1:5002
:Leader$ToBeAppliedRequestProcessor@658] - Shutting down
2017-09-21 03:00:03,649 [myid:3] - INFO  [QuorumPeer[myid=3]/127.0.0.1:5002
:FinalRequestProcessor@415] - shutdown of request processor complete
2017-09-21 03:00:03,649 [myid:3] - INFO  [QuorumPeer[myid=3]/127.0.0.1:5002
:SyncRequestProcessor@209] - Shutting down
2017-09-21 03:00:03,649 [myid:3] - INFO  [ProcessThread(sid:3
cport:-1)::PrepRequestProcessor@143] - PrepRequestProcessor exited loop!
2017-09-21 03:00:03,649 [myid:3] - INFO
[CommitProcessor:3:CommitProcessor@150] - CommitProcessor exited loop!
2017-09-21 03:00:03,649 [myid:3] - INFO
[SyncThread:3:SyncRequestProcessor@187] - SyncRequestProcessor exited!
2017-09-21 03:00:03,652 [myid:3] - INFO  [QuorumPeer[myid=3]/127.0.0.1:5002
:QuorumPeer@714] - LOOKING
2017-09-21 03:00:03,653 [myid:3] - INFO  [QuorumPeer[myid=3]/127.0.0.1:5002
:FileSnap@83] - Reading snapshot /run/zookeeper/version-2/snapshot.50006fcf2
2017-09-21 03:00:03,659 [myid:3] - INFO  [QuorumPeer[myid=3]/127.0.0.1:5002
:FastLeaderElection@815] - New election. My id =  3, proposed
zxid=0x50006fed5
2017-09-21 03:00:03,659 [myid:3] - INFO
[WorkerReceiver[myid=3]:FastLeaderElection@597] - Notification: 1 (message
format version), 3 (n.leader), 0x50006fed5 (n.zxid), 0xd (n.round), LOOKING
(n.state), 3 (n.sid), 0x5 (n.peerEpoch) LOOKING (my state)
2017-09-21 03:00:03,660 [myid:3] - INFO
[WorkerReceiver[myid=3]:FastLeaderElection@597] - Notification: 1 (message
format version), 2 (n.leader), 0x50006fec3 (n.zxid), 0xd (n.round), LEADING
(n.state), 2 (n.sid), 0x6 (n.peerEpoch) LOOKING (my state)
2017-09-21 03:00:03,697 [myid:3] - INFO
[WorkerReceiver[myid=3]:FastLeaderElection@597] - Notification: 1 (message
format version), 2 (n.leader), 0x50006fec3 (n.zxid), 0xd (n.round),
FOLLOWING (n.state), 1 (n.sid), 0x6 (n.peerEpoch) LOOKING (my state)
2017-09-21 03:00:03,697 [myid:3] - INFO  [QuorumPeer[myid=3]/127.0.0.1:5002
:QuorumPeer@784] - FOLLOWING
2017-09-21 03:00:03,697 [myid:3] - INFO  [QuorumPeer[myid=3]/127.0.0.1:5002
:ZooKeeperServer@162] - Created server with tickTime 2000 minSessionTimeout
4000 maxSessionTimeout 40000 datadir /run/zookeeper/version-2 snapdir
/run/zookeeper/version-2
2017-09-21 03:00:03,697 [myid:3] - INFO  [QuorumPeer[myid=3]/127.0.0.1:5002
:Follower@63] - FOLLOWING - LEADER ELECTION TOOK - 44
2017-09-21 03:00:03,700 [myid:3] - WARN  [QuorumPeer[myid=3]/127.0.0.1:5002
:Learner@338] - Truncating log to get in sync with the leader 0x50006fec3
2017-09-21 03:00:03,702 [myid:3] - INFO  [QuorumPeer[myid=3]/127.0.0.1:5002
:FileSnap@83] - Reading snapshot /run/zookeeper/version-2/snapshot.50006fcf2
2017-09-21 03:00:03,707 [myid:3] - WARN  [QuorumPeer[myid=3]/127.0.0.1:5002
:Learner@374] - Got zxid 0x600000001 expected 0x1
2017-09-21 03:00:03,708 [myid:3] - INFO  [QuorumPeer[myid=3]/127.0.0.1:5002
:FileTxnSnapLog@240] - Snapshotting: 0x600000086 to
/run/zookeeper/version-2/snapshot.600000086
2017-09-21 03:00:03,773 [myid:3] - INFO  [SyncThread:3:FileTxnLog@199] -
Creating new log file: log.600000087
2017-09-21 03:00:03,776 [myid:3] - WARN  [QuorumPeer[myid=3]/127.0.0.1:5002
:Follower@118] - Got zxid 0x600000088 expected 0x1
2017-09-21 03:00:03,895 [myid:3] - WARN
[NIOServerCxn.Factory:localhost/127.0.0.1:5002:NIOServerCnxn@357] - caught
end of stream exception
EndOfStreamException: Unable to read additional data from client sessionid
0x35ea260c1fb0002, likely client has closed socket
at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228)
at
org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208)
at java.lang.Thread.run(Thread.java:745)
2017-09-21 03:00:03,978 [myid:3] - WARN
[NIOServerCxn.Factory:localhost/127.0.0.1:5002:NIOServerCnxn@362] -
Exception causing close of session 0x35ea260c1fb0003 due to
java.io.IOException: Connection reset by peer
2017-09-21 03:00:04,000 [myid:3] - INFO
[SessionTracker:SessionTrackerImpl@162] - SessionTrackerImpl exited loop!
2017-09-21 03:00:05,806 [myid:3] - WARN
[NIOServerCxn.Factory:localhost/127.0.0.1:5002:NIOServerCnxn@357] - caught
end of stream exception
EndOfStreamException: Unable to read additional data from client sessionid
0x35ea260c1fb0011, likely client has closed socket
at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228)
at
org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208)
at java.lang.Thread.run(Thread.java:745)
2017-09-21 03:00:09,600 [myid:3] - INFO  [Snapshot Thread:FileTxnSnapLog@240]
- Snapshotting: 0x60000029c to /run/zookeeper/version-2/snapshot.60000029c
2017-09-21 03:00:09,606 [myid:3] - INFO  [SyncThread:3:FileTxnLog@199] -
Creating new log file: log.60000029e
2017-09-21 03:00:10,399 [myid:3] - WARN
[NIOServerCxn.Factory:localhost/127.0.0.1:5002:NIOServerCnxn@357] - caught
end of stream exception
EndOfStreamException: Unable to read additional data from client sessionid
0x35ea260c1fb002f, likely client has closed socket
at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228)
at
org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208)
at java.lang.Thread.run(Thread.java:745)

Thanks,
Anand.

On Thu, Oct 5, 2017 at 9:50 AM, Patrick Hunt <ph...@apache.org> wrote:

> Unfortunately I don't see any attached logs, which makes it difficult to
> provide you with insight. "Not sufficient followers synced" indicates that
> you're losing followers, likely they are falling behind - what is your
> metric tracking telling your wrt load on the compute and load on the
> disk/memory/network/etc... also metrics at the ZK level (e.g. are zk
> latencies increasing?) Check the logs to see if you're seeing "fsync"
> slowness issues (it's a warning in the server logs). This is a pretty
> common issue. GC might also be an issue, although that's more rare these
> days (hard to say w/o knowing your use case, etc...) Again, look to your
> metrics collection for insight where to start.
>
> Patrick
>
> On Wed, Oct 4, 2017 at 11:17 AM, Anand Parthasarathy <
> anpartha@avinetworks.com> wrote:
>
> > Hi,
> >
> > We have an issue with a 3-node zookeeper ensemble where the quorum goes
> > down due to no apparent reason every once in a while. Here is what I see
> in
> > the ZK leader:
> >
> > 2017-09-21 03:00:03,648 [myid:3] - INFO  [QuorumPeer[myid=3]/127.0.0.1:
> > 5002:Leader@493] - Shutting down
> > 2017-09-21 03:00:03,648 [myid:3] - INFO  [QuorumPeer[myid=3]/127.0.0.1:
> > 5002:Leader@499] - Shutdown called
> > java.lang.Exception: shutdown Leader! reason: Not sufficient followers
> > synced, only synced with sids: [ 3 ]
> >     at org.apache.zookeeper.server.quorum.Leader.shutdown(Leader.
> java:499)
> >     at org.apache.zookeeper.server.quorum.Leader.lead(Leader.java:474)
> >     at org.apache.zookeeper.server.quorum.QuorumPeer.run(
> > QuorumPeer.java:799)
> >
> > I have attached the logs from the 3 nodes around this time. Could you
> pls.
> > help understand what the issue could be here. The only thing I see a
> little
> > bit ahead of this timestamp is that all of them did a PurgeTask pretty
> much
> > at the same time.
> >
> > Thanks,
> > Anand.
> >
>

Re: Zookeeper quorum goes down for no apparent reason in 3.4.5

Posted by Patrick Hunt <ph...@apache.org>.
Unfortunately I don't see any attached logs, which makes it difficult to
provide you with insight. "Not sufficient followers synced" indicates that
you're losing followers, likely they are falling behind - what is your
metric tracking telling your wrt load on the compute and load on the
disk/memory/network/etc... also metrics at the ZK level (e.g. are zk
latencies increasing?) Check the logs to see if you're seeing "fsync"
slowness issues (it's a warning in the server logs). This is a pretty
common issue. GC might also be an issue, although that's more rare these
days (hard to say w/o knowing your use case, etc...) Again, look to your
metrics collection for insight where to start.

Patrick

On Wed, Oct 4, 2017 at 11:17 AM, Anand Parthasarathy <
anpartha@avinetworks.com> wrote:

> Hi,
>
> We have an issue with a 3-node zookeeper ensemble where the quorum goes
> down due to no apparent reason every once in a while. Here is what I see in
> the ZK leader:
>
> 2017-09-21 03:00:03,648 [myid:3] - INFO  [QuorumPeer[myid=3]/127.0.0.1:
> 5002:Leader@493] - Shutting down
> 2017-09-21 03:00:03,648 [myid:3] - INFO  [QuorumPeer[myid=3]/127.0.0.1:
> 5002:Leader@499] - Shutdown called
> java.lang.Exception: shutdown Leader! reason: Not sufficient followers
> synced, only synced with sids: [ 3 ]
>     at org.apache.zookeeper.server.quorum.Leader.shutdown(Leader.java:499)
>     at org.apache.zookeeper.server.quorum.Leader.lead(Leader.java:474)
>     at org.apache.zookeeper.server.quorum.QuorumPeer.run(
> QuorumPeer.java:799)
>
> I have attached the logs from the 3 nodes around this time. Could you pls.
> help understand what the issue could be here. The only thing I see a little
> bit ahead of this timestamp is that all of them did a PurgeTask pretty much
> at the same time.
>
> Thanks,
> Anand.
>