You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@zookeeper.apache.org by Aaron Zimmerman <az...@sproutsocial.com> on 2014/07/04 15:29:34 UTC

entire cluster dies with EOFException

Hi all,

We have a 5 node zookeeper cluster that has been operating normally for
several months.  Starting a few days ago, the entire cluster crashes a few
times per day, all nodes at the exact same time.  We can't track down the
exact issue, but deleting the snapshots and logs and restarting resolves.

We are running exhibitor to monitor the cluster.

It appears that something bad gets into the logs, causing an EOFException
and this cascades through the entire cluster:

2014-07-04 12:55:26,328 [myid:1] - WARN
 [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Follower@89] - Exception when
following the leader
java.io.EOFException
        at java.io.DataInputStream.readInt(DataInputStream.java:375)
        at
org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
        at
org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:83)
        at
org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:108)
        at
org.apache.zookeeper.server.quorum.Learner.readPacket(Learner.java:152)
        at
org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:85)
        at
org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:740)
2014-07-04 12:55:26,328 [myid:1] - INFO
 [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Follower@166] - shutdown called
java.lang.Exception: shutdown Follower
        at
org.apache.zookeeper.server.quorum.Follower.shutdown(Follower.java:166)
        at
org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:744)


Then the server dies, exhibitor tries to restart each node, and they all
get stuck trying to replay the bad transaction, logging things like:


2014-07-04 12:58:52,734 [myid:1] - INFO  [main:FileSnap@83] - Reading
snapshot /var/lib/zookeeper/version-2/snapshot.300011fc0
2014-07-04 12:58:52,896 [myid:1] - DEBUG
[main:FileTxnLog$FileTxnIterator@575] - Created new input stream
/var/lib/zookeeper/version-2/log.300000021
2014-07-04 12:58:52,915 [myid:1] - DEBUG
[main:FileTxnLog$FileTxnIterator@578] - Created new input archive
/var/lib/zookeeper/version-2/log.300000021
2014-07-04 12:59:25,870 [myid:1] - DEBUG
[main:FileTxnLog$FileTxnIterator@618] - EOF excepton java.io.EOFException:
Failed to read /var/lib/zookeeper/version-2/log.300000021
2014-07-04 12:59:25,871 [myid:1] - DEBUG
[main:FileTxnLog$FileTxnIterator@575] - Created new input stream
/var/lib/zookeeper/version-2/log.300011fc2
2014-07-04 12:59:25,872 [myid:1] - DEBUG
[main:FileTxnLog$FileTxnIterator@578] - Created new input archive
/var/lib/zookeeper/version-2/log.300011fc2
2014-07-04 12:59:48,722 [myid:1] - DEBUG
[main:FileTxnLog$FileTxnIterator@618] - EOF excepton java.io.EOFException:
Failed to read /var/lib/zookeeper/version-2/log.300011fc2

And the cluster is dead.  The only way we have found to recover is to
delete all of the data and restart.

Anyone seen this before?  Any ideas how I can track down what is causing
the EOFException, or insulate zookeeper from completely crashing?

Thanks,

Aaron Zimmerman

Re: entire cluster dies with EOFException

Posted by Camille Fournier <ca...@apache.org>.

Do you have copies of the logs and snapshots from the time this happened? I
think we'll need that to debug this. If you do can you open a ticket and
attach those and tag me?

C
On Jul 4, 2014 9:30 AM, "Aaron Zimmerman" <az...@sproutsocial.com>
wrote:

> Hi all,
>
> We have a 5 node zookeeper cluster that has been operating normally for
> several months.  Starting a few days ago, the entire cluster crashes a few
> times per day, all nodes at the exact same time.  We can't track down the
> exact issue, but deleting the snapshots and logs and restarting resolves.
>
> We are running exhibitor to monitor the cluster.
>
> It appears that something bad gets into the logs, causing an EOFException
> and this cascades through the entire cluster:
>
> 2014-07-04 12:55:26,328 [myid:1] - WARN
>  [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Follower@89] - Exception when
> following the leader
> java.io.EOFException
>         at java.io.DataInputStream.readInt(DataInputStream.java:375)
>         at
> org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
>         at
>
> org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:83)
>         at
> org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:108)
>         at
> org.apache.zookeeper.server.quorum.Learner.readPacket(Learner.java:152)
>         at
> org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:85)
>         at
> org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:740)
> 2014-07-04 12:55:26,328 [myid:1] - INFO
>  [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Follower@166] - shutdown called
> java.lang.Exception: shutdown Follower
>         at
> org.apache.zookeeper.server.quorum.Follower.shutdown(Follower.java:166)
>         at
> org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:744)
>
>
> Then the server dies, exhibitor tries to restart each node, and they all
> get stuck trying to replay the bad transaction, logging things like:
>
>
> 2014-07-04 12:58:52,734 [myid:1] - INFO  [main:FileSnap@83] - Reading
> snapshot /var/lib/zookeeper/version-2/snapshot.300011fc0
> 2014-07-04 12:58:52,896 [myid:1] - DEBUG
> [main:FileTxnLog$FileTxnIterator@575] - Created new input stream
> /var/lib/zookeeper/version-2/log.300000021
> 2014-07-04 12:58:52,915 [myid:1] - DEBUG
> [main:FileTxnLog$FileTxnIterator@578] - Created new input archive
> /var/lib/zookeeper/version-2/log.300000021
> 2014-07-04 12:59:25,870 [myid:1] - DEBUG
> [main:FileTxnLog$FileTxnIterator@618] - EOF excepton java.io.EOFException:
> Failed to read /var/lib/zookeeper/version-2/log.300000021
> 2014-07-04 12:59:25,871 [myid:1] - DEBUG
> [main:FileTxnLog$FileTxnIterator@575] - Created new input stream
> /var/lib/zookeeper/version-2/log.300011fc2
> 2014-07-04 12:59:25,872 [myid:1] - DEBUG
> [main:FileTxnLog$FileTxnIterator@578] - Created new input archive
> /var/lib/zookeeper/version-2/log.300011fc2
> 2014-07-04 12:59:48,722 [myid:1] - DEBUG
> [main:FileTxnLog$FileTxnIterator@618] - EOF excepton java.io.EOFException:
> Failed to read /var/lib/zookeeper/version-2/log.300011fc2
>
> And the cluster is dead.  The only way we have found to recover is to
> delete all of the data and restart.
>
> Anyone seen this before?  Any ideas how I can track down what is causing
> the EOFException, or insulate zookeeper from completely crashing?
>
> Thanks,
>
> Aaron Zimmerman
>

Re: entire cluster dies with EOFException

Posted by Benjamin Reed <br...@apache.org>.

any chance you are running out of disk space?


On Sun, Jul 6, 2014 at 6:49 AM, Benjamin Reed <br...@apache.org> wrote:

>
>
>
> On Fri, Jul 4, 2014 at 10:35 PM, Aaron Zimmerman <
> azimmerman@sproutsocial.com> wrote:
>
>> Thanks for getting back to me.
>>
>> Jordan,
>>
>> I don't think we are doing any large nodes or thousands of children.  We
>> are using zookeeper for storm and service discovery, so things are pretty
>> modest.
>>
>> Camille,
>>
>> I've created https://issues.apache.org/jira/browse/ZOOKEEPER-1955, and
>> attached the snapshot causing the EOF exception (I think..?), let me know
>> if you can discover anything from the snapshot.
>>
>> Thanks,
>>
>> Aaron Zimmerman
>>
>>
>>
>> On Fri, Jul 4, 2014 at 4:02 PM, Jordan Zimmerman <
>> jordan@jordanzimmerman.com
>> > wrote:
>>
>> > I’ve seen EOF errors when the 1MB limit has been reached. Check to see
>> if
>> > any ZNodes have thousands of children and/or big payloads.
>> >
>> > -JZ
>> >
>> >
>> > From: Aaron Zimmerman azimmerman@sproutsocial.com
>> > Reply: user@zookeeper.apache.org user@zookeeper.apache.org
>> > Date: July 4, 2014 at 8:30:09 AM
>> > To: user@zookeeper.apache.org user@zookeeper.apache.org
>> > Subject:  entire cluster dies with EOFException
>> >
>> > Hi all,
>> >
>> > We have a 5 node zookeeper cluster that has been operating normally for
>> > several months. Starting a few days ago, the entire cluster crashes a
>> few
>> > times per day, all nodes at the exact same time. We can't track down the
>> > exact issue, but deleting the snapshots and logs and restarting
>> resolves.
>> >
>> > We are running exhibitor to monitor the cluster.
>> >
>> > It appears that something bad gets into the logs, causing an
>> EOFException
>> > and this cascades through the entire cluster:
>> >
>> > 2014-07-04 12:55:26,328 [myid:1] - WARN
>> > [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Follower@89] - Exception when
>> > following the leader
>> > java.io.EOFException
>> > at java.io.DataInputStream.readInt(DataInputStream.java:375)
>> > at
>> > org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
>> > at
>> >
>> org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:83)
>> >
>> > at
>> >
>> org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:108)
>> > at
>> > org.apache.zookeeper.server.quorum.Learner.readPacket(Learner.java:152)
>> > at
>> >
>> org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:85)
>> > at
>> > org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:740)
>> > 2014-07-04 12:55:26,328 [myid:1] - INFO
>> > [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Follower@166] - shutdown
>> called
>> > java.lang.Exception: shutdown Follower
>> > at
>> > org.apache.zookeeper.server.quorum.Follower.shutdown(Follower.java:166)
>> > at
>> > org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:744)
>> >
>> >
>> > Then the server dies, exhibitor tries to restart each node, and they all
>> > get stuck trying to replay the bad transaction, logging things like:
>> >
>> >
>> > 2014-07-04 12:58:52,734 [myid:1] - INFO [main:FileSnap@83] - Reading
>> > snapshot /var/lib/zookeeper/version-2/snapshot.300011fc0
>> > 2014-07-04 12:58:52,896 [myid:1] - DEBUG
>> > [main:FileTxnLog$FileTxnIterator@575] - Created new input stream
>> > /var/lib/zookeeper/version-2/log.300000021
>> > 2014-07-04 12:58:52,915 [myid:1] - DEBUG
>> > [main:FileTxnLog$FileTxnIterator@578] - Created new input archive
>> > /var/lib/zookeeper/version-2/log.300000021
>> > 2014-07-04 12:59:25,870 [myid:1] - DEBUG
>> > [main:FileTxnLog$FileTxnIterator@618] - EOF excepton
>> > java.io.EOFException:
>> > Failed to read /var/lib/zookeeper/version-2/log.300000021
>> > 2014-07-04 12:59:25,871 [myid:1] - DEBUG
>> > [main:FileTxnLog$FileTxnIterator@575] - Created new input stream
>> > /var/lib/zookeeper/version-2/log.300011fc2
>> > 2014-07-04 12:59:25,872 [myid:1] - DEBUG
>> > [main:FileTxnLog$FileTxnIterator@578] - Created new input archive
>> > /var/lib/zookeeper/version-2/log.300011fc2
>> > 2014-07-04 12:59:48,722 [myid:1] - DEBUG
>> > [main:FileTxnLog$FileTxnIterator@618] - EOF excepton
>> > java.io.EOFException:
>> > Failed to read /var/lib/zookeeper/version-2/log.300011fc2
>> >
>> > And the cluster is dead. The only way we have found to recover is to
>> > delete all of the data and restart.
>> >
>> > Anyone seen this before? Any ideas how I can track down what is causing
>> > the EOFException, or insulate zookeeper from completely crashing?
>> >
>> > Thanks,
>> >
>> > Aaron Zimmerman
>> >
>> >
>>
>
>

Re: entire cluster dies with EOFException

Posted by Benjamin Reed <br...@apache.org>.

On Fri, Jul 4, 2014 at 10:35 PM, Aaron Zimmerman <
azimmerman@sproutsocial.com> wrote:

> Thanks for getting back to me.
>
> Jordan,
>
> I don't think we are doing any large nodes or thousands of children.  We
> are using zookeeper for storm and service discovery, so things are pretty
> modest.
>
> Camille,
>
> I've created https://issues.apache.org/jira/browse/ZOOKEEPER-1955, and
> attached the snapshot causing the EOF exception (I think..?), let me know
> if you can discover anything from the snapshot.
>
> Thanks,
>
> Aaron Zimmerman
>
>
>
> On Fri, Jul 4, 2014 at 4:02 PM, Jordan Zimmerman <
> jordan@jordanzimmerman.com
> > wrote:
>
> > I’ve seen EOF errors when the 1MB limit has been reached. Check to see if
> > any ZNodes have thousands of children and/or big payloads.
> >
> > -JZ
> >
> >
> > From: Aaron Zimmerman azimmerman@sproutsocial.com
> > Reply: user@zookeeper.apache.org user@zookeeper.apache.org
> > Date: July 4, 2014 at 8:30:09 AM
> > To: user@zookeeper.apache.org user@zookeeper.apache.org
> > Subject:  entire cluster dies with EOFException
> >
> > Hi all,
> >
> > We have a 5 node zookeeper cluster that has been operating normally for
> > several months. Starting a few days ago, the entire cluster crashes a few
> > times per day, all nodes at the exact same time. We can't track down the
> > exact issue, but deleting the snapshots and logs and restarting resolves.
> >
> > We are running exhibitor to monitor the cluster.
> >
> > It appears that something bad gets into the logs, causing an EOFException
> > and this cascades through the entire cluster:
> >
> > 2014-07-04 12:55:26,328 [myid:1] - WARN
> > [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Follower@89] - Exception when
> > following the leader
> > java.io.EOFException
> > at java.io.DataInputStream.readInt(DataInputStream.java:375)
> > at
> > org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
> > at
> >
> org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:83)
> >
> > at
> >
> org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:108)
> > at
> > org.apache.zookeeper.server.quorum.Learner.readPacket(Learner.java:152)
> > at
> >
> org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:85)
> > at
> > org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:740)
> > 2014-07-04 12:55:26,328 [myid:1] - INFO
> > [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Follower@166] - shutdown called
> > java.lang.Exception: shutdown Follower
> > at
> > org.apache.zookeeper.server.quorum.Follower.shutdown(Follower.java:166)
> > at
> > org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:744)
> >
> >
> > Then the server dies, exhibitor tries to restart each node, and they all
> > get stuck trying to replay the bad transaction, logging things like:
> >
> >
> > 2014-07-04 12:58:52,734 [myid:1] - INFO [main:FileSnap@83] - Reading
> > snapshot /var/lib/zookeeper/version-2/snapshot.300011fc0
> > 2014-07-04 12:58:52,896 [myid:1] - DEBUG
> > [main:FileTxnLog$FileTxnIterator@575] - Created new input stream
> > /var/lib/zookeeper/version-2/log.300000021
> > 2014-07-04 12:58:52,915 [myid:1] - DEBUG
> > [main:FileTxnLog$FileTxnIterator@578] - Created new input archive
> > /var/lib/zookeeper/version-2/log.300000021
> > 2014-07-04 12:59:25,870 [myid:1] - DEBUG
> > [main:FileTxnLog$FileTxnIterator@618] - EOF excepton
> > java.io.EOFException:
> > Failed to read /var/lib/zookeeper/version-2/log.300000021
> > 2014-07-04 12:59:25,871 [myid:1] - DEBUG
> > [main:FileTxnLog$FileTxnIterator@575] - Created new input stream
> > /var/lib/zookeeper/version-2/log.300011fc2
> > 2014-07-04 12:59:25,872 [myid:1] - DEBUG
> > [main:FileTxnLog$FileTxnIterator@578] - Created new input archive
> > /var/lib/zookeeper/version-2/log.300011fc2
> > 2014-07-04 12:59:48,722 [myid:1] - DEBUG
> > [main:FileTxnLog$FileTxnIterator@618] - EOF excepton
> > java.io.EOFException:
> > Failed to read /var/lib/zookeeper/version-2/log.300011fc2
> >
> > And the cluster is dead. The only way we have found to recover is to
> > delete all of the data and restart.
> >
> > Anyone seen this before? Any ideas how I can track down what is causing
> > the EOFException, or insulate zookeeper from completely crashing?
> >
> > Thanks,
> >
> > Aaron Zimmerman
> >
> >
>

Re: entire cluster dies with EOFException

Posted by Aaron Zimmerman <az...@sproutsocial.com>.

Thanks for getting back to me.

Jordan,

I don't think we are doing any large nodes or thousands of children.  We
are using zookeeper for storm and service discovery, so things are pretty
modest.

Camille,

I've created https://issues.apache.org/jira/browse/ZOOKEEPER-1955, and
attached the snapshot causing the EOF exception (I think..?), let me know
if you can discover anything from the snapshot.

Thanks,

Aaron Zimmerman



On Fri, Jul 4, 2014 at 4:02 PM, Jordan Zimmerman <jordan@jordanzimmerman.com
> wrote:

> I’ve seen EOF errors when the 1MB limit has been reached. Check to see if
> any ZNodes have thousands of children and/or big payloads.
>
> -JZ
>
>
> From: Aaron Zimmerman azimmerman@sproutsocial.com
> Reply: user@zookeeper.apache.org user@zookeeper.apache.org
> Date: July 4, 2014 at 8:30:09 AM
> To: user@zookeeper.apache.org user@zookeeper.apache.org
> Subject:  entire cluster dies with EOFException
>
> Hi all,
>
> We have a 5 node zookeeper cluster that has been operating normally for
> several months. Starting a few days ago, the entire cluster crashes a few
> times per day, all nodes at the exact same time. We can't track down the
> exact issue, but deleting the snapshots and logs and restarting resolves.
>
> We are running exhibitor to monitor the cluster.
>
> It appears that something bad gets into the logs, causing an EOFException
> and this cascades through the entire cluster:
>
> 2014-07-04 12:55:26,328 [myid:1] - WARN
> [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Follower@89] - Exception when
> following the leader
> java.io.EOFException
> at java.io.DataInputStream.readInt(DataInputStream.java:375)
> at
> org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
> at
> org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:83)
>
> at
> org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:108)
> at
> org.apache.zookeeper.server.quorum.Learner.readPacket(Learner.java:152)
> at
> org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:85)
> at
> org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:740)
> 2014-07-04 12:55:26,328 [myid:1] - INFO
> [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Follower@166] - shutdown called
> java.lang.Exception: shutdown Follower
> at
> org.apache.zookeeper.server.quorum.Follower.shutdown(Follower.java:166)
> at
> org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:744)
>
>
> Then the server dies, exhibitor tries to restart each node, and they all
> get stuck trying to replay the bad transaction, logging things like:
>
>
> 2014-07-04 12:58:52,734 [myid:1] - INFO [main:FileSnap@83] - Reading
> snapshot /var/lib/zookeeper/version-2/snapshot.300011fc0
> 2014-07-04 12:58:52,896 [myid:1] - DEBUG
> [main:FileTxnLog$FileTxnIterator@575] - Created new input stream
> /var/lib/zookeeper/version-2/log.300000021
> 2014-07-04 12:58:52,915 [myid:1] - DEBUG
> [main:FileTxnLog$FileTxnIterator@578] - Created new input archive
> /var/lib/zookeeper/version-2/log.300000021
> 2014-07-04 12:59:25,870 [myid:1] - DEBUG
> [main:FileTxnLog$FileTxnIterator@618] - EOF excepton
> java.io.EOFException:
> Failed to read /var/lib/zookeeper/version-2/log.300000021
> 2014-07-04 12:59:25,871 [myid:1] - DEBUG
> [main:FileTxnLog$FileTxnIterator@575] - Created new input stream
> /var/lib/zookeeper/version-2/log.300011fc2
> 2014-07-04 12:59:25,872 [myid:1] - DEBUG
> [main:FileTxnLog$FileTxnIterator@578] - Created new input archive
> /var/lib/zookeeper/version-2/log.300011fc2
> 2014-07-04 12:59:48,722 [myid:1] - DEBUG
> [main:FileTxnLog$FileTxnIterator@618] - EOF excepton
> java.io.EOFException:
> Failed to read /var/lib/zookeeper/version-2/log.300011fc2
>
> And the cluster is dead. The only way we have found to recover is to
> delete all of the data and restart.
>
> Anyone seen this before? Any ideas how I can track down what is causing
> the EOFException, or insulate zookeeper from completely crashing?
>
> Thanks,
>
> Aaron Zimmerman
>
>

Re: entire cluster dies with EOFException

Posted by Jordan Zimmerman <jo...@jordanzimmerman.com>.

I’ve seen EOF errors when the 1MB limit has been reached. Check to see if any ZNodes have thousands of children and/or big payloads.

-JZ


From: Aaron Zimmerman azimmerman@sproutsocial.com
Reply: user@zookeeper.apache.org user@zookeeper.apache.org
Date: July 4, 2014 at 8:30:09 AM
To: user@zookeeper.apache.org user@zookeeper.apache.org
Subject:  entire cluster dies with EOFException  

Hi all,  

We have a 5 node zookeeper cluster that has been operating normally for  
several months. Starting a few days ago, the entire cluster crashes a few  
times per day, all nodes at the exact same time. We can't track down the  
exact issue, but deleting the snapshots and logs and restarting resolves.  

We are running exhibitor to monitor the cluster.  

It appears that something bad gets into the logs, causing an EOFException  
and this cascades through the entire cluster:  

2014-07-04 12:55:26,328 [myid:1] - WARN  
[QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Follower@89] - Exception when  
following the leader  
java.io.EOFException  
at java.io.DataInputStream.readInt(DataInputStream.java:375)  
at  
org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)  
at  
org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:83)  
at  
org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:108)  
at  
org.apache.zookeeper.server.quorum.Learner.readPacket(Learner.java:152)  
at  
org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:85)  
at  
org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:740)  
2014-07-04 12:55:26,328 [myid:1] - INFO  
[QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Follower@166] - shutdown called  
java.lang.Exception: shutdown Follower  
at  
org.apache.zookeeper.server.quorum.Follower.shutdown(Follower.java:166)  
at  
org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:744)  


Then the server dies, exhibitor tries to restart each node, and they all  
get stuck trying to replay the bad transaction, logging things like:  


2014-07-04 12:58:52,734 [myid:1] - INFO [main:FileSnap@83] - Reading  
snapshot /var/lib/zookeeper/version-2/snapshot.300011fc0  
2014-07-04 12:58:52,896 [myid:1] - DEBUG  
[main:FileTxnLog$FileTxnIterator@575] - Created new input stream  
/var/lib/zookeeper/version-2/log.300000021  
2014-07-04 12:58:52,915 [myid:1] - DEBUG  
[main:FileTxnLog$FileTxnIterator@578] - Created new input archive  
/var/lib/zookeeper/version-2/log.300000021  
2014-07-04 12:59:25,870 [myid:1] - DEBUG  
[main:FileTxnLog$FileTxnIterator@618] - EOF excepton java.io.EOFException:  
Failed to read /var/lib/zookeeper/version-2/log.300000021  
2014-07-04 12:59:25,871 [myid:1] - DEBUG  
[main:FileTxnLog$FileTxnIterator@575] - Created new input stream  
/var/lib/zookeeper/version-2/log.300011fc2  
2014-07-04 12:59:25,872 [myid:1] - DEBUG  
[main:FileTxnLog$FileTxnIterator@578] - Created new input archive  
/var/lib/zookeeper/version-2/log.300011fc2  
2014-07-04 12:59:48,722 [myid:1] - DEBUG  
[main:FileTxnLog$FileTxnIterator@618] - EOF excepton java.io.EOFException:  
Failed to read /var/lib/zookeeper/version-2/log.300011fc2  

And the cluster is dead. The only way we have found to recover is to  
delete all of the data and restart.  

Anyone seen this before? Any ideas how I can track down what is causing  
the EOFException, or insulate zookeeper from completely crashing?  

Thanks,  

Aaron Zimmerman

Re: entire cluster dies with EOFException

Posted by Flavio Junqueira <fp...@yahoo.com.INVALID>.

That's interesting, I came across a similar case last week, but the snapshot size reported was 4Gb, not 4Mb like here (we report the approximate data size in bytes, right?). I thought it was the snapshot size causing problems and recommended increasing the initLimit value, but maybe the snapshot size reported was wrong and the problem is similar to this one. The use case was storm as well...

-Flavio

On 06 Jul 2014, at 12:48, Aaron Zimmerman <az...@sproutsocial.com> wrote:

> Raúl,
> 
> zk_approximate_data_size 4899392
> 
> That is about the size of the snapshots also.
> 
> Benjamin,
> 
> We are not running out of disk space.
> But the log.XXXX files are quite large, is this normal?  In less than 3
> hours, the log file since the last snapshot is 8.2G, and the older log
> files are as large as 12G.
> 
> We are using Storm Trident, this uses zookeeper pretty heavily for tracking
> transactional state, but i'm not sure if that could account for this much
> storage.  Is there an easy way to track which znodes are being updated most
> frequently?
> 
> Thanks,
> 
> Aaron
> 
> 
> 
> 
> 
> On Sun, Jul 6, 2014 at 1:55 AM, Raúl Gutiérrez Segalés <rg...@itevenworks.net>
> wrote:
> 
>> What's the total size of the data in your ZK cluster? i.e.:
>> 
>> $ echo mntr | nc localhost 2181 | grep zk_approximate_data_size
>> 
>> And/or the size of the snapshot?
>> 
>> 
>> -rgs
>> 
>> 
>> On 4 July 2014 06:29, Aaron Zimmerman <az...@sproutsocial.com> wrote:
>> 
>>> Hi all,
>>> 
>>> We have a 5 node zookeeper cluster that has been operating normally for
>>> several months.  Starting a few days ago, the entire cluster crashes a
>> few
>>> times per day, all nodes at the exact same time.  We can't track down the
>>> exact issue, but deleting the snapshots and logs and restarting resolves.
>>> 
>>> We are running exhibitor to monitor the cluster.
>>> 
>>> It appears that something bad gets into the logs, causing an EOFException
>>> and this cascades through the entire cluster:
>>> 
>>> 2014-07-04 12:55:26,328 [myid:1] - WARN
>>> [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Follower@89] - Exception when
>>> following the leader
>>> java.io.EOFException
>>>        at java.io.DataInputStream.readInt(DataInputStream.java:375)
>>>        at
>>> org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
>>>        at
>>> 
>>> 
>> org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:83)
>>>        at
>>> 
>> org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:108)
>>>        at
>>> org.apache.zookeeper.server.quorum.Learner.readPacket(Learner.java:152)
>>>        at
>>> 
>> org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:85)
>>>        at
>>> org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:740)
>>> 2014-07-04 12:55:26,328 [myid:1] - INFO
>>> [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Follower@166] - shutdown
>> called
>>> java.lang.Exception: shutdown Follower
>>>        at
>>> org.apache.zookeeper.server.quorum.Follower.shutdown(Follower.java:166)
>>>        at
>>> org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:744)
>>> 
>>> 
>>> Then the server dies, exhibitor tries to restart each node, and they all
>>> get stuck trying to replay the bad transaction, logging things like:
>>> 
>>> 
>>> 2014-07-04 12:58:52,734 [myid:1] - INFO  [main:FileSnap@83] - Reading
>>> snapshot /var/lib/zookeeper/version-2/snapshot.300011fc0
>>> 2014-07-04 12:58:52,896 [myid:1] - DEBUG
>>> [main:FileTxnLog$FileTxnIterator@575] - Created new input stream
>>> /var/lib/zookeeper/version-2/log.300000021
>>> 2014-07-04 12:58:52,915 [myid:1] - DEBUG
>>> [main:FileTxnLog$FileTxnIterator@578] - Created new input archive
>>> /var/lib/zookeeper/version-2/log.300000021
>>> 2014-07-04 12:59:25,870 [myid:1] - DEBUG
>>> [main:FileTxnLog$FileTxnIterator@618] - EOF excepton
>> java.io.EOFException:
>>> Failed to read /var/lib/zookeeper/version-2/log.300000021
>>> 2014-07-04 12:59:25,871 [myid:1] - DEBUG
>>> [main:FileTxnLog$FileTxnIterator@575] - Created new input stream
>>> /var/lib/zookeeper/version-2/log.300011fc2
>>> 2014-07-04 12:59:25,872 [myid:1] - DEBUG
>>> [main:FileTxnLog$FileTxnIterator@578] - Created new input archive
>>> /var/lib/zookeeper/version-2/log.300011fc2
>>> 2014-07-04 12:59:48,722 [myid:1] - DEBUG
>>> [main:FileTxnLog$FileTxnIterator@618] - EOF excepton
>> java.io.EOFException:
>>> Failed to read /var/lib/zookeeper/version-2/log.300011fc2
>>> 
>>> And the cluster is dead.  The only way we have found to recover is to
>>> delete all of the data and restart.
>>> 
>>> Anyone seen this before?  Any ideas how I can track down what is causing
>>> the EOFException, or insulate zookeeper from completely crashing?
>>> 
>>> Thanks,
>>> 
>>> Aaron Zimmerman
>>> 
>>

Re: entire cluster dies with EOFException

Posted by Raúl Gutiérrez Segalés <rg...@itevenworks.net>.

On 6 July 2014 14:26, Flavio Junqueira <fp...@yahoo.com.invalid>
wrote:

> But what is it that was causing problems in your scenario, Raul? Is it
> reading the log? In any case, it sounds like initLimit is the parameter you
> want to change, no?
>

Yeah, I think so. It was just that it took too long to walk through all the
txns (too many of them). So finding the sweet spot of snapshots vs
transactions is a bit tricky in this case I think.


-rgs




>
> -Flavio
>
> On 06 Jul 2014, at 19:09, Raúl Gutiérrez Segalés <rg...@itevenworks.net>
> wrote:
>
> > Oh, storm right. Yeah I've seen this. The transaction rate is so huge the
> > the initial sync fails.. perhaps you could try bigger tickTime, initLimit
> > and syncLimit params...
> >
> >
> > -rgs
> >
> >
> > On 6 July 2014 04:48, Aaron Zimmerman <az...@sproutsocial.com>
> wrote:
> >
> >> Raúl,
> >>
> >> zk_approximate_data_size 4899392
> >>
> >> That is about the size of the snapshots also.
> >>
> >> Benjamin,
> >>
> >> We are not running out of disk space.
> >> But the log.XXXX files are quite large, is this normal?  In less than 3
> >> hours, the log file since the last snapshot is 8.2G, and the older log
> >> files are as large as 12G.
> >>
> >> We are using Storm Trident, this uses zookeeper pretty heavily for
> tracking
> >> transactional state, but i'm not sure if that could account for this
> much
> >> storage.  Is there an easy way to track which znodes are being updated
> most
> >> frequently?
> >>
> >> Thanks,
> >>
> >> Aaron
> >>
> >>
> >>
> >>
> >>
> >> On Sun, Jul 6, 2014 at 1:55 AM, Raúl Gutiérrez Segalés <
> >> rgs@itevenworks.net>
> >> wrote:
> >>
> >>> What's the total size of the data in your ZK cluster? i.e.:
> >>>
> >>> $ echo mntr | nc localhost 2181 | grep zk_approximate_data_size
> >>>
> >>> And/or the size of the snapshot?
> >>>
> >>>
> >>> -rgs
> >>>
> >>>
> >>> On 4 July 2014 06:29, Aaron Zimmerman <az...@sproutsocial.com>
> >> wrote:
> >>>
> >>>> Hi all,
> >>>>
> >>>> We have a 5 node zookeeper cluster that has been operating normally
> for
> >>>> several months.  Starting a few days ago, the entire cluster crashes a
> >>> few
> >>>> times per day, all nodes at the exact same time.  We can't track down
> >> the
> >>>> exact issue, but deleting the snapshots and logs and restarting
> >> resolves.
> >>>>
> >>>> We are running exhibitor to monitor the cluster.
> >>>>
> >>>> It appears that something bad gets into the logs, causing an
> >> EOFException
> >>>> and this cascades through the entire cluster:
> >>>>
> >>>> 2014-07-04 12:55:26,328 [myid:1] - WARN
> >>>> [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Follower@89] - Exception
> >> when
> >>>> following the leader
> >>>> java.io.EOFException
> >>>>        at java.io.DataInputStream.readInt(DataInputStream.java:375)
> >>>>        at
> >>>> org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
> >>>>        at
> >>>>
> >>>>
> >>>
> >>
> org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:83)
> >>>>        at
> >>>>
> >>>
> >>
> org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:108)
> >>>>        at
> >>>>
> org.apache.zookeeper.server.quorum.Learner.readPacket(Learner.java:152)
> >>>>        at
> >>>>
> >>>
> >>
> org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:85)
> >>>>        at
> >>>> org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:740)
> >>>> 2014-07-04 12:55:26,328 [myid:1] - INFO
> >>>> [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Follower@166] - shutdown
> >>> called
> >>>> java.lang.Exception: shutdown Follower
> >>>>        at
> >>>>
> org.apache.zookeeper.server.quorum.Follower.shutdown(Follower.java:166)
> >>>>        at
> >>>> org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:744)
> >>>>
> >>>>
> >>>> Then the server dies, exhibitor tries to restart each node, and they
> >> all
> >>>> get stuck trying to replay the bad transaction, logging things like:
> >>>>
> >>>>
> >>>> 2014-07-04 12:58:52,734 [myid:1] - INFO  [main:FileSnap@83] - Reading
> >>>> snapshot /var/lib/zookeeper/version-2/snapshot.300011fc0
> >>>> 2014-07-04 12:58:52,896 [myid:1] - DEBUG
> >>>> [main:FileTxnLog$FileTxnIterator@575] - Created new input stream
> >>>> /var/lib/zookeeper/version-2/log.300000021
> >>>> 2014-07-04 12:58:52,915 [myid:1] - DEBUG
> >>>> [main:FileTxnLog$FileTxnIterator@578] - Created new input archive
> >>>> /var/lib/zookeeper/version-2/log.300000021
> >>>> 2014-07-04 12:59:25,870 [myid:1] - DEBUG
> >>>> [main:FileTxnLog$FileTxnIterator@618] - EOF excepton
> >>> java.io.EOFException:
> >>>> Failed to read /var/lib/zookeeper/version-2/log.300000021
> >>>> 2014-07-04 12:59:25,871 [myid:1] - DEBUG
> >>>> [main:FileTxnLog$FileTxnIterator@575] - Created new input stream
> >>>> /var/lib/zookeeper/version-2/log.300011fc2
> >>>> 2014-07-04 12:59:25,872 [myid:1] - DEBUG
> >>>> [main:FileTxnLog$FileTxnIterator@578] - Created new input archive
> >>>> /var/lib/zookeeper/version-2/log.300011fc2
> >>>> 2014-07-04 12:59:48,722 [myid:1] - DEBUG
> >>>> [main:FileTxnLog$FileTxnIterator@618] - EOF excepton
> >>> java.io.EOFException:
> >>>> Failed to read /var/lib/zookeeper/version-2/log.300011fc2
> >>>>
> >>>> And the cluster is dead.  The only way we have found to recover is to
> >>>> delete all of the data and restart.
> >>>>
> >>>> Anyone seen this before?  Any ideas how I can track down what is
> >> causing
> >>>> the EOFException, or insulate zookeeper from completely crashing?
> >>>>
> >>>> Thanks,
> >>>>
> >>>> Aaron Zimmerman
> >>>>
> >>>
> >>
>
>

Re: entire cluster dies with EOFException

Posted by Aaron Zimmerman <az...@sproutsocial.com>.

I should also mention that these servers only have 1G of ram.  Zookeeper
usually takes around 300m.


On Sun, Jul 6, 2014 at 6:32 PM, Aaron Zimmerman <azimmerman@sproutsocial.com
> wrote:

> This happened again this afternoon.  I captured the log file as well as
> the preceding snapshot, but I can't upload to jira obviously, as its 10G or
> so.  Is there an easy way to step through it and see what the last,
> presumably corrupted line is?  I'll try to get zookeeper running in debug
> mode or something.
>
> I think reducing the snap count seems like a good idea, having just read
> about this setting.  What has me most worried is that this issue somehow
> propagates through the entire cluster.  Is there some way to gauge these
> concerns?  Is there a rule of thumb like "log files should be less than 5g"
> or something that we can watch for to head off this problem before it
> becomes a production issue?
>
> But snap count seems like memory defense, would that cause EOF exceptions
> when reading a DataInputStream?
>
> This is the config file, it looks a bit odd since its auto-generated by
> exhibitor, but here it is verbatim.  I was just reading about having
> separate devices for dataLogDir and dataDir.  That is definitely something
> I plan to put in place.
>
> server.3=zookeeper02-n3\:2888\:3888
> server.2=zookeeper02-n2\:2888\:3888
> server.1=zookeeper02-n1\:2888\:3888
> initLimit=10
> syncLimit=5
> clientPort=2181
> tickTime=2000
> dataDir=/var/lib/zookeeper
> dataLogDir=/var/lib/zookeeper
> server.5=zookeeper02-n5\:2888\:3888
> server.4=zookeeper02-n4\:2888\:3888
>
>
> On Sun, Jul 6, 2014 at 5:30 PM, Flavio Junqueira <
> fpjunqueira@yahoo.com.invalid> wrote:
>
>> That's a good point, some znodes in the snapshot Aaron shared have over
>> 400k bytes. There are some znodes with tens of kbytes too.
>>
>> Given that we are giving hints about the config parameters, it'd be nice
>> if we could see the config file.
>>
>> -Flavio
>>
>> On 06 Jul 2014, at 22:34, Benjamin Reed <br...@apache.org> wrote:
>>
>> > the log files seem very large to me. is the snap count really large? or
>> do
>> > you have a lot of data in each znode? you might try reducing the snap
>> count
>> > to make the log files smaller.
>> > On Jul 6, 2014 2:27 PM, "Flavio Junqueira"
>> <fp...@yahoo.com.invalid>
>> > wrote:
>> >
>> >> But what is it that was causing problems in your scenario, Raul? Is it
>> >> reading the log? In any case, it sounds like initLimit is the
>> parameter you
>> >> want to change, no?
>> >>
>> >> -Flavio
>> >>
>> >> On 06 Jul 2014, at 19:09, Raúl Gutiérrez Segalés <rg...@itevenworks.net>
>> >> wrote:
>> >>
>> >>> Oh, storm right. Yeah I've seen this. The transaction rate is so huge
>> the
>> >>> the initial sync fails.. perhaps you could try bigger tickTime,
>> initLimit
>> >>> and syncLimit params...
>> >>>
>> >>>
>> >>> -rgs
>> >>>
>> >>>
>> >>> On 6 July 2014 04:48, Aaron Zimmerman <az...@sproutsocial.com>
>> >> wrote:
>> >>>
>> >>>> Raúl,
>> >>>>
>> >>>> zk_approximate_data_size 4899392
>> >>>>
>> >>>> That is about the size of the snapshots also.
>> >>>>
>> >>>> Benjamin,
>> >>>>
>> >>>> We are not running out of disk space.
>> >>>> But the log.XXXX files are quite large, is this normal?  In less
>> than 3
>> >>>> hours, the log file since the last snapshot is 8.2G, and the older
>> log
>> >>>> files are as large as 12G.
>> >>>>
>> >>>> We are using Storm Trident, this uses zookeeper pretty heavily for
>> >> tracking
>> >>>> transactional state, but i'm not sure if that could account for this
>> >> much
>> >>>> storage.  Is there an easy way to track which znodes are being
>> updated
>> >> most
>> >>>> frequently?
>> >>>>
>> >>>> Thanks,
>> >>>>
>> >>>> Aaron
>> >>>>
>> >>>>
>> >>>>
>> >>>>
>> >>>>
>> >>>> On Sun, Jul 6, 2014 at 1:55 AM, Raúl Gutiérrez Segalés <
>> >>>> rgs@itevenworks.net>
>> >>>> wrote:
>> >>>>
>> >>>>> What's the total size of the data in your ZK cluster? i.e.:
>> >>>>>
>> >>>>> $ echo mntr | nc localhost 2181 | grep zk_approximate_data_size
>> >>>>>
>> >>>>> And/or the size of the snapshot?
>> >>>>>
>> >>>>>
>> >>>>> -rgs
>> >>>>>
>> >>>>>
>> >>>>> On 4 July 2014 06:29, Aaron Zimmerman <az...@sproutsocial.com>
>> >>>> wrote:
>> >>>>>
>> >>>>>> Hi all,
>> >>>>>>
>> >>>>>> We have a 5 node zookeeper cluster that has been operating normally
>> >> for
>> >>>>>> several months.  Starting a few days ago, the entire cluster
>> crashes a
>> >>>>> few
>> >>>>>> times per day, all nodes at the exact same time.  We can't track
>> down
>> >>>> the
>> >>>>>> exact issue, but deleting the snapshots and logs and restarting
>> >>>> resolves.
>> >>>>>>
>> >>>>>> We are running exhibitor to monitor the cluster.
>> >>>>>>
>> >>>>>> It appears that something bad gets into the logs, causing an
>> >>>> EOFException
>> >>>>>> and this cascades through the entire cluster:
>> >>>>>>
>> >>>>>> 2014-07-04 12:55:26,328 [myid:1] - WARN
>> >>>>>> [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Follower@89] - Exception
>> >>>> when
>> >>>>>> following the leader
>> >>>>>> java.io.EOFException
>> >>>>>>       at java.io.DataInputStream.readInt(DataInputStream.java:375)
>> >>>>>>       at
>> >>>>>>
>> org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
>> >>>>>>       at
>> >>>>>>
>> >>>>>>
>> >>>>>
>> >>>>
>> >>
>> org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:83)
>> >>>>>>       at
>> >>>>>>
>> >>>>>
>> >>>>
>> >>
>> org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:108)
>> >>>>>>       at
>> >>>>>>
>> >> org.apache.zookeeper.server.quorum.Learner.readPacket(Learner.java:152)
>> >>>>>>       at
>> >>>>>>
>> >>>>>
>> >>>>
>> >>
>> org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:85)
>> >>>>>>       at
>> >>>>>>
>> org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:740)
>> >>>>>> 2014-07-04 12:55:26,328 [myid:1] - INFO
>> >>>>>> [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Follower@166] - shutdown
>> >>>>> called
>> >>>>>> java.lang.Exception: shutdown Follower
>> >>>>>>       at
>> >>>>>>
>> >> org.apache.zookeeper.server.quorum.Follower.shutdown(Follower.java:166)
>> >>>>>>       at
>> >>>>>>
>> org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:744)
>> >>>>>>
>> >>>>>>
>> >>>>>> Then the server dies, exhibitor tries to restart each node, and
>> they
>> >>>> all
>> >>>>>> get stuck trying to replay the bad transaction, logging things
>> like:
>> >>>>>>
>> >>>>>>
>> >>>>>> 2014-07-04 12:58:52,734 [myid:1] - INFO  [main:FileSnap@83] -
>> Reading
>> >>>>>> snapshot /var/lib/zookeeper/version-2/snapshot.300011fc0
>> >>>>>> 2014-07-04 12:58:52,896 [myid:1] - DEBUG
>> >>>>>> [main:FileTxnLog$FileTxnIterator@575] - Created new input stream
>> >>>>>> /var/lib/zookeeper/version-2/log.300000021
>> >>>>>> 2014-07-04 12:58:52,915 [myid:1] - DEBUG
>> >>>>>> [main:FileTxnLog$FileTxnIterator@578] - Created new input archive
>> >>>>>> /var/lib/zookeeper/version-2/log.300000021
>> >>>>>> 2014-07-04 12:59:25,870 [myid:1] - DEBUG
>> >>>>>> [main:FileTxnLog$FileTxnIterator@618] - EOF excepton
>> >>>>> java.io.EOFException:
>> >>>>>> Failed to read /var/lib/zookeeper/version-2/log.300000021
>> >>>>>> 2014-07-04 12:59:25,871 [myid:1] - DEBUG
>> >>>>>> [main:FileTxnLog$FileTxnIterator@575] - Created new input stream
>> >>>>>> /var/lib/zookeeper/version-2/log.300011fc2
>> >>>>>> 2014-07-04 12:59:25,872 [myid:1] - DEBUG
>> >>>>>> [main:FileTxnLog$FileTxnIterator@578] - Created new input archive
>> >>>>>> /var/lib/zookeeper/version-2/log.300011fc2
>> >>>>>> 2014-07-04 12:59:48,722 [myid:1] - DEBUG
>> >>>>>> [main:FileTxnLog$FileTxnIterator@618] - EOF excepton
>> >>>>> java.io.EOFException:
>> >>>>>> Failed to read /var/lib/zookeeper/version-2/log.300011fc2
>> >>>>>>
>> >>>>>> And the cluster is dead.  The only way we have found to recover is
>> to
>> >>>>>> delete all of the data and restart.
>> >>>>>>
>> >>>>>> Anyone seen this before?  Any ideas how I can track down what is
>> >>>> causing
>> >>>>>> the EOFException, or insulate zookeeper from completely crashing?
>> >>>>>>
>> >>>>>> Thanks,
>> >>>>>>
>> >>>>>> Aaron Zimmerman
>> >>>>>>
>> >>>>>
>> >>>>
>> >>
>> >>
>>
>>
>

Re: entire cluster dies with EOFException

Posted by Raúl Gutiérrez Segalés <rg...@itevenworks.net>.

well, not transaction rate but transaction count.. you can get the rate out
of that :-D


-rgs


On 7 July 2014 09:50, Raúl Gutiérrez Segalés <rg...@itevenworks.net> wrote:

> On 7 July 2014 09:39, Aaron Zimmerman <az...@sproutsocial.com> wrote:
>
>> What I don't understand is how the entire cluster could die in such a
>> situation.  I was able to load zookeeper locally using the snapshot and
>> 10g
>> log file without apparent issue.
>
>
> Sure, but it's syncing up with other learners that becomes challenging
> when having either big snapshots or too many txnlogs, right?
>
>
>>  I can see how large amounts of data could
>> cause latency issues in syncing causing a single worker to die, but how
>> would that explain the node's inability to restart?  When the server
>> replays the log file, does it have to sync the transactions to other nodes
>> while it does so?
>>
>
> Given that your txn churn is so big, by the time it finished up reading
> from disc it'll need
> to catch up with the quorum.. how many txns have happened by that point?
> By the way, we use
> this patch:
>
> https://issues.apache.org/jira/browse/ZOOKEEPER-1804
>
> to measure transaction rate, do you have any approximation of what your
> transaction rate might be?
>
>
>>
>> I can alter the settings as has been discussed, but I worry that I'm just
>> delaying the same thing from happening again, if I deploy another storm
>> topology or something.  How can I get the cluster in a state where I can
>> be
>> confident that it won't crash in a similar way as load increases, or at
>> least set up some kind of monitoring that will let me know something is
>> unhealthy?
>>
>
> I think it depends on what your txn rate is, lets measure that first I
> guess.
>
>
> -rgs
>

RE: entire cluster dies with EOFException

Posted by FPJ <fp...@yahoo.com.INVALID>.

Thanks for reporting back, Aaron. Shall we close the jira you created?

-Flavio

> -----Original Message-----
> From: Aaron Zimmerman [mailto:azimmerman@sproutsocial.com]
> Sent: 14 July 2014 16:21
> To: user@zookeeper.apache.org
> Subject: Re: entire cluster dies with EOFException
> 
> Closing the loop on this, It appears that upping the initLimit did resolve the
> issue.  Thanks all for the help.
> 
> Thanks,
> 
> Aaron Zimmerman
> 
> 
> On Tue, Jul 8, 2014 at 4:40 PM, Flavio Junqueira <
> fpjunqueira@yahoo.com.invalid> wrote:
> 
> > Agreed, but we need that check because we expect bytes for the
> > checksum computation right underneath. The bit that's odd is that we
> > make the same check again below:
> >
> >         try {
> >                 long crcValue = ia.readLong("crcvalue");
> >                 byte[] bytes = Util.readTxnBytes(ia);
> >                 // Since we preallocate, we define EOF to be an
> >                 if (bytes == null || bytes.length==0) {
> >                     throw new EOFException("Failed to read " + logFile);
> >                 }
> >                 // EOF or corrupted record
> >                 // validate CRC
> >                 Checksum crc = makeChecksumAlgorithm();
> >                 crc.update(bytes, 0, bytes.length);
> >                 if (crcValue != crc.getValue())
> >                     throw new IOException(CRC_ERROR);
> >                 if (bytes == null || bytes.length == 0)
> >                     return false;
> >                 hdr = new TxnHeader();
> >                 record = SerializeUtils.deserializeTxn(bytes, hdr);
> >             } catch (EOFException e) {
> >
> > I'm moving this discussion, to the jira, btw.
> >
> > -Flavio
> >
> > On 07 Jul 2014, at 22:03, Aaron Zimmerman
> > <az...@sproutsocial.com>
> > wrote:
> >
> > > Flavio,
> > >
> > > Yes that is the initial error, and then the nodes in the cluster are
> > > restarted but fail to restart with
> > >
> > > 2014-07-04 12:58:52,734 [myid:1] - INFO  [main:FileSnap@83] -
> > > Reading snapshot /var/lib/zookeeper/version-2/snapshot.300011fc0
> > > 2014-07-04 12:58:52,896 [myid:1] - DEBUG
> > > [main:FileTxnLog$FileTxnIterator@575] - Created new input stream
> > > /var/lib/zookeeper/version-2/log.300000021
> > > 2014-07-04 12:58:52,915 [myid:1] - DEBUG
> > > [main:FileTxnLog$FileTxnIterator@578] - Created new input archive
> > > /var/lib/zookeeper/version-2/log.300000021
> > > 2014-07-04 12:59:25,870 [myid:1] - DEBUG
> > > [main:FileTxnLog$FileTxnIterator@618] - EOF excepton
> > java.io.EOFException:
> > > Failed to read /var/lib/zookeeper/version-2/log.300000021
> > > 2014-07-04 12:59:25,871 [myid:1] - DEBUG
> > > [main:FileTxnLog$FileTxnIterator@575] - Created new input stream
> > > /var/lib/zookeeper/version-2/log.300011fc2
> > > 2014-07-04 12:59:25,872 [myid:1] - DEBUG
> > > [main:FileTxnLog$FileTxnIterator@578] - Created new input archive
> > > /var/lib/zookeeper/version-2/log.300011fc2
> > > 2014-07-04 12:59:48,722 [myid:1] - DEBUG
> > > [main:FileTxnLog$FileTxnIterator@618] - EOF excepton
> > java.io.EOFException:
> > > Failed to read /var/lib/zookeeper/version-2/log.300011fc2
> > >
> > > Thanks,
> > >
> > > AZ
> > >
> > >
> > > On Mon, Jul 7, 2014 at 3:33 PM, Flavio Junqueira <
> > > fpjunqueira@yahoo.com.invalid> wrote:
> > >
> > >> I'm a bit confused, the stack trace you reported was this one:
> > >>
> > >> [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Follower@89] - Exception
> > >> when following the leader java.io.EOFException
> > >>       at java.io.DataInputStream.readInt(DataInputStream.java:375)
> > >>       at
> > >>
> org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
> > >>       at
> > >>
> >
> org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPa
> ck
> > et.java:83)
> > >>       at
> > >>
> > org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:
> > 108)
> > >>       at
> > >>
> org.apache.zookeeper.server.quorum.Learner.readPacket(Learner.java:152)
> > >>       at
> > >>
> >
> org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java
> > :85)
> > >>       at
> > >>
> org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:7
> > >> 40)
> > >>
> > >>
> > >> That's in a different part of the code.
> > >>
> > >> -Flavio
> > >>
> > >> On 07 Jul 2014, at 18:50, Aaron Zimmerman
> > >> <az...@sproutsocial.com>
> > >> wrote:
> > >>
> > >>> Util.readTxnBytes reads from the buffer and if the length is 0, it
> > return
> > >>> the zero length array, seemingly indicating the end of the file.
> > >>>
> > >>> Then this is detected in FileTxnLog.java:671:
> > >>>
> > >>>               byte[] bytes = Util.readTxnBytes(ia);
> > >>>               // Since we preallocate, we define EOF to be an
> > >>>               if (bytes == null || bytes.length==0) {
> > >>>                   throw new EOFException("Failed to read " + logFile);
> > >>>               }
> > >>>
> > >>>
> > >>> This exception is caught a few lines later, and the streams closed etc.
> > >>>
> > >>> So this seems to be not really an error condition, but a signal
> > >>> that
> > the
> > >>> entire file has been read? Is this exception a red herring?
> > >>>
> > >>>
> > >>>
> > >>>
> > >>> On Mon, Jul 7, 2014 at 11:50 AM, Raúl Gutiérrez Segalés <
> > >> rgs@itevenworks.net
> > >>>> wrote:
> > >>>
> > >>>> On 7 July 2014 09:39, Aaron Zimmerman
> > >>>> <az...@sproutsocial.com>
> > >> wrote:
> > >>>>
> > >>>>> What I don't understand is how the entire cluster could die in
> > >>>>> such a situation.  I was able to load zookeeper locally using
> > >>>>> the snapshot
> > and
> > >>>> 10g
> > >>>>> log file without apparent issue.
> > >>>>
> > >>>>
> > >>>> Sure, but it's syncing up with other learners that becomes
> > >>>> challenging
> > >> when
> > >>>> having either big snapshots or too many txnlogs, right?
> > >>>>
> > >>>>
> > >>>>> I can see how large amounts of data could cause latency issues
> > >>>>> in syncing causing a single worker to die, but
> > how
> > >>>>> would that explain the node's inability to restart?  When the
> > >>>>> server replays the log file, does it have to sync the
> > >>>>> transactions to other
> > >>>> nodes
> > >>>>> while it does so?
> > >>>>>
> > >>>>
> > >>>> Given that your txn churn is so big, by the time it finished up
> > reading
> > >>>> from disc it'll need
> > >>>> to catch up with the quorum.. how many txns have happened by that
> > >> point? By
> > >>>> the way, we use
> > >>>> this patch:
> > >>>>
> > >>>> https://issues.apache.org/jira/browse/ZOOKEEPER-1804
> > >>>>
> > >>>> to measure transaction rate, do you have any approximation of
> > >>>> what
> > your
> > >>>> transaction rate might be?
> > >>>>
> > >>>>
> > >>>>>
> > >>>>> I can alter the settings as has been discussed, but I worry that
> > >>>>> I'm
> > >> just
> > >>>>> delaying the same thing from happening again, if I deploy
> > >>>>> another
> > storm
> > >>>>> topology or something.  How can I get the cluster in a state
> > >>>>> where I
> > >> can
> > >>>> be
> > >>>>> confident that it won't crash in a similar way as load
> > >>>>> increases, or
> > at
> > >>>>> least set up some kind of monitoring that will let me know
> > >>>>> something
> > is
> > >>>>> unhealthy?
> > >>>>>
> > >>>>
> > >>>> I think it depends on what your txn rate is, lets measure that
> > >>>> first I guess.
> > >>>>
> > >>>>
> > >>>> -rgs
> > >>>>
> > >>
> > >>
> >
> >

Re: entire cluster dies with EOFException

Posted by Aaron Zimmerman <az...@sproutsocial.com>.

Closing the loop on this, It appears that upping the initLimit did resolve
the issue.  Thanks all for the help.

Thanks,

Aaron Zimmerman


On Tue, Jul 8, 2014 at 4:40 PM, Flavio Junqueira <
fpjunqueira@yahoo.com.invalid> wrote:

> Agreed, but we need that check because we expect bytes for the checksum
> computation right underneath. The bit that's odd is that we make the same
> check again below:
>
>         try {
>                 long crcValue = ia.readLong("crcvalue");
>                 byte[] bytes = Util.readTxnBytes(ia);
>                 // Since we preallocate, we define EOF to be an
>                 if (bytes == null || bytes.length==0) {
>                     throw new EOFException("Failed to read " + logFile);
>                 }
>                 // EOF or corrupted record
>                 // validate CRC
>                 Checksum crc = makeChecksumAlgorithm();
>                 crc.update(bytes, 0, bytes.length);
>                 if (crcValue != crc.getValue())
>                     throw new IOException(CRC_ERROR);
>                 if (bytes == null || bytes.length == 0)
>                     return false;
>                 hdr = new TxnHeader();
>                 record = SerializeUtils.deserializeTxn(bytes, hdr);
>             } catch (EOFException e) {
>
> I'm moving this discussion, to the jira, btw.
>
> -Flavio
>
> On 07 Jul 2014, at 22:03, Aaron Zimmerman <az...@sproutsocial.com>
> wrote:
>
> > Flavio,
> >
> > Yes that is the initial error, and then the nodes in the cluster are
> > restarted but fail to restart with
> >
> > 2014-07-04 12:58:52,734 [myid:1] - INFO  [main:FileSnap@83] - Reading
> > snapshot /var/lib/zookeeper/version-2/snapshot.300011fc0
> > 2014-07-04 12:58:52,896 [myid:1] - DEBUG
> > [main:FileTxnLog$FileTxnIterator@575] - Created new input stream
> > /var/lib/zookeeper/version-2/log.300000021
> > 2014-07-04 12:58:52,915 [myid:1] - DEBUG
> > [main:FileTxnLog$FileTxnIterator@578] - Created new input archive
> > /var/lib/zookeeper/version-2/log.300000021
> > 2014-07-04 12:59:25,870 [myid:1] - DEBUG
> > [main:FileTxnLog$FileTxnIterator@618] - EOF excepton
> java.io.EOFException:
> > Failed to read /var/lib/zookeeper/version-2/log.300000021
> > 2014-07-04 12:59:25,871 [myid:1] - DEBUG
> > [main:FileTxnLog$FileTxnIterator@575] - Created new input stream
> > /var/lib/zookeeper/version-2/log.300011fc2
> > 2014-07-04 12:59:25,872 [myid:1] - DEBUG
> > [main:FileTxnLog$FileTxnIterator@578] - Created new input archive
> > /var/lib/zookeeper/version-2/log.300011fc2
> > 2014-07-04 12:59:48,722 [myid:1] - DEBUG
> > [main:FileTxnLog$FileTxnIterator@618] - EOF excepton
> java.io.EOFException:
> > Failed to read /var/lib/zookeeper/version-2/log.300011fc2
> >
> > Thanks,
> >
> > AZ
> >
> >
> > On Mon, Jul 7, 2014 at 3:33 PM, Flavio Junqueira <
> > fpjunqueira@yahoo.com.invalid> wrote:
> >
> >> I'm a bit confused, the stack trace you reported was this one:
> >>
> >> [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Follower@89] - Exception when
> >> following the leader
> >> java.io.EOFException
> >>       at java.io.DataInputStream.readInt(DataInputStream.java:375)
> >>       at
> >> org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
> >>       at
> >>
> org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:83)
> >>       at
> >>
> org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:108)
> >>       at
> >> org.apache.zookeeper.server.quorum.Learner.readPacket(Learner.java:152)
> >>       at
> >>
> org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:85)
> >>       at
> >> org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:740)
> >>
> >>
> >> That's in a different part of the code.
> >>
> >> -Flavio
> >>
> >> On 07 Jul 2014, at 18:50, Aaron Zimmerman <az...@sproutsocial.com>
> >> wrote:
> >>
> >>> Util.readTxnBytes reads from the buffer and if the length is 0, it
> return
> >>> the zero length array, seemingly indicating the end of the file.
> >>>
> >>> Then this is detected in FileTxnLog.java:671:
> >>>
> >>>               byte[] bytes = Util.readTxnBytes(ia);
> >>>               // Since we preallocate, we define EOF to be an
> >>>               if (bytes == null || bytes.length==0) {
> >>>                   throw new EOFException("Failed to read " + logFile);
> >>>               }
> >>>
> >>>
> >>> This exception is caught a few lines later, and the streams closed etc.
> >>>
> >>> So this seems to be not really an error condition, but a signal that
> the
> >>> entire file has been read? Is this exception a red herring?
> >>>
> >>>
> >>>
> >>>
> >>> On Mon, Jul 7, 2014 at 11:50 AM, Raúl Gutiérrez Segalés <
> >> rgs@itevenworks.net
> >>>> wrote:
> >>>
> >>>> On 7 July 2014 09:39, Aaron Zimmerman <az...@sproutsocial.com>
> >> wrote:
> >>>>
> >>>>> What I don't understand is how the entire cluster could die in such a
> >>>>> situation.  I was able to load zookeeper locally using the snapshot
> and
> >>>> 10g
> >>>>> log file without apparent issue.
> >>>>
> >>>>
> >>>> Sure, but it's syncing up with other learners that becomes challenging
> >> when
> >>>> having either big snapshots or too many txnlogs, right?
> >>>>
> >>>>
> >>>>> I can see how large amounts of data could
> >>>>> cause latency issues in syncing causing a single worker to die, but
> how
> >>>>> would that explain the node's inability to restart?  When the server
> >>>>> replays the log file, does it have to sync the transactions to other
> >>>> nodes
> >>>>> while it does so?
> >>>>>
> >>>>
> >>>> Given that your txn churn is so big, by the time it finished up
> reading
> >>>> from disc it'll need
> >>>> to catch up with the quorum.. how many txns have happened by that
> >> point? By
> >>>> the way, we use
> >>>> this patch:
> >>>>
> >>>> https://issues.apache.org/jira/browse/ZOOKEEPER-1804
> >>>>
> >>>> to measure transaction rate, do you have any approximation of what
> your
> >>>> transaction rate might be?
> >>>>
> >>>>
> >>>>>
> >>>>> I can alter the settings as has been discussed, but I worry that I'm
> >> just
> >>>>> delaying the same thing from happening again, if I deploy another
> storm
> >>>>> topology or something.  How can I get the cluster in a state where I
> >> can
> >>>> be
> >>>>> confident that it won't crash in a similar way as load increases, or
> at
> >>>>> least set up some kind of monitoring that will let me know something
> is
> >>>>> unhealthy?
> >>>>>
> >>>>
> >>>> I think it depends on what your txn rate is, lets measure that first I
> >>>> guess.
> >>>>
> >>>>
> >>>> -rgs
> >>>>
> >>
> >>
>
>

Re: entire cluster dies with EOFException

Posted by Flavio Junqueira <fp...@yahoo.com.INVALID>.

Agreed, but we need that check because we expect bytes for the checksum computation right underneath. The bit that's odd is that we make the same check again below:

	try {
                long crcValue = ia.readLong("crcvalue");
                byte[] bytes = Util.readTxnBytes(ia);
                // Since we preallocate, we define EOF to be an
                if (bytes == null || bytes.length==0) {
                    throw new EOFException("Failed to read " + logFile);
                }
                // EOF or corrupted record
                // validate CRC
                Checksum crc = makeChecksumAlgorithm();
                crc.update(bytes, 0, bytes.length);
                if (crcValue != crc.getValue())
                    throw new IOException(CRC_ERROR);
                if (bytes == null || bytes.length == 0)
                    return false;
                hdr = new TxnHeader();
                record = SerializeUtils.deserializeTxn(bytes, hdr);
            } catch (EOFException e) { 

I'm moving this discussion, to the jira, btw.

-Flavio

On 07 Jul 2014, at 22:03, Aaron Zimmerman <az...@sproutsocial.com> wrote:

> Flavio,
> 
> Yes that is the initial error, and then the nodes in the cluster are
> restarted but fail to restart with
> 
> 2014-07-04 12:58:52,734 [myid:1] - INFO  [main:FileSnap@83] - Reading
> snapshot /var/lib/zookeeper/version-2/snapshot.300011fc0
> 2014-07-04 12:58:52,896 [myid:1] - DEBUG
> [main:FileTxnLog$FileTxnIterator@575] - Created new input stream
> /var/lib/zookeeper/version-2/log.300000021
> 2014-07-04 12:58:52,915 [myid:1] - DEBUG
> [main:FileTxnLog$FileTxnIterator@578] - Created new input archive
> /var/lib/zookeeper/version-2/log.300000021
> 2014-07-04 12:59:25,870 [myid:1] - DEBUG
> [main:FileTxnLog$FileTxnIterator@618] - EOF excepton java.io.EOFException:
> Failed to read /var/lib/zookeeper/version-2/log.300000021
> 2014-07-04 12:59:25,871 [myid:1] - DEBUG
> [main:FileTxnLog$FileTxnIterator@575] - Created new input stream
> /var/lib/zookeeper/version-2/log.300011fc2
> 2014-07-04 12:59:25,872 [myid:1] - DEBUG
> [main:FileTxnLog$FileTxnIterator@578] - Created new input archive
> /var/lib/zookeeper/version-2/log.300011fc2
> 2014-07-04 12:59:48,722 [myid:1] - DEBUG
> [main:FileTxnLog$FileTxnIterator@618] - EOF excepton java.io.EOFException:
> Failed to read /var/lib/zookeeper/version-2/log.300011fc2
> 
> Thanks,
> 
> AZ
> 
> 
> On Mon, Jul 7, 2014 at 3:33 PM, Flavio Junqueira <
> fpjunqueira@yahoo.com.invalid> wrote:
> 
>> I'm a bit confused, the stack trace you reported was this one:
>> 
>> [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Follower@89] - Exception when
>> following the leader
>> java.io.EOFException
>>       at java.io.DataInputStream.readInt(DataInputStream.java:375)
>>       at
>> org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
>>       at
>> org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:83)
>>       at
>> org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:108)
>>       at
>> org.apache.zookeeper.server.quorum.Learner.readPacket(Learner.java:152)
>>       at
>> org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:85)
>>       at
>> org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:740)
>> 
>> 
>> That's in a different part of the code.
>> 
>> -Flavio
>> 
>> On 07 Jul 2014, at 18:50, Aaron Zimmerman <az...@sproutsocial.com>
>> wrote:
>> 
>>> Util.readTxnBytes reads from the buffer and if the length is 0, it return
>>> the zero length array, seemingly indicating the end of the file.
>>> 
>>> Then this is detected in FileTxnLog.java:671:
>>> 
>>>               byte[] bytes = Util.readTxnBytes(ia);
>>>               // Since we preallocate, we define EOF to be an
>>>               if (bytes == null || bytes.length==0) {
>>>                   throw new EOFException("Failed to read " + logFile);
>>>               }
>>> 
>>> 
>>> This exception is caught a few lines later, and the streams closed etc.
>>> 
>>> So this seems to be not really an error condition, but a signal that the
>>> entire file has been read? Is this exception a red herring?
>>> 
>>> 
>>> 
>>> 
>>> On Mon, Jul 7, 2014 at 11:50 AM, Raúl Gutiérrez Segalés <
>> rgs@itevenworks.net
>>>> wrote:
>>> 
>>>> On 7 July 2014 09:39, Aaron Zimmerman <az...@sproutsocial.com>
>> wrote:
>>>> 
>>>>> What I don't understand is how the entire cluster could die in such a
>>>>> situation.  I was able to load zookeeper locally using the snapshot and
>>>> 10g
>>>>> log file without apparent issue.
>>>> 
>>>> 
>>>> Sure, but it's syncing up with other learners that becomes challenging
>> when
>>>> having either big snapshots or too many txnlogs, right?
>>>> 
>>>> 
>>>>> I can see how large amounts of data could
>>>>> cause latency issues in syncing causing a single worker to die, but how
>>>>> would that explain the node's inability to restart?  When the server
>>>>> replays the log file, does it have to sync the transactions to other
>>>> nodes
>>>>> while it does so?
>>>>> 
>>>> 
>>>> Given that your txn churn is so big, by the time it finished up reading
>>>> from disc it'll need
>>>> to catch up with the quorum.. how many txns have happened by that
>> point? By
>>>> the way, we use
>>>> this patch:
>>>> 
>>>> https://issues.apache.org/jira/browse/ZOOKEEPER-1804
>>>> 
>>>> to measure transaction rate, do you have any approximation of what your
>>>> transaction rate might be?
>>>> 
>>>> 
>>>>> 
>>>>> I can alter the settings as has been discussed, but I worry that I'm
>> just
>>>>> delaying the same thing from happening again, if I deploy another storm
>>>>> topology or something.  How can I get the cluster in a state where I
>> can
>>>> be
>>>>> confident that it won't crash in a similar way as load increases, or at
>>>>> least set up some kind of monitoring that will let me know something is
>>>>> unhealthy?
>>>>> 
>>>> 
>>>> I think it depends on what your txn rate is, lets measure that first I
>>>> guess.
>>>> 
>>>> 
>>>> -rgs
>>>> 
>> 
>>

Re: entire cluster dies with EOFException

Posted by Aaron Zimmerman <az...@sproutsocial.com>.

Flavio,

 Yes that is the initial error, and then the nodes in the cluster are
restarted but fail to restart with

2014-07-04 12:58:52,734 [myid:1] - INFO  [main:FileSnap@83] - Reading
snapshot /var/lib/zookeeper/version-2/snapshot.300011fc0
2014-07-04 12:58:52,896 [myid:1] - DEBUG
[main:FileTxnLog$FileTxnIterator@575] - Created new input stream
/var/lib/zookeeper/version-2/log.300000021
2014-07-04 12:58:52,915 [myid:1] - DEBUG
[main:FileTxnLog$FileTxnIterator@578] - Created new input archive
/var/lib/zookeeper/version-2/log.300000021
2014-07-04 12:59:25,870 [myid:1] - DEBUG
[main:FileTxnLog$FileTxnIterator@618] - EOF excepton java.io.EOFException:
Failed to read /var/lib/zookeeper/version-2/log.300000021
2014-07-04 12:59:25,871 [myid:1] - DEBUG
[main:FileTxnLog$FileTxnIterator@575] - Created new input stream
/var/lib/zookeeper/version-2/log.300011fc2
2014-07-04 12:59:25,872 [myid:1] - DEBUG
[main:FileTxnLog$FileTxnIterator@578] - Created new input archive
/var/lib/zookeeper/version-2/log.300011fc2
2014-07-04 12:59:48,722 [myid:1] - DEBUG
[main:FileTxnLog$FileTxnIterator@618] - EOF excepton java.io.EOFException:
Failed to read /var/lib/zookeeper/version-2/log.300011fc2

Thanks,

AZ


On Mon, Jul 7, 2014 at 3:33 PM, Flavio Junqueira <
fpjunqueira@yahoo.com.invalid> wrote:

> I'm a bit confused, the stack trace you reported was this one:
>
> [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Follower@89] - Exception when
> following the leader
> java.io.EOFException
>        at java.io.DataInputStream.readInt(DataInputStream.java:375)
>        at
> org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
>        at
> org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:83)
>        at
> org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:108)
>        at
> org.apache.zookeeper.server.quorum.Learner.readPacket(Learner.java:152)
>        at
> org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:85)
>        at
> org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:740)
>
>
> That's in a different part of the code.
>
> -Flavio
>
> On 07 Jul 2014, at 18:50, Aaron Zimmerman <az...@sproutsocial.com>
> wrote:
>
> > Util.readTxnBytes reads from the buffer and if the length is 0, it return
> > the zero length array, seemingly indicating the end of the file.
> >
> > Then this is detected in FileTxnLog.java:671:
> >
> >                byte[] bytes = Util.readTxnBytes(ia);
> >                // Since we preallocate, we define EOF to be an
> >                if (bytes == null || bytes.length==0) {
> >                    throw new EOFException("Failed to read " + logFile);
> >                }
> >
> >
> > This exception is caught a few lines later, and the streams closed etc.
> >
> > So this seems to be not really an error condition, but a signal that the
> > entire file has been read? Is this exception a red herring?
> >
> >
> >
> >
> > On Mon, Jul 7, 2014 at 11:50 AM, Raúl Gutiérrez Segalés <
> rgs@itevenworks.net
> >> wrote:
> >
> >> On 7 July 2014 09:39, Aaron Zimmerman <az...@sproutsocial.com>
> wrote:
> >>
> >>> What I don't understand is how the entire cluster could die in such a
> >>> situation.  I was able to load zookeeper locally using the snapshot and
> >> 10g
> >>> log file without apparent issue.
> >>
> >>
> >> Sure, but it's syncing up with other learners that becomes challenging
> when
> >> having either big snapshots or too many txnlogs, right?
> >>
> >>
> >>> I can see how large amounts of data could
> >>> cause latency issues in syncing causing a single worker to die, but how
> >>> would that explain the node's inability to restart?  When the server
> >>> replays the log file, does it have to sync the transactions to other
> >> nodes
> >>> while it does so?
> >>>
> >>
> >> Given that your txn churn is so big, by the time it finished up reading
> >> from disc it'll need
> >> to catch up with the quorum.. how many txns have happened by that
> point? By
> >> the way, we use
> >> this patch:
> >>
> >> https://issues.apache.org/jira/browse/ZOOKEEPER-1804
> >>
> >> to measure transaction rate, do you have any approximation of what your
> >> transaction rate might be?
> >>
> >>
> >>>
> >>> I can alter the settings as has been discussed, but I worry that I'm
> just
> >>> delaying the same thing from happening again, if I deploy another storm
> >>> topology or something.  How can I get the cluster in a state where I
> can
> >> be
> >>> confident that it won't crash in a similar way as load increases, or at
> >>> least set up some kind of monitoring that will let me know something is
> >>> unhealthy?
> >>>
> >>
> >> I think it depends on what your txn rate is, lets measure that first I
> >> guess.
> >>
> >>
> >> -rgs
> >>
>
>

Re: entire cluster dies with EOFException

Posted by Flavio Junqueira <fp...@yahoo.com.INVALID>.

I'm a bit confused, the stack trace you reported was this one:

[QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Follower@89] - Exception when
following the leader
java.io.EOFException
       at java.io.DataInputStream.readInt(DataInputStream.java:375)
       at org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
       at org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:83)
       at org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:108)
       at org.apache.zookeeper.server.quorum.Learner.readPacket(Learner.java:152)
       at org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:85)
       at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:740)


That's in a different part of the code.

-Flavio

On 07 Jul 2014, at 18:50, Aaron Zimmerman <az...@sproutsocial.com> wrote:

> Util.readTxnBytes reads from the buffer and if the length is 0, it return
> the zero length array, seemingly indicating the end of the file.
> 
> Then this is detected in FileTxnLog.java:671:
> 
>                byte[] bytes = Util.readTxnBytes(ia);
>                // Since we preallocate, we define EOF to be an
>                if (bytes == null || bytes.length==0) {
>                    throw new EOFException("Failed to read " + logFile);
>                }
> 
> 
> This exception is caught a few lines later, and the streams closed etc.
> 
> So this seems to be not really an error condition, but a signal that the
> entire file has been read? Is this exception a red herring?
> 
> 
> 
> 
> On Mon, Jul 7, 2014 at 11:50 AM, Raúl Gutiérrez Segalés <rgs@itevenworks.net
>> wrote:
> 
>> On 7 July 2014 09:39, Aaron Zimmerman <az...@sproutsocial.com> wrote:
>> 
>>> What I don't understand is how the entire cluster could die in such a
>>> situation.  I was able to load zookeeper locally using the snapshot and
>> 10g
>>> log file without apparent issue.
>> 
>> 
>> Sure, but it's syncing up with other learners that becomes challenging when
>> having either big snapshots or too many txnlogs, right?
>> 
>> 
>>> I can see how large amounts of data could
>>> cause latency issues in syncing causing a single worker to die, but how
>>> would that explain the node's inability to restart?  When the server
>>> replays the log file, does it have to sync the transactions to other
>> nodes
>>> while it does so?
>>> 
>> 
>> Given that your txn churn is so big, by the time it finished up reading
>> from disc it'll need
>> to catch up with the quorum.. how many txns have happened by that point? By
>> the way, we use
>> this patch:
>> 
>> https://issues.apache.org/jira/browse/ZOOKEEPER-1804
>> 
>> to measure transaction rate, do you have any approximation of what your
>> transaction rate might be?
>> 
>> 
>>> 
>>> I can alter the settings as has been discussed, but I worry that I'm just
>>> delaying the same thing from happening again, if I deploy another storm
>>> topology or something.  How can I get the cluster in a state where I can
>> be
>>> confident that it won't crash in a similar way as load increases, or at
>>> least set up some kind of monitoring that will let me know something is
>>> unhealthy?
>>> 
>> 
>> I think it depends on what your txn rate is, lets measure that first I
>> guess.
>> 
>> 
>> -rgs
>>

Re: entire cluster dies with EOFException

Posted by Aaron Zimmerman <az...@sproutsocial.com>.

Util.readTxnBytes reads from the buffer and if the length is 0, it return
the zero length array, seemingly indicating the end of the file.

Then this is detected in FileTxnLog.java:671:

                byte[] bytes = Util.readTxnBytes(ia);
                // Since we preallocate, we define EOF to be an
                if (bytes == null || bytes.length==0) {
                    throw new EOFException("Failed to read " + logFile);
                }


This exception is caught a few lines later, and the streams closed etc.

So this seems to be not really an error condition, but a signal that the
entire file has been read? Is this exception a red herring?




On Mon, Jul 7, 2014 at 11:50 AM, Raúl Gutiérrez Segalés <rgs@itevenworks.net
> wrote:

> On 7 July 2014 09:39, Aaron Zimmerman <az...@sproutsocial.com> wrote:
>
> > What I don't understand is how the entire cluster could die in such a
> > situation.  I was able to load zookeeper locally using the snapshot and
> 10g
> > log file without apparent issue.
>
>
> Sure, but it's syncing up with other learners that becomes challenging when
> having either big snapshots or too many txnlogs, right?
>
>
> >  I can see how large amounts of data could
> > cause latency issues in syncing causing a single worker to die, but how
> > would that explain the node's inability to restart?  When the server
> > replays the log file, does it have to sync the transactions to other
> nodes
> > while it does so?
> >
>
> Given that your txn churn is so big, by the time it finished up reading
> from disc it'll need
> to catch up with the quorum.. how many txns have happened by that point? By
> the way, we use
> this patch:
>
> https://issues.apache.org/jira/browse/ZOOKEEPER-1804
>
> to measure transaction rate, do you have any approximation of what your
> transaction rate might be?
>
>
> >
> > I can alter the settings as has been discussed, but I worry that I'm just
> > delaying the same thing from happening again, if I deploy another storm
> > topology or something.  How can I get the cluster in a state where I can
> be
> > confident that it won't crash in a similar way as load increases, or at
> > least set up some kind of monitoring that will let me know something is
> > unhealthy?
> >
>
> I think it depends on what your txn rate is, lets measure that first I
> guess.
>
>
> -rgs
>

Re: entire cluster dies with EOFException

Posted by Raúl Gutiérrez Segalés <rg...@itevenworks.net>.

On 7 July 2014 09:39, Aaron Zimmerman <az...@sproutsocial.com> wrote:

> What I don't understand is how the entire cluster could die in such a
> situation.  I was able to load zookeeper locally using the snapshot and 10g
> log file without apparent issue.

Sure, but it's syncing up with other learners that becomes challenging when
having either big snapshots or too many txnlogs, right?

>  I can see how large amounts of data could
> cause latency issues in syncing causing a single worker to die, but how
> would that explain the node's inability to restart?  When the server
> replays the log file, does it have to sync the transactions to other nodes
> while it does so?
>

Given that your txn churn is so big, by the time it finished up reading
from disc it'll need
to catch up with the quorum.. how many txns have happened by that point? By
the way, we use
this patch:

https://issues.apache.org/jira/browse/ZOOKEEPER-1804

to measure transaction rate, do you have any approximation of what your
transaction rate might be?

>
> I can alter the settings as has been discussed, but I worry that I'm just
> delaying the same thing from happening again, if I deploy another storm
> topology or something.  How can I get the cluster in a state where I can be
> confident that it won't crash in a similar way as load increases, or at
> least set up some kind of monitoring that will let me know something is
> unhealthy?
>

I think it depends on what your txn rate is, lets measure that first I
guess.

-rgs

Re: entire cluster dies with EOFException

Posted by Aaron Zimmerman <az...@sproutsocial.com>.

What I don't understand is how the entire cluster could die in such a
situation.  I was able to load zookeeper locally using the snapshot and 10g
log file without apparent issue.  I can see how large amounts of data could
cause latency issues in syncing causing a single worker to die, but how
would that explain the node's inability to restart?  When the server
replays the log file, does it have to sync the transactions to other nodes
while it does so?

I can alter the settings as has been discussed, but I worry that I'm just
delaying the same thing from happening again, if I deploy another storm
topology or something.  How can I get the cluster in a state where I can be
confident that it won't crash in a similar way as load increases, or at
least set up some kind of monitoring that will let me know something is
unhealthy?



On Mon, Jul 7, 2014 at 3:57 AM, Flavio Junqueira <
fpjunqueira@yahoo.com.invalid> wrote:

> In general, you want to minimize the amount of data you need to process
> during recovery (change of leader). When I've seen problems like yours
> happening, it was due to a large snapshot. In your case the snapshot size
> isn't very large, but the txn log files are. Given that the snapshot is of
> reasonable size but the log files are large, I understand that your app is
> creating and deleting a lot of data over time. I'm a bit worried that
> dropping the snapCount might cause your servers to start generating a lot
> of snapshots during short time periods.
>
> If you can find a value of snapCount that leads to reasonably sized logs
> and snapshots without overwhelming servers with the number of snapshots
> generated, then you're good. If you can't, then I suppose you'll need to
> leave  snapCount as is currently or a bit smaller maybe, and adjust
> initLimit, which controls the amount of time a follower to sync with the
> leader when it connects to the leader.
>
> This discussion actually makes me think that a version of snapCount that
> counts bytes instead of txns might work better in your case, but this isn't
> what we have currently.
>
> -Flavio
>
> On 07 Jul 2014, at 00:32, Aaron Zimmerman <az...@sproutsocial.com>
> wrote:
>
> > This happened again this afternoon.  I captured the log file as well as
> the
> > preceding snapshot, but I can't upload to jira obviously, as its 10G or
> so.
> > Is there an easy way to step through it and see what the last, presumably
> > corrupted line is?  I'll try to get zookeeper running in debug mode or
> > something.
> >
> > I think reducing the snap count seems like a good idea, having just read
> > about this setting.  What has me most worried is that this issue somehow
> > propagates through the entire cluster.  Is there some way to gauge these
> > concerns?  Is there a rule of thumb like "log files should be less than
> 5g"
> > or something that we can watch for to head off this problem before it
> > becomes a production issue?
> >
> > But snap count seems like memory defense, would that cause EOF exceptions
> > when reading a DataInputStream?
> >
> > This is the config file, it looks a bit odd since its auto-generated by
> > exhibitor, but here it is verbatim.  I was just reading about having
> > separate devices for dataLogDir and dataDir.  That is definitely
> something
> > I plan to put in place.
> >
> > server.3=zookeeper02-n3\:2888\:3888
> > server.2=zookeeper02-n2\:2888\:3888
> > server.1=zookeeper02-n1\:2888\:3888
> > initLimit=10
> > syncLimit=5
> > clientPort=2181
> > tickTime=2000
> > dataDir=/var/lib/zookeeper
> > dataLogDir=/var/lib/zookeeper
> > server.5=zookeeper02-n5\:2888\:3888
> > server.4=zookeeper02-n4\:2888\:3888
> >
> >
> > On Sun, Jul 6, 2014 at 5:30 PM, Flavio Junqueira <
> > fpjunqueira@yahoo.com.invalid> wrote:
> >
> >> That's a good point, some znodes in the snapshot Aaron shared have over
> >> 400k bytes. There are some znodes with tens of kbytes too.
> >>
> >> Given that we are giving hints about the config parameters, it'd be nice
> >> if we could see the config file.
> >>
> >> -Flavio
> >>
> >> On 06 Jul 2014, at 22:34, Benjamin Reed <br...@apache.org> wrote:
> >>
> >>> the log files seem very large to me. is the snap count really large? or
> >> do
> >>> you have a lot of data in each znode? you might try reducing the snap
> >> count
> >>> to make the log files smaller.
> >>> On Jul 6, 2014 2:27 PM, "Flavio Junqueira"
> <fpjunqueira@yahoo.com.invalid
> >>>
> >>> wrote:
> >>>
> >>>> But what is it that was causing problems in your scenario, Raul? Is it
> >>>> reading the log? In any case, it sounds like initLimit is the
> parameter
> >> you
> >>>> want to change, no?
> >>>>
> >>>> -Flavio
> >>>>
> >>>> On 06 Jul 2014, at 19:09, Raúl Gutiérrez Segalés <rgs@itevenworks.net
> >
> >>>> wrote:
> >>>>
> >>>>> Oh, storm right. Yeah I've seen this. The transaction rate is so huge
> >> the
> >>>>> the initial sync fails.. perhaps you could try bigger tickTime,
> >> initLimit
> >>>>> and syncLimit params...
> >>>>>
> >>>>>
> >>>>> -rgs
> >>>>>
> >>>>>
> >>>>> On 6 July 2014 04:48, Aaron Zimmerman <az...@sproutsocial.com>
> >>>> wrote:
> >>>>>
> >>>>>> Raúl,
> >>>>>>
> >>>>>> zk_approximate_data_size 4899392
> >>>>>>
> >>>>>> That is about the size of the snapshots also.
> >>>>>>
> >>>>>> Benjamin,
> >>>>>>
> >>>>>> We are not running out of disk space.
> >>>>>> But the log.XXXX files are quite large, is this normal?  In less
> than
> >> 3
> >>>>>> hours, the log file since the last snapshot is 8.2G, and the older
> log
> >>>>>> files are as large as 12G.
> >>>>>>
> >>>>>> We are using Storm Trident, this uses zookeeper pretty heavily for
> >>>> tracking
> >>>>>> transactional state, but i'm not sure if that could account for this
> >>>> much
> >>>>>> storage.  Is there an easy way to track which znodes are being
> updated
> >>>> most
> >>>>>> frequently?
> >>>>>>
> >>>>>> Thanks,
> >>>>>>
> >>>>>> Aaron
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> On Sun, Jul 6, 2014 at 1:55 AM, Raúl Gutiérrez Segalés <
> >>>>>> rgs@itevenworks.net>
> >>>>>> wrote:
> >>>>>>
> >>>>>>> What's the total size of the data in your ZK cluster? i.e.:
> >>>>>>>
> >>>>>>> $ echo mntr | nc localhost 2181 | grep zk_approximate_data_size
> >>>>>>>
> >>>>>>> And/or the size of the snapshot?
> >>>>>>>
> >>>>>>>
> >>>>>>> -rgs
> >>>>>>>
> >>>>>>>
> >>>>>>> On 4 July 2014 06:29, Aaron Zimmerman <azimmerman@sproutsocial.com
> >
> >>>>>> wrote:
> >>>>>>>
> >>>>>>>> Hi all,
> >>>>>>>>
> >>>>>>>> We have a 5 node zookeeper cluster that has been operating
> normally
> >>>> for
> >>>>>>>> several months.  Starting a few days ago, the entire cluster
> >> crashes a
> >>>>>>> few
> >>>>>>>> times per day, all nodes at the exact same time.  We can't track
> >> down
> >>>>>> the
> >>>>>>>> exact issue, but deleting the snapshots and logs and restarting
> >>>>>> resolves.
> >>>>>>>>
> >>>>>>>> We are running exhibitor to monitor the cluster.
> >>>>>>>>
> >>>>>>>> It appears that something bad gets into the logs, causing an
> >>>>>> EOFException
> >>>>>>>> and this cascades through the entire cluster:
> >>>>>>>>
> >>>>>>>> 2014-07-04 12:55:26,328 [myid:1] - WARN
> >>>>>>>> [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Follower@89] - Exception
> >>>>>> when
> >>>>>>>> following the leader
> >>>>>>>> java.io.EOFException
> >>>>>>>>      at java.io.DataInputStream.readInt(DataInputStream.java:375)
> >>>>>>>>      at
> >>>>>>>>
> >> org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
> >>>>>>>>      at
> >>>>>>>>
> >>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>
> >>
> org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:83)
> >>>>>>>>      at
> >>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>
> >>
> org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:108)
> >>>>>>>>      at
> >>>>>>>>
> >>>>
> org.apache.zookeeper.server.quorum.Learner.readPacket(Learner.java:152)
> >>>>>>>>      at
> >>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>
> >>
> org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:85)
> >>>>>>>>      at
> >>>>>>>>
> >> org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:740)
> >>>>>>>> 2014-07-04 12:55:26,328 [myid:1] - INFO
> >>>>>>>> [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Follower@166] - shutdown
> >>>>>>> called
> >>>>>>>> java.lang.Exception: shutdown Follower
> >>>>>>>>      at
> >>>>>>>>
> >>>>
> org.apache.zookeeper.server.quorum.Follower.shutdown(Follower.java:166)
> >>>>>>>>      at
> >>>>>>>>
> >> org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:744)
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> Then the server dies, exhibitor tries to restart each node, and
> they
> >>>>>> all
> >>>>>>>> get stuck trying to replay the bad transaction, logging things
> like:
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> 2014-07-04 12:58:52,734 [myid:1] - INFO  [main:FileSnap@83] -
> >> Reading
> >>>>>>>> snapshot /var/lib/zookeeper/version-2/snapshot.300011fc0
> >>>>>>>> 2014-07-04 12:58:52,896 [myid:1] - DEBUG
> >>>>>>>> [main:FileTxnLog$FileTxnIterator@575] - Created new input stream
> >>>>>>>> /var/lib/zookeeper/version-2/log.300000021
> >>>>>>>> 2014-07-04 12:58:52,915 [myid:1] - DEBUG
> >>>>>>>> [main:FileTxnLog$FileTxnIterator@578] - Created new input archive
> >>>>>>>> /var/lib/zookeeper/version-2/log.300000021
> >>>>>>>> 2014-07-04 12:59:25,870 [myid:1] - DEBUG
> >>>>>>>> [main:FileTxnLog$FileTxnIterator@618] - EOF excepton
> >>>>>>> java.io.EOFException:
> >>>>>>>> Failed to read /var/lib/zookeeper/version-2/log.300000021
> >>>>>>>> 2014-07-04 12:59:25,871 [myid:1] - DEBUG
> >>>>>>>> [main:FileTxnLog$FileTxnIterator@575] - Created new input stream
> >>>>>>>> /var/lib/zookeeper/version-2/log.300011fc2
> >>>>>>>> 2014-07-04 12:59:25,872 [myid:1] - DEBUG
> >>>>>>>> [main:FileTxnLog$FileTxnIterator@578] - Created new input archive
> >>>>>>>> /var/lib/zookeeper/version-2/log.300011fc2
> >>>>>>>> 2014-07-04 12:59:48,722 [myid:1] - DEBUG
> >>>>>>>> [main:FileTxnLog$FileTxnIterator@618] - EOF excepton
> >>>>>>> java.io.EOFException:
> >>>>>>>> Failed to read /var/lib/zookeeper/version-2/log.300011fc2
> >>>>>>>>
> >>>>>>>> And the cluster is dead.  The only way we have found to recover is
> >> to
> >>>>>>>> delete all of the data and restart.
> >>>>>>>>
> >>>>>>>> Anyone seen this before?  Any ideas how I can track down what is
> >>>>>> causing
> >>>>>>>> the EOFException, or insulate zookeeper from completely crashing?
> >>>>>>>>
> >>>>>>>> Thanks,
> >>>>>>>>
> >>>>>>>> Aaron Zimmerman
> >>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>
> >>>>
> >>
> >>
>
>

Re: entire cluster dies with EOFException

Posted by Flavio Junqueira <fp...@yahoo.com.INVALID>.

In general, you want to minimize the amount of data you need to process during recovery (change of leader). When I've seen problems like yours happening, it was due to a large snapshot. In your case the snapshot size isn't very large, but the txn log files are. Given that the snapshot is of reasonable size but the log files are large, I understand that your app is creating and deleting a lot of data over time. I'm a bit worried that dropping the snapCount might cause your servers to start generating a lot of snapshots during short time periods.

If you can find a value of snapCount that leads to reasonably sized logs and snapshots without overwhelming servers with the number of snapshots generated, then you're good. If you can't, then I suppose you'll need to leave  snapCount as is currently or a bit smaller maybe, and adjust initLimit, which controls the amount of time a follower to sync with the leader when it connects to the leader. 

This discussion actually makes me think that a version of snapCount that counts bytes instead of txns might work better in your case, but this isn't what we have currently. 

-Flavio

On 07 Jul 2014, at 00:32, Aaron Zimmerman <az...@sproutsocial.com> wrote:

> This happened again this afternoon.  I captured the log file as well as the
> preceding snapshot, but I can't upload to jira obviously, as its 10G or so.
> Is there an easy way to step through it and see what the last, presumably
> corrupted line is?  I'll try to get zookeeper running in debug mode or
> something.
> 
> I think reducing the snap count seems like a good idea, having just read
> about this setting.  What has me most worried is that this issue somehow
> propagates through the entire cluster.  Is there some way to gauge these
> concerns?  Is there a rule of thumb like "log files should be less than 5g"
> or something that we can watch for to head off this problem before it
> becomes a production issue?
> 
> But snap count seems like memory defense, would that cause EOF exceptions
> when reading a DataInputStream?
> 
> This is the config file, it looks a bit odd since its auto-generated by
> exhibitor, but here it is verbatim.  I was just reading about having
> separate devices for dataLogDir and dataDir.  That is definitely something
> I plan to put in place.
> 
> server.3=zookeeper02-n3\:2888\:3888
> server.2=zookeeper02-n2\:2888\:3888
> server.1=zookeeper02-n1\:2888\:3888
> initLimit=10
> syncLimit=5
> clientPort=2181
> tickTime=2000
> dataDir=/var/lib/zookeeper
> dataLogDir=/var/lib/zookeeper
> server.5=zookeeper02-n5\:2888\:3888
> server.4=zookeeper02-n4\:2888\:3888
> 
> 
> On Sun, Jul 6, 2014 at 5:30 PM, Flavio Junqueira <
> fpjunqueira@yahoo.com.invalid> wrote:
> 
>> That's a good point, some znodes in the snapshot Aaron shared have over
>> 400k bytes. There are some znodes with tens of kbytes too.
>> 
>> Given that we are giving hints about the config parameters, it'd be nice
>> if we could see the config file.
>> 
>> -Flavio
>> 
>> On 06 Jul 2014, at 22:34, Benjamin Reed <br...@apache.org> wrote:
>> 
>>> the log files seem very large to me. is the snap count really large? or
>> do
>>> you have a lot of data in each znode? you might try reducing the snap
>> count
>>> to make the log files smaller.
>>> On Jul 6, 2014 2:27 PM, "Flavio Junqueira" <fpjunqueira@yahoo.com.invalid
>>> 
>>> wrote:
>>> 
>>>> But what is it that was causing problems in your scenario, Raul? Is it
>>>> reading the log? In any case, it sounds like initLimit is the parameter
>> you
>>>> want to change, no?
>>>> 
>>>> -Flavio
>>>> 
>>>> On 06 Jul 2014, at 19:09, Raúl Gutiérrez Segalés <rg...@itevenworks.net>
>>>> wrote:
>>>> 
>>>>> Oh, storm right. Yeah I've seen this. The transaction rate is so huge
>> the
>>>>> the initial sync fails.. perhaps you could try bigger tickTime,
>> initLimit
>>>>> and syncLimit params...
>>>>> 
>>>>> 
>>>>> -rgs
>>>>> 
>>>>> 
>>>>> On 6 July 2014 04:48, Aaron Zimmerman <az...@sproutsocial.com>
>>>> wrote:
>>>>> 
>>>>>> Raúl,
>>>>>> 
>>>>>> zk_approximate_data_size 4899392
>>>>>> 
>>>>>> That is about the size of the snapshots also.
>>>>>> 
>>>>>> Benjamin,
>>>>>> 
>>>>>> We are not running out of disk space.
>>>>>> But the log.XXXX files are quite large, is this normal?  In less than
>> 3
>>>>>> hours, the log file since the last snapshot is 8.2G, and the older log
>>>>>> files are as large as 12G.
>>>>>> 
>>>>>> We are using Storm Trident, this uses zookeeper pretty heavily for
>>>> tracking
>>>>>> transactional state, but i'm not sure if that could account for this
>>>> much
>>>>>> storage.  Is there an easy way to track which znodes are being updated
>>>> most
>>>>>> frequently?
>>>>>> 
>>>>>> Thanks,
>>>>>> 
>>>>>> Aaron
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> On Sun, Jul 6, 2014 at 1:55 AM, Raúl Gutiérrez Segalés <
>>>>>> rgs@itevenworks.net>
>>>>>> wrote:
>>>>>> 
>>>>>>> What's the total size of the data in your ZK cluster? i.e.:
>>>>>>> 
>>>>>>> $ echo mntr | nc localhost 2181 | grep zk_approximate_data_size
>>>>>>> 
>>>>>>> And/or the size of the snapshot?
>>>>>>> 
>>>>>>> 
>>>>>>> -rgs
>>>>>>> 
>>>>>>> 
>>>>>>> On 4 July 2014 06:29, Aaron Zimmerman <az...@sproutsocial.com>
>>>>>> wrote:
>>>>>>> 
>>>>>>>> Hi all,
>>>>>>>> 
>>>>>>>> We have a 5 node zookeeper cluster that has been operating normally
>>>> for
>>>>>>>> several months.  Starting a few days ago, the entire cluster
>> crashes a
>>>>>>> few
>>>>>>>> times per day, all nodes at the exact same time.  We can't track
>> down
>>>>>> the
>>>>>>>> exact issue, but deleting the snapshots and logs and restarting
>>>>>> resolves.
>>>>>>>> 
>>>>>>>> We are running exhibitor to monitor the cluster.
>>>>>>>> 
>>>>>>>> It appears that something bad gets into the logs, causing an
>>>>>> EOFException
>>>>>>>> and this cascades through the entire cluster:
>>>>>>>> 
>>>>>>>> 2014-07-04 12:55:26,328 [myid:1] - WARN
>>>>>>>> [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Follower@89] - Exception
>>>>>> when
>>>>>>>> following the leader
>>>>>>>> java.io.EOFException
>>>>>>>>      at java.io.DataInputStream.readInt(DataInputStream.java:375)
>>>>>>>>      at
>>>>>>>> 
>> org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
>>>>>>>>      at
>>>>>>>> 
>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>> 
>> org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:83)
>>>>>>>>      at
>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>> 
>> org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:108)
>>>>>>>>      at
>>>>>>>> 
>>>> org.apache.zookeeper.server.quorum.Learner.readPacket(Learner.java:152)
>>>>>>>>      at
>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>> 
>> org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:85)
>>>>>>>>      at
>>>>>>>> 
>> org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:740)
>>>>>>>> 2014-07-04 12:55:26,328 [myid:1] - INFO
>>>>>>>> [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Follower@166] - shutdown
>>>>>>> called
>>>>>>>> java.lang.Exception: shutdown Follower
>>>>>>>>      at
>>>>>>>> 
>>>> org.apache.zookeeper.server.quorum.Follower.shutdown(Follower.java:166)
>>>>>>>>      at
>>>>>>>> 
>> org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:744)
>>>>>>>> 
>>>>>>>> 
>>>>>>>> Then the server dies, exhibitor tries to restart each node, and they
>>>>>> all
>>>>>>>> get stuck trying to replay the bad transaction, logging things like:
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 2014-07-04 12:58:52,734 [myid:1] - INFO  [main:FileSnap@83] -
>> Reading
>>>>>>>> snapshot /var/lib/zookeeper/version-2/snapshot.300011fc0
>>>>>>>> 2014-07-04 12:58:52,896 [myid:1] - DEBUG
>>>>>>>> [main:FileTxnLog$FileTxnIterator@575] - Created new input stream
>>>>>>>> /var/lib/zookeeper/version-2/log.300000021
>>>>>>>> 2014-07-04 12:58:52,915 [myid:1] - DEBUG
>>>>>>>> [main:FileTxnLog$FileTxnIterator@578] - Created new input archive
>>>>>>>> /var/lib/zookeeper/version-2/log.300000021
>>>>>>>> 2014-07-04 12:59:25,870 [myid:1] - DEBUG
>>>>>>>> [main:FileTxnLog$FileTxnIterator@618] - EOF excepton
>>>>>>> java.io.EOFException:
>>>>>>>> Failed to read /var/lib/zookeeper/version-2/log.300000021
>>>>>>>> 2014-07-04 12:59:25,871 [myid:1] - DEBUG
>>>>>>>> [main:FileTxnLog$FileTxnIterator@575] - Created new input stream
>>>>>>>> /var/lib/zookeeper/version-2/log.300011fc2
>>>>>>>> 2014-07-04 12:59:25,872 [myid:1] - DEBUG
>>>>>>>> [main:FileTxnLog$FileTxnIterator@578] - Created new input archive
>>>>>>>> /var/lib/zookeeper/version-2/log.300011fc2
>>>>>>>> 2014-07-04 12:59:48,722 [myid:1] - DEBUG
>>>>>>>> [main:FileTxnLog$FileTxnIterator@618] - EOF excepton
>>>>>>> java.io.EOFException:
>>>>>>>> Failed to read /var/lib/zookeeper/version-2/log.300011fc2
>>>>>>>> 
>>>>>>>> And the cluster is dead.  The only way we have found to recover is
>> to
>>>>>>>> delete all of the data and restart.
>>>>>>>> 
>>>>>>>> Anyone seen this before?  Any ideas how I can track down what is
>>>>>> causing
>>>>>>>> the EOFException, or insulate zookeeper from completely crashing?
>>>>>>>> 
>>>>>>>> Thanks,
>>>>>>>> 
>>>>>>>> Aaron Zimmerman
>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>> 
>>>> 
>> 
>>

Re: entire cluster dies with EOFException

Posted by Aaron Zimmerman <az...@sproutsocial.com>.

This happened again this afternoon.  I captured the log file as well as the
preceding snapshot, but I can't upload to jira obviously, as its 10G or so.
 Is there an easy way to step through it and see what the last, presumably
corrupted line is?  I'll try to get zookeeper running in debug mode or
something.

I think reducing the snap count seems like a good idea, having just read
about this setting.  What has me most worried is that this issue somehow
propagates through the entire cluster.  Is there some way to gauge these
concerns?  Is there a rule of thumb like "log files should be less than 5g"
or something that we can watch for to head off this problem before it
becomes a production issue?

But snap count seems like memory defense, would that cause EOF exceptions
when reading a DataInputStream?

This is the config file, it looks a bit odd since its auto-generated by
exhibitor, but here it is verbatim.  I was just reading about having
separate devices for dataLogDir and dataDir.  That is definitely something
I plan to put in place.

server.3=zookeeper02-n3\:2888\:3888
server.2=zookeeper02-n2\:2888\:3888
server.1=zookeeper02-n1\:2888\:3888
initLimit=10
syncLimit=5
clientPort=2181
tickTime=2000
dataDir=/var/lib/zookeeper
dataLogDir=/var/lib/zookeeper
server.5=zookeeper02-n5\:2888\:3888
server.4=zookeeper02-n4\:2888\:3888


On Sun, Jul 6, 2014 at 5:30 PM, Flavio Junqueira <
fpjunqueira@yahoo.com.invalid> wrote:

> That's a good point, some znodes in the snapshot Aaron shared have over
> 400k bytes. There are some znodes with tens of kbytes too.
>
> Given that we are giving hints about the config parameters, it'd be nice
> if we could see the config file.
>
> -Flavio
>
> On 06 Jul 2014, at 22:34, Benjamin Reed <br...@apache.org> wrote:
>
> > the log files seem very large to me. is the snap count really large? or
> do
> > you have a lot of data in each znode? you might try reducing the snap
> count
> > to make the log files smaller.
> > On Jul 6, 2014 2:27 PM, "Flavio Junqueira" <fpjunqueira@yahoo.com.invalid
> >
> > wrote:
> >
> >> But what is it that was causing problems in your scenario, Raul? Is it
> >> reading the log? In any case, it sounds like initLimit is the parameter
> you
> >> want to change, no?
> >>
> >> -Flavio
> >>
> >> On 06 Jul 2014, at 19:09, Raúl Gutiérrez Segalés <rg...@itevenworks.net>
> >> wrote:
> >>
> >>> Oh, storm right. Yeah I've seen this. The transaction rate is so huge
> the
> >>> the initial sync fails.. perhaps you could try bigger tickTime,
> initLimit
> >>> and syncLimit params...
> >>>
> >>>
> >>> -rgs
> >>>
> >>>
> >>> On 6 July 2014 04:48, Aaron Zimmerman <az...@sproutsocial.com>
> >> wrote:
> >>>
> >>>> Raúl,
> >>>>
> >>>> zk_approximate_data_size 4899392
> >>>>
> >>>> That is about the size of the snapshots also.
> >>>>
> >>>> Benjamin,
> >>>>
> >>>> We are not running out of disk space.
> >>>> But the log.XXXX files are quite large, is this normal?  In less than
> 3
> >>>> hours, the log file since the last snapshot is 8.2G, and the older log
> >>>> files are as large as 12G.
> >>>>
> >>>> We are using Storm Trident, this uses zookeeper pretty heavily for
> >> tracking
> >>>> transactional state, but i'm not sure if that could account for this
> >> much
> >>>> storage.  Is there an easy way to track which znodes are being updated
> >> most
> >>>> frequently?
> >>>>
> >>>> Thanks,
> >>>>
> >>>> Aaron
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>
> >>>> On Sun, Jul 6, 2014 at 1:55 AM, Raúl Gutiérrez Segalés <
> >>>> rgs@itevenworks.net>
> >>>> wrote:
> >>>>
> >>>>> What's the total size of the data in your ZK cluster? i.e.:
> >>>>>
> >>>>> $ echo mntr | nc localhost 2181 | grep zk_approximate_data_size
> >>>>>
> >>>>> And/or the size of the snapshot?
> >>>>>
> >>>>>
> >>>>> -rgs
> >>>>>
> >>>>>
> >>>>> On 4 July 2014 06:29, Aaron Zimmerman <az...@sproutsocial.com>
> >>>> wrote:
> >>>>>
> >>>>>> Hi all,
> >>>>>>
> >>>>>> We have a 5 node zookeeper cluster that has been operating normally
> >> for
> >>>>>> several months.  Starting a few days ago, the entire cluster
> crashes a
> >>>>> few
> >>>>>> times per day, all nodes at the exact same time.  We can't track
> down
> >>>> the
> >>>>>> exact issue, but deleting the snapshots and logs and restarting
> >>>> resolves.
> >>>>>>
> >>>>>> We are running exhibitor to monitor the cluster.
> >>>>>>
> >>>>>> It appears that something bad gets into the logs, causing an
> >>>> EOFException
> >>>>>> and this cascades through the entire cluster:
> >>>>>>
> >>>>>> 2014-07-04 12:55:26,328 [myid:1] - WARN
> >>>>>> [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Follower@89] - Exception
> >>>> when
> >>>>>> following the leader
> >>>>>> java.io.EOFException
> >>>>>>       at java.io.DataInputStream.readInt(DataInputStream.java:375)
> >>>>>>       at
> >>>>>>
> org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
> >>>>>>       at
> >>>>>>
> >>>>>>
> >>>>>
> >>>>
> >>
> org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:83)
> >>>>>>       at
> >>>>>>
> >>>>>
> >>>>
> >>
> org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:108)
> >>>>>>       at
> >>>>>>
> >> org.apache.zookeeper.server.quorum.Learner.readPacket(Learner.java:152)
> >>>>>>       at
> >>>>>>
> >>>>>
> >>>>
> >>
> org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:85)
> >>>>>>       at
> >>>>>>
> org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:740)
> >>>>>> 2014-07-04 12:55:26,328 [myid:1] - INFO
> >>>>>> [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Follower@166] - shutdown
> >>>>> called
> >>>>>> java.lang.Exception: shutdown Follower
> >>>>>>       at
> >>>>>>
> >> org.apache.zookeeper.server.quorum.Follower.shutdown(Follower.java:166)
> >>>>>>       at
> >>>>>>
> org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:744)
> >>>>>>
> >>>>>>
> >>>>>> Then the server dies, exhibitor tries to restart each node, and they
> >>>> all
> >>>>>> get stuck trying to replay the bad transaction, logging things like:
> >>>>>>
> >>>>>>
> >>>>>> 2014-07-04 12:58:52,734 [myid:1] - INFO  [main:FileSnap@83] -
> Reading
> >>>>>> snapshot /var/lib/zookeeper/version-2/snapshot.300011fc0
> >>>>>> 2014-07-04 12:58:52,896 [myid:1] - DEBUG
> >>>>>> [main:FileTxnLog$FileTxnIterator@575] - Created new input stream
> >>>>>> /var/lib/zookeeper/version-2/log.300000021
> >>>>>> 2014-07-04 12:58:52,915 [myid:1] - DEBUG
> >>>>>> [main:FileTxnLog$FileTxnIterator@578] - Created new input archive
> >>>>>> /var/lib/zookeeper/version-2/log.300000021
> >>>>>> 2014-07-04 12:59:25,870 [myid:1] - DEBUG
> >>>>>> [main:FileTxnLog$FileTxnIterator@618] - EOF excepton
> >>>>> java.io.EOFException:
> >>>>>> Failed to read /var/lib/zookeeper/version-2/log.300000021
> >>>>>> 2014-07-04 12:59:25,871 [myid:1] - DEBUG
> >>>>>> [main:FileTxnLog$FileTxnIterator@575] - Created new input stream
> >>>>>> /var/lib/zookeeper/version-2/log.300011fc2
> >>>>>> 2014-07-04 12:59:25,872 [myid:1] - DEBUG
> >>>>>> [main:FileTxnLog$FileTxnIterator@578] - Created new input archive
> >>>>>> /var/lib/zookeeper/version-2/log.300011fc2
> >>>>>> 2014-07-04 12:59:48,722 [myid:1] - DEBUG
> >>>>>> [main:FileTxnLog$FileTxnIterator@618] - EOF excepton
> >>>>> java.io.EOFException:
> >>>>>> Failed to read /var/lib/zookeeper/version-2/log.300011fc2
> >>>>>>
> >>>>>> And the cluster is dead.  The only way we have found to recover is
> to
> >>>>>> delete all of the data and restart.
> >>>>>>
> >>>>>> Anyone seen this before?  Any ideas how I can track down what is
> >>>> causing
> >>>>>> the EOFException, or insulate zookeeper from completely crashing?
> >>>>>>
> >>>>>> Thanks,
> >>>>>>
> >>>>>> Aaron Zimmerman
> >>>>>>
> >>>>>
> >>>>
> >>
> >>
>
>

Re: entire cluster dies with EOFException

Posted by Flavio Junqueira <fp...@yahoo.com.INVALID>.

That's a good point, some znodes in the snapshot Aaron shared have over 400k bytes. There are some znodes with tens of kbytes too. 

Given that we are giving hints about the config parameters, it'd be nice if we could see the config file.

-Flavio

On 06 Jul 2014, at 22:34, Benjamin Reed <br...@apache.org> wrote:

> the log files seem very large to me. is the snap count really large? or do
> you have a lot of data in each znode? you might try reducing the snap count
> to make the log files smaller.
> On Jul 6, 2014 2:27 PM, "Flavio Junqueira" <fp...@yahoo.com.invalid>
> wrote:
> 
>> But what is it that was causing problems in your scenario, Raul? Is it
>> reading the log? In any case, it sounds like initLimit is the parameter you
>> want to change, no?
>> 
>> -Flavio
>> 
>> On 06 Jul 2014, at 19:09, Raúl Gutiérrez Segalés <rg...@itevenworks.net>
>> wrote:
>> 
>>> Oh, storm right. Yeah I've seen this. The transaction rate is so huge the
>>> the initial sync fails.. perhaps you could try bigger tickTime, initLimit
>>> and syncLimit params...
>>> 
>>> 
>>> -rgs
>>> 
>>> 
>>> On 6 July 2014 04:48, Aaron Zimmerman <az...@sproutsocial.com>
>> wrote:
>>> 
>>>> Raúl,
>>>> 
>>>> zk_approximate_data_size 4899392
>>>> 
>>>> That is about the size of the snapshots also.
>>>> 
>>>> Benjamin,
>>>> 
>>>> We are not running out of disk space.
>>>> But the log.XXXX files are quite large, is this normal?  In less than 3
>>>> hours, the log file since the last snapshot is 8.2G, and the older log
>>>> files are as large as 12G.
>>>> 
>>>> We are using Storm Trident, this uses zookeeper pretty heavily for
>> tracking
>>>> transactional state, but i'm not sure if that could account for this
>> much
>>>> storage.  Is there an easy way to track which znodes are being updated
>> most
>>>> frequently?
>>>> 
>>>> Thanks,
>>>> 
>>>> Aaron
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> On Sun, Jul 6, 2014 at 1:55 AM, Raúl Gutiérrez Segalés <
>>>> rgs@itevenworks.net>
>>>> wrote:
>>>> 
>>>>> What's the total size of the data in your ZK cluster? i.e.:
>>>>> 
>>>>> $ echo mntr | nc localhost 2181 | grep zk_approximate_data_size
>>>>> 
>>>>> And/or the size of the snapshot?
>>>>> 
>>>>> 
>>>>> -rgs
>>>>> 
>>>>> 
>>>>> On 4 July 2014 06:29, Aaron Zimmerman <az...@sproutsocial.com>
>>>> wrote:
>>>>> 
>>>>>> Hi all,
>>>>>> 
>>>>>> We have a 5 node zookeeper cluster that has been operating normally
>> for
>>>>>> several months.  Starting a few days ago, the entire cluster crashes a
>>>>> few
>>>>>> times per day, all nodes at the exact same time.  We can't track down
>>>> the
>>>>>> exact issue, but deleting the snapshots and logs and restarting
>>>> resolves.
>>>>>> 
>>>>>> We are running exhibitor to monitor the cluster.
>>>>>> 
>>>>>> It appears that something bad gets into the logs, causing an
>>>> EOFException
>>>>>> and this cascades through the entire cluster:
>>>>>> 
>>>>>> 2014-07-04 12:55:26,328 [myid:1] - WARN
>>>>>> [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Follower@89] - Exception
>>>> when
>>>>>> following the leader
>>>>>> java.io.EOFException
>>>>>>       at java.io.DataInputStream.readInt(DataInputStream.java:375)
>>>>>>       at
>>>>>> org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
>>>>>>       at
>>>>>> 
>>>>>> 
>>>>> 
>>>> 
>> org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:83)
>>>>>>       at
>>>>>> 
>>>>> 
>>>> 
>> org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:108)
>>>>>>       at
>>>>>> 
>> org.apache.zookeeper.server.quorum.Learner.readPacket(Learner.java:152)
>>>>>>       at
>>>>>> 
>>>>> 
>>>> 
>> org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:85)
>>>>>>       at
>>>>>> org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:740)
>>>>>> 2014-07-04 12:55:26,328 [myid:1] - INFO
>>>>>> [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Follower@166] - shutdown
>>>>> called
>>>>>> java.lang.Exception: shutdown Follower
>>>>>>       at
>>>>>> 
>> org.apache.zookeeper.server.quorum.Follower.shutdown(Follower.java:166)
>>>>>>       at
>>>>>> org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:744)
>>>>>> 
>>>>>> 
>>>>>> Then the server dies, exhibitor tries to restart each node, and they
>>>> all
>>>>>> get stuck trying to replay the bad transaction, logging things like:
>>>>>> 
>>>>>> 
>>>>>> 2014-07-04 12:58:52,734 [myid:1] - INFO  [main:FileSnap@83] - Reading
>>>>>> snapshot /var/lib/zookeeper/version-2/snapshot.300011fc0
>>>>>> 2014-07-04 12:58:52,896 [myid:1] - DEBUG
>>>>>> [main:FileTxnLog$FileTxnIterator@575] - Created new input stream
>>>>>> /var/lib/zookeeper/version-2/log.300000021
>>>>>> 2014-07-04 12:58:52,915 [myid:1] - DEBUG
>>>>>> [main:FileTxnLog$FileTxnIterator@578] - Created new input archive
>>>>>> /var/lib/zookeeper/version-2/log.300000021
>>>>>> 2014-07-04 12:59:25,870 [myid:1] - DEBUG
>>>>>> [main:FileTxnLog$FileTxnIterator@618] - EOF excepton
>>>>> java.io.EOFException:
>>>>>> Failed to read /var/lib/zookeeper/version-2/log.300000021
>>>>>> 2014-07-04 12:59:25,871 [myid:1] - DEBUG
>>>>>> [main:FileTxnLog$FileTxnIterator@575] - Created new input stream
>>>>>> /var/lib/zookeeper/version-2/log.300011fc2
>>>>>> 2014-07-04 12:59:25,872 [myid:1] - DEBUG
>>>>>> [main:FileTxnLog$FileTxnIterator@578] - Created new input archive
>>>>>> /var/lib/zookeeper/version-2/log.300011fc2
>>>>>> 2014-07-04 12:59:48,722 [myid:1] - DEBUG
>>>>>> [main:FileTxnLog$FileTxnIterator@618] - EOF excepton
>>>>> java.io.EOFException:
>>>>>> Failed to read /var/lib/zookeeper/version-2/log.300011fc2
>>>>>> 
>>>>>> And the cluster is dead.  The only way we have found to recover is to
>>>>>> delete all of the data and restart.
>>>>>> 
>>>>>> Anyone seen this before?  Any ideas how I can track down what is
>>>> causing
>>>>>> the EOFException, or insulate zookeeper from completely crashing?
>>>>>> 
>>>>>> Thanks,
>>>>>> 
>>>>>> Aaron Zimmerman
>>>>>> 
>>>>> 
>>>> 
>> 
>>

Re: entire cluster dies with EOFException

Posted by Benjamin Reed <br...@apache.org>.

the log files seem very large to me. is the snap count really large? or do
you have a lot of data in each znode? you might try reducing the snap count
to make the log files smaller.
On Jul 6, 2014 2:27 PM, "Flavio Junqueira" <fp...@yahoo.com.invalid>
wrote:

> But what is it that was causing problems in your scenario, Raul? Is it
> reading the log? In any case, it sounds like initLimit is the parameter you
> want to change, no?
>
> -Flavio
>
> On 06 Jul 2014, at 19:09, Raúl Gutiérrez Segalés <rg...@itevenworks.net>
> wrote:
>
> > Oh, storm right. Yeah I've seen this. The transaction rate is so huge the
> > the initial sync fails.. perhaps you could try bigger tickTime, initLimit
> > and syncLimit params...
> >
> >
> > -rgs
> >
> >
> > On 6 July 2014 04:48, Aaron Zimmerman <az...@sproutsocial.com>
> wrote:
> >
> >> Raúl,
> >>
> >> zk_approximate_data_size 4899392
> >>
> >> That is about the size of the snapshots also.
> >>
> >> Benjamin,
> >>
> >> We are not running out of disk space.
> >> But the log.XXXX files are quite large, is this normal?  In less than 3
> >> hours, the log file since the last snapshot is 8.2G, and the older log
> >> files are as large as 12G.
> >>
> >> We are using Storm Trident, this uses zookeeper pretty heavily for
> tracking
> >> transactional state, but i'm not sure if that could account for this
> much
> >> storage.  Is there an easy way to track which znodes are being updated
> most
> >> frequently?
> >>
> >> Thanks,
> >>
> >> Aaron
> >>
> >>
> >>
> >>
> >>
> >> On Sun, Jul 6, 2014 at 1:55 AM, Raúl Gutiérrez Segalés <
> >> rgs@itevenworks.net>
> >> wrote:
> >>
> >>> What's the total size of the data in your ZK cluster? i.e.:
> >>>
> >>> $ echo mntr | nc localhost 2181 | grep zk_approximate_data_size
> >>>
> >>> And/or the size of the snapshot?
> >>>
> >>>
> >>> -rgs
> >>>
> >>>
> >>> On 4 July 2014 06:29, Aaron Zimmerman <az...@sproutsocial.com>
> >> wrote:
> >>>
> >>>> Hi all,
> >>>>
> >>>> We have a 5 node zookeeper cluster that has been operating normally
> for
> >>>> several months.  Starting a few days ago, the entire cluster crashes a
> >>> few
> >>>> times per day, all nodes at the exact same time.  We can't track down
> >> the
> >>>> exact issue, but deleting the snapshots and logs and restarting
> >> resolves.
> >>>>
> >>>> We are running exhibitor to monitor the cluster.
> >>>>
> >>>> It appears that something bad gets into the logs, causing an
> >> EOFException
> >>>> and this cascades through the entire cluster:
> >>>>
> >>>> 2014-07-04 12:55:26,328 [myid:1] - WARN
> >>>> [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Follower@89] - Exception
> >> when
> >>>> following the leader
> >>>> java.io.EOFException
> >>>>        at java.io.DataInputStream.readInt(DataInputStream.java:375)
> >>>>        at
> >>>> org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
> >>>>        at
> >>>>
> >>>>
> >>>
> >>
> org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:83)
> >>>>        at
> >>>>
> >>>
> >>
> org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:108)
> >>>>        at
> >>>>
> org.apache.zookeeper.server.quorum.Learner.readPacket(Learner.java:152)
> >>>>        at
> >>>>
> >>>
> >>
> org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:85)
> >>>>        at
> >>>> org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:740)
> >>>> 2014-07-04 12:55:26,328 [myid:1] - INFO
> >>>> [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Follower@166] - shutdown
> >>> called
> >>>> java.lang.Exception: shutdown Follower
> >>>>        at
> >>>>
> org.apache.zookeeper.server.quorum.Follower.shutdown(Follower.java:166)
> >>>>        at
> >>>> org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:744)
> >>>>
> >>>>
> >>>> Then the server dies, exhibitor tries to restart each node, and they
> >> all
> >>>> get stuck trying to replay the bad transaction, logging things like:
> >>>>
> >>>>
> >>>> 2014-07-04 12:58:52,734 [myid:1] - INFO  [main:FileSnap@83] - Reading
> >>>> snapshot /var/lib/zookeeper/version-2/snapshot.300011fc0
> >>>> 2014-07-04 12:58:52,896 [myid:1] - DEBUG
> >>>> [main:FileTxnLog$FileTxnIterator@575] - Created new input stream
> >>>> /var/lib/zookeeper/version-2/log.300000021
> >>>> 2014-07-04 12:58:52,915 [myid:1] - DEBUG
> >>>> [main:FileTxnLog$FileTxnIterator@578] - Created new input archive
> >>>> /var/lib/zookeeper/version-2/log.300000021
> >>>> 2014-07-04 12:59:25,870 [myid:1] - DEBUG
> >>>> [main:FileTxnLog$FileTxnIterator@618] - EOF excepton
> >>> java.io.EOFException:
> >>>> Failed to read /var/lib/zookeeper/version-2/log.300000021
> >>>> 2014-07-04 12:59:25,871 [myid:1] - DEBUG
> >>>> [main:FileTxnLog$FileTxnIterator@575] - Created new input stream
> >>>> /var/lib/zookeeper/version-2/log.300011fc2
> >>>> 2014-07-04 12:59:25,872 [myid:1] - DEBUG
> >>>> [main:FileTxnLog$FileTxnIterator@578] - Created new input archive
> >>>> /var/lib/zookeeper/version-2/log.300011fc2
> >>>> 2014-07-04 12:59:48,722 [myid:1] - DEBUG
> >>>> [main:FileTxnLog$FileTxnIterator@618] - EOF excepton
> >>> java.io.EOFException:
> >>>> Failed to read /var/lib/zookeeper/version-2/log.300011fc2
> >>>>
> >>>> And the cluster is dead.  The only way we have found to recover is to
> >>>> delete all of the data and restart.
> >>>>
> >>>> Anyone seen this before?  Any ideas how I can track down what is
> >> causing
> >>>> the EOFException, or insulate zookeeper from completely crashing?
> >>>>
> >>>> Thanks,
> >>>>
> >>>> Aaron Zimmerman
> >>>>
> >>>
> >>
>
>

Re: entire cluster dies with EOFException

Posted by Flavio Junqueira <fp...@yahoo.com.INVALID>.

But what is it that was causing problems in your scenario, Raul? Is it reading the log? In any case, it sounds like initLimit is the parameter you want to change, no?

-Flavio

On 06 Jul 2014, at 19:09, Raúl Gutiérrez Segalés <rg...@itevenworks.net> wrote:

> Oh, storm right. Yeah I've seen this. The transaction rate is so huge the
> the initial sync fails.. perhaps you could try bigger tickTime, initLimit
> and syncLimit params...
> 
> 
> -rgs
> 
> 
> On 6 July 2014 04:48, Aaron Zimmerman <az...@sproutsocial.com> wrote:
> 
>> Raúl,
>> 
>> zk_approximate_data_size 4899392
>> 
>> That is about the size of the snapshots also.
>> 
>> Benjamin,
>> 
>> We are not running out of disk space.
>> But the log.XXXX files are quite large, is this normal?  In less than 3
>> hours, the log file since the last snapshot is 8.2G, and the older log
>> files are as large as 12G.
>> 
>> We are using Storm Trident, this uses zookeeper pretty heavily for tracking
>> transactional state, but i'm not sure if that could account for this much
>> storage.  Is there an easy way to track which znodes are being updated most
>> frequently?
>> 
>> Thanks,
>> 
>> Aaron
>> 
>> 
>> 
>> 
>> 
>> On Sun, Jul 6, 2014 at 1:55 AM, Raúl Gutiérrez Segalés <
>> rgs@itevenworks.net>
>> wrote:
>> 
>>> What's the total size of the data in your ZK cluster? i.e.:
>>> 
>>> $ echo mntr | nc localhost 2181 | grep zk_approximate_data_size
>>> 
>>> And/or the size of the snapshot?
>>> 
>>> 
>>> -rgs
>>> 
>>> 
>>> On 4 July 2014 06:29, Aaron Zimmerman <az...@sproutsocial.com>
>> wrote:
>>> 
>>>> Hi all,
>>>> 
>>>> We have a 5 node zookeeper cluster that has been operating normally for
>>>> several months.  Starting a few days ago, the entire cluster crashes a
>>> few
>>>> times per day, all nodes at the exact same time.  We can't track down
>> the
>>>> exact issue, but deleting the snapshots and logs and restarting
>> resolves.
>>>> 
>>>> We are running exhibitor to monitor the cluster.
>>>> 
>>>> It appears that something bad gets into the logs, causing an
>> EOFException
>>>> and this cascades through the entire cluster:
>>>> 
>>>> 2014-07-04 12:55:26,328 [myid:1] - WARN
>>>> [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Follower@89] - Exception
>> when
>>>> following the leader
>>>> java.io.EOFException
>>>>        at java.io.DataInputStream.readInt(DataInputStream.java:375)
>>>>        at
>>>> org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
>>>>        at
>>>> 
>>>> 
>>> 
>> org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:83)
>>>>        at
>>>> 
>>> 
>> org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:108)
>>>>        at
>>>> org.apache.zookeeper.server.quorum.Learner.readPacket(Learner.java:152)
>>>>        at
>>>> 
>>> 
>> org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:85)
>>>>        at
>>>> org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:740)
>>>> 2014-07-04 12:55:26,328 [myid:1] - INFO
>>>> [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Follower@166] - shutdown
>>> called
>>>> java.lang.Exception: shutdown Follower
>>>>        at
>>>> org.apache.zookeeper.server.quorum.Follower.shutdown(Follower.java:166)
>>>>        at
>>>> org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:744)
>>>> 
>>>> 
>>>> Then the server dies, exhibitor tries to restart each node, and they
>> all
>>>> get stuck trying to replay the bad transaction, logging things like:
>>>> 
>>>> 
>>>> 2014-07-04 12:58:52,734 [myid:1] - INFO  [main:FileSnap@83] - Reading
>>>> snapshot /var/lib/zookeeper/version-2/snapshot.300011fc0
>>>> 2014-07-04 12:58:52,896 [myid:1] - DEBUG
>>>> [main:FileTxnLog$FileTxnIterator@575] - Created new input stream
>>>> /var/lib/zookeeper/version-2/log.300000021
>>>> 2014-07-04 12:58:52,915 [myid:1] - DEBUG
>>>> [main:FileTxnLog$FileTxnIterator@578] - Created new input archive
>>>> /var/lib/zookeeper/version-2/log.300000021
>>>> 2014-07-04 12:59:25,870 [myid:1] - DEBUG
>>>> [main:FileTxnLog$FileTxnIterator@618] - EOF excepton
>>> java.io.EOFException:
>>>> Failed to read /var/lib/zookeeper/version-2/log.300000021
>>>> 2014-07-04 12:59:25,871 [myid:1] - DEBUG
>>>> [main:FileTxnLog$FileTxnIterator@575] - Created new input stream
>>>> /var/lib/zookeeper/version-2/log.300011fc2
>>>> 2014-07-04 12:59:25,872 [myid:1] - DEBUG
>>>> [main:FileTxnLog$FileTxnIterator@578] - Created new input archive
>>>> /var/lib/zookeeper/version-2/log.300011fc2
>>>> 2014-07-04 12:59:48,722 [myid:1] - DEBUG
>>>> [main:FileTxnLog$FileTxnIterator@618] - EOF excepton
>>> java.io.EOFException:
>>>> Failed to read /var/lib/zookeeper/version-2/log.300011fc2
>>>> 
>>>> And the cluster is dead.  The only way we have found to recover is to
>>>> delete all of the data and restart.
>>>> 
>>>> Anyone seen this before?  Any ideas how I can track down what is
>> causing
>>>> the EOFException, or insulate zookeeper from completely crashing?
>>>> 
>>>> Thanks,
>>>> 
>>>> Aaron Zimmerman
>>>> 
>>> 
>>

Re: entire cluster dies with EOFException

Posted by Raúl Gutiérrez Segalés <rg...@itevenworks.net>.

Oh, storm right. Yeah I've seen this. The transaction rate is so huge the
the initial sync fails.. perhaps you could try bigger tickTime, initLimit
and syncLimit params...


-rgs


On 6 July 2014 04:48, Aaron Zimmerman <az...@sproutsocial.com> wrote:

> Raúl,
>
> zk_approximate_data_size 4899392
>
> That is about the size of the snapshots also.
>
> Benjamin,
>
> We are not running out of disk space.
> But the log.XXXX files are quite large, is this normal?  In less than 3
> hours, the log file since the last snapshot is 8.2G, and the older log
> files are as large as 12G.
>
> We are using Storm Trident, this uses zookeeper pretty heavily for tracking
> transactional state, but i'm not sure if that could account for this much
> storage.  Is there an easy way to track which znodes are being updated most
> frequently?
>
> Thanks,
>
> Aaron
>
>
>
>
>
> On Sun, Jul 6, 2014 at 1:55 AM, Raúl Gutiérrez Segalés <
> rgs@itevenworks.net>
> wrote:
>
> > What's the total size of the data in your ZK cluster? i.e.:
> >
> > $ echo mntr | nc localhost 2181 | grep zk_approximate_data_size
> >
> > And/or the size of the snapshot?
> >
> >
> > -rgs
> >
> >
> > On 4 July 2014 06:29, Aaron Zimmerman <az...@sproutsocial.com>
> wrote:
> >
> > > Hi all,
> > >
> > > We have a 5 node zookeeper cluster that has been operating normally for
> > > several months.  Starting a few days ago, the entire cluster crashes a
> > few
> > > times per day, all nodes at the exact same time.  We can't track down
> the
> > > exact issue, but deleting the snapshots and logs and restarting
> resolves.
> > >
> > > We are running exhibitor to monitor the cluster.
> > >
> > > It appears that something bad gets into the logs, causing an
> EOFException
> > > and this cascades through the entire cluster:
> > >
> > > 2014-07-04 12:55:26,328 [myid:1] - WARN
> > >  [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Follower@89] - Exception
> when
> > > following the leader
> > > java.io.EOFException
> > >         at java.io.DataInputStream.readInt(DataInputStream.java:375)
> > >         at
> > > org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
> > >         at
> > >
> > >
> >
> org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:83)
> > >         at
> > >
> >
> org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:108)
> > >         at
> > > org.apache.zookeeper.server.quorum.Learner.readPacket(Learner.java:152)
> > >         at
> > >
> >
> org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:85)
> > >         at
> > > org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:740)
> > > 2014-07-04 12:55:26,328 [myid:1] - INFO
> > >  [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Follower@166] - shutdown
> > called
> > > java.lang.Exception: shutdown Follower
> > >         at
> > > org.apache.zookeeper.server.quorum.Follower.shutdown(Follower.java:166)
> > >         at
> > > org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:744)
> > >
> > >
> > > Then the server dies, exhibitor tries to restart each node, and they
> all
> > > get stuck trying to replay the bad transaction, logging things like:
> > >
> > >
> > > 2014-07-04 12:58:52,734 [myid:1] - INFO  [main:FileSnap@83] - Reading
> > > snapshot /var/lib/zookeeper/version-2/snapshot.300011fc0
> > > 2014-07-04 12:58:52,896 [myid:1] - DEBUG
> > > [main:FileTxnLog$FileTxnIterator@575] - Created new input stream
> > > /var/lib/zookeeper/version-2/log.300000021
> > > 2014-07-04 12:58:52,915 [myid:1] - DEBUG
> > > [main:FileTxnLog$FileTxnIterator@578] - Created new input archive
> > > /var/lib/zookeeper/version-2/log.300000021
> > > 2014-07-04 12:59:25,870 [myid:1] - DEBUG
> > > [main:FileTxnLog$FileTxnIterator@618] - EOF excepton
> > java.io.EOFException:
> > > Failed to read /var/lib/zookeeper/version-2/log.300000021
> > > 2014-07-04 12:59:25,871 [myid:1] - DEBUG
> > > [main:FileTxnLog$FileTxnIterator@575] - Created new input stream
> > > /var/lib/zookeeper/version-2/log.300011fc2
> > > 2014-07-04 12:59:25,872 [myid:1] - DEBUG
> > > [main:FileTxnLog$FileTxnIterator@578] - Created new input archive
> > > /var/lib/zookeeper/version-2/log.300011fc2
> > > 2014-07-04 12:59:48,722 [myid:1] - DEBUG
> > > [main:FileTxnLog$FileTxnIterator@618] - EOF excepton
> > java.io.EOFException:
> > > Failed to read /var/lib/zookeeper/version-2/log.300011fc2
> > >
> > > And the cluster is dead.  The only way we have found to recover is to
> > > delete all of the data and restart.
> > >
> > > Anyone seen this before?  Any ideas how I can track down what is
> causing
> > > the EOFException, or insulate zookeeper from completely crashing?
> > >
> > > Thanks,
> > >
> > > Aaron Zimmerman
> > >
> >
>

Re: entire cluster dies with EOFException

Posted by Aaron Zimmerman <az...@sproutsocial.com>.

Raúl,

zk_approximate_data_size 4899392

That is about the size of the snapshots also.

Benjamin,

We are not running out of disk space.
But the log.XXXX files are quite large, is this normal?  In less than 3
hours, the log file since the last snapshot is 8.2G, and the older log
files are as large as 12G.

We are using Storm Trident, this uses zookeeper pretty heavily for tracking
transactional state, but i'm not sure if that could account for this much
storage.  Is there an easy way to track which znodes are being updated most
frequently?

Thanks,

Aaron





On Sun, Jul 6, 2014 at 1:55 AM, Raúl Gutiérrez Segalés <rg...@itevenworks.net>
wrote:

> What's the total size of the data in your ZK cluster? i.e.:
>
> $ echo mntr | nc localhost 2181 | grep zk_approximate_data_size
>
> And/or the size of the snapshot?
>
>
> -rgs
>
>
> On 4 July 2014 06:29, Aaron Zimmerman <az...@sproutsocial.com> wrote:
>
> > Hi all,
> >
> > We have a 5 node zookeeper cluster that has been operating normally for
> > several months.  Starting a few days ago, the entire cluster crashes a
> few
> > times per day, all nodes at the exact same time.  We can't track down the
> > exact issue, but deleting the snapshots and logs and restarting resolves.
> >
> > We are running exhibitor to monitor the cluster.
> >
> > It appears that something bad gets into the logs, causing an EOFException
> > and this cascades through the entire cluster:
> >
> > 2014-07-04 12:55:26,328 [myid:1] - WARN
> >  [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Follower@89] - Exception when
> > following the leader
> > java.io.EOFException
> >         at java.io.DataInputStream.readInt(DataInputStream.java:375)
> >         at
> > org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
> >         at
> >
> >
> org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:83)
> >         at
> >
> org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:108)
> >         at
> > org.apache.zookeeper.server.quorum.Learner.readPacket(Learner.java:152)
> >         at
> >
> org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:85)
> >         at
> > org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:740)
> > 2014-07-04 12:55:26,328 [myid:1] - INFO
> >  [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Follower@166] - shutdown
> called
> > java.lang.Exception: shutdown Follower
> >         at
> > org.apache.zookeeper.server.quorum.Follower.shutdown(Follower.java:166)
> >         at
> > org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:744)
> >
> >
> > Then the server dies, exhibitor tries to restart each node, and they all
> > get stuck trying to replay the bad transaction, logging things like:
> >
> >
> > 2014-07-04 12:58:52,734 [myid:1] - INFO  [main:FileSnap@83] - Reading
> > snapshot /var/lib/zookeeper/version-2/snapshot.300011fc0
> > 2014-07-04 12:58:52,896 [myid:1] - DEBUG
> > [main:FileTxnLog$FileTxnIterator@575] - Created new input stream
> > /var/lib/zookeeper/version-2/log.300000021
> > 2014-07-04 12:58:52,915 [myid:1] - DEBUG
> > [main:FileTxnLog$FileTxnIterator@578] - Created new input archive
> > /var/lib/zookeeper/version-2/log.300000021
> > 2014-07-04 12:59:25,870 [myid:1] - DEBUG
> > [main:FileTxnLog$FileTxnIterator@618] - EOF excepton
> java.io.EOFException:
> > Failed to read /var/lib/zookeeper/version-2/log.300000021
> > 2014-07-04 12:59:25,871 [myid:1] - DEBUG
> > [main:FileTxnLog$FileTxnIterator@575] - Created new input stream
> > /var/lib/zookeeper/version-2/log.300011fc2
> > 2014-07-04 12:59:25,872 [myid:1] - DEBUG
> > [main:FileTxnLog$FileTxnIterator@578] - Created new input archive
> > /var/lib/zookeeper/version-2/log.300011fc2
> > 2014-07-04 12:59:48,722 [myid:1] - DEBUG
> > [main:FileTxnLog$FileTxnIterator@618] - EOF excepton
> java.io.EOFException:
> > Failed to read /var/lib/zookeeper/version-2/log.300011fc2
> >
> > And the cluster is dead.  The only way we have found to recover is to
> > delete all of the data and restart.
> >
> > Anyone seen this before?  Any ideas how I can track down what is causing
> > the EOFException, or insulate zookeeper from completely crashing?
> >
> > Thanks,
> >
> > Aaron Zimmerman
> >
>

Re: entire cluster dies with EOFException

Posted by Raúl Gutiérrez Segalés <rg...@itevenworks.net>.

What's the total size of the data in your ZK cluster? i.e.:

$ echo mntr | nc localhost 2181 | grep zk_approximate_data_size

And/or the size of the snapshot?


-rgs


On 4 July 2014 06:29, Aaron Zimmerman <az...@sproutsocial.com> wrote:

> Hi all,
>
> We have a 5 node zookeeper cluster that has been operating normally for
> several months.  Starting a few days ago, the entire cluster crashes a few
> times per day, all nodes at the exact same time.  We can't track down the
> exact issue, but deleting the snapshots and logs and restarting resolves.
>
> We are running exhibitor to monitor the cluster.
>
> It appears that something bad gets into the logs, causing an EOFException
> and this cascades through the entire cluster:
>
> 2014-07-04 12:55:26,328 [myid:1] - WARN
>  [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Follower@89] - Exception when
> following the leader
> java.io.EOFException
>         at java.io.DataInputStream.readInt(DataInputStream.java:375)
>         at
> org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
>         at
>
> org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:83)
>         at
> org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:108)
>         at
> org.apache.zookeeper.server.quorum.Learner.readPacket(Learner.java:152)
>         at
> org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:85)
>         at
> org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:740)
> 2014-07-04 12:55:26,328 [myid:1] - INFO
>  [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Follower@166] - shutdown called
> java.lang.Exception: shutdown Follower
>         at
> org.apache.zookeeper.server.quorum.Follower.shutdown(Follower.java:166)
>         at
> org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:744)
>
>
> Then the server dies, exhibitor tries to restart each node, and they all
> get stuck trying to replay the bad transaction, logging things like:
>
>
> 2014-07-04 12:58:52,734 [myid:1] - INFO  [main:FileSnap@83] - Reading
> snapshot /var/lib/zookeeper/version-2/snapshot.300011fc0
> 2014-07-04 12:58:52,896 [myid:1] - DEBUG
> [main:FileTxnLog$FileTxnIterator@575] - Created new input stream
> /var/lib/zookeeper/version-2/log.300000021
> 2014-07-04 12:58:52,915 [myid:1] - DEBUG
> [main:FileTxnLog$FileTxnIterator@578] - Created new input archive
> /var/lib/zookeeper/version-2/log.300000021
> 2014-07-04 12:59:25,870 [myid:1] - DEBUG
> [main:FileTxnLog$FileTxnIterator@618] - EOF excepton java.io.EOFException:
> Failed to read /var/lib/zookeeper/version-2/log.300000021
> 2014-07-04 12:59:25,871 [myid:1] - DEBUG
> [main:FileTxnLog$FileTxnIterator@575] - Created new input stream
> /var/lib/zookeeper/version-2/log.300011fc2
> 2014-07-04 12:59:25,872 [myid:1] - DEBUG
> [main:FileTxnLog$FileTxnIterator@578] - Created new input archive
> /var/lib/zookeeper/version-2/log.300011fc2
> 2014-07-04 12:59:48,722 [myid:1] - DEBUG
> [main:FileTxnLog$FileTxnIterator@618] - EOF excepton java.io.EOFException:
> Failed to read /var/lib/zookeeper/version-2/log.300011fc2
>
> And the cluster is dead.  The only way we have found to recover is to
> delete all of the data and restart.
>
> Anyone seen this before?  Any ideas how I can track down what is causing
> the EOFException, or insulate zookeeper from completely crashing?
>
> Thanks,
>
> Aaron Zimmerman
>