You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@zookeeper.apache.org by Srikanth R <rs...@gmail.com> on 2013/10/10 19:31:06 UTC

zkclient timeout issue

Hi Guys,

Need some expert advice :)

I have a 3 Server Zookeeper ensemble. Only one client (hadoop-zkfc with
5second timeout)  is connected to zookeeper. There is no other activity,
one txn is recorded every 10 min in zookeeper txn log. Zookeeper is as
close to being idle, except for the session traffic from the hadoop-zkfc
client.

Even with no writes happening to the zookeeper data dir, if  I start some
disk intensive process on the same partition  that has datadir (like
raid-check or cat'ing huge files), I am observing zkclient session
timeouts. (Error: Client has not heard back from Server in 3334 ms, so
disconnecting and reconnecting)

strace on zookeeper process shows that it received the 12byte heartbeat
from the client, but has not responded. Also strace does not show any disk
activity from zoookeper.

So the question is does zookeeper do anything on the disk even when its
idle (not writing any txns to disk). why does the datadir disk utilization
affect zookeeper even without any traffic ?

Any help is appreciated in this regard.

Thanks,
Srikanth

Re: zkclient timeout issue

Posted by Patrick Hunt <ph...@apache.org>.
Are you sure that the JVM is not swapping? Perhaps the "disk intensive
process" is also eating up alot of memory and that's pushing ZK server
out to swap?

Turn swappyness off and retry.

Patrick

On Thu, Oct 10, 2013 at 10:31 AM, Srikanth R <rs...@gmail.com> wrote:
> Hi Guys,
>
> Need some expert advice :)
>
> I have a 3 Server Zookeeper ensemble. Only one client (hadoop-zkfc with
> 5second timeout)  is connected to zookeeper. There is no other activity,
> one txn is recorded every 10 min in zookeeper txn log. Zookeeper is as
> close to being idle, except for the session traffic from the hadoop-zkfc
> client.
>
> Even with no writes happening to the zookeeper data dir, if  I start some
> disk intensive process on the same partition  that has datadir (like
> raid-check or cat'ing huge files), I am observing zkclient session
> timeouts. (Error: Client has not heard back from Server in 3334 ms, so
> disconnecting and reconnecting)
>
> strace on zookeeper process shows that it received the 12byte heartbeat
> from the client, but has not responded. Also strace does not show any disk
> activity from zoookeper.
>
> So the question is does zookeeper do anything on the disk even when its
> idle (not writing any txns to disk). why does the datadir disk utilization
> affect zookeeper even without any traffic ?
>
> Any help is appreciated in this regard.
>
> Thanks,
> Srikanth