You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@zookeeper.apache.org by Gunnar Wagenknecht <gu...@wagenknecht.org> on 2011/06/21 09:14:31 UTC
ZooKeeper Clients waiting forever (hanging threads)
Hi,
I have an issue with ZK clients waiting forever. The stack for the
waiting threads looks like the following.
> java.lang.Thread.State: WAITING (on object monitor)
> at java.lang.Object.wait(Native Method)
> at java.lang.Object.wait(Object.java:485)
> at org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1317)
> - locked <0x00002aab19a019b0>
> (a org.apache.zookeeper.ClientCnxn$Packet)
> at org.apache.zookeeper.ZooKeeper.getChildren(ZooKeeper.java:1241)
> at org.apache.zookeeper.ZooKeeper.getChildren(ZooKeeper.java:1271)
> ...
Look at the stack further I noticed many more threads hung. All with a
similar call stack (but different client calls, though).
> java.lang.Thread.State: WAITING (on object monitor)
> at java.lang.Object.wait(Native Method)
> at java.lang.Object.wait(Object.java:485)
> at org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1317)
> - locked <0x00002aab19a013a8>
> (a org.apache.zookeeper.ClientCnxn$Packet)
> at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:804)
> ...
Looking at the logs, it seems that all this started with a connection
loss during nights.
03:31:32.855 [Worker-76] WARN ... KeeperErrorCode = ConnectionLoss ...
03:31:32.867 [Worker-65] WARN ... KeeperErrorCode = ConnectionLoss ...
However, then I found this:
03:32:49.417 [ZooKeeper Gate Connect Thread-SendThread(zk-03:2181)]
ERROR org.apache.zookeeper.ClientCnxn - from ZooKeeper Gate Connect
Thread-SendThread(zk-03:2181)
java.lang.OutOfMemoryError: Java heap space
at java.util.HashMap.resize(HashMap.java:462) ~[na:1.6.0_24]
at java.util.HashMap.addEntry(HashMap.java:755) ~[na:1.6.0_24]
at java.util.HashMap.put(HashMap.java:385) ~[na:1.6.0_24]
at java.util.HashSet.add(HashSet.java:200) ~[na:1.6.0_24]
at
java.util.AbstractCollection.addAll(AbstractCollection.java:305)
~[na:1.6.0_24]
at
org.apache.zookeeper.ZooKeeper$ZKWatchManager.materialize(ZooKeeper.java:165)
~[na:na]
at
org.apache.zookeeper.ClientCnxn$EventThread.queueEvent(ClientCnxn.java:474)
~[na:na]
at
org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1172)
~[na:na]
I was wondering if this may have caused any race condition in the ZK client?
-Gunnar
--
Gunnar Wagenknecht
gunnar@wagenknecht.org
http://wagenknecht.org/