You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@curator.apache.org by "Amir Gur (JIRA)" <ji...@apache.org> on 2015/03/24 10:45:52 UTC
[jira] [Updated] (CURATOR-194) Deadlock in
ConnectionState.checkTimeouts
[ https://issues.apache.org/jira/browse/CURATOR-194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Amir Gur updated CURATOR-194:
-----------------------------
Summary: Deadlock in ConnectionState.checkTimeouts (was: Deadlock ConnectionState.checkTimeouts)
> Deadlock in ConnectionState.checkTimeouts
> -----------------------------------------
>
> Key: CURATOR-194
> URL: https://issues.apache.org/jira/browse/CURATOR-194
> Project: Apache Curator
> Issue Type: Bug
> Components: Client
> Affects Versions: 2.6.0
> Reporter: Amir Gur
>
> When ConnectionState.checkTimeouts actually detects a timeout, it calls 'reset'
> which calls org.apache.zookeeper.ClientCnxn.close, which sends a ZooDefs.OpCode.closeSession request.
> Then it waits on the packet, until SendThread calls 'notifyAll' on the packet.
> At that time, SendThread is blocked because it tries to enter the synchronized method 'ConnectionState.checkTimeouts'.
> So it will never notify the packet.
> Here is the thread dump:
> "job-scheduler_Worker-19-CheckHealthTask" prio=10 tid=0x00007f260609c000 nid=0x5a97 in Object.wait() [0x00007f25723e1000]
> java.lang.Thread.State: WAITING (on object monitor)
> at java.lang.Object.wait(Native Method)
> - waiting on <0x0000000725fc0580> (a org.apache.zookeeper.ClientCnxn$Packet)
> at java.lang.Object.wait(Object.java:503)
> at org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1342)
> - locked <0x0000000725fc0580> (a org.apache.zookeeper.ClientCnxn$Packet)
> at org.apache.zookeeper.ClientCnxn.close(ClientCnxn.java:1314)
> at org.apache.zookeeper.ZooKeeper.close(ZooKeeper.java:677)
> - locked <0x0000000723949c88> (a org.apache.zookeeper.ZooKeeper)
> at org.apache.curator.HandleHolder.internalClose(HandleHolder.java:139)
> at org.apache.curator.HandleHolder.closeAndReset(HandleHolder.java:77)
> at org.apache.curator.ConnectionState.reset(ConnectionState.java:218)
> - locked <0x000000071651de48> (a org.apache.curator.ConnectionState)
> at org.apache.curator.ConnectionState.checkTimeouts(ConnectionState.java:194)
> - locked <0x000000071651de48> (a org.apache.curator.ConnectionState)
> at org.apache.curator.ConnectionState.getZooKeeper(ConnectionState.java:88)
> at org.apache.curator.CuratorZookeeperClient.getZooKeeper(CuratorZookeeperClient.java:115)
> at org.apache.curator.framework.imps.CuratorFrameworkImpl.getZooKeeper(CuratorFrameworkImpl.java:474)
> at org.apache.curator.framework.imps.ExistsBuilderImpl$2.call(ExistsBuilderImpl.java:172)
> at org.apache.curator.framework.imps.ExistsBuilderImpl$2.call(ExistsBuilderImpl.java:161)
> at org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:107)
> at org.apache.curator.framework.imps.ExistsBuilderImpl.pathInForeground(ExistsBuilderImpl.java:157)
> at org.apache.curator.framework.imps.ExistsBuilderImpl.forPath(ExistsBuilderImpl.java:148)
> at org.apache.curator.framework.imps.ExistsBuilderImpl.forPath(ExistsBuilderImpl.java:36)
> at com.alu.dal.zooKeeper.ZooKeeperSession.checkHealth(ZooKeeperSession.java:350)
> at com.alu.dal.zooKeeper.ZooKeeperSession.check(ZooKeeperSession.java:86)
> at com.alu.orchestration.cluster.ClusterInstanceServiceImpl.checkQuorum(ClusterInstanceServiceImpl.java:464)
> at com.alu.orchestration.cluster.ClusterInstanceServiceImpl.checkHealthState(ClusterInstanceServiceImpl.java:400)
> at com.alu.tasks.health.CheckHealthTaskImpl.doWork(CheckHealthTaskImpl.java:37)
> at com.alu.scheduler.JobSchedulerDetails$QuartzJob.executeInternal(JobSchedulerDetails.java:95)
> at org.springframework.scheduling.quartz.QuartzJobBean.execute(QuartzJobBean.java:114)
> at org.quartz.core.JobRunShell.run(JobRunShell.java:216)
> at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:549)
> "localhost-startStop-1-SendThread(11.1.1.11:2181)" daemon prio=10 tid=0x00007f257c61a000 nid=0x7c3 waiting for monitor entry [0x00007f2562e65000]
> java.lang.Thread.State: BLOCKED (on object monitor)
> at org.apache.curator.ConnectionState.checkTimeouts(ConnectionState.java:177)
> - waiting to lock <0x000000071651de48> (a org.apache.curator.ConnectionState)
> at org.apache.curator.ConnectionState.getZooKeeper(ConnectionState.java:88)
> at org.apache.curator.CuratorZookeeperClient.getZooKeeper(CuratorZookeeperClient.java:115)
> at org.apache.curator.framework.imps.CuratorFrameworkImpl.performBackgroundOperation(CuratorFrameworkImpl.java:793)
> at org.apache.curator.framework.imps.CuratorFrameworkImpl.doSyncForSuspendedConnection(CuratorFrameworkImpl.java:668)
> at org.apache.curator.framework.imps.CuratorFrameworkImpl.access$800(CuratorFrameworkImpl.java:58)
> at org.apache.curator.framework.imps.CuratorFrameworkImpl$7.retriesExhausted(CuratorFrameworkImpl.java:664)
> at org.apache.curator.framework.imps.CuratorFrameworkImpl.checkBackgroundRetry(CuratorFrameworkImpl.java:683)
> at org.apache.curator.framework.imps.CuratorFrameworkImpl.processBackgroundOperation(CuratorFrameworkImpl.java:496)
> at org.apache.curator.framework.imps.BackgroundSyncImpl$1.processResult(BackgroundSyncImpl.java:50)
> at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:609)
> at org.apache.zookeeper.ClientCnxn$EventThread.queuePacket(ClientCnxn.java:478)
> - locked <0x0000000714935b18> (a java.util.concurrent.LinkedBlockingQueue)
> at org.apache.zookeeper.ClientCnxn.finishPacket(ClientCnxn.java:630)
> at org.apache.zookeeper.ClientCnxn.conLossPacket(ClientCnxn.java:648)
> at org.apache.zookeeper.ClientCnxn.access$2400(ClientCnxn.java:85)
> at org.apache.zookeeper.ClientCnxn$SendThread.cleanup(ClientCnxn.java:1194)
> - locked <0x000000071b205bf0> (a java.util.LinkedList)
> at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1122)
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)