You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@tez.apache.org by "Jeff Zhang (JIRA)" <ji...@apache.org> on 2015/04/07 09:23:12 UTC
[jira] [Resolved] (TEZ-2267) Deadlock caused by TEZ-2149
[ https://issues.apache.org/jira/browse/TEZ-2267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jeff Zhang resolved TEZ-2267.
-----------------------------
Resolution: Won't Fix
Fixed in TEZ-2269
> Deadlock caused by TEZ-2149
> ---------------------------
>
> Key: TEZ-2267
> URL: https://issues.apache.org/jira/browse/TEZ-2267
> Project: Apache Tez
> Issue Type: Bug
> Reporter: Jeff Zhang
> Priority: Critical
> Attachments: jstack.txt, jstack2.txt, syslog_dag_1427965027460_0001_1
>
>
> {code}
> "TaskSchedulerAppCaller #0" daemon prio=10 tid=0x00007f044005e800 nid=0x7be6 waiting on condition [0x00007f04350ce000]
> java.lang.Thread.State: WAITING (parking)
> at sun.misc.Unsafe.park(Native Method)
> - parking to wait for <0x00000000fc279e18> (a java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
> at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
> at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834)
> at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireShared(AbstractQueuedSynchronizer.java:964)
> at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireShared(AbstractQueuedSynchronizer.java:1282)
> at java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.lock(ReentrantReadWriteLock.java:731)
> at org.apache.tez.dag.app.dag.impl.TaskImpl.isFinished(TaskImpl.java:394)
> at org.apache.tez.dag.app.dag.impl.VertexImpl.computeProgress(VertexImpl.java:1064)
> at org.apache.tez.dag.app.dag.impl.VertexImpl.getProgress(VertexImpl.java:1002)
> at org.apache.tez.dag.app.dag.impl.DAGImpl.getProgress(DAGImpl.java:676)
> at org.apache.tez.dag.app.DAGAppMaster.getProgress(DAGAppMaster.java:1067)
> at org.apache.tez.dag.app.rm.TaskSchedulerEventHandler.getProgress(TaskSchedulerEventHandler.java:558)
> at org.apache.tez.dag.app.rm.TaskSchedulerAppCallbackWrapper$GetProgressCallable.call(TaskSchedulerAppCallbackWrapper.java:291)
> at org.apache.tez.dag.app.rm.TaskSchedulerAppCallbackWrapper$GetProgressCallable.call(TaskSchedulerAppCallbackWrapper.java:1)
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> Locked ownable synchronizers:
> - <0x00000000fc20aed8> (a java.util.concurrent.ThreadPoolExecutor$Worker)
> "IPC Server handler 0 on 47949" daemon prio=10 tid=0x00007f0448036800 nid=0x7bc0 waiting on condition [0x00007f04372f0000]
> java.lang.Thread.State: WAITING (parking)
> at sun.misc.Unsafe.park(Native Method)
> - parking to wait for <0x00000000fc200160> (a java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
> at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
> at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834)
> at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:867)
> at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2090)
> at org.apache.tez.dag.app.dag.impl.DAGImpl.getDAGStatus(DAGImpl.java:763)
> at org.apache.tez.dag.api.client.DAGClientHandler.getDAGStatus(DAGClientHandler.java:67)
> at org.apache.tez.dag.api.client.rpc.DAGClientAMProtocolBlockingPBServerImpl.getDAGStatus(DAGClientAMProtocolBlockingPBServerImpl.java:99)
> at org.apache.tez.dag.api.client.rpc.DAGClientAMProtocolRPC$DAGClientAMProtocol$2.callBlockingMethod(DAGClientAMProtocolRPC.java:7465)
> at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:962)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2039)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2035)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2033)
> Locked ownable synchronizers:
> - None
> {code}
> Or maybe the timeoutNanos is a very large number.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)