You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@tez.apache.org by "Bikas Saha (JIRA)" <ji...@apache.org> on 2016/02/13 01:48:18 UTC
[jira] [Created] (TEZ-3117) Deadlock in Edge and Vertex code
Bikas Saha created TEZ-3117:
-------------------------------
Summary: Deadlock in Edge and Vertex code
Key: TEZ-3117
URL: https://issues.apache.org/jira/browse/TEZ-3117
Project: Apache Tez
Issue Type: Bug
Reporter: Yesha Vora
Assignee: Bikas Saha
{code}
Java-level deadlocks detected
This means that some threads are blocked waiting to enter a synchronization block or
waiting to reenter a synchronization block after an Object.wait() call, where each thread
owns one monitor while trying to obtain another monitor already held by another thread.
Deadlock:
App Shared Pool - #1 is waiting to lock java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync@18a7c819 which is held by Dispatcher thread {Central}
Dispatcher thread {Central} is waiting to lock org.apache.tez.dag.app.dag.impl.Edge@3e6ba2db which is held by App Shared Pool - #1
Deadlock:
Dispatcher thread {Central} is waiting to lock org.apache.tez.dag.app.dag.impl.Edge@3e6ba2db which is held by App Shared Pool - #1
App Shared Pool - #1 is waiting to lock java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync@18a7c819 which is held by Dispatcher thread {Central}
Thread stacks
App Shared Pool - #1 [WAITING]
sun.misc.Unsafe.park(native method)
java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireShared(AbstractQueuedSynchronizer.java:967)
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireShared(AbstractQueuedSynchronizer.java:1283)
java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.lock(ReentrantReadWriteLock.java:727)
org.apache.tez.dag.app.dag.impl.VertexImpl.getTotalTasks(VertexImpl.java:1098)
org.apache.tez.dag.app.dag.impl.Edge$EdgeManagerPluginContextImpl.getDestinationVertexNumTasks(Edge.java:99)
org.apache.tez.dag.app.dag.impl.Edge.routingToBegin(Edge.java:214)
org.apache.tez.dag.app.dag.impl.VertexImpl.setupEdgeRouting(VertexImpl.java:1447)
org.apache.tez.dag.app.dag.impl.VertexImpl.unsetTasksNotYetScheduled(VertexImpl.java:1453)
org.apache.tez.dag.app.dag.impl.VertexImpl.scheduleTasks(VertexImpl.java:1496)
org.apache.tez.dag.app.dag.impl.VertexManager$VertexManagerPluginContextImpl.scheduleTasks(VertexManager.java:216)
org.apache.tez.dag.library.vertexmanager.InputReadyVertexManager.handleSourceTaskFinished(InputReadyVertexManager.java:275)
org.apache.tez.dag.library.vertexmanager.InputReadyVertexManager.onSourceTaskCompleted(InputReadyVertexManager.java:196)
org.apache.tez.dag.library.vertexmanager.InputReadyVertexManager.trySchedulingPendingCompletions(InputReadyVertexManager.java:146)
org.apache.tez.dag.library.vertexmanager.InputReadyVertexManager.onVertexStarted(InputReadyVertexManager.java:187)
org.apache.tez.dag.app.dag.impl.VertexManager$VertexManagerEventOnVertexStarted.invoke(VertexManager.java:578)
org.apache.tez.dag.app.dag.impl.VertexManager$VertexManagerEvent$1.run(VertexManager.java:647)
org.apache.tez.dag.app.dag.impl.VertexManager$VertexManagerEvent$1.run(VertexManager.java:642)
java.security.AccessController.doPrivileged(native method)
javax.security.auth.Subject.doAs(Subject.java:422)
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
org.apache.tez.dag.app.dag.impl.VertexManager$VertexManagerEvent.call(VertexManager.java:642)
org.apache.tez.dag.app.dag.impl.VertexManager$VertexManagerEvent.call(VertexManager.java:631)
java.util.concurrent.FutureTask.run(FutureTask.java:266)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
java.lang.Thread.<null>(unknown source)
Dispatcher thread {Central} [BLOCKED; waiting to lock org.apache.tez.dag.app.dag.impl.Edge@3e6ba2db]
org.apache.tez.dag.app.dag.impl.Edge.getEdgeProperty(Edge.java:241)
org.apache.tez.dag.app.dag.impl.VertexImpl.logVertexConfigurationDoneEvent(VertexImpl.java:1886)
org.apache.tez.dag.app.dag.impl.VertexImpl.maybeSendConfiguredEvent(VertexImpl.java:3020)
org.apache.tez.dag.app.dag.impl.VertexImpl.startVertex(VertexImpl.java:3055)
org.apache.tez.dag.app.dag.impl.VertexImpl.access$4500(VertexImpl.java:204)
org.apache.tez.dag.app.dag.impl.VertexImpl$StartTransition.transition(VertexImpl.java:3007)
org.apache.tez.dag.app.dag.impl.VertexImpl$StartTransition.transition(VertexImpl.java:2996)
org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:385)
org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
org.apache.tez.state.StateMachineTez.doTransition(StateMachineTez.java:59)
org.apache.tez.dag.app.dag.impl.VertexImpl.handle(VertexImpl.java:1799)
org.apache.tez.dag.app.dag.impl.VertexImpl.handle(VertexImpl.java:203)
org.apache.tez.dag.app.DAGAppMaster$VertexEventDispatcher.handle(DAGAppMaster.java:2214)
org.apache.tez.dag.app.DAGAppMaster$VertexEventDispatcher.handle(DAGAppMaster.java:2200)
org.apache.tez.common.AsyncDispatcher.dispatch(AsyncDispatcher.java:183)
org.apache.tez.common.AsyncDispatcher$1.run(AsyncDispatcher.java:114)
java.lang.Thread.<null>(unknown source)
Frozen threads found (potential deadlock)
It seems that the following threads have not changed their stack for more than 10 seconds.
These threads are possibly (but not necessarily!) in a deadlock or hung.
client DomainSocketWatcher <--- Frozen for at least 20m 33 sec
org.apache.hadoop.net.unix.DomainSocketWatcher.doPoll0(int, DomainSocketWatcher$FdSet) DomainSocketWatcher.java (native)
org.apache.hadoop.net.unix.DomainSocketWatcher.access$900(int, DomainSocketWatcher$FdSet) DomainSocketWatcher.java:52
org.apache.hadoop.net.unix.DomainSocketWatcher$2.run() DomainSocketWatcher.java:511
java.lang.Thread.run() Thread.java:745
{code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)