You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tez.apache.org by "Jason Lowe (JIRA)" <ji...@apache.org> on 2016/07/20 20:28:20 UTC

[jira] [Created] (TEZ-3368) NPE in DelayedContainerManager

Jason Lowe created TEZ-3368:
-------------------------------

             Summary: NPE in DelayedContainerManager
                 Key: TEZ-3368
                 URL: https://issues.apache.org/jira/browse/TEZ-3368
             Project: Apache Tez
          Issue Type: Bug
    Affects Versions: 0.7.1
            Reporter: Jason Lowe


Saw a Tez AM hang due to an NPE in the DelayedContainerManager:
{noformat}
2016-07-17 01:53:23,157 [ERROR] [DelayedContainerManager] |yarn.YarnUncaughtExceptionHandler|: Thread Thread[DelayedContainerManager,5,main] threw an Exception.
java.lang.NullPointerException
        at org.apache.tez.dag.app.rm.TezAMRMClientAsync.getMatchingRequestsForTopPriority(TezAMRMClientAsync.java:142)
        at org.apache.tez.dag.app.rm.YarnTaskSchedulerService.getMatchingRequestWithoutPriority(YarnTaskSchedulerService.java:1474)
        at org.apache.tez.dag.app.rm.YarnTaskSchedulerService.access$500(YarnTaskSchedulerService.java:84)
        at org.apache.tez.dag.app.rm.YarnTaskSchedulerService$NodeLocalContainerAssigner.assignReUsedContainer(YarnTaskSchedulerService.java:1869)
        at org.apache.tez.dag.app.rm.YarnTaskSchedulerService.assignReUsedContainerWithLocation(YarnTaskSchedulerService.java:1753)
        at org.apache.tez.dag.app.rm.YarnTaskSchedulerService.assignDelayedContainer(YarnTaskSchedulerService.java:733)
        at org.apache.tez.dag.app.rm.YarnTaskSchedulerService.access$600(YarnTaskSchedulerService.java:84)
        at org.apache.tez.dag.app.rm.YarnTaskSchedulerService$DelayedContainerManager.run(YarnTaskSchedulerService.java:2030)
{noformat}

After the DelayedContainerManager thread exited the AM proceeded to receive requested containers that would go unused until the container allocations expired.  Then they would be re-requested, and the cycle repeated indefinitely.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)