You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-dev@hadoop.apache.org by "Tao Yang (JIRA)" <ji...@apache.org> on 2018/04/27 07:18:00 UTC

[jira] [Created] (YARN-8222) NPE in RMContainerImpl$FinishedTransition#updateAttemptMetrics

Tao Yang created YARN-8222:
------------------------------

             Summary: NPE in RMContainerImpl$FinishedTransition#updateAttemptMetrics
                 Key: YARN-8222
                 URL: https://issues.apache.org/jira/browse/YARN-8222
             Project: Hadoop YARN
          Issue Type: Bug
            Reporter: Tao Yang
            Assignee: Tao Yang


This NPE looks like may happen when node heartbeat delay and try to update attempt metrics for a non-exist application.
Reference code of RMContainerImpl$FinishedTransition#updateAttemptMetrics:
{code:java}
private static void updateAttemptMetrics(RMContainerImpl container) {
      Resource resource = container.getContainer().getResource();
      RMAppAttempt rmAttempt = container.rmContext.getRMApps()
          .get(container.getApplicationAttemptId().getApplicationId())
          .getCurrentAppAttempt();

      if (rmAttempt != null) {
         //....
      }
}
{code}
We can add a null check for application before getting attempt form it.

Error log:
{noformat}
java.lang.NullPointerException
        at org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl$FinishedTransition.updateAttemptMetrics(RMContainerImpl.java:742)
        at org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl$FinishedTransition.transition(RMContainerImpl.java:715)
        at org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl$FinishedTransition.transition(RMContainerImpl.java:699)
        at org.apache.hadoop.yarn.state.StateMachineFactory$SingleInternalArc.doTransition(StateMachineFactory.java:362)
        at org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
        at org.apache.hadoop.yarn.state.StateMachineFactory.access$500(StateMachineFactory.java:46)
        at org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:487)
        at org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl.handle(RMContainerImpl.java:482)
        at org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl.handle(RMContainerImpl.java:64)
        at org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp.containerCompleted(FiCaSchedulerApp.java:195)
        at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.completedContainer(LeafQueue.java:1793)
        at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.completedContainerInternal(CapacityScheduler.java:2624)
        at org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler.completedContainer(AbstractYarnScheduler.java:663)
        at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.doneApplicationAttempt(CapacityScheduler.java:1514)
        at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:2396)
        at org.apache.hadoop.yarn.sls.scheduler.SLSCapacityScheduler.handle(SLSCapacityScheduler.java:205)
        at org.apache.hadoop.yarn.sls.scheduler.SLSCapacityScheduler.handle(SLSCapacityScheduler.java:60)
        at org.apache.hadoop.yarn.event.EventDispatcher$EventProcessor.run(EventDispatcher.java:66)
        at java.lang.Thread.run(Thread.java:834)
{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-dev-help@hadoop.apache.org