You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-issues@hadoop.apache.org by "Hudson (JIRA)" <ji...@apache.org> on 2017/10/09 12:32:11 UTC

[jira] [Commented] (YARN-2612) Some completed containers are not reported to NM

    [ https://issues.apache.org/jira/browse/YARN-2612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16196892#comment-16196892 ] 

Hudson commented on YARN-2612:
------------------------------

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #13051 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/13051/])
YARN-2612 addendum: fixed javadoc error. (templedf: rev 6d6ca4c923142a893870fc8a6309c02f60dc2d24)
* (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/ConfigurableResource.java


> Some completed containers are not reported to NM
> ------------------------------------------------
>
>                 Key: YARN-2612
>                 URL: https://issues.apache.org/jira/browse/YARN-2612
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: resourcemanager
>    Affects Versions: 2.6.0
>            Reporter: Jun Gong
>             Fix For: 2.6.0
>
>
> We are testing RM work preserving restart and found the following logs when we ran a simple MapReduce task "PI". Some completed containers which already pulled by AM never reported back to NM, so NM continuously report the completed containers while AM had finished. 
> {code}
> 2014-09-26 17:00:42,228 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: Null container completed...
> 2014-09-26 17:00:42,228 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: Null container completed...
> 2014-09-26 17:00:43,230 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: Null container completed...
> 2014-09-26 17:00:43,230 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: Null container completed...
> 2014-09-26 17:00:44,233 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: Null container completed...
> 2014-09-26 17:00:44,233 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: Null container completed...
> {code}
> In YARN-1372, NM will report completed containers to RM until it gets ACK from RM.  If AM does not call allocate, which means that AM does not ack RM, RM will not ack NM. We([~chenchun]) have observed these two cases when running Mapreduce task 'pi':
> 1) RM sends completed containers to AM. After receiving it, AM thinks it has done the work and does not need resource, so it does not call allocate.
> 2) When AM finishes, it could not ack to RM because AM itself has not finished yet.
> We think when RMAppAttempt call BaseFinalTransition, it means AppAttempt finishes, then RM could send this AppAttempt's completed containers to NM.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org