You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-issues@hadoop.apache.org by "Naganarasimha G R (JIRA)" <ji...@apache.org> on 2016/01/06 20:02:40 UTC

[jira] [Commented] (YARN-3995) Some of the NM events are not getting published due race condition when AM container finishes in NM

    [ https://issues.apache.org/jira/browse/YARN-3995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15086072#comment-15086072 ] 

Naganarasimha G R commented on YARN-3995:
-----------------------------------------

bq. Are you thinking of cases where the AM crashes? If the app finishes normally, this sequence does not happen, right?
Well was just having a hunch that suppose AM finishes before its containers finishes (like AM will note once container informs AM through umbilical protocol that its finished but may be container is not yet finished one of the possible reasons being Timeline client has not yet finished flushing the ATS events or any other reason for cleaning up)


> Some of the NM events are not getting published due race condition when AM container finishes in NM 
> ----------------------------------------------------------------------------------------------------
>
>                 Key: YARN-3995
>                 URL: https://issues.apache.org/jira/browse/YARN-3995
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: nodemanager, timelineserver
>    Affects Versions: YARN-2928
>            Reporter: Naganarasimha G R
>            Assignee: Naganarasimha G R
>              Labels: yarn-2928-1st-milestone
>         Attachments: YARN-3995-feature-YARN-2928.v1.001.patch
>
>
> As discussed in YARN-3045:  While testing in TestDistributedShell found out that few of the container metrics events were failing as there will be race condition. When the AM container finishes and removes the collector for the app, still there is possibility that all the events published for the app by the current NM and other NM are still in pipeline, 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)