You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-issues@hadoop.apache.org by "Junping Du (JIRA)" <ji...@apache.org> on 2014/07/31 16:09:41 UTC

[jira] [Commented] (YARN-2371) Wrong NMToken is issued when NM preserving restarts with containers running

    [ https://issues.apache.org/jira/browse/YARN-2371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14080925#comment-14080925 ] 

Junping Du commented on YARN-2371:
----------------------------------

Nice finding, [~zhiguohong]! The fix here looks reasonable to me. It reminds me that we also have recently changes to replace checking appAttemptID with checking appID in authorizing NMToken for the similar reason. For unit test, I suggest to have a separated test method or at least  separated code segment for your case with proper document on scenario of cases.

> Wrong NMToken is issued when NM preserving restarts with containers running
> ---------------------------------------------------------------------------
>
>                 Key: YARN-2371
>                 URL: https://issues.apache.org/jira/browse/YARN-2371
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: resourcemanager
>            Reporter: Hong Zhiguo
>            Assignee: Hong Zhiguo
>         Attachments: YARN-2371.patch
>
>
> When application is submitted with "ApplicationSubmissionContext.getKeepContainersAcrossApplicationAttempts() == true", and NM is restarted with containers running, wrong NMToken is issued to AM through RegisterApplicationMasterResponse.
> See the NM log:
> {code}
> 2014-07-30 11:59:58,941 ERROR org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl: Unauthorized request to start container.-
> NMToken for application attempt : appattempt_1406691610864_0002_000001 was used for starting container with container token issued for application attempt : appattempt_1406691610864_0002_000002
> {code}
> The reason is in below code:
> {code} 
> createAndGetNMToken(String applicationSubmitter,
>       ApplicationAttemptId appAttemptId, Container container) {
>       ......
>           Token token =
>               createNMToken(container.getId().getApplicationAttemptId(),
>                 container.getNodeId(), applicationSubmitter);
>      ......
> }
> {code} 
> "appAttemptId" instead of "container.getId().getApplicationAttemptId()" should be passed to "createNMToken".



--
This message was sent by Atlassian JIRA
(v6.2#6252)