You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-issues@hadoop.apache.org by "Junping Du (JIRA)" <ji...@apache.org> on 2014/07/15 11:47:05 UTC

[jira] [Commented] (YARN-1341) Recover NMTokens upon nodemanager restart

    [ https://issues.apache.org/jira/browse/YARN-1341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14061891#comment-14061891 ] 

Junping Du commented on YARN-1341:
----------------------------------

Hey [~jlowe], I also agree it is better to discuss the inconsistent scenario for each cases on separated JIRAs. However, for now, our conclusion from these discussions can only be true in theoretically but it may have bugs/issues in practical. Thus, I also suggest we should have a central place to document these assumptions/conclusions from discussions and it would help us and others in community to identify potential issues if coming up with UT or other integration tests on negative cases later. What do you think? If you are also agree on this, we can separate this document effort to other JIRA (Umbrella or a dedicated one, whatever you like) and continue the discussion on this particular case.
On this particular one, the assumptions here from discussion above seems like: 
if NM restart with stale keys, 
a. if currentMasterKey is stale, it can be updated and override soon with registering to RM later. Nothing is affected.
b. if previousMasterKey is stale, then the real previous master key is lost, so the affection is: AMs with real master key cannot connect to NM to launch containers.
c. if applicationMasterKeys are stale, then previous old keys get tracked in applicationMasterKeys get lost after restart. The affection is: AMs with old keys cannot connect to NM to launch containers.
I would prefer option 1 too if we listed all affections here. Anything I am missing here?

> Recover NMTokens upon nodemanager restart
> -----------------------------------------
>
>                 Key: YARN-1341
>                 URL: https://issues.apache.org/jira/browse/YARN-1341
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: nodemanager
>    Affects Versions: 2.3.0
>            Reporter: Jason Lowe
>            Assignee: Jason Lowe
>         Attachments: YARN-1341.patch, YARN-1341v2.patch, YARN-1341v3.patch, YARN-1341v4-and-YARN-1987.patch, YARN-1341v5.patch, YARN-1341v6.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.2#6252)