You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-issues@hadoop.apache.org by "Bikas Saha (JIRA)" <ji...@apache.org> on 2013/04/24 04:41:16 UTC

[jira] [Commented] (YARN-556) RM Restart phase 2 - Design for work preserving restart

    [ https://issues.apache.org/jira/browse/YARN-556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13640012#comment-13640012 ] 

Bikas Saha commented on YARN-556:
---------------------------------

Adding brief description of proposal from YARN-128 design document for Google Summer of Code.
{noformat}
Work preserving restart - RM will have to make the RMAppAttempt state machine enter 
the Running state before starting the internal services. The RM can obtain information 
about all running containers from the NM’s when the NM’s heartbeat with it. This 
information can be used to repopulate the allocation state of scheduler. When the 
running AM’s heartbeat with RM then the RM can ask them to resend their container 
requests so that the RM can repopulate all the pending requests. Repopulating the 
running container and pending container information completes all the data needed by 
the RM to start normal operations.
{noformat}
                
> RM Restart phase 2 - Design for work preserving restart
> -------------------------------------------------------
>
>                 Key: YARN-556
>                 URL: https://issues.apache.org/jira/browse/YARN-556
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: resourcemanager
>            Reporter: Bikas Saha
>            Assignee: Bikas Saha
>              Labels: gsoc2013
>
> The basic idea is already documented on YARN-128. This will describe further details.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira