You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-issues@hadoop.apache.org by "Daniel Zhi (JIRA)" <ji...@apache.org> on 2016/06/27 17:58:52 UTC

[jira] [Comment Edited] (YARN-4676) Automatic and Asynchronous Decommissioning Nodes Status Tracking

    [ https://issues.apache.org/jira/browse/YARN-4676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15351488#comment-15351488 ] 

Daniel Zhi edited comment on YARN-4676 at 6/27/16 5:58 PM:
-----------------------------------------------------------

Our team attended Hadoop summit 2014 but there was no plan to attend this year. I will miss the opportunity to discussion the design in person with both of you. Robert had great grasp of the code change so hopefully he can clarify a few things on behalf of me. I will be responding to emails as well.


was (Author: danzhi):
Our team attended Hadoop summit 2014 but there was no plan to attend this year. I will miss the opportunity to discussion the design in person with both of you. Robert had great grasp of the code change so hopefully he can clarify a few things on behave of me. I will be responding to emails as well.

> Automatic and Asynchronous Decommissioning Nodes Status Tracking
> ----------------------------------------------------------------
>
>                 Key: YARN-4676
>                 URL: https://issues.apache.org/jira/browse/YARN-4676
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: resourcemanager
>    Affects Versions: 2.8.0
>            Reporter: Daniel Zhi
>            Assignee: Daniel Zhi
>              Labels: features
>         Attachments: GracefulDecommissionYarnNode.pdf, GracefulDecommissionYarnNode.pdf, YARN-4676.004.patch, YARN-4676.005.patch, YARN-4676.006.patch, YARN-4676.007.patch, YARN-4676.008.patch, YARN-4676.009.patch, YARN-4676.010.patch, YARN-4676.011.patch, YARN-4676.012.patch, YARN-4676.013.patch, YARN-4676.014.patch, YARN-4676.015.patch, YARN-4676.016.patch
>
>
> YARN-4676 implements an automatic, asynchronous and flexible mechanism to graceful decommission
> YARN nodes. After user issues the refreshNodes request, ResourceManager automatically evaluates
> status of all affected nodes to kicks out decommission or recommission actions. RM asynchronously
> tracks container and application status related to DECOMMISSIONING nodes to decommission the
> nodes immediately after there are ready to be decommissioned. Decommissioning timeout at individual
> nodes granularity is supported and could be dynamically updated. The mechanism naturally supports multiple
> independent graceful decommissioning “sessions” where each one involves different sets of nodes with
> different timeout settings. Such support is ideal and necessary for graceful decommission request issued
> by external cluster management software instead of human.
> DecommissioningNodeWatcher inside ResourceTrackingService tracks DECOMMISSIONING nodes status automatically and asynchronously after client/admin made the graceful decommission request. It tracks DECOMMISSIONING nodes status to decide when, after all running containers on the node have completed, will be transitioned into DECOMMISSIONED state. NodesListManager detect and handle include and exclude list changes to kick out decommission or recommission as necessary.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org