You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-issues@hadoop.apache.org by "Ming Ma (JIRA)" <ji...@apache.org> on 2015/01/06 21:21:35 UTC

[jira] [Commented] (YARN-914) Support graceful decommission of nodemanager

    [ https://issues.apache.org/jira/browse/YARN-914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14266692#comment-14266692 ] 

Ming Ma commented on YARN-914:
------------------------------

Thanks, Junping. The timeout is definitely necessary.

* Sounds like we need a new state for NM, called "decommission_in_progress" when NM is draining the containers. When RM considers the decommission completes, it will be marked "decommissioned".

* To clarify my early comment "all its map output are fetched or until all the applications the node touches have completed", the question is when YARN can declare a node's state has been gracefully drained and thus the node gracefully decommissioned ( admins can shutdown the whole machine without any impact on jobs ). For MR, the state could be running tasks/containers or mapper outputs. Say we have timeout of 30 minutes for decommission, it takes 3 minutes to finish the mappers on the node, another 5 minutes for the job to finish, then YARN can declare the node gracefully decommissioned in 8 minutes, instead of waiting for 30 minutes. RM knows all applications on any given NM. So if all applications on any given node have completed, RM can mark the node "decommissioned".

* Yes, I meant long running services. If YARN just kills the containers upon decommission request, the impact could vary. Some services might not have states to drain. Or maybe the services can handle the state migration on their own without YARN's help. For such services, maybe we can just use ResourceOption's timeout for that; set timeout to 0 and NM will just kill the containers.

* Given we don't plan to have applications checkpoint and migrate states, it doesn't seem to be necessary to have YARN notify applications upon decommission requests. Just to call it out.

* It might be useful to have a new state called "decommissioned_timeout", so that admins know the node has been gracefully decommissioned or not.

Thoughts?

> Support graceful decommission of nodemanager
> --------------------------------------------
>
>                 Key: YARN-914
>                 URL: https://issues.apache.org/jira/browse/YARN-914
>             Project: Hadoop YARN
>          Issue Type: Improvement
>    Affects Versions: 2.0.4-alpha
>            Reporter: Luke Lu
>            Assignee: Junping Du
>
> When NMs are decommissioned for non-fault reasons (capacity change etc.), it's desirable to minimize the impact to running applications.
> Currently if a NM is decommissioned, all running containers on the NM need to be rescheduled on other NMs. Further more, for finished map tasks, if their map output are not fetched by the reducers of the job, these map tasks will need to be rerun as well.
> We propose to introduce a mechanism to optionally gracefully decommission a node manager.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)