You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-issues@hadoop.apache.org by "Junping Du (JIRA)" <ji...@apache.org> on 2013/11/12 15:04:18 UTC

[jira] [Commented] (YARN-558) Add ability to completely remove nodemanager from resourcemanager.

    [ https://issues.apache.org/jira/browse/YARN-558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13820113#comment-13820113 ] 

Junping Du commented on YARN-558:
---------------------------------

+1 on use case in cloud services. I think one feasible way to achieve this (although not convenient) now is:
- first, decommission nodes by putting them on decommission list and call refreshNodes().
- then, wait at least one heartbeat() of each nodes to make sure decommissioned nodes are clear
- at last, remove nodes from decommission list and refreshNodes() again.
We do need something simpler.

> Add ability to completely remove nodemanager from resourcemanager.
> ------------------------------------------------------------------
>
>                 Key: YARN-558
>                 URL: https://issues.apache.org/jira/browse/YARN-558
>             Project: Hadoop YARN
>          Issue Type: Improvement
>          Components: resourcemanager
>            Reporter: Garth Goodson
>            Priority: Minor
>              Labels: feature
>
> I would like to add the ability to completely remove a nodemanager from the resourcemanager's state.
> I run a cloud service where I want to dynamically bring up nodes to act as nodemanagers and then bring them down again when not needed.  These nodes have dynamically assigned IPs, thus the alternative of decommissioning them via an excludes file leads to a large (unbounded) list of decommissioned nodes that may never be commissioned again. I would like the ability to move a node from a decommissioned state to completely removing it from the resource manager.
> I have thought of two ways of implementing this.
> 1) Add an optional timeout between the decommission state -> being removed from the nodemanager.
> 2) Add an explicit RPC to remove a node that is decommissioned.
> Any additional thoughts/discussion are welcome.



--
This message was sent by Atlassian JIRA
(v6.1#6144)