You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-issues@hadoop.apache.org by "Sunil Govindan (JIRA)" <ji...@apache.org> on 2019/08/06 06:08:00 UTC

[jira] [Commented] (YARN-9721) An easy method to exclude a nodemanager from the yarn cluster cleanly

    [ https://issues.apache.org/jira/browse/YARN-9721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16900661#comment-16900661 ] 

Sunil Govindan commented on YARN-9721:
--------------------------------------

Looping [~tangzhankun] to this thread.

[~yuan_zac], ideally node decommissioning will help you to make sure all containers are drained and a smooth decommission can be done. Once the node is decommissioned, you can remove as per use case. And as you mentioned, such nodes which are forced out should not be in inactive list.

cc [~leftnoteasy] [~cheersyang]

> An easy method to exclude a nodemanager from the yarn cluster cleanly
> ---------------------------------------------------------------------
>
>                 Key: YARN-9721
>                 URL: https://issues.apache.org/jira/browse/YARN-9721
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>            Reporter: Zac Zhou
>            Priority: Major
>         Attachments: decommission nodes.png
>
>
> If we want to take offline a nodemanager server, nodes.exclude-path
>  and "rmadmin -refreshNodes" command are used to decommission the server.
>  But this method cannot clean up the node clearly. Nodemanager servers are still in Decommissioned Nodes as the attachment shows.
>   !decommission nodes.png!
> YARN-4311 enable a removalTimer to clean up the untracked node.
>  But the logic of isUntrackedNode method is to restrict. If include-path is not used, no servers can meet the criteria. Using an include file would make a potential risk in maintenance.
> If yarn cluster is installed on cloud, nodemanager servers are created and deleted frequently. We need a way to exclude a nodemanager from the yarn cluster cleanly. Otherwise, the map of rmContext.getInactiveRMNodes() would keep growing, which would cause a memory issue of RM.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org