You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-issues@hadoop.apache.org by "Eric Badger (Jira)" <ji...@apache.org> on 2020/06/05 19:12:00 UTC

[jira] [Commented] (YARN-9809) NMs should supply a health status when registering with RM

    [ https://issues.apache.org/jira/browse/YARN-9809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17127025#comment-17127025 ] 

Eric Badger commented on YARN-9809:
-----------------------------------

Patch 001 adds the feature but makes it opt-in via the config {{yarn.nodemanager.health-checker.run-before-startup}}. I didn't put in the retries flag for shutting down the NM if there are a certain number of failures. I can do that in a subsequent patch if you'd like. But I tested this patch out and it seems to work.

> NMs should supply a health status when registering with RM
> ----------------------------------------------------------
>
>                 Key: YARN-9809
>                 URL: https://issues.apache.org/jira/browse/YARN-9809
>             Project: Hadoop YARN
>          Issue Type: Bug
>            Reporter: Eric Badger
>            Assignee: Eric Badger
>            Priority: Major
>         Attachments: YARN-9809.001.patch
>
>
> Currently if the NM registers with the RM and it is unhealthy, it can be scheduled many containers before the first heartbeat. After the first heartbeat, the RM will mark the NM as unhealthy and kill all of the containers.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org