You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-issues@hadoop.apache.org by "Wangda Tan (JIRA)" <ji...@apache.org> on 2015/02/11 22:59:13 UTC

[jira] [Commented] (YARN-2495) Allow admin specify labels from each NM (Distributed configuration)

    [ https://issues.apache.org/jira/browse/YARN-2495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14317048#comment-14317048 ] 

Wangda Tan commented on YARN-2495:
----------------------------------

Hi [~cwelch],
Thanks for jumping in and providing your thoughts, and really sorry for the late response.
I think biggest concern of you is about DECENTRALIZED_CONFIGURATION_ENABLED, let me talk about my thinkings :)

IMHO, mixing decentralized/centralized is dangerous and will cause non-determinated result. You may think about merging them together, such as some labels set by admin using RMAdminCLI, and some others are set by NM. But I can give you an example shows it is still non-determinated even if we have +/- for ResourceTracker protocol:
- Assume a node has label x,y (reported +x,+y)
- RMAdmin remove y from the node (-y)
- NM failure then restart, and report it has x,y (+x, +y). What should labels on the node be?

I also don't like adding too much switches in configuration, but it seems a good way that we can support both with determinated behavior.

For your other suggestions,
- Name changes is->are,
- Make RegisterNodeManagerRequest consist wiht NodeHeartbeatRequest
I all agree with

One more suggestion (as per suggested by [~vinodkv]), when there's anything wrong with node label reported from NM, we should fail NM (ask it to shutdown and give it proper diagnostic message). This is because if NM report a label but rejected, even if RM tell NM this, NM cannot handle it properly except print some error messages (we don't have smart logic now). Which will lead to problems in debugging (A NM reported some label to RM but scheduler failed allocating containers on the NM). To avoid it, a simple way is to shutdown the NM and admin can take a look at what happened.

Thoughts?
Wangda

> Allow admin specify labels from each NM (Distributed configuration)
> -------------------------------------------------------------------
>
>                 Key: YARN-2495
>                 URL: https://issues.apache.org/jira/browse/YARN-2495
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: resourcemanager
>            Reporter: Wangda Tan
>            Assignee: Naganarasimha G R
>         Attachments: YARN-2495.20141023-1.patch, YARN-2495.20141024-1.patch, YARN-2495.20141030-1.patch, YARN-2495.20141031-1.patch, YARN-2495.20141119-1.patch, YARN-2495.20141126-1.patch, YARN-2495.20141204-1.patch, YARN-2495.20141208-1.patch, YARN-2495_20141022.1.patch
>
>
> Target of this JIRA is to allow admin specify labels in each NM, this covers
> - User can set labels in each NM (by setting yarn-site.xml (YARN-2923) or using script suggested by [~aw] (YARN-2729) )
> - NM will send labels to RM via ResourceTracker API
> - RM will set labels in NodeLabelManager when NM register/update labels



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)