You are viewing a plain text version of this content. The canonical link for it is here.

Posted to yarn-issues@hadoop.apache.org by "shenhong (JIRA)" <ji...@apache.org> on 2013/08/13 20:40:57 UTC

[jira] [Commented] (YARN-435) Make it easier to access cluster topology information in an AM

    [ https://issues.apache.org/jira/browse/YARN-435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13738647#comment-13738647 ] 

shenhong commented on YARN-435:
-------------------------------

Firstly, if AM get all nodes in the cluster including their rack information by calling RM. This will increase pressure on the RM's network. For example, the cluster had more than 5000 datanodes.

Secondly, if the yarn cluster only has 100 nodemanagers, but the hdfs it accessed is a cluster with more than 5000 datanodes, we can't get all the nodes including their rack information. However, AM need all the datanode information in it's job.splitmetainfo file, in order to init TaskAttempt. In this case, we can't get all nodes by calling RM.
                
> Make it easier to access cluster topology information in an AM
> --------------------------------------------------------------
>
>                 Key: YARN-435
>                 URL: https://issues.apache.org/jira/browse/YARN-435
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>            Reporter: Hitesh Shah
>            Assignee: Omkar Vinit Joshi
>
> ClientRMProtocol exposes a getClusterNodes api that provides a report on all nodes in the cluster including their rack information. 
> However, this requires the AM to open and establish a separate connection to the RM in addition to one for the AMRMProtocol. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira