You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Hudson (JIRA)" <ji...@apache.org> on 2018/04/24 20:50:08 UTC

[jira] [Commented] (HADOOP-14412) HostsFileReader#getHostDetails is very expensive on large clusters

    [ https://issues.apache.org/jira/browse/HADOOP-14412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16450742#comment-16450742 ] 

Hudson commented on HADOOP-14412:
---------------------------------

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #14057 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/14057/])
HADOOP-14412. HostsFileReader#getHostDetails is very expensive on large (xyao: rev a1ad4ea273ecc10cc5dad6465ee9bdff233e7666)
* (edit) hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/HostsFileReader.java
* (edit) hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/util/TestHostsFileReader.java
* (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/NodesListManager.java


> HostsFileReader#getHostDetails is very expensive on large clusters
> ------------------------------------------------------------------
>
>                 Key: HADOOP-14412
>                 URL: https://issues.apache.org/jira/browse/HADOOP-14412
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: util
>    Affects Versions: 2.8.0
>            Reporter: Jason Lowe
>            Assignee: Jason Lowe
>            Priority: Major
>             Fix For: 2.9.0, 3.0.0-alpha4, 2.8.2
>
>         Attachments: HADOOP-14412-branch-2.001.patch, HADOOP-14412-branch-2.002.patch, HADOOP-14412-branch-2.002.patch, HADOOP-14412-branch-2.8.002.patch, HADOOP-14412.001.patch, HADOOP-14412.002.patch
>
>
> After upgrading one of our large clusters to 2.8 we noticed many IPC server threads of the resourcemanager spending time in NodesListManager#isValidNode which in turn was calling HostsFileReader#getHostDetails.  The latter is creating complete copies of the include and exclude sets for every node heartbeat, and these sets are not small due to the size of the cluster.  These copies are causing multiple resizes of the underlying HashSets being filled and creating lots of garbage.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org