You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-dev@hadoop.apache.org by "Karthik Palanisamy (JIRA)" <ji...@apache.org> on 2019/03/20 00:00:00 UTC

[jira] [Created] (HDFS-14383) Compute datanode load based on StoragePolicy

Karthik Palanisamy created HDFS-14383:
-----------------------------------------

             Summary: Compute datanode load based on StoragePolicy
                 Key: HDFS-14383
                 URL: https://issues.apache.org/jira/browse/HDFS-14383
             Project: Hadoop HDFS
          Issue Type: Bug
          Components: hdfs, namenode
    Affects Versions: 3.1.2, 2.7.3
            Reporter: Karthik Palanisamy
            Assignee: Karthik Palanisamy


Datanode load check logic needs to be changed because existing computation will not consider StoragePolicy.

DatanodeManager#getInServiceXceiverAverage

{code}

public double getInServiceXceiverAverage() {
 double avgLoad = 0;
 final int nodes = getNumDatanodesInService();
 if (nodes != 0) {
 final int xceivers = heartbeatManager
 .getInServiceXceiverCount();
 avgLoad = (double)xceivers/nodes;
 }
 return avgLoad;
}

{code}

 

For example: with 10 nodes (HOT), average 50 xceivers and 90 nodes (COLD) with average 10 xceivers the calculated threshold by the NN is 28 (((500 + 900)/100)*2), which means those 10 nodes (the whole HOT tier) becomes unavailable when the COLD tier nodes are barely in use. Turning this check off helps to mitigate this issue, however the dfs.namenode.replication.considerLoad helps to "balance" the load of the DNs, upon turning it off can lead to situations where specific DNs are "overloaded".



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-help@hadoop.apache.org