You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-dev@hadoop.apache.org by "Ming Ma (JIRA)" <ji...@apache.org> on 2016/03/24 18:05:25 UTC
[jira] [Created] (HDFS-10206) getBlockLocations might not sort
datanodes properly by distance
Ming Ma created HDFS-10206:
------------------------------
Summary: getBlockLocations might not sort datanodes properly by distance
Key: HDFS-10206
URL: https://issues.apache.org/jira/browse/HDFS-10206
Project: Hadoop HDFS
Issue Type: Bug
Reporter: Ming Ma
If the DFSClient machine is not a datanode, but it shares its rack with some datanodes of the HDFS block requested, {{DatanodeManager#sortLocatedBlocks}} might not put the local-rack datanodes at the beginning of the sorted list. That is because the function didn't call {{networktopology.add(client);}} to properly set the node's parent node; something required by {{networktopology.sortByDistance}} to compute distance between two nodes in the same topology tree.
Another issue with {{networktopology.sortByDistance}} is it only distinguishes local rack from remote rack, but it doesn't support general distance calculation to tell how remote the rack is.
{noformat}
NetworkTopology.java
protected int getWeight(Node reader, Node node) {
// 0 is local, 1 is same rack, 2 is off rack
// Start off by initializing to off rack
int weight = 2;
if (reader != null) {
if (reader.equals(node)) {
weight = 0;
} else if (isOnSameRack(reader, node)) {
weight = 1;
}
}
return weight;
}
{noformat}
HDFS-10203 has suggested moving the sorting from namenode to DFSClient to address another issue. Regardless of where we do the sorting, we still fix the issues outline here.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)