You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-dev@hadoop.apache.org by "He Tianyi (JIRA)" <ji...@apache.org> on 2016/04/14 10:53:25 UTC
[jira] [Created] (HDFS-10290) Move getBlocks calls to DataNode in
Balancer
He Tianyi created HDFS-10290:
--------------------------------
Summary: Move getBlocks calls to DataNode in Balancer
Key: HDFS-10290
URL: https://issues.apache.org/jira/browse/HDFS-10290
Project: Hadoop HDFS
Issue Type: New Feature
Components: balancer & mover
Affects Versions: 2.6.0
Reporter: He Tianyi
In current implementation, Balancer asks NameNode for a list of blocks on specific DataNode. This made workload of NameNode heavier, and actually it caused NameNode flappy when average # of blocks on each DataNode reaches 1,000,000 (NameNode heap size is 192GB, cpu: Xeon E5-2630 * 2).
Recently I investigated whether {{getBlocks}} invocation from Balancer can be handled by DataNodes, turned out to be practical.
The only pitfall is: since DataNode has no information about other locations of each block it possesses, some block move may fail (since target node may already has a replica of that particular block).
I think this may be beneficial for large clusters.
Any suggestions or comments?
Thanks.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)