You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-dev@hadoop.apache.org by "Daryn Sharp (JIRA)" <ji...@apache.org> on 2015/03/21 00:17:38 UTC

[jira] [Created] (HDFS-7967) Reduce the performance impact of the balancer

Daryn Sharp created HDFS-7967:
---------------------------------

             Summary: Reduce the performance impact of the balancer
                 Key: HDFS-7967
                 URL: https://issues.apache.org/jira/browse/HDFS-7967
             Project: Hadoop HDFS
          Issue Type: Sub-task
    Affects Versions: 2.0.0-alpha
            Reporter: Daryn Sharp
            Assignee: Daryn Sharp


The balancer needs to query for blocks to move from overly full DNs.  The block lookup is extremely inefficient.  An iterator of the node's blocks is created from the iterators of its storages' blocks.  A random number is chosen corresponding to how many blocks will be skipped via the iterator.  Each skip requires costly scanning of triplets.

The current design also only considers node imbalances while ignoring imbalances within the nodes's storages.  A more efficient and intelligent design may eliminate the costly skipping of blocks via round-robin selection of blocks from the storages based on remaining capacity.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)