You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-dev@hadoop.apache.org by "Max Lapan (JIRA)" <ji...@apache.org> on 2015/09/03 16:37:46 UTC

[jira] [Created] (HDFS-9014) Block placement policy with respect to DN free space

Max Lapan created HDFS-9014:
-------------------------------

             Summary: Block placement policy with respect to DN free space
                 Key: HDFS-9014
                 URL: https://issues.apache.org/jira/browse/HDFS-9014
             Project: Hadoop HDFS
          Issue Type: Improvement
          Components: namenode
            Reporter: Max Lapan


Default block allocation policy (also known as 'replication policy') implemented in NN is random selection from suitable candidates (rack-local or 'other rack'). This is ok when all DNs in a cluster has nearly equal amount of storage, but leads to problems when some DNs are significantly larger than other. In that situation, when NN places new blocks in random fashion, extra space becomes almost unusable and, in extreme case can lead to 100% usage of all other 'small' DNs with almost empty 'large', which can lead to various HDFS and MR problems. 

Situation when we have datanodes of different sizes is quite real in large, long-lived systems when different generations of machines are put in a single cluster.

To overcome this, I implemented a different block allocation policy which places blocks with respect to free space available on a DN. Please, consider it for inclusion in hdfs codebase.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)