You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-dev@hadoop.apache.org by "Jihoon Son (JIRA)" <ji...@apache.org> on 2013/06/25 07:40:19 UTC

[jira] [Created] (HDFS-4931) Extend the block placement policy interface to utilize the location information of previously stored files

Jihoon Son created HDFS-4931:
--------------------------------

             Summary: Extend the block placement policy interface to utilize the location information of previously stored files  
                 Key: HDFS-4931
                 URL: https://issues.apache.org/jira/browse/HDFS-4931
             Project: Hadoop HDFS
          Issue Type: Improvement
            Reporter: Jihoon Son


Nowadays, I'm implementing a locality preserving block placement policy which stores files in a directory in the same datanode. That is to say, given a root directory, files under the root directory are grouped by paths of their parent directories. After that, files of a group are stored in the same datanode. 

When a new file is stored at HDFS, the block placement policy choose the target datanode considering locations of previously stored files. 

In the current block placement policy interface, there are some problems. The first problem is that there is no interface to keep the previously stored files when HDFS is restarted. To restore the location information of all files, this process should be done during the safe mode of the namenode.

To solve the first problem, I modified the block placement policy interface and FSNamesystem. Before leaving the safe mode, every necessary location information is sent to the block placement policy. 
However, there are too much changes of access modifiers from private to public in my implementation. This may violate the design of the interface. 

The second problem is occurred when some blocks are moved by the balancer or node failures. In this case, the block placement policy should recognize the current status, and return a new datanode to move blocks. However, the current interface does not support it. 

The attached patch is to solve the first problem, but as mentioned above, it may violate the design of the interface. 
Do you have any good ideas?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira