You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-dev@hadoop.apache.org by "Yong Zhang (JIRA)" <ji...@apache.org> on 2015/09/01 10:29:46 UTC

[jira] [Created] (HDFS-8998) Small files storage supported inside HDFS

Yong Zhang created HDFS-8998:
--------------------------------

             Summary: Small files storage supported inside HDFS
                 Key: HDFS-8998
                 URL: https://issues.apache.org/jira/browse/HDFS-8998
             Project: Hadoop HDFS
          Issue Type: New Feature
            Reporter: Yong Zhang
            Assignee: Yong Zhang


HDFS has problems on store small files, just like this blog said (http://blog.cloudera.com/blog/2009/02/the-small-files-problem).
This blog also tell us some way how to store small file in HDFS, but they are not good way, seems HAR files and Sequence Files are better for read-only files.

Current each HDFS block is only for one HDFS file, if too many small file there, many small blocks will be in DataNode, which will make DataNode heavy loading.
This jira will show how to online merge small blocks to big one, and how to delete small file, and so on.

Cerrentlly we have many open jira for improving HDFS scalability on NameNode, such as HDFS-7836, HDFS-8286 and so on. 
So small file meta (INode and BlocksMap) will also be in NameNode.

Design document will be uploaded soon. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)