You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-dev@hadoop.apache.org by "Yong Zhang (JIRA)" <ji...@apache.org> on 2015/09/01 10:29:46 UTC
[jira] [Created] (HDFS-8998) Small files storage supported inside
HDFS
Yong Zhang created HDFS-8998:
--------------------------------
Summary: Small files storage supported inside HDFS
Key: HDFS-8998
URL: https://issues.apache.org/jira/browse/HDFS-8998
Project: Hadoop HDFS
Issue Type: New Feature
Reporter: Yong Zhang
Assignee: Yong Zhang
HDFS has problems on store small files, just like this blog said (http://blog.cloudera.com/blog/2009/02/the-small-files-problem).
This blog also tell us some way how to store small file in HDFS, but they are not good way, seems HAR files and Sequence Files are better for read-only files.
Current each HDFS block is only for one HDFS file, if too many small file there, many small blocks will be in DataNode, which will make DataNode heavy loading.
This jira will show how to online merge small blocks to big one, and how to delete small file, and so on.
Cerrentlly we have many open jira for improving HDFS scalability on NameNode, such as HDFS-7836, HDFS-8286 and so on.
So small file meta (INode and BlocksMap) will also be in NameNode.
Design document will be uploaded soon.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)