You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Andrzej Bialecki (JIRA)" <ji...@apache.org> on 2006/03/25 21:04:21 UTC
[jira] Commented: (HADOOP-50) dfs datanode should store blocks in
multiple directories
[ http://issues.apache.org/jira/browse/HADOOP-50?page=comments#action_12371869 ]
Andrzej Bialecki commented on HADOOP-50:
-----------------------------------------
I think this is a valid concern. Most filesystems work poorly with thousands of files in a single directory. My recent tests on ext3 show that listing the data directory with 50,000 blocks takes several seconds.
FSDataset:80 contains a commented out section, which seems to address this issue. Anyone knows why it's not used?
> dfs datanode should store blocks in multiple directories
> --------------------------------------------------------
>
> Key: HADOOP-50
> URL: http://issues.apache.org/jira/browse/HADOOP-50
> Project: Hadoop
> Type: Bug
> Components: dfs
> Versions: 0.2
> Reporter: Doug Cutting
> Assignee: Mike Cafarella
> Fix For: 0.2
>
> The datanode currently stores all file blocks in a single directory. With 32MB blocks and terabyte filesystems, this will create too many files in a single directory for many filesystems. Thus blocks should be stored in multiple directories, perhaps even a shallow hierarchy.
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
http://www.atlassian.com/software/jira