You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-dev@hadoop.apache.org by "Matt Foley (JIRA)" <ji...@apache.org> on 2010/10/08 00:41:32 UTC
[jira] Created: (HDFS-1446) Refactor the start-time Directory Tree
and Replicas Map constructors to share data and run volume-parallel
Refactor the start-time Directory Tree and Replicas Map constructors to share data and run volume-parallel
----------------------------------------------------------------------------------------------------------
Key: HDFS-1446
URL: https://issues.apache.org/jira/browse/HDFS-1446
Project: Hadoop HDFS
Issue Type: Sub-task
Components: data-node
Affects Versions: 0.20.2
Reporter: Matt Foley
Assignee: Matt Foley
Fix For: 0.22.0
Refactor the FSDir() and getVolumeMap() call chains in FSDataset, so they share data and run volume-parallel. Currently the two constructors for in-memory directory tree and replicas map run THREE full scans of the entire disk - once in FSDir(), once in recoverTempUnlinkedBlock(), and once in addToReplicasMap(). During each scan, a new File object is created for each of the 100,000 or so items in the native file system (for a 50,000-block node). This impacts GC as well as disk traffic.
This work item is one of four sub-tasks for HDFS-1443, Improve Datanode startup time.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.