You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "dhruba borthakur (JIRA)" <ji...@apache.org> on 2006/11/06 20:41:38 UTC
[jira] Commented: (HADOOP-334) Redesign the dfs namespace datastructures to be copy on write

    [ http://issues.apache.org/jira/browse/HADOOP-334?page=comments#action_12447513 ] 
            
dhruba borthakur commented on HADOOP-334:
-----------------------------------------

I was thinking of the different ways to checkpoint the fsimage.

The simplest and dumb way is to make the checkpointing process acquire the global lock, flush the entire
contents to persistent store and then release the lock. This causes all currently running operations to block and is sub-optimal.

Another simple way is to have a global lock. At the start of the checkpointing process, we make a clone (ie. copy) of the entire FSImage in memory and then release the global lock. The  clone is then lazily written to persistent store. The disadvantage of this approach is that we need lots of memory.

If we adopt a copy-on-write approach, we might get an optimal solution. One that does not require loads of memory neither does it impact concurrent transactions.

Each node in the tree has a clone member. Each node starts with the clone member set to null.  There is a single global lock associated with the entire tree. No other fine-grain locks are necessary.

All node modification request holds the global lock (just like in the current code).  Suppose a node modification request finds a node that is not yet cloned. It clones the node and starts using the clone data. If the node is already cloned, it uses the clone data\and ignores the original data of that node.

The checkpointing thread traverses the tree and starts writing nodes to persistent store.  No locks are held during this flush to persistent store. The checkpointing thread does not mess around with any clone data (if present). At the end of the checkpointing process, the global lock is reacquired, all data from the clones are copied back to the original nodes, the clone members are nulled and the global lock released.

In this approach, we keep the simiplicity associated with a single global lock while optimising on memory usage.

> Redesign the dfs namespace datastructures to be copy on write
> -------------------------------------------------------------
>
>                 Key: HADOOP-334
>                 URL: http://issues.apache.org/jira/browse/HADOOP-334
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: dfs
>    Affects Versions: 0.4.0
>            Reporter: Owen O'Malley
>         Assigned To: Konstantin Shvachko
>
> The namespace datastructures should be copy on write so that the namespace does not need to be completely locked down from user changes while the checkpoint is being made.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira