You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Hairong Kuang (JIRA)" <ji...@apache.org> on 2009/01/13 00:41:59 UTC

[jira] Created: (HADOOP-5015) Seprate block/replica management code from FSNamesystem

Seprate block/replica management code from FSNamesystem
-------------------------------------------------------

                 Key: HADOOP-5015
                 URL: https://issues.apache.org/jira/browse/HADOOP-5015
             Project: Hadoop Core
          Issue Type: Improvement
          Components: dfs
            Reporter: Hairong Kuang
             Fix For: 0.21.0


Currently FSNamesystem contains a big amount of code that manages blocks and replicas. The code scatters in FSNamesystem and it is hard to read and maintain. It would be nice to move the code to a separate class called, for example, BlockManager. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Issue Comment Edited: (HADOOP-5015) Separate block/replica management code from FSNamesystem

Posted by "Raghu Angadi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-5015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12706649#action_12706649 ] 

Raghu Angadi edited comment on HADOOP-5015 at 5/6/09 4:01 PM:
--------------------------------------------------------------

> To make code review simpler, I have also retained the structure of the code moved from FSNamesystem.java as it is in BlockManager.java.

There are a lot of formatting changes that break existing patches for HDFS : 

e.g. segment (ideally it should have been a clean cut-n-paste) :

{noformat}
     synchronized (neededReplications) {
-      out.println("Metasave: Blocks waiting for replication: " +
-                  neededReplications.size());
+      out.println("Metasave: Blocks waiting for replication: "
+          + neededReplications.size());
       for (Block block : neededReplications) {
-        List<DatanodeDescriptor> containingNodes =
-                                          new ArrayList<DatanodeDescriptor>();
+        List<DatanodeDescriptor> containingNodes = new ArrayList<DatanodeDescriptor>();
         NumberReplicas numReplicas = new NumberReplicas();
         // source node returned is not used
         chooseSourceDatanode(block, containingNodes, numReplicas);
-        int usableReplicas = numReplicas.liveReplicas() +
-                             numReplicas.decommissionedReplicas();
+        int usableReplicas = numReplicas.liveReplicas()
+            + numReplicas.decommissionedReplicas();
         // l: == live:, d: == decommissioned c: == corrupt e: == excess
-        out.print(block + " (replicas:" +
-                  " l: " + numReplicas.liveReplicas() +
-                  " d: " + numReplicas.decommissionedReplicas() +
-                  " c: " + numReplicas.corruptReplicas() +
-                  " e: " + numReplicas.excessReplicas() +
-                  ((usableReplicas > 0)? "" : " MISSING") + ")");
+        out.print(block + " (replicas:" + " l: " + numReplicas.liveReplicas()
+            + " d: " + numReplicas.decommissionedReplicas() + " c: "
+            + numReplicas.corruptReplicas() + " e: "
+            + numReplicas.excessReplicas()
+            + ((usableReplicas > 0) ? "" : " MISSING") + ")");
 
-        for (Iterator<DatanodeDescriptor> jt = blocksMap.nodeIterator(block);
-             jt.hasNext();) {
+        for (Iterator<DatanodeDescriptor> jt = blocksMap.nodeIterator(block); jt
+            .hasNext();) {
           DatanodeDescriptor node = jt.next();
           out.print(" " + node + " : ");
         }
         out.println("");
{noformat}

Without such changes, it would have been much simpler to port the patches (just by changing the file name in patch file.


      was (Author: rangadi):
    > To make code review simpler, I have also retained the structure of the code moved from FSNamesystem.java as it is in BlockManager.java.

There are a lot of formatting changes the break existing patches HDFS : 
e.g. segment (Ideally it should have been a clean cut-n-paste) :

{noformat}
     synchronized (neededReplications) {
-      out.println("Metasave: Blocks waiting for replication: " +
-                  neededReplications.size());
+      out.println("Metasave: Blocks waiting for replication: "
+          + neededReplications.size());
       for (Block block : neededReplications) {
-        List<DatanodeDescriptor> containingNodes =
-                                          new ArrayList<DatanodeDescriptor>();
+        List<DatanodeDescriptor> containingNodes = new ArrayList<DatanodeDescriptor>();
         NumberReplicas numReplicas = new NumberReplicas();
         // source node returned is not used
         chooseSourceDatanode(block, containingNodes, numReplicas);
-        int usableReplicas = numReplicas.liveReplicas() +
-                             numReplicas.decommissionedReplicas();
+        int usableReplicas = numReplicas.liveReplicas()
+            + numReplicas.decommissionedReplicas();
         // l: == live:, d: == decommissioned c: == corrupt e: == excess
-        out.print(block + " (replicas:" +
-                  " l: " + numReplicas.liveReplicas() +
-                  " d: " + numReplicas.decommissionedReplicas() +
-                  " c: " + numReplicas.corruptReplicas() +
-                  " e: " + numReplicas.excessReplicas() +
-                  ((usableReplicas > 0)? "" : " MISSING") + ")");
+        out.print(block + " (replicas:" + " l: " + numReplicas.liveReplicas()
+            + " d: " + numReplicas.decommissionedReplicas() + " c: "
+            + numReplicas.corruptReplicas() + " e: "
+            + numReplicas.excessReplicas()
+            + ((usableReplicas > 0) ? "" : " MISSING") + ")");
 
-        for (Iterator<DatanodeDescriptor> jt = blocksMap.nodeIterator(block);
-             jt.hasNext();) {
+        for (Iterator<DatanodeDescriptor> jt = blocksMap.nodeIterator(block); jt
+            .hasNext();) {
           DatanodeDescriptor node = jt.next();
           out.print(" " + node + " : ");
         }
         out.println("");
{noformat}

Without such changes, it would have been much simpler to port the patches (just by changing the file name in patch file.

  
> Separate block/replica management code from FSNamesystem
> --------------------------------------------------------
>
>                 Key: HADOOP-5015
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5015
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: dfs
>            Reporter: Hairong Kuang
>            Assignee: Suresh Srinivas
>             Fix For: 0.21.0
>
>         Attachments: blkmanager.patch, blkmanager.patch
>
>
> Currently FSNamesystem contains a big amount of code that manages blocks and replicas. The code scatters in FSNamesystem and it is hard to read and maintain. It would be nice to move the code to a separate class called, for example, BlockManager. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-5015) Separate block/replica management code from FSNamesystem

Posted by "Tsz Wo (Nicholas), SZE (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-5015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Tsz Wo (Nicholas), SZE updated HADOOP-5015:
-------------------------------------------

    Resolution: Fixed
        Status: Resolved  (was: Patch Available)

I have committed this.  Thanks, Suresh!

> Separate block/replica management code from FSNamesystem
> --------------------------------------------------------
>
>                 Key: HADOOP-5015
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5015
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: dfs
>            Reporter: Hairong Kuang
>            Assignee: Suresh Srinivas
>             Fix For: 0.21.0
>
>         Attachments: blkmanager.patch, blkmanager.patch
>
>
> Currently FSNamesystem contains a big amount of code that manages blocks and replicas. The code scatters in FSNamesystem and it is hard to read and maintain. It would be nice to move the code to a separate class called, for example, BlockManager. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-5015) Separate block/replica management code from FSNamesystem

Posted by "Raghu Angadi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-5015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12706649#action_12706649 ] 

Raghu Angadi commented on HADOOP-5015:
--------------------------------------

> To make code review simpler, I have also retained the structure of the code moved from FSNamesystem.java as it is in BlockManager.java.

There are a lot of formatting changes the break existing patches HDFS : 
e.g. segment (Ideally it should have been a clean cut-n-paste) :

{noformat}
     synchronized (neededReplications) {
-      out.println("Metasave: Blocks waiting for replication: " +
-                  neededReplications.size());
+      out.println("Metasave: Blocks waiting for replication: "
+          + neededReplications.size());
       for (Block block : neededReplications) {
-        List<DatanodeDescriptor> containingNodes =
-                                          new ArrayList<DatanodeDescriptor>();
+        List<DatanodeDescriptor> containingNodes = new ArrayList<DatanodeDescriptor>();
         NumberReplicas numReplicas = new NumberReplicas();
         // source node returned is not used
         chooseSourceDatanode(block, containingNodes, numReplicas);
-        int usableReplicas = numReplicas.liveReplicas() +
-                             numReplicas.decommissionedReplicas();
+        int usableReplicas = numReplicas.liveReplicas()
+            + numReplicas.decommissionedReplicas();
         // l: == live:, d: == decommissioned c: == corrupt e: == excess
-        out.print(block + " (replicas:" +
-                  " l: " + numReplicas.liveReplicas() +
-                  " d: " + numReplicas.decommissionedReplicas() +
-                  " c: " + numReplicas.corruptReplicas() +
-                  " e: " + numReplicas.excessReplicas() +
-                  ((usableReplicas > 0)? "" : " MISSING") + ")");
+        out.print(block + " (replicas:" + " l: " + numReplicas.liveReplicas()
+            + " d: " + numReplicas.decommissionedReplicas() + " c: "
+            + numReplicas.corruptReplicas() + " e: "
+            + numReplicas.excessReplicas()
+            + ((usableReplicas > 0) ? "" : " MISSING") + ")");
 
-        for (Iterator<DatanodeDescriptor> jt = blocksMap.nodeIterator(block);
-             jt.hasNext();) {
+        for (Iterator<DatanodeDescriptor> jt = blocksMap.nodeIterator(block); jt
+            .hasNext();) {
           DatanodeDescriptor node = jt.next();
           out.print(" " + node + " : ");
         }
         out.println("");
{noformat}

Without such changes, it would have been much simpler to port the patches (just by changing the file name in patch file.


> Separate block/replica management code from FSNamesystem
> --------------------------------------------------------
>
>                 Key: HADOOP-5015
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5015
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: dfs
>            Reporter: Hairong Kuang
>            Assignee: Suresh Srinivas
>             Fix For: 0.21.0
>
>         Attachments: blkmanager.patch, blkmanager.patch
>
>
> Currently FSNamesystem contains a big amount of code that manages blocks and replicas. The code scatters in FSNamesystem and it is hard to read and maintain. It would be nice to move the code to a separate class called, for example, BlockManager. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-5015) Separate block/replica management code from FSNamesystem

Posted by "Jeff Hammerbacher (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-5015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12705330#action_12705330 ] 

Jeff Hammerbacher commented on HADOOP-5015:
-------------------------------------------

Hey,

Could someone with a deeper knowledge of HDFS internals comment on what work would need to be done after this refactoring to complete https://issues.apache.org/jira/browse/HADOOP-3799? Any insight here would be much appreciated!

Later,
Jeff

> Separate block/replica management code from FSNamesystem
> --------------------------------------------------------
>
>                 Key: HADOOP-5015
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5015
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: dfs
>            Reporter: Hairong Kuang
>            Assignee: Suresh Srinivas
>             Fix For: 0.21.0
>
>         Attachments: blkmanager.patch
>
>
> Currently FSNamesystem contains a big amount of code that manages blocks and replicas. The code scatters in FSNamesystem and it is hard to read and maintain. It would be nice to move the code to a separate class called, for example, BlockManager. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-5015) Seprate block/replica management code from FSNamesystem

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-5015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12663156#action_12663156 ] 

dhruba borthakur commented on HADOOP-5015:
------------------------------------------

This will be the first step to maybe run the BlockManager separately from the namespace manager! That will be pretty cool!

> Seprate block/replica management code from FSNamesystem
> -------------------------------------------------------
>
>                 Key: HADOOP-5015
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5015
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: dfs
>            Reporter: Hairong Kuang
>             Fix For: 0.21.0
>
>
> Currently FSNamesystem contains a big amount of code that manages blocks and replicas. The code scatters in FSNamesystem and it is hard to read and maintain. It would be nice to move the code to a separate class called, for example, BlockManager. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-5015) Seprate block/replica management code from FSNamesystem

Posted by "Hong Tang (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-5015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12663162#action_12663162 ] 

Hong Tang commented on HADOOP-5015:
-----------------------------------

My wish list:
- Blocks are addressed through numeric IDs (160-bit or even longer, ID never recycled).
- Inode Block (a block that contains the Block IDs of a file's data blocks.)
- NN maintains mapping of path => Inode Block ID.


> Seprate block/replica management code from FSNamesystem
> -------------------------------------------------------
>
>                 Key: HADOOP-5015
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5015
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: dfs
>            Reporter: Hairong Kuang
>             Fix For: 0.21.0
>
>
> Currently FSNamesystem contains a big amount of code that manages blocks and replicas. The code scatters in FSNamesystem and it is hard to read and maintain. It would be nice to move the code to a separate class called, for example, BlockManager. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-5015) Separate block/replica management code from FSNamesystem

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-5015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12705225#action_12705225 ] 

dhruba borthakur commented on HADOOP-5015:
------------------------------------------

This refactoring looks great!

> Separate block/replica management code from FSNamesystem
> --------------------------------------------------------
>
>                 Key: HADOOP-5015
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5015
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: dfs
>            Reporter: Hairong Kuang
>            Assignee: Suresh Srinivas
>             Fix For: 0.21.0
>
>         Attachments: blkmanager.patch
>
>
> Currently FSNamesystem contains a big amount of code that manages blocks and replicas. The code scatters in FSNamesystem and it is hard to read and maintain. It would be nice to move the code to a separate class called, for example, BlockManager. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-5015) Separate block/replica management code from FSNamesystem

Posted by "Suresh Srinivas (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-5015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Suresh Srinivas updated HADOOP-5015:
------------------------------------

    Hadoop Flags: [Reviewed]
          Status: Patch Available  (was: Open)

> Separate block/replica management code from FSNamesystem
> --------------------------------------------------------
>
>                 Key: HADOOP-5015
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5015
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: dfs
>            Reporter: Hairong Kuang
>            Assignee: Suresh Srinivas
>             Fix For: 0.21.0
>
>         Attachments: blkmanager.patch, blkmanager.patch
>
>
> Currently FSNamesystem contains a big amount of code that manages blocks and replicas. The code scatters in FSNamesystem and it is hard to read and maintain. It would be nice to move the code to a separate class called, for example, BlockManager. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-5015) Separate block/replica management code from FSNamesystem

Posted by "Tsz Wo (Nicholas), SZE (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-5015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Tsz Wo (Nicholas), SZE updated HADOOP-5015:
-------------------------------------------


+1 patch looks good.

I will commit this later today.

> Separate block/replica management code from FSNamesystem
> --------------------------------------------------------
>
>                 Key: HADOOP-5015
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5015
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: dfs
>            Reporter: Hairong Kuang
>            Assignee: Suresh Srinivas
>             Fix For: 0.21.0
>
>         Attachments: blkmanager.patch, blkmanager.patch
>
>
> Currently FSNamesystem contains a big amount of code that manages blocks and replicas. The code scatters in FSNamesystem and it is hard to read and maintain. It would be nice to move the code to a separate class called, for example, BlockManager. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-5015) Separate block/replica management code from FSNamesystem

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-5015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12706243#action_12706243 ] 

Hadoop QA commented on HADOOP-5015:
-----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12407169/blkmanager.patch
  against trunk revision 771661.

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 18 new or modified tests.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    +1 findbugs.  The patch does not introduce any new Findbugs warnings.

    +1 Eclipse classpath. The patch retains Eclipse classpath integrity.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

    +1 core tests.  The patch passed core unit tests.

    -1 contrib tests.  The patch failed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/291/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/291/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/291/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/291/console

This message is automatically generated.

> Separate block/replica management code from FSNamesystem
> --------------------------------------------------------
>
>                 Key: HADOOP-5015
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5015
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: dfs
>            Reporter: Hairong Kuang
>            Assignee: Suresh Srinivas
>             Fix For: 0.21.0
>
>         Attachments: blkmanager.patch, blkmanager.patch
>
>
> Currently FSNamesystem contains a big amount of code that manages blocks and replicas. The code scatters in FSNamesystem and it is hard to read and maintain. It would be nice to move the code to a separate class called, for example, BlockManager. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-5015) Separate block/replica management code from FSNamesystem

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-5015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12705359#action_12705359 ] 

dhruba borthakur commented on HADOOP-5015:
------------------------------------------

This patch is not related to HADOOP-3799. This patch attempts to separate out the block manager code from the namespace manager code whereas HADOOP-3799 makes the replica-placement pluggable.

> Separate block/replica management code from FSNamesystem
> --------------------------------------------------------
>
>                 Key: HADOOP-5015
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5015
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: dfs
>            Reporter: Hairong Kuang
>            Assignee: Suresh Srinivas
>             Fix For: 0.21.0
>
>         Attachments: blkmanager.patch
>
>
> Currently FSNamesystem contains a big amount of code that manages blocks and replicas. The code scatters in FSNamesystem and it is hard to read and maintain. It would be nice to move the code to a separate class called, for example, BlockManager. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-5015) Separate block/replica management code from FSNamesystem

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-5015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12706879#action_12706879 ] 

Hudson commented on HADOOP-5015:
--------------------------------

Integrated in Hadoop-trunk #828 (See [http://hudson.zones.apache.org/hudson/job/Hadoop-trunk/828/])
    . Separate block management code from FSNamesystem.  Contributed by Suresh Srinivas


> Separate block/replica management code from FSNamesystem
> --------------------------------------------------------
>
>                 Key: HADOOP-5015
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5015
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: dfs
>            Reporter: Hairong Kuang
>            Assignee: Suresh Srinivas
>             Fix For: 0.21.0
>
>         Attachments: blkmanager.patch, blkmanager.patch
>
>
> Currently FSNamesystem contains a big amount of code that manages blocks and replicas. The code scatters in FSNamesystem and it is hard to read and maintain. It would be nice to move the code to a separate class called, for example, BlockManager. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Assigned: (HADOOP-5015) Seprate block/replica management code from FSNamesystem

Posted by "Suresh Srinivas (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-5015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Suresh Srinivas reassigned HADOOP-5015:
---------------------------------------

    Assignee: Suresh Srinivas

> Seprate block/replica management code from FSNamesystem
> -------------------------------------------------------
>
>                 Key: HADOOP-5015
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5015
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: dfs
>            Reporter: Hairong Kuang
>            Assignee: Suresh Srinivas
>             Fix For: 0.21.0
>
>         Attachments: blkmanager.patch
>
>
> Currently FSNamesystem contains a big amount of code that manages blocks and replicas. The code scatters in FSNamesystem and it is hard to read and maintain. It would be nice to move the code to a separate class called, for example, BlockManager. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-5015) Separate block/replica management code from FSNamesystem

Posted by "Suresh Srinivas (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-5015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Suresh Srinivas updated HADOOP-5015:
------------------------------------

    Attachment: blkmanager.patch

Incorporated all the comments in the attached patch, except the following:
# Not moving {{generationStamp}} to blockManager as it is used in {{FSNamesystem}} and updated during file creation
# {{blockInvalidateLimit}} can be renamed in a separate change
# {{replicationRecheckInterval}} should be moved into {{ReplicationMonitor}} in a separate change


> Separate block/replica management code from FSNamesystem
> --------------------------------------------------------
>
>                 Key: HADOOP-5015
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5015
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: dfs
>            Reporter: Hairong Kuang
>            Assignee: Suresh Srinivas
>             Fix For: 0.21.0
>
>         Attachments: blkmanager.patch, blkmanager.patch
>
>
> Currently FSNamesystem contains a big amount of code that manages blocks and replicas. The code scatters in FSNamesystem and it is hard to read and maintain. It would be nice to move the code to a separate class called, for example, BlockManager. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-5015) Seprate block/replica management code from FSNamesystem

Posted by "Konstantin Shvachko (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-5015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12705205#action_12705205 ] 

Konstantin Shvachko commented on HADOOP-5015:
---------------------------------------------

This is a nice refactoring. A few comments:
# FSDirectory should have a method {{getBlockManager()}}. This will let you avoid unnecessary transient methods in {{FSNamesystem}} that only exist to call the respective {{BlockManager}} methods.
# {{BlocksWithLocations getBlocks()}} should remain in {{FSNamesystem}}. I think {{BlockManager}} should not be exposed to external types like {{BlocksWithLocations}}. Besides {{getBlocks()}} does not work with any intrinsic fields of {{BlockManager}}.
# It seems to me that {{ReplicationTargetChooser replicator}} should be also moved into {{BlockManager}}.
# I am not sure about the following fields, which may be related to block management as well, but the patch left them inside {{FSNamesystem}}:
#- {{generationStamp}} seem to be related
#- {{blockInvalidateLimit}} looks like unrelated, we may later want to rename it to {{datanodeInvalidateLimit}}.
#- {{ReplicationMonitor}} along with {{replicationRecheckInterval}} - probably related.

> Seprate block/replica management code from FSNamesystem
> -------------------------------------------------------
>
>                 Key: HADOOP-5015
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5015
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: dfs
>            Reporter: Hairong Kuang
>            Assignee: Suresh Srinivas
>             Fix For: 0.21.0
>
>         Attachments: blkmanager.patch
>
>
> Currently FSNamesystem contains a big amount of code that manages blocks and replicas. The code scatters in FSNamesystem and it is hard to read and maintain. It would be nice to move the code to a separate class called, for example, BlockManager. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-5015) Separate block/replica management code from FSNamesystem

Posted by "Konstantin Shvachko (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-5015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Konstantin Shvachko updated HADOOP-5015:
----------------------------------------

    Summary: Separate block/replica management code from FSNamesystem  (was: Seprate block/replica management code from FSNamesystem)

> Separate block/replica management code from FSNamesystem
> --------------------------------------------------------
>
>                 Key: HADOOP-5015
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5015
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: dfs
>            Reporter: Hairong Kuang
>            Assignee: Suresh Srinivas
>             Fix For: 0.21.0
>
>         Attachments: blkmanager.patch
>
>
> Currently FSNamesystem contains a big amount of code that manages blocks and replicas. The code scatters in FSNamesystem and it is hard to read and maintain. It would be nice to move the code to a separate class called, for example, BlockManager. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-5015) Seprate block/replica management code from FSNamesystem

Posted by "Suresh Srinivas (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-5015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Suresh Srinivas updated HADOOP-5015:
------------------------------------

    Attachment: blkmanager.patch

As a first step towards separating out block management functionality from FSNamesystem.java, I have introduced a new class BlockManager.java. This new class is to be only used by {{FSNamesystem}}, using the synchronization as it exists today. To make code review simpler, I have also retained the structure of the code moved from FSNamesystem.java as it is in BlockManager.java. After this change, we could have more iterations to organize the code better with in BlockManager.java.

> Seprate block/replica management code from FSNamesystem
> -------------------------------------------------------
>
>                 Key: HADOOP-5015
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5015
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: dfs
>            Reporter: Hairong Kuang
>             Fix For: 0.21.0
>
>         Attachments: blkmanager.patch
>
>
> Currently FSNamesystem contains a big amount of code that manages blocks and replicas. The code scatters in FSNamesystem and it is hard to read and maintain. It would be nice to move the code to a separate class called, for example, BlockManager. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.