You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Lohit Vijayarenu (JIRA)" <ji...@apache.org> on 2008/08/13 19:04:44 UTC

[jira] Created: (HADOOP-3948) Separate Namenodes edits and fsimage

Separate Namenodes edits and fsimage 
-------------------------------------

                 Key: HADOOP-3948
                 URL: https://issues.apache.org/jira/browse/HADOOP-3948
             Project: Hadoop Core
          Issue Type: Improvement
          Components: dfs
            Reporter: Lohit Vijayarenu


NameNode's _edits_ and _fsimage_ should be separated with an option of having them in their own separate directories.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3948) Separate Namenodes edits and fsimage

Posted by "Konstantin Shvachko (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12625584#action_12625584 ] 

Konstantin Shvachko commented on HADOOP-3948:
---------------------------------------------

# {{StorageDirType}} does not belong to {{Storage}}. {{Storage}} is a common class for different storages, so name-node specific 
{{FSIMAGE, FSEDITS}} look extraneous in it.
Let us define an interface {{StorageDirType}} in {{Storage}} and then define {{enum NameNodeDirType implements StorageDirType}} in FSImage.
I would also rename emun fields: {{UNDEFINED, IMAGE, EDITS, IMAGE_AND_EDITS}}
{code}
Storage {
  interface StorageDirType {
    public StorageDirType getStorageDirType();
    public boolean isOfType(StorageDirType type);
  }
}

FSImage {
  static enum NameNodeDirType implements StorageDirType {
    UNDEFINED,
    IMAGE,
    EDITS,
    IMAGE_AND_EDITS;
    public StorageDirType getStorageDirType() {
      return this;
    }
    public boolean isOfType(StorageDirType type) {
      if(type == IMAGE_AND_EDITS && (this == IMAGE || this == EDITS)
        return true;
      return this == type;
    }
  }
}
{code}
Then Storage is able to operate entirely in terms of StorageDirType, and FSImage will pass NameNodeDirType as a value of StorageDirType.
# {{StorageDirectory.root}} should not be public. It is better to introduce {{public File getRoot()}}
# {{EditlogFileOutputStream.getFile()}} JavaDoc comment should say
{code}
* Returns the file associated with this stream
{code}
# In setStorageDirectories() you can loop like this
{code}
for (File dirName : fsNameDirs) {
  boolean isAlsoEdits = false;
...............
}
{code}
# Change parameter name
{code}
void processIOError(File dirName) { .... }
{code}
I am a bit confused that we have so many processIOError() methods both in FSImage and EditsLog. Can something be done about it?
# {{FSImage.incrementCheckpointTime()}} iterates through storage directories and removes those that have problems. 
Removing within iterator should break the iterator. Does it?
# In {{FSImage.loadFSImage()}} both latestSD for image directories and latestEditsDir should be found and checked for consistency (same time) before loading the image.
# And you should be looking for the latest edits dir rather than the one that has image dir's latestCheckpointTime. In your implementation if I can by mistake specify old image dir, old edits dir, and new edits dir the cluster will start no problem, and will remove new edits. We should rather detect this inconsistency and fail to start.
Should have a test case for that too.
# Debug LOG messages in startCheckpoint() should be removed.
# In {{SecondaryNameNode.doMerge()}} you should not scan till the end of the storage directories but rather pick the first one
{code}
sdImage = dirIterator(StorageDirType.FSIMAGE).next();
sdEdits = dirIterator(StorageDirType.FSEDITS).next();
{code}
Checking of course that dirIterator()s do not return empty collections.

> Separate Namenodes edits and fsimage 
> -------------------------------------
>
>                 Key: HADOOP-3948
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3948
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: dfs
>            Reporter: Lohit Vijayarenu
>         Attachments: HADOOP-3948.patch, hadoop-core-trunk.patch, hadoop-core-trunk.patch, hadoop-core-trunk.patch
>
>
> NameNode's _edits_ and _fsimage_ should be separated with an option of having them in their own separate directories.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3948) Separate Namenodes edits and fsimage

Posted by "Runping Qi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12622767#action_12622767 ] 

Runping Qi commented on HADOOP-3948:
------------------------------------


+1.

This way, we can use different disks/volume partitions for them.


> Separate Namenodes edits and fsimage 
> -------------------------------------
>
>                 Key: HADOOP-3948
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3948
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: dfs
>            Reporter: Lohit Vijayarenu
>
> NameNode's _edits_ and _fsimage_ should be separated with an option of having them in their own separate directories.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3948) Separate Namenodes edits and fsimage

Posted by "Lohit Vijayarenu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Lohit Vijayarenu updated HADOOP-3948:
-------------------------------------

    Attachment: HADOOP-3948.patch

Attached patch removes FSImage::processIOError(StorageDirectory) and instead uses Iterator::remove(). Have included a testcase which test various config scenarios.   

> Separate Namenodes edits and fsimage 
> -------------------------------------
>
>                 Key: HADOOP-3948
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3948
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: dfs
>            Reporter: Lohit Vijayarenu
>         Attachments: HADOOP-3948.patch, HADOOP-3948.patch, HADOOP-3948.patch, hadoop-core-trunk.patch, hadoop-core-trunk.patch, hadoop-core-trunk.patch
>
>
> NameNode's _edits_ and _fsimage_ should be separated with an option of having them in their own separate directories.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3948) Separate Namenodes edits and fsimage

Posted by "Lohit Vijayarenu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12623748#action_12623748 ] 

Lohit Vijayarenu commented on HADOOP-3948:
------------------------------------------

_FSEditLog::processIOError(int)_ removes the stream and directory like this
{noformat}
editStreams.remove(index);
fsimage.processIOError(index);
{noformat}

how about changing _fsimage.processIOError_ to accept File descriptor to take care of cleanup. Since directory handling is done at FSImage level, EditOutputStream need not worry about it. We could add an additional call to get File descriptor from EditFileOutputStream



> Separate Namenodes edits and fsimage 
> -------------------------------------
>
>                 Key: HADOOP-3948
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3948
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: dfs
>            Reporter: Lohit Vijayarenu
>         Attachments: hadoop-core-trunk.patch
>
>
> NameNode's _edits_ and _fsimage_ should be separated with an option of having them in their own separate directories.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3948) Separate Namenodes edits and fsimage

Posted by "Konstantin Shvachko (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12627135#action_12627135 ] 

Konstantin Shvachko commented on HADOOP-3948:
---------------------------------------------

+1
I agree we should fix it in another jira.

> Separate Namenodes edits and fsimage 
> -------------------------------------
>
>                 Key: HADOOP-3948
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3948
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: dfs
>    Affects Versions: 0.19.0
>            Reporter: Lohit Vijayarenu
>            Assignee: Lohit Vijayarenu
>             Fix For: 0.19.0
>
>         Attachments: HADOOP-3948.patch, HADOOP-3948.patch, HADOOP-3948.patch, HADOOP-3948.patch, HADOOP-3948.patch, hadoop-core-trunk.patch, hadoop-core-trunk.patch, hadoop-core-trunk.patch
>
>
> NameNode's _edits_ and _fsimage_ should be separated with an option of having them in their own separate directories.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3948) Separate Namenodes edits and fsimage

Posted by "Konstantin Shvachko (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12626829#action_12626829 ] 

Konstantin Shvachko commented on HADOOP-3948:
---------------------------------------------

- I think classes StorageDirType and DirIterator need JavaDoc explaining what they are for.
It would be good to indicate that DirIterator can be dirType specific or iterate through all.
- StorageDirectory.setStorageDirType() is not used anywhere. Let's not introduce public methods if they do not have immediate usage.
- In FSEditLog.getEditLogSize() it is better to replace {{idx < getNumStorageDirs()}} by {{idx < editStreams.size()}}
- I have indentation problems looks like you have tabs in the new code. You can try setting tab size to 2 and you will see where indentation is not right.
- In case of FSEditLog.rollEditLog() you call it.remove() to remove StorageDirectory in which a file cannot be created, but never incrementCheckpointTime().
Can that be a problem?
- FSImage does need to import StorageDirectory, List and AbstractList.
- FSImage.getFileNames() should return {{List<File>}} rather than {{File []}}. We should file a separate jira for that. Not related to the patch.


> Separate Namenodes edits and fsimage 
> -------------------------------------
>
>                 Key: HADOOP-3948
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3948
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: dfs
>            Reporter: Lohit Vijayarenu
>         Attachments: HADOOP-3948.patch, HADOOP-3948.patch, HADOOP-3948.patch, HADOOP-3948.patch, hadoop-core-trunk.patch, hadoop-core-trunk.patch, hadoop-core-trunk.patch
>
>
> NameNode's _edits_ and _fsimage_ should be separated with an option of having them in their own separate directories.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3948) Separate Namenodes edits and fsimage

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12627493#action_12627493 ] 

Hudson commented on HADOOP-3948:
--------------------------------

Integrated in Hadoop-trunk #589 (See [http://hudson.zones.apache.org/hudson/job/Hadoop-trunk/589/])

> Separate Namenodes edits and fsimage 
> -------------------------------------
>
>                 Key: HADOOP-3948
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3948
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: dfs
>    Affects Versions: 0.19.0
>            Reporter: Lohit Vijayarenu
>            Assignee: Lohit Vijayarenu
>             Fix For: 0.19.0
>
>         Attachments: HADOOP-3948.patch, HADOOP-3948.patch, HADOOP-3948.patch, HADOOP-3948.patch, HADOOP-3948.patch, hadoop-core-trunk.patch, hadoop-core-trunk.patch, hadoop-core-trunk.patch
>
>
> NameNode's _edits_ and _fsimage_ should be separated with an option of having them in their own separate directories.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3948) Separate Namenodes edits and fsimage

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12627137#action_12627137 ] 

Hadoop QA commented on HADOOP-3948:
-----------------------------------

+1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12389149/HADOOP-3948.patch
  against trunk revision 690142.

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 12 new or modified tests.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    +1 findbugs.  The patch does not introduce any new Findbugs warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

    +1 core tests.  The patch passed core unit tests.

    +1 contrib tests.  The patch passed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3145/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3145/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3145/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3145/console

This message is automatically generated.

> Separate Namenodes edits and fsimage 
> -------------------------------------
>
>                 Key: HADOOP-3948
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3948
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: dfs
>    Affects Versions: 0.19.0
>            Reporter: Lohit Vijayarenu
>            Assignee: Lohit Vijayarenu
>             Fix For: 0.19.0
>
>         Attachments: HADOOP-3948.patch, HADOOP-3948.patch, HADOOP-3948.patch, HADOOP-3948.patch, HADOOP-3948.patch, hadoop-core-trunk.patch, hadoop-core-trunk.patch, hadoop-core-trunk.patch
>
>
> NameNode's _edits_ and _fsimage_ should be separated with an option of having them in their own separate directories.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3948) Separate Namenodes edits and fsimage

Posted by "Lohit Vijayarenu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Lohit Vijayarenu updated HADOOP-3948:
-------------------------------------

    Status: Patch Available  (was: Open)

> Separate Namenodes edits and fsimage 
> -------------------------------------
>
>                 Key: HADOOP-3948
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3948
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: dfs
>            Reporter: Lohit Vijayarenu
>            Assignee: Lohit Vijayarenu
>         Attachments: HADOOP-3948.patch, HADOOP-3948.patch, HADOOP-3948.patch, HADOOP-3948.patch, HADOOP-3948.patch, hadoop-core-trunk.patch, hadoop-core-trunk.patch, hadoop-core-trunk.patch
>
>
> NameNode's _edits_ and _fsimage_ should be separated with an option of having them in their own separate directories.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3948) Separate Namenodes edits and fsimage

Posted by "Lohit Vijayarenu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12627116#action_12627116 ] 

Lohit Vijayarenu commented on HADOOP-3948:
------------------------------------------

Passes all test-patch and all tests on my system.
[exec] +1 overall.  
     [exec] 
     [exec]     +1 @author.  The patch does not contain any @author tags.
     [exec] 
     [exec]     +1 tests included.  The patch appears to include 12 new or modified tests.
     [exec] 
     [exec]     +1 javadoc.  The javadoc tool did not generate any warning messages.
     [exec] 
     [exec]     +1 javac.  The applied patch does not increase the total number of javac compiler warnings.
     [exec] 
     [exec]     +1 findbugs.  The patch does not introduce any new Findbugs warnings.
     [exec] 
[lohit@ hadoop-core-trunk]$ (ant test > test.log ) >& test.log
[lohit@ hadoop-core-trunk]$ grep FAIL test.log 
[lohit@ hadoop-core-trunk]$ grep SUCC test.log 
BUILD SUCCESSFUL
[lohit@ hadoop-core-trunk]$ 

> Separate Namenodes edits and fsimage 
> -------------------------------------
>
>                 Key: HADOOP-3948
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3948
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: dfs
>    Affects Versions: 0.19.0
>            Reporter: Lohit Vijayarenu
>            Assignee: Lohit Vijayarenu
>             Fix For: 0.19.0
>
>         Attachments: HADOOP-3948.patch, HADOOP-3948.patch, HADOOP-3948.patch, HADOOP-3948.patch, HADOOP-3948.patch, hadoop-core-trunk.patch, hadoop-core-trunk.patch, hadoop-core-trunk.patch
>
>
> NameNode's _edits_ and _fsimage_ should be separated with an option of having them in their own separate directories.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3948) Separate Namenodes edits and fsimage

Posted by "Lohit Vijayarenu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Lohit Vijayarenu updated HADOOP-3948:
-------------------------------------

    Attachment: hadoop-core-trunk.patch

Initial testing, namenode starts and works with different image and edits directory. I still need to fix lot of other test cases. 

> Separate Namenodes edits and fsimage 
> -------------------------------------
>
>                 Key: HADOOP-3948
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3948
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: dfs
>            Reporter: Lohit Vijayarenu
>         Attachments: hadoop-core-trunk.patch, hadoop-core-trunk.patch, hadoop-core-trunk.patch
>
>
> NameNode's _edits_ and _fsimage_ should be separated with an option of having them in their own separate directories.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3948) Separate Namenodes edits and fsimage

Posted by "Lohit Vijayarenu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Lohit Vijayarenu updated HADOOP-3948:
-------------------------------------

    Attachment: HADOOP-3948.patch

Thanks Konstantin. Updated patch with all your inputs. For
bq. In case of FSEditLog.rollEditLog() you call it.remove() to remove StorageDirectory in which a file cannot be created, but never incrementCheckpointTime().
This was the case in trunk also, we did not increment checkpoint. I would like to fix this in another JIRA along with to investigate if we miss this case any other place. HADOOP-4045

> Separate Namenodes edits and fsimage 
> -------------------------------------
>
>                 Key: HADOOP-3948
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3948
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: dfs
>            Reporter: Lohit Vijayarenu
>         Attachments: HADOOP-3948.patch, HADOOP-3948.patch, HADOOP-3948.patch, HADOOP-3948.patch, HADOOP-3948.patch, hadoop-core-trunk.patch, hadoop-core-trunk.patch, hadoop-core-trunk.patch
>
>
> NameNode's _edits_ and _fsimage_ should be separated with an option of having them in their own separate directories.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3948) Separate Namenodes edits and fsimage

Posted by "Lohit Vijayarenu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Lohit Vijayarenu updated HADOOP-3948:
-------------------------------------

    Attachment: HADOOP-3948.patch

Updated patch fixing findbugs warnings. Also added few more checks in test for secondary namenode.

> Separate Namenodes edits and fsimage 
> -------------------------------------
>
>                 Key: HADOOP-3948
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3948
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: dfs
>            Reporter: Lohit Vijayarenu
>         Attachments: HADOOP-3948.patch, HADOOP-3948.patch, HADOOP-3948.patch, HADOOP-3948.patch, hadoop-core-trunk.patch, hadoop-core-trunk.patch, hadoop-core-trunk.patch
>
>
> NameNode's _edits_ and _fsimage_ should be separated with an option of having them in their own separate directories.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3948) Separate Namenodes edits and fsimage

Posted by "Konstantin Shvachko (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Konstantin Shvachko updated HADOOP-3948:
----------------------------------------

      Resolution: Fixed
    Hadoop Flags: [Reviewed]
          Status: Resolved  (was: Patch Available)

I just committed this. Thank you Lohit.

> Separate Namenodes edits and fsimage 
> -------------------------------------
>
>                 Key: HADOOP-3948
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3948
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: dfs
>    Affects Versions: 0.19.0
>            Reporter: Lohit Vijayarenu
>            Assignee: Lohit Vijayarenu
>             Fix For: 0.19.0
>
>         Attachments: HADOOP-3948.patch, HADOOP-3948.patch, HADOOP-3948.patch, HADOOP-3948.patch, HADOOP-3948.patch, hadoop-core-trunk.patch, hadoop-core-trunk.patch, hadoop-core-trunk.patch
>
>
> NameNode's _edits_ and _fsimage_ should be separated with an option of having them in their own separate directories.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3948) Separate Namenodes edits and fsimage

Posted by "Lohit Vijayarenu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12622281#action_12622281 ] 

Lohit Vijayarenu commented on HADOOP-3948:
------------------------------------------

Running few experiments and observing logs of past run shows that there was IO contention during checkpoint. It is clear that _edits_ and _fsimage_ should be separated and we should have an option to configure them into its own directories. For example _dfs.name.dir_ could have just the image and _dfs.edits.dir_ could have edits. Few key points

- This enables one to use combination of different directories/drives/mounts 
- Also enables administrator to choose how many copies of edits and fsimage once should maintain
- Splitting this might help for HA feature
- We should think about how we support upgrade and deal with cases when someone uses same directory for both _edits_ and _fsimage_ or changes the configuration during restarts

> Separate Namenodes edits and fsimage 
> -------------------------------------
>
>                 Key: HADOOP-3948
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3948
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: dfs
>            Reporter: Lohit Vijayarenu
>
> NameNode's _edits_ and _fsimage_ should be separated with an option of having them in their own separate directories.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3948) Separate Namenodes edits and fsimage

Posted by "Viraj Bhat (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Viraj Bhat updated HADOOP-3948:
-------------------------------

    Attachment: HADOOP-3948.patch

Attached patch with incorporating inputs from Konstantin. Also added a testcase. 
We still need to see how procesIO could be remodeled to use iterator::remove()

> Separate Namenodes edits and fsimage 
> -------------------------------------
>
>                 Key: HADOOP-3948
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3948
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: dfs
>            Reporter: Lohit Vijayarenu
>         Attachments: HADOOP-3948.patch, HADOOP-3948.patch, hadoop-core-trunk.patch, hadoop-core-trunk.patch, hadoop-core-trunk.patch
>
>
> NameNode's _edits_ and _fsimage_ should be separated with an option of having them in their own separate directories.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3948) Separate Namenodes edits and fsimage

Posted by "Lohit Vijayarenu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Lohit Vijayarenu updated HADOOP-3948:
-------------------------------------

    Attachment: hadoop-core-trunk.patch

Another version (untested), which compares compares directory names to solve the [above|https://issues.apache.org/jira/browse/HADOOP-3948?focusedCommentId=12623748#action_12623748] problem.

> Separate Namenodes edits and fsimage 
> -------------------------------------
>
>                 Key: HADOOP-3948
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3948
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: dfs
>            Reporter: Lohit Vijayarenu
>         Attachments: hadoop-core-trunk.patch, hadoop-core-trunk.patch
>
>
> NameNode's _edits_ and _fsimage_ should be separated with an option of having them in their own separate directories.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3948) Separate Namenodes edits and fsimage

Posted by "Lohit Vijayarenu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Lohit Vijayarenu updated HADOOP-3948:
-------------------------------------

    Attachment: HADOOP-3948.patch

Updated patch, tested manually different directory combination. Passes all existing tests. Will add new test cases.

> Separate Namenodes edits and fsimage 
> -------------------------------------
>
>                 Key: HADOOP-3948
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3948
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: dfs
>            Reporter: Lohit Vijayarenu
>         Attachments: HADOOP-3948.patch, hadoop-core-trunk.patch, hadoop-core-trunk.patch, hadoop-core-trunk.patch
>
>
> NameNode's _edits_ and _fsimage_ should be separated with an option of having them in their own separate directories.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Assigned: (HADOOP-3948) Separate Namenodes edits and fsimage

Posted by "Lohit Vijayarenu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Lohit Vijayarenu reassigned HADOOP-3948:
----------------------------------------

    Assignee: Lohit Vijayarenu

> Separate Namenodes edits and fsimage 
> -------------------------------------
>
>                 Key: HADOOP-3948
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3948
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: dfs
>            Reporter: Lohit Vijayarenu
>            Assignee: Lohit Vijayarenu
>         Attachments: HADOOP-3948.patch, HADOOP-3948.patch, HADOOP-3948.patch, HADOOP-3948.patch, HADOOP-3948.patch, hadoop-core-trunk.patch, hadoop-core-trunk.patch, hadoop-core-trunk.patch
>
>
> NameNode's _edits_ and _fsimage_ should be separated with an option of having them in their own separate directories.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3948) Separate Namenodes edits and fsimage

Posted by "Lohit Vijayarenu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Lohit Vijayarenu updated HADOOP-3948:
-------------------------------------

    Attachment: hadoop-core-trunk.patch

Attached is an initial version (untested) of the patch. This incorporates ideas from Konstantine who suggested Iterators to traverse list of Storage Directories. The patch
- adds a new config variable dfs.name.edits.dir whose default value would be dfs.name.dir. One could have different combinations of directories for both the variables
- Added a new type StorageDirType which identifies if the directory is of type _NORMAL_, _FSEDITS_, _FSIMAGE_, _FSEDITS_FSIMAGE_. This way we can have a directory to store either edits only, images only or both.
- Added an Iterator to go over the list of storage directories. This accepts a StorageDirType as argument and selects only the matching directories.
- Changed few constructors and methods to accept 2 set of dirs

I am looking into how to change FSEdits::processIOError(int), which used to accept only index. Few places, this could be changed to accept storage directory, but few its not straight forward. 

> Separate Namenodes edits and fsimage 
> -------------------------------------
>
>                 Key: HADOOP-3948
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3948
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: dfs
>            Reporter: Lohit Vijayarenu
>         Attachments: hadoop-core-trunk.patch
>
>
> NameNode's _edits_ and _fsimage_ should be separated with an option of having them in their own separate directories.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3948) Separate Namenodes edits and fsimage

Posted by "Lohit Vijayarenu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Lohit Vijayarenu updated HADOOP-3948:
-------------------------------------

    Affects Version/s: 0.19.0
        Fix Version/s: 0.19.0

> Separate Namenodes edits and fsimage 
> -------------------------------------
>
>                 Key: HADOOP-3948
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3948
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: dfs
>    Affects Versions: 0.19.0
>            Reporter: Lohit Vijayarenu
>            Assignee: Lohit Vijayarenu
>             Fix For: 0.19.0
>
>         Attachments: HADOOP-3948.patch, HADOOP-3948.patch, HADOOP-3948.patch, HADOOP-3948.patch, HADOOP-3948.patch, hadoop-core-trunk.patch, hadoop-core-trunk.patch, hadoop-core-trunk.patch
>
>
> NameNode's _edits_ and _fsimage_ should be separated with an option of having them in their own separate directories.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.