You are viewing a plain text version of this content. The canonical link for it is here.

Posted to hdfs-dev@hadoop.apache.org by "Colin Patrick McCabe (Created) (JIRA)" <ji...@apache.org> on 2012/03/05 22:51:57 UTC

[jira] [Created] (HDFS-3049) During the normal loading NN startup process, fall back on a different image or EditLog if we see one that is corrupt

During the normal loading NN startup process, fall back on a different image or EditLog if we see one that is corrupt
---------------------------------------------------------------------------------------------------------------------

Key: HDFS-3049
URL: https://issues.apache.org/jira/browse/HDFS-3049
Project: Hadoop HDFS
Issue Type: New Feature
Components: name-node
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
Priority: Minor
Fix For: 0.24.0

During the NameNode startup process, we load an image, and then apply edit logs to it until we believe that we have all the latest changes. Unfortunately, if there is an I/O error while reading any of these files, in most cases, we simply abort the startup process. We should try harder to locate a readable edit log and/or image file.

*There are three main use cases for this feature:*
1. If the operating system does not honor fsync (usually due to a misconfiguration), a file may end up in an inconsistent state.
2. In certain older releases where we did not use fallocate() or similar to pre-reserve blocks, a disk full condition may cause a truncated log in one edit directory.
3. There may be a bug in HDFS which results in some of the data directories receiving corrupt data, but not all. This is the least likely use case.

*Proposed changes to normal NN startup*
* We should try a different FSImage if we can't load the first one we try.
* We should examine other FSEditLogs if we can't load the first one(s) we try.
* We should fail if we can't find EditLogs that would bring us up to what we believe is the latest transaction ID.

Proposed changes to recovery mode NN startup:
we should list out all the available storage directories and allow the operator to select which one he wants to use.
Something like this:
{code}
Multiple storage directories found.
1. /foo/bar
edits__curent__XYZ size:213421345 md5:2345345
image size:213421345 md5:2345345
2. /foo/baz
edits__curent__XYZ size:213421345 md5:2345345345
image size:213421345 md5:2345345
Which one would you like to use? (1/2)
{code}

As usual in recovery mode, we want to be flexible about error handling. In this case, this means that we should NOT fail if we can't find EditLogs that would bring us up to what we believe is the latest transaction ID.

*Not addressed by this feature*
This feature will not address the case where an attempt to access the NameNode name directory or directories hangs because of an I/O error. This may happen, for example, when trying to load an image from a hard-mounted NFS directory, when the NFS server has gone away. Just as now, the operator will have to notice this problem and take steps to correct it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (HDFS-3049) During the normal loading NN startup process, fall back on a different EditLog if we see one that is corrupt

Posted by "Todd Lipcon (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HDFS-3049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Todd Lipcon resolved HDFS-3049.
-------------------------------

       Resolution: Fixed
    Fix Version/s: 2.0.3-alpha

Fixed the extra imports and committed to branch-2, thanks for the reviews.
                
> During the normal loading NN startup process, fall back on a different EditLog if we see one that is corrupt
> ------------------------------------------------------------------------------------------------------------
>
>                 Key: HDFS-3049
>                 URL: https://issues.apache.org/jira/browse/HDFS-3049
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>          Components: namenode
>    Affects Versions: 0.23.0
>            Reporter: Colin Patrick McCabe
>            Assignee: Colin Patrick McCabe
>            Priority: Minor
>             Fix For: 3.0.0, 2.0.3-alpha
>
>         Attachments: HDFS-3049.001.patch, HDFS-3049.002.patch, HDFS-3049.003.patch, HDFS-3049.005.against3335.patch, HDFS-3049.006.against3335.patch, HDFS-3049.007.against3335.patch, HDFS-3049.010.patch, HDFS-3049.011.patch, HDFS-3049.012.patch, HDFS-3049.013.patch, HDFS-3049.015.patch, HDFS-3049.017.patch, HDFS-3049.018.patch, HDFS-3049.021.patch, HDFS-3049.023.patch, HDFS-3049.025.patch, HDFS-3049.026.patch, HDFS-3049.027.patch, HDFS-3049.028.patch, HDFS-3049.028.patch, HDFS-3049.028.patch, hdfs-3049-branch-2.txt
>
>
> During the NameNode startup process, we load an image, and then apply edit logs to it until we believe that we have all the latest changes.  Unfortunately, if there is an I/O error while reading any of these files, in most cases, we simply abort the startup process.  We should try harder to locate a readable edit log and/or image file.
> *There are three main use cases for this feature:*
> 1. If the operating system does not honor fsync (usually due to a misconfiguration), a file may end up in an inconsistent state.
> 2. In certain older releases where we did not use fallocate() or similar to pre-reserve blocks, a disk full condition may cause a truncated log in one edit directory.
> 3. There may be a bug in HDFS which results in some of the data directories receiving corrupt data, but not all.  This is the least likely use case.
> *Proposed changes to normal NN startup*
> * We should try a different FSImage if we can't load the first one we try.
> * We should examine other FSEditLogs if we can't load the first one(s) we try.
> * We should fail if we can't find EditLogs that would bring us up to what we believe is the latest transaction ID.
> Proposed changes to recovery mode NN startup:
> we should list out all the available storage directories and allow the operator to select which one he wants to use.
> Something like this:
> {code}
> Multiple storage directories found.
> 1. /foo/bar
>     edits__curent__XYZ          size:213421345       md5:2345345
>     image                                  size:213421345       md5:2345345
> 2. /foo/baz
>     edits__curent__XYZ          size:213421345       md5:2345345345
>     image                                  size:213421345       md5:2345345
> Which one would you like to use? (1/2)
> {code}
> As usual in recovery mode, we want to be flexible about error handling.  In this case, this means that we should NOT fail if we can't find EditLogs that would bring us up to what we believe is the latest transaction ID.
> *Not addressed by this feature*
> This feature will not address the case where an attempt to access the NameNode name directory or directories hangs because of an I/O error.  This may happen, for example, when trying to load an image from a hard-mounted NFS directory, when the NFS server has gone away.  Just as now, the operator will have to notice this problem and take steps to correct it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Reopened] (HDFS-3049) During the normal loading NN startup process, fall back on a different EditLog if we see one that is corrupt

Posted by "Todd Lipcon (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HDFS-3049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Todd Lipcon reopened HDFS-3049:
-------------------------------


Reopening for commit to branch-2 (this code is needed for QJM support).
                
> During the normal loading NN startup process, fall back on a different EditLog if we see one that is corrupt
> ------------------------------------------------------------------------------------------------------------
>
>                 Key: HDFS-3049
>                 URL: https://issues.apache.org/jira/browse/HDFS-3049
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>          Components: namenode
>    Affects Versions: 0.23.0
>            Reporter: Colin Patrick McCabe
>            Assignee: Colin Patrick McCabe
>            Priority: Minor
>             Fix For: 3.0.0
>
>         Attachments: HDFS-3049.001.patch, HDFS-3049.002.patch, HDFS-3049.003.patch, HDFS-3049.005.against3335.patch, HDFS-3049.006.against3335.patch, HDFS-3049.007.against3335.patch, HDFS-3049.010.patch, HDFS-3049.011.patch, HDFS-3049.012.patch, HDFS-3049.013.patch, HDFS-3049.015.patch, HDFS-3049.017.patch, HDFS-3049.018.patch, HDFS-3049.021.patch, HDFS-3049.023.patch, HDFS-3049.025.patch, HDFS-3049.026.patch, HDFS-3049.027.patch, HDFS-3049.028.patch, HDFS-3049.028.patch, HDFS-3049.028.patch, hdfs-3049-branch-2.txt
>
>
> During the NameNode startup process, we load an image, and then apply edit logs to it until we believe that we have all the latest changes.  Unfortunately, if there is an I/O error while reading any of these files, in most cases, we simply abort the startup process.  We should try harder to locate a readable edit log and/or image file.
> *There are three main use cases for this feature:*
> 1. If the operating system does not honor fsync (usually due to a misconfiguration), a file may end up in an inconsistent state.
> 2. In certain older releases where we did not use fallocate() or similar to pre-reserve blocks, a disk full condition may cause a truncated log in one edit directory.
> 3. There may be a bug in HDFS which results in some of the data directories receiving corrupt data, but not all.  This is the least likely use case.
> *Proposed changes to normal NN startup*
> * We should try a different FSImage if we can't load the first one we try.
> * We should examine other FSEditLogs if we can't load the first one(s) we try.
> * We should fail if we can't find EditLogs that would bring us up to what we believe is the latest transaction ID.
> Proposed changes to recovery mode NN startup:
> we should list out all the available storage directories and allow the operator to select which one he wants to use.
> Something like this:
> {code}
> Multiple storage directories found.
> 1. /foo/bar
>     edits__curent__XYZ          size:213421345       md5:2345345
>     image                                  size:213421345       md5:2345345
> 2. /foo/baz
>     edits__curent__XYZ          size:213421345       md5:2345345345
>     image                                  size:213421345       md5:2345345
> Which one would you like to use? (1/2)
> {code}
> As usual in recovery mode, we want to be flexible about error handling.  In this case, this means that we should NOT fail if we can't find EditLogs that would bring us up to what we believe is the latest transaction ID.
> *Not addressed by this feature*
> This feature will not address the case where an attempt to access the NameNode name directory or directories hangs because of an I/O error.  This may happen, for example, when trying to load an image from a hard-mounted NFS directory, when the NFS server has gone away.  Just as now, the operator will have to notice this problem and take steps to correct it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Reopened] (HDFS-3049) During the normal loading NN startup process, fall back on a different EditLog if we see one that is corrupt

Posted by "Arun C Murthy (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HDFS-3049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Arun C Murthy reopened HDFS-3049:
---------------------------------


Looks like this patch made an incompatible change and broke MR (non-maven) tests e.g. TestMapredGroupMappingServiceRefresh doesn't compile.

IAC, we *should* not remove public ctors from Mini clusters since we don't know how these affect downstream consumers.

I'll revert this for now.
                
> During the normal loading NN startup process, fall back on a different EditLog if we see one that is corrupt
> ------------------------------------------------------------------------------------------------------------
>
>                 Key: HDFS-3049
>                 URL: https://issues.apache.org/jira/browse/HDFS-3049
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>          Components: name-node
>    Affects Versions: 0.23.0
>            Reporter: Colin Patrick McCabe
>            Assignee: Colin Patrick McCabe
>            Priority: Minor
>             Fix For: 3.0.0
>
>         Attachments: HDFS-3049.001.patch, HDFS-3049.002.patch, HDFS-3049.003.patch, HDFS-3049.005.against3335.patch, HDFS-3049.006.against3335.patch, HDFS-3049.007.against3335.patch, HDFS-3049.010.patch, HDFS-3049.011.patch, HDFS-3049.012.patch, HDFS-3049.013.patch, HDFS-3049.015.patch, HDFS-3049.017.patch, HDFS-3049.018.patch, HDFS-3049.021.patch, HDFS-3049.023.patch, HDFS-3049.025.patch, HDFS-3049.026.patch, HDFS-3049.027.patch, HDFS-3049.028.patch, HDFS-3049.028.patch, HDFS-3049.028.patch
>
>
> During the NameNode startup process, we load an image, and then apply edit logs to it until we believe that we have all the latest changes.  Unfortunately, if there is an I/O error while reading any of these files, in most cases, we simply abort the startup process.  We should try harder to locate a readable edit log and/or image file.
> *There are three main use cases for this feature:*
> 1. If the operating system does not honor fsync (usually due to a misconfiguration), a file may end up in an inconsistent state.
> 2. In certain older releases where we did not use fallocate() or similar to pre-reserve blocks, a disk full condition may cause a truncated log in one edit directory.
> 3. There may be a bug in HDFS which results in some of the data directories receiving corrupt data, but not all.  This is the least likely use case.
> *Proposed changes to normal NN startup*
> * We should try a different FSImage if we can't load the first one we try.
> * We should examine other FSEditLogs if we can't load the first one(s) we try.
> * We should fail if we can't find EditLogs that would bring us up to what we believe is the latest transaction ID.
> Proposed changes to recovery mode NN startup:
> we should list out all the available storage directories and allow the operator to select which one he wants to use.
> Something like this:
> {code}
> Multiple storage directories found.
> 1. /foo/bar
>     edits__curent__XYZ          size:213421345       md5:2345345
>     image                                  size:213421345       md5:2345345
> 2. /foo/baz
>     edits__curent__XYZ          size:213421345       md5:2345345345
>     image                                  size:213421345       md5:2345345
> Which one would you like to use? (1/2)
> {code}
> As usual in recovery mode, we want to be flexible about error handling.  In this case, this means that we should NOT fail if we can't find EditLogs that would bring us up to what we believe is the latest transaction ID.
> *Not addressed by this feature*
> This feature will not address the case where an attempt to access the NameNode name directory or directories hangs because of an I/O error.  This may happen, for example, when trying to load an image from a hard-mounted NFS directory, when the NFS server has gone away.  Just as now, the operator will have to notice this problem and take steps to correct it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (HDFS-3049) During the normal loading NN startup process, fall back on a different EditLog if we see one that is corrupt

Posted by "Colin Patrick McCabe (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HDFS-3049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Colin Patrick McCabe resolved HDFS-3049.
----------------------------------------

    Resolution: Fixed

The build failures in un-mavenized MR tests were handled by Arun in HDFS-3614
                
> During the normal loading NN startup process, fall back on a different EditLog if we see one that is corrupt
> ------------------------------------------------------------------------------------------------------------
>
>                 Key: HDFS-3049
>                 URL: https://issues.apache.org/jira/browse/HDFS-3049
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>          Components: name-node
>    Affects Versions: 0.23.0
>            Reporter: Colin Patrick McCabe
>            Assignee: Colin Patrick McCabe
>            Priority: Minor
>         Attachments: HDFS-3049.001.patch, HDFS-3049.002.patch, HDFS-3049.003.patch, HDFS-3049.005.against3335.patch, HDFS-3049.006.against3335.patch, HDFS-3049.007.against3335.patch, HDFS-3049.010.patch, HDFS-3049.011.patch, HDFS-3049.012.patch, HDFS-3049.013.patch, HDFS-3049.015.patch, HDFS-3049.017.patch, HDFS-3049.018.patch, HDFS-3049.021.patch, HDFS-3049.023.patch, HDFS-3049.025.patch, HDFS-3049.026.patch, HDFS-3049.027.patch, HDFS-3049.028.patch, HDFS-3049.028.patch, HDFS-3049.028.patch
>
>
> During the NameNode startup process, we load an image, and then apply edit logs to it until we believe that we have all the latest changes.  Unfortunately, if there is an I/O error while reading any of these files, in most cases, we simply abort the startup process.  We should try harder to locate a readable edit log and/or image file.
> *There are three main use cases for this feature:*
> 1. If the operating system does not honor fsync (usually due to a misconfiguration), a file may end up in an inconsistent state.
> 2. In certain older releases where we did not use fallocate() or similar to pre-reserve blocks, a disk full condition may cause a truncated log in one edit directory.
> 3. There may be a bug in HDFS which results in some of the data directories receiving corrupt data, but not all.  This is the least likely use case.
> *Proposed changes to normal NN startup*
> * We should try a different FSImage if we can't load the first one we try.
> * We should examine other FSEditLogs if we can't load the first one(s) we try.
> * We should fail if we can't find EditLogs that would bring us up to what we believe is the latest transaction ID.
> Proposed changes to recovery mode NN startup:
> we should list out all the available storage directories and allow the operator to select which one he wants to use.
> Something like this:
> {code}
> Multiple storage directories found.
> 1. /foo/bar
>     edits__curent__XYZ          size:213421345       md5:2345345
>     image                                  size:213421345       md5:2345345
> 2. /foo/baz
>     edits__curent__XYZ          size:213421345       md5:2345345345
>     image                                  size:213421345       md5:2345345
> Which one would you like to use? (1/2)
> {code}
> As usual in recovery mode, we want to be flexible about error handling.  In this case, this means that we should NOT fail if we can't find EditLogs that would bring us up to what we believe is the latest transaction ID.
> *Not addressed by this feature*
> This feature will not address the case where an attempt to access the NameNode name directory or directories hangs because of an I/O error.  This may happen, for example, when trying to load an image from a hard-mounted NFS directory, when the NFS server has gone away.  Just as now, the operator will have to notice this problem and take steps to correct it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira