You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Nicolas Spiegelberg (JIRA)" <ji...@apache.org> on 2010/10/04 23:36:35 UTC

[jira] Created: (HADOOP-6986) SequenceFile.Reader should distinguish between Network IOE and Parsing IOE

SequenceFile.Reader should distinguish between Network IOE and Parsing IOE
--------------------------------------------------------------------------

                 Key: HADOOP-6986
                 URL: https://issues.apache.org/jira/browse/HADOOP-6986
             Project: Hadoop Common
          Issue Type: Bug
          Components: io
    Affects Versions: 0.21.1, 0.22.0, 0.20-append
            Reporter: Nicolas Spiegelberg
            Priority: Minor
             Fix For: 0.21.1, 0.22.0, 0.20-append


The SequenceFile.Reader api should give the user an easy way to distinguish between a Network/Low-level IOE and a Parsing IOE.  The use case appeared recently in the HBase project:

Originally, if a RegionServer got an IOE from HDFS while opening a region file, it would abort the open and let the HMaster reassign the region.  The assumption being that this is a network failure that will likely disappear at a later time or different partition of the network.  However, if HBase gets parsing exceptions, we want to log the problem and continue opening the region anyways, because parsing is an idempotent problem and retries won't fix this issue.

Although this problem was found in HBase, it seems to be a generic problem of being able to more easily identify idempotent vs transient errors.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-6986) SequenceFile.Reader should distinguish between Network IOE and Parsing IOE

Posted by "Nicolas Spiegelberg (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-6986?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Nicolas Spiegelberg updated HADOOP-6986:
----------------------------------------

    Status: Patch Available  (was: Open)

> SequenceFile.Reader should distinguish between Network IOE and Parsing IOE
> --------------------------------------------------------------------------
>
>                 Key: HADOOP-6986
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6986
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: io
>    Affects Versions: 0.21.1, 0.22.0, 0.20-append
>            Reporter: Nicolas Spiegelberg
>            Priority: Minor
>             Fix For: 0.21.1, 0.22.0, 0.20-append
>
>         Attachments: HADOOP-6986_0.21.patch, HADOOP-6986_20-append.patch
>
>
> The SequenceFile.Reader api should give the user an easy way to distinguish between a Network/Low-level IOE and a Parsing IOE.  The use case appeared recently in the HBase project:
> Originally, if a RegionServer got an IOE from HDFS while opening a region file, it would abort the open and let the HMaster reassign the region.  The assumption being that this is a network failure that will likely disappear at a later time or different partition of the network.  However, if HBase gets parsing exceptions, we want to log the problem and continue opening the region anyways, because parsing is an idempotent problem and retries won't fix this issue.
> Although this problem was found in HBase, it seems to be a generic problem of being able to more easily identify idempotent vs transient errors.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-6986) SequenceFile.Reader should distinguish between Network IOE and Parsing IOE

Posted by "Nicolas Spiegelberg (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-6986?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Nicolas Spiegelberg updated HADOOP-6986:
----------------------------------------

    Attachment: HADOOP-6986_0.21.patch
                HADOOP-6986_20-append.patch

2 patch versions: one works for 20-append branch, the 0.21 works for both 0.21 & 0.22

> SequenceFile.Reader should distinguish between Network IOE and Parsing IOE
> --------------------------------------------------------------------------
>
>                 Key: HADOOP-6986
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6986
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: io
>    Affects Versions: 0.21.1, 0.22.0, 0.20-append
>            Reporter: Nicolas Spiegelberg
>            Priority: Minor
>             Fix For: 0.21.1, 0.22.0, 0.20-append
>
>         Attachments: HADOOP-6986_0.21.patch, HADOOP-6986_20-append.patch
>
>
> The SequenceFile.Reader api should give the user an easy way to distinguish between a Network/Low-level IOE and a Parsing IOE.  The use case appeared recently in the HBase project:
> Originally, if a RegionServer got an IOE from HDFS while opening a region file, it would abort the open and let the HMaster reassign the region.  The assumption being that this is a network failure that will likely disappear at a later time or different partition of the network.  However, if HBase gets parsing exceptions, we want to log the problem and continue opening the region anyways, because parsing is an idempotent problem and retries won't fix this issue.
> Although this problem was found in HBase, it seems to be a generic problem of being able to more easily identify idempotent vs transient errors.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-6986) SequenceFile.Reader should distinguish between Network IOE and Parsing IOE

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-6986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12928493#action_12928493 ] 

Hadoop QA commented on HADOOP-6986:
-----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12456328/HADOOP-6986_0.21.patch
  against trunk revision 1031422.

    +1 @author.  The patch does not contain any @author tags.

    -1 tests included.  The patch doesn't appear to include any new or modified tests.
                        Please justify why no new tests are needed for this patch.
                        Also please list what manual steps were performed to verify this patch.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    +1 findbugs.  The patch does not introduce any new Findbugs warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

    +1 core tests.  The patch passed core unit tests.

    +1 contrib tests.  The patch passed contrib unit tests.

    +1 system test framework.  The patch passed system test framework compile.

Test results: https://hudson.apache.org/hudson/job/PreCommit-HADOOP-Build/15//testReport/
Findbugs warnings: https://hudson.apache.org/hudson/job/PreCommit-HADOOP-Build/15//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: https://hudson.apache.org/hudson/job/PreCommit-HADOOP-Build/15//console

This message is automatically generated.

> SequenceFile.Reader should distinguish between Network IOE and Parsing IOE
> --------------------------------------------------------------------------
>
>                 Key: HADOOP-6986
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6986
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: io
>    Affects Versions: 0.21.1, 0.22.0, 0.20-append
>            Reporter: Nicolas Spiegelberg
>            Priority: Minor
>             Fix For: 0.21.1, 0.22.0, 0.20-append
>
>         Attachments: HADOOP-6986_0.21.patch, HADOOP-6986_20-append.patch
>
>
> The SequenceFile.Reader api should give the user an easy way to distinguish between a Network/Low-level IOE and a Parsing IOE.  The use case appeared recently in the HBase project:
> Originally, if a RegionServer got an IOE from HDFS while opening a region file, it would abort the open and let the HMaster reassign the region.  The assumption being that this is a network failure that will likely disappear at a later time or different partition of the network.  However, if HBase gets parsing exceptions, we want to log the problem and continue opening the region anyways, because parsing is an idempotent problem and retries won't fix this issue.
> Although this problem was found in HBase, it seems to be a generic problem of being able to more easily identify idempotent vs transient errors.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-6986) SequenceFile.Reader should distinguish between Network IOE and Parsing IOE

Posted by "Nicolas Spiegelberg (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-6986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12917795#action_12917795 ] 

Nicolas Spiegelberg commented on HADOOP-6986:
---------------------------------------------

To fix this issue, I kept all the existing error types & messages, but I added ParseException as the cause to all parsing-related IOEs.  None of the changed exceptions had an associated cause prior.  This will allow us to maintain 100% backwards compatibility (in case any users were doing deep inspection of the IOE text) while allowing new users and easy way to check:  if(ioe.getCause() instanceof ParseException)

> SequenceFile.Reader should distinguish between Network IOE and Parsing IOE
> --------------------------------------------------------------------------
>
>                 Key: HADOOP-6986
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6986
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: io
>    Affects Versions: 0.21.1, 0.22.0, 0.20-append
>            Reporter: Nicolas Spiegelberg
>            Priority: Minor
>             Fix For: 0.21.1, 0.22.0, 0.20-append
>
>         Attachments: HADOOP-6986_0.21.patch, HADOOP-6986_20-append.patch
>
>
> The SequenceFile.Reader api should give the user an easy way to distinguish between a Network/Low-level IOE and a Parsing IOE.  The use case appeared recently in the HBase project:
> Originally, if a RegionServer got an IOE from HDFS while opening a region file, it would abort the open and let the HMaster reassign the region.  The assumption being that this is a network failure that will likely disappear at a later time or different partition of the network.  However, if HBase gets parsing exceptions, we want to log the problem and continue opening the region anyways, because parsing is an idempotent problem and retries won't fix this issue.
> Although this problem was found in HBase, it seems to be a generic problem of being able to more easily identify idempotent vs transient errors.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.