You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-issues@hadoop.apache.org by "Ramkumar Vadali (JIRA)" <ji...@apache.org> on 2010/07/01 22:42:50 UTC

[jira] Created: (MAPREDUCE-1908) DistributedRaidFileSystem does not handle ChecksumException correctly

DistributedRaidFileSystem does not handle ChecksumException correctly
---------------------------------------------------------------------

                 Key: MAPREDUCE-1908
                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1908
             Project: Hadoop Map/Reduce
          Issue Type: Bug
            Reporter: Ramkumar Vadali


ChecksumException reports the offset of corruption within a block,
whereas DistributedRaidFileSystem.setAlternateLocations was expecting it
to report the offset of corruption within the file.

The best way of dealing with a missing block/corrupt block is to just
use the current seek offset in the file as the position of corruption.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (MAPREDUCE-1908) DistributedRaidFileSystem does not handle ChecksumException correctly

Posted by "Ramkumar Vadali (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-1908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12915981#action_12915981 ] 

Ramkumar Vadali commented on MAPREDUCE-1908:
--------------------------------------------

ant test result:

All but org.apache.hadoop.streaming.TestUlimit passed. TestUlimit failure is unrelated to this.


ant test-patch result:

     [exec] +1 overall.  
     [exec] 
     [exec]     +1 @author.  The patch does not contain any @author tags.
     [exec] 
     [exec]     +1 tests included.  The patch appears to include 2 new or modified tests.
     [exec] 
     [exec]     +1 javadoc.  The javadoc tool did not generate any warning messages.
     [exec] 
     [exec]     +1 javac.  The applied patch does not increase the total number of javac compiler warnings.
     [exec] 
     [exec]     +1 findbugs.  The patch does not introduce any new Findbugs warnings.
     [exec] 
     [exec]     +1 release audit.  The applied patch does not increase the total number of release audit warnings.
     [exec] 
     [exec]     +1 system tests framework.  The patch passed system tests framework compile.
     [exec] 
     [exec] 
     [exec] 
     [exec] 
     [exec] ======================================================================
     [exec] ======================================================================
     [exec]     Finished build.
     [exec] ======================================================================
     [exec] ======================================================================


> DistributedRaidFileSystem does not handle ChecksumException correctly
> ---------------------------------------------------------------------
>
>                 Key: MAPREDUCE-1908
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1908
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>            Reporter: Ramkumar Vadali
>            Assignee: Ramkumar Vadali
>         Attachments: MAPREDUCE-1908.patch
>
>
> ChecksumException reports the offset of corruption within a block,
> whereas DistributedRaidFileSystem.setAlternateLocations was expecting it
> to report the offset of corruption within the file.
> The best way of dealing with a missing block/corrupt block is to just
> use the current seek offset in the file as the position of corruption.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (MAPREDUCE-1908) DistributedRaidFileSystem does not handle ChecksumException correctly

Posted by "Scott Chen (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-1908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12919051#action_12919051 ] 

Scott Chen commented on MAPREDUCE-1908:
---------------------------------------

Ram: Can you add one more unit test that corrupt more than one block in the stripe?

> DistributedRaidFileSystem does not handle ChecksumException correctly
> ---------------------------------------------------------------------
>
>                 Key: MAPREDUCE-1908
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1908
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>            Reporter: Ramkumar Vadali
>            Assignee: Ramkumar Vadali
>         Attachments: MAPREDUCE-1908.patch
>
>
> ChecksumException reports the offset of corruption within a block,
> whereas DistributedRaidFileSystem.setAlternateLocations was expecting it
> to report the offset of corruption within the file.
> The best way of dealing with a missing block/corrupt block is to just
> use the current seek offset in the file as the position of corruption.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (MAPREDUCE-1908) DistributedRaidFileSystem does not handle ChecksumException correctly

Posted by "Ramkumar Vadali (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-1908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12919099#action_12919099 ] 

Ramkumar Vadali commented on MAPREDUCE-1908:
--------------------------------------------

Test results under src/contrib/raid:


test-junit:
    [junit] WARNING: multiple versions of ant detected in path for junit 
    [junit]          jar:file:/home/rvadali/local/external/ant/lib/ant.jar!/org/apache/tools/ant/Project.class
    [junit]      and jar:file:/home/rvadali/.ivy2/cache/ant/ant/jars/ant-1.6.5.jar!/org/apache/tools/ant/Project.class
    [junit] Running org.apache.hadoop.hdfs.TestRaidDfs
    [junit] Tests run: 2, Failures: 0, Errors: 0, Time elapsed: 14.223 sec
    [junit] Running org.apache.hadoop.raid.TestRaidHar
    [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 91.264 sec
    [junit] Running org.apache.hadoop.raid.TestRaidNode
    [junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed: 62.755 sec
    [junit] Running org.apache.hadoop.raid.TestRaidPurge
    [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 23.959 sec

test:

BUILD SUCCESSFUL
Total time: 3 minutes 25 seconds


result of ant test-patch:


     [exec]
     [exec]
     [exec]
     [exec] +1 overall.
     [exec]
     [exec]     +1 @author.  The patch does not contain any @author tags.
     [exec]
     [exec]     +1 tests included.  The patch appears to include 2 new or modified tests.
     [exec]
     [exec]     +1 javadoc.  The javadoc tool did not generate any warning messages.
     [exec]
     [exec]     +1 javac.  The applied patch does not increase the total number of javac compiler warnings.
     [exec]
     [exec]     +1 findbugs.  The patch does not introduce any new Findbugs warnings.
     [exec]
     [exec]     +1 release audit.  The applied patch does not increase the total number of release audit warnings.
     [exec]
     [exec]     +1 system tests framework.  The patch passed system tests framework compile.
     [exec]
     [exec]
     [exec]
     [exec]
     [exec] ======================================================================
     [exec] ======================================================================
     [exec]     Finished build.
     [exec] ======================================================================
     [exec] ======================================================================
     [exec]
     [exec]

BUILD SUCCESSFUL
Total time: 18 minutes 27 seconds


> DistributedRaidFileSystem does not handle ChecksumException correctly
> ---------------------------------------------------------------------
>
>                 Key: MAPREDUCE-1908
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1908
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>            Reporter: Ramkumar Vadali
>            Assignee: Ramkumar Vadali
>         Attachments: MAPREDUCE-1908.2.patch, MAPREDUCE-1908.patch
>
>
> ChecksumException reports the offset of corruption within a block,
> whereas DistributedRaidFileSystem.setAlternateLocations was expecting it
> to report the offset of corruption within the file.
> The best way of dealing with a missing block/corrupt block is to just
> use the current seek offset in the file as the position of corruption.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Assigned: (MAPREDUCE-1908) DistributedRaidFileSystem does not handle ChecksumException correctly

Posted by "Ramkumar Vadali (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAPREDUCE-1908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ramkumar Vadali reassigned MAPREDUCE-1908:
------------------------------------------

    Assignee: Ramkumar Vadali

> DistributedRaidFileSystem does not handle ChecksumException correctly
> ---------------------------------------------------------------------
>
>                 Key: MAPREDUCE-1908
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1908
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>            Reporter: Ramkumar Vadali
>            Assignee: Ramkumar Vadali
>
> ChecksumException reports the offset of corruption within a block,
> whereas DistributedRaidFileSystem.setAlternateLocations was expecting it
> to report the offset of corruption within the file.
> The best way of dealing with a missing block/corrupt block is to just
> use the current seek offset in the file as the position of corruption.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (MAPREDUCE-1908) DistributedRaidFileSystem does not handle ChecksumException correctly

Posted by "Ramkumar Vadali (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAPREDUCE-1908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ramkumar Vadali updated MAPREDUCE-1908:
---------------------------------------

    Status: Patch Available  (was: Open)

> DistributedRaidFileSystem does not handle ChecksumException correctly
> ---------------------------------------------------------------------
>
>                 Key: MAPREDUCE-1908
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1908
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>            Reporter: Ramkumar Vadali
>            Assignee: Ramkumar Vadali
>         Attachments: MAPREDUCE-1908.patch
>
>
> ChecksumException reports the offset of corruption within a block,
> whereas DistributedRaidFileSystem.setAlternateLocations was expecting it
> to report the offset of corruption within the file.
> The best way of dealing with a missing block/corrupt block is to just
> use the current seek offset in the file as the position of corruption.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (MAPREDUCE-1908) DistributedRaidFileSystem does not handle ChecksumException correctly

Posted by "Scott Chen (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-1908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12919068#action_12919068 ] 

Scott Chen commented on MAPREDUCE-1908:
---------------------------------------

+1 Looks good to me. Let's wait for your test results.

> DistributedRaidFileSystem does not handle ChecksumException correctly
> ---------------------------------------------------------------------
>
>                 Key: MAPREDUCE-1908
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1908
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>            Reporter: Ramkumar Vadali
>            Assignee: Ramkumar Vadali
>         Attachments: MAPREDUCE-1908.2.patch, MAPREDUCE-1908.patch
>
>
> ChecksumException reports the offset of corruption within a block,
> whereas DistributedRaidFileSystem.setAlternateLocations was expecting it
> to report the offset of corruption within the file.
> The best way of dealing with a missing block/corrupt block is to just
> use the current seek offset in the file as the position of corruption.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (MAPREDUCE-1908) DistributedRaidFileSystem does not handle ChecksumException correctly

Posted by "Scott Chen (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-1908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12919107#action_12919107 ] 

Scott Chen commented on MAPREDUCE-1908:
---------------------------------------

Looks good.
Let's wait for a few days to see if we get further comments. Then I will commit it.

> DistributedRaidFileSystem does not handle ChecksumException correctly
> ---------------------------------------------------------------------
>
>                 Key: MAPREDUCE-1908
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1908
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>            Reporter: Ramkumar Vadali
>            Assignee: Ramkumar Vadali
>         Attachments: MAPREDUCE-1908.2.patch, MAPREDUCE-1908.patch
>
>
> ChecksumException reports the offset of corruption within a block,
> whereas DistributedRaidFileSystem.setAlternateLocations was expecting it
> to report the offset of corruption within the file.
> The best way of dealing with a missing block/corrupt block is to just
> use the current seek offset in the file as the position of corruption.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (MAPREDUCE-1908) DistributedRaidFileSystem does not handle ChecksumException correctly

Posted by "Scott Chen (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAPREDUCE-1908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Scott Chen updated MAPREDUCE-1908:
----------------------------------

       Resolution: Fixed
    Fix Version/s: 0.22.0
     Hadoop Flags: [Reviewed]
           Status: Resolved  (was: Patch Available)

I just committed this. Thanks, Ram.

> DistributedRaidFileSystem does not handle ChecksumException correctly
> ---------------------------------------------------------------------
>
>                 Key: MAPREDUCE-1908
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1908
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>            Reporter: Ramkumar Vadali
>            Assignee: Ramkumar Vadali
>             Fix For: 0.22.0
>
>         Attachments: MAPREDUCE-1908.2.patch, MAPREDUCE-1908.patch
>
>
> ChecksumException reports the offset of corruption within a block,
> whereas DistributedRaidFileSystem.setAlternateLocations was expecting it
> to report the offset of corruption within the file.
> The best way of dealing with a missing block/corrupt block is to just
> use the current seek offset in the file as the position of corruption.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (MAPREDUCE-1908) DistributedRaidFileSystem does not handle ChecksumException correctly

Posted by "Ramkumar Vadali (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAPREDUCE-1908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ramkumar Vadali updated MAPREDUCE-1908:
---------------------------------------

    Attachment: MAPREDUCE-1908.2.patch

Modified test to corrupt two blocks in the same stripe and ensure failure.
The test found an additional issue - need to disable caching to force the use of DFS.

> DistributedRaidFileSystem does not handle ChecksumException correctly
> ---------------------------------------------------------------------
>
>                 Key: MAPREDUCE-1908
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1908
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>            Reporter: Ramkumar Vadali
>            Assignee: Ramkumar Vadali
>         Attachments: MAPREDUCE-1908.2.patch, MAPREDUCE-1908.patch
>
>
> ChecksumException reports the offset of corruption within a block,
> whereas DistributedRaidFileSystem.setAlternateLocations was expecting it
> to report the offset of corruption within the file.
> The best way of dealing with a missing block/corrupt block is to just
> use the current seek offset in the file as the position of corruption.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (MAPREDUCE-1908) DistributedRaidFileSystem does not handle ChecksumException correctly

Posted by "Ramkumar Vadali (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAPREDUCE-1908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ramkumar Vadali updated MAPREDUCE-1908:
---------------------------------------

    Attachment: MAPREDUCE-1908.patch

This patch make DistributedRaidFileSystem find the location of missing/corrupt data correctly.

> DistributedRaidFileSystem does not handle ChecksumException correctly
> ---------------------------------------------------------------------
>
>                 Key: MAPREDUCE-1908
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1908
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>            Reporter: Ramkumar Vadali
>            Assignee: Ramkumar Vadali
>         Attachments: MAPREDUCE-1908.patch
>
>
> ChecksumException reports the offset of corruption within a block,
> whereas DistributedRaidFileSystem.setAlternateLocations was expecting it
> to report the offset of corruption within the file.
> The best way of dealing with a missing block/corrupt block is to just
> use the current seek offset in the file as the position of corruption.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.