You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-issues@hadoop.apache.org by "Ramkumar Vadali (JIRA)" <ji...@apache.org> on 2010/07/01 22:42:50 UTC
[jira] Created: (MAPREDUCE-1908) DistributedRaidFileSystem does not
handle ChecksumException correctly
DistributedRaidFileSystem does not handle ChecksumException correctly
---------------------------------------------------------------------
Key: MAPREDUCE-1908
URL: https://issues.apache.org/jira/browse/MAPREDUCE-1908
Project: Hadoop Map/Reduce
Issue Type: Bug
Reporter: Ramkumar Vadali
ChecksumException reports the offset of corruption within a block,
whereas DistributedRaidFileSystem.setAlternateLocations was expecting it
to report the offset of corruption within the file.
The best way of dealing with a missing block/corrupt block is to just
use the current seek offset in the file as the position of corruption.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1908) DistributedRaidFileSystem does
not handle ChecksumException correctly
Posted by "Ramkumar Vadali (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/MAPREDUCE-1908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12915981#action_12915981 ]
Ramkumar Vadali commented on MAPREDUCE-1908:
--------------------------------------------
ant test result:
All but org.apache.hadoop.streaming.TestUlimit passed. TestUlimit failure is unrelated to this.
ant test-patch result:
[exec] +1 overall.
[exec]
[exec] +1 @author. The patch does not contain any @author tags.
[exec]
[exec] +1 tests included. The patch appears to include 2 new or modified tests.
[exec]
[exec] +1 javadoc. The javadoc tool did not generate any warning messages.
[exec]
[exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings.
[exec]
[exec] +1 findbugs. The patch does not introduce any new Findbugs warnings.
[exec]
[exec] +1 release audit. The applied patch does not increase the total number of release audit warnings.
[exec]
[exec] +1 system tests framework. The patch passed system tests framework compile.
[exec]
[exec]
[exec]
[exec]
[exec] ======================================================================
[exec] ======================================================================
[exec] Finished build.
[exec] ======================================================================
[exec] ======================================================================
> DistributedRaidFileSystem does not handle ChecksumException correctly
> ---------------------------------------------------------------------
>
> Key: MAPREDUCE-1908
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1908
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Reporter: Ramkumar Vadali
> Assignee: Ramkumar Vadali
> Attachments: MAPREDUCE-1908.patch
>
>
> ChecksumException reports the offset of corruption within a block,
> whereas DistributedRaidFileSystem.setAlternateLocations was expecting it
> to report the offset of corruption within the file.
> The best way of dealing with a missing block/corrupt block is to just
> use the current seek offset in the file as the position of corruption.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1908) DistributedRaidFileSystem does
not handle ChecksumException correctly
Posted by "Scott Chen (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/MAPREDUCE-1908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12919051#action_12919051 ]
Scott Chen commented on MAPREDUCE-1908:
---------------------------------------
Ram: Can you add one more unit test that corrupt more than one block in the stripe?
> DistributedRaidFileSystem does not handle ChecksumException correctly
> ---------------------------------------------------------------------
>
> Key: MAPREDUCE-1908
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1908
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Reporter: Ramkumar Vadali
> Assignee: Ramkumar Vadali
> Attachments: MAPREDUCE-1908.patch
>
>
> ChecksumException reports the offset of corruption within a block,
> whereas DistributedRaidFileSystem.setAlternateLocations was expecting it
> to report the offset of corruption within the file.
> The best way of dealing with a missing block/corrupt block is to just
> use the current seek offset in the file as the position of corruption.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1908) DistributedRaidFileSystem does
not handle ChecksumException correctly
Posted by "Ramkumar Vadali (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/MAPREDUCE-1908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12919099#action_12919099 ]
Ramkumar Vadali commented on MAPREDUCE-1908:
--------------------------------------------
Test results under src/contrib/raid:
test-junit:
[junit] WARNING: multiple versions of ant detected in path for junit
[junit] jar:file:/home/rvadali/local/external/ant/lib/ant.jar!/org/apache/tools/ant/Project.class
[junit] and jar:file:/home/rvadali/.ivy2/cache/ant/ant/jars/ant-1.6.5.jar!/org/apache/tools/ant/Project.class
[junit] Running org.apache.hadoop.hdfs.TestRaidDfs
[junit] Tests run: 2, Failures: 0, Errors: 0, Time elapsed: 14.223 sec
[junit] Running org.apache.hadoop.raid.TestRaidHar
[junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 91.264 sec
[junit] Running org.apache.hadoop.raid.TestRaidNode
[junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed: 62.755 sec
[junit] Running org.apache.hadoop.raid.TestRaidPurge
[junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 23.959 sec
test:
BUILD SUCCESSFUL
Total time: 3 minutes 25 seconds
result of ant test-patch:
[exec]
[exec]
[exec]
[exec] +1 overall.
[exec]
[exec] +1 @author. The patch does not contain any @author tags.
[exec]
[exec] +1 tests included. The patch appears to include 2 new or modified tests.
[exec]
[exec] +1 javadoc. The javadoc tool did not generate any warning messages.
[exec]
[exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings.
[exec]
[exec] +1 findbugs. The patch does not introduce any new Findbugs warnings.
[exec]
[exec] +1 release audit. The applied patch does not increase the total number of release audit warnings.
[exec]
[exec] +1 system tests framework. The patch passed system tests framework compile.
[exec]
[exec]
[exec]
[exec]
[exec] ======================================================================
[exec] ======================================================================
[exec] Finished build.
[exec] ======================================================================
[exec] ======================================================================
[exec]
[exec]
BUILD SUCCESSFUL
Total time: 18 minutes 27 seconds
> DistributedRaidFileSystem does not handle ChecksumException correctly
> ---------------------------------------------------------------------
>
> Key: MAPREDUCE-1908
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1908
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Reporter: Ramkumar Vadali
> Assignee: Ramkumar Vadali
> Attachments: MAPREDUCE-1908.2.patch, MAPREDUCE-1908.patch
>
>
> ChecksumException reports the offset of corruption within a block,
> whereas DistributedRaidFileSystem.setAlternateLocations was expecting it
> to report the offset of corruption within the file.
> The best way of dealing with a missing block/corrupt block is to just
> use the current seek offset in the file as the position of corruption.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Assigned: (MAPREDUCE-1908) DistributedRaidFileSystem does
not handle ChecksumException correctly
Posted by "Ramkumar Vadali (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/MAPREDUCE-1908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ramkumar Vadali reassigned MAPREDUCE-1908:
------------------------------------------
Assignee: Ramkumar Vadali
> DistributedRaidFileSystem does not handle ChecksumException correctly
> ---------------------------------------------------------------------
>
> Key: MAPREDUCE-1908
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1908
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Reporter: Ramkumar Vadali
> Assignee: Ramkumar Vadali
>
> ChecksumException reports the offset of corruption within a block,
> whereas DistributedRaidFileSystem.setAlternateLocations was expecting it
> to report the offset of corruption within the file.
> The best way of dealing with a missing block/corrupt block is to just
> use the current seek offset in the file as the position of corruption.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1908) DistributedRaidFileSystem does not
handle ChecksumException correctly
Posted by "Ramkumar Vadali (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/MAPREDUCE-1908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ramkumar Vadali updated MAPREDUCE-1908:
---------------------------------------
Status: Patch Available (was: Open)
> DistributedRaidFileSystem does not handle ChecksumException correctly
> ---------------------------------------------------------------------
>
> Key: MAPREDUCE-1908
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1908
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Reporter: Ramkumar Vadali
> Assignee: Ramkumar Vadali
> Attachments: MAPREDUCE-1908.patch
>
>
> ChecksumException reports the offset of corruption within a block,
> whereas DistributedRaidFileSystem.setAlternateLocations was expecting it
> to report the offset of corruption within the file.
> The best way of dealing with a missing block/corrupt block is to just
> use the current seek offset in the file as the position of corruption.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1908) DistributedRaidFileSystem does
not handle ChecksumException correctly
Posted by "Scott Chen (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/MAPREDUCE-1908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12919068#action_12919068 ]
Scott Chen commented on MAPREDUCE-1908:
---------------------------------------
+1 Looks good to me. Let's wait for your test results.
> DistributedRaidFileSystem does not handle ChecksumException correctly
> ---------------------------------------------------------------------
>
> Key: MAPREDUCE-1908
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1908
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Reporter: Ramkumar Vadali
> Assignee: Ramkumar Vadali
> Attachments: MAPREDUCE-1908.2.patch, MAPREDUCE-1908.patch
>
>
> ChecksumException reports the offset of corruption within a block,
> whereas DistributedRaidFileSystem.setAlternateLocations was expecting it
> to report the offset of corruption within the file.
> The best way of dealing with a missing block/corrupt block is to just
> use the current seek offset in the file as the position of corruption.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1908) DistributedRaidFileSystem does
not handle ChecksumException correctly
Posted by "Scott Chen (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/MAPREDUCE-1908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12919107#action_12919107 ]
Scott Chen commented on MAPREDUCE-1908:
---------------------------------------
Looks good.
Let's wait for a few days to see if we get further comments. Then I will commit it.
> DistributedRaidFileSystem does not handle ChecksumException correctly
> ---------------------------------------------------------------------
>
> Key: MAPREDUCE-1908
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1908
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Reporter: Ramkumar Vadali
> Assignee: Ramkumar Vadali
> Attachments: MAPREDUCE-1908.2.patch, MAPREDUCE-1908.patch
>
>
> ChecksumException reports the offset of corruption within a block,
> whereas DistributedRaidFileSystem.setAlternateLocations was expecting it
> to report the offset of corruption within the file.
> The best way of dealing with a missing block/corrupt block is to just
> use the current seek offset in the file as the position of corruption.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1908) DistributedRaidFileSystem does not
handle ChecksumException correctly
Posted by "Scott Chen (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/MAPREDUCE-1908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Scott Chen updated MAPREDUCE-1908:
----------------------------------
Resolution: Fixed
Fix Version/s: 0.22.0
Hadoop Flags: [Reviewed]
Status: Resolved (was: Patch Available)
I just committed this. Thanks, Ram.
> DistributedRaidFileSystem does not handle ChecksumException correctly
> ---------------------------------------------------------------------
>
> Key: MAPREDUCE-1908
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1908
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Reporter: Ramkumar Vadali
> Assignee: Ramkumar Vadali
> Fix For: 0.22.0
>
> Attachments: MAPREDUCE-1908.2.patch, MAPREDUCE-1908.patch
>
>
> ChecksumException reports the offset of corruption within a block,
> whereas DistributedRaidFileSystem.setAlternateLocations was expecting it
> to report the offset of corruption within the file.
> The best way of dealing with a missing block/corrupt block is to just
> use the current seek offset in the file as the position of corruption.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1908) DistributedRaidFileSystem does not
handle ChecksumException correctly
Posted by "Ramkumar Vadali (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/MAPREDUCE-1908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ramkumar Vadali updated MAPREDUCE-1908:
---------------------------------------
Attachment: MAPREDUCE-1908.2.patch
Modified test to corrupt two blocks in the same stripe and ensure failure.
The test found an additional issue - need to disable caching to force the use of DFS.
> DistributedRaidFileSystem does not handle ChecksumException correctly
> ---------------------------------------------------------------------
>
> Key: MAPREDUCE-1908
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1908
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Reporter: Ramkumar Vadali
> Assignee: Ramkumar Vadali
> Attachments: MAPREDUCE-1908.2.patch, MAPREDUCE-1908.patch
>
>
> ChecksumException reports the offset of corruption within a block,
> whereas DistributedRaidFileSystem.setAlternateLocations was expecting it
> to report the offset of corruption within the file.
> The best way of dealing with a missing block/corrupt block is to just
> use the current seek offset in the file as the position of corruption.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1908) DistributedRaidFileSystem does not
handle ChecksumException correctly
Posted by "Ramkumar Vadali (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/MAPREDUCE-1908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ramkumar Vadali updated MAPREDUCE-1908:
---------------------------------------
Attachment: MAPREDUCE-1908.patch
This patch make DistributedRaidFileSystem find the location of missing/corrupt data correctly.
> DistributedRaidFileSystem does not handle ChecksumException correctly
> ---------------------------------------------------------------------
>
> Key: MAPREDUCE-1908
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1908
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Reporter: Ramkumar Vadali
> Assignee: Ramkumar Vadali
> Attachments: MAPREDUCE-1908.patch
>
>
> ChecksumException reports the offset of corruption within a block,
> whereas DistributedRaidFileSystem.setAlternateLocations was expecting it
> to report the offset of corruption within the file.
> The best way of dealing with a missing block/corrupt block is to just
> use the current seek offset in the file as the position of corruption.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.