You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Hairong Kuang (JIRA)" <ji...@apache.org> on 2007/05/10 00:48:15 UTC
[jira] Commented: (HADOOP-1124) ChecksumFileSystem does not handle
ChecksumError correctly
[ https://issues.apache.org/jira/browse/HADOOP-1124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12494594 ]
Hairong Kuang commented on HADOOP-1124:
---------------------------------------
The problems 2 and 3 described above is not critical. But problem 1 causes a job to fail on ChecksumException when a task gets a ChecksumError when read after seeking to a position which is not at the checksum chunk boundary althogh there are non-corrupted replicas available.
I plan to create a separate issue dealing with problem 1 and mark it as a Blocker, then I will mark this issue as a non-blocker.
> ChecksumFileSystem does not handle ChecksumError correctly
> ----------------------------------------------------------
>
> Key: HADOOP-1124
> URL: https://issues.apache.org/jira/browse/HADOOP-1124
> Project: Hadoop
> Issue Type: Bug
> Components: fs
> Affects Versions: 0.12.0
> Reporter: Hairong Kuang
> Assigned To: Hairong Kuang
> Priority: Blocker
> Fix For: 0.13.0
>
>
> When handle ChecksumError, the checksumed file system tries to recover by rereading from a different replica.
> I have three comments:
> 1. One bug in the code is that when retrying, the object that computes checksum does not get restored to the old state.
> 2. The code also assumes that the first byte read and the byte being read when ChecksumError occurs are in the same block.
> 3. It would be more efficient if we roll back to the first byte in the chunk that's being checksumed instead of rolling back to the first byte that was read.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.