You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Hairong Kuang (JIRA)" <ji...@apache.org> on 2007/05/10 01:06:15 UTC
[jira] Created: (HADOOP-1345) Checksum object does not get restored
to the old state in retries when handle ChecksumException
Checksum object does not get restored to the old state in retries when handle ChecksumException
-----------------------------------------------------------------------------------------------
Key: HADOOP-1345
URL: https://issues.apache.org/jira/browse/HADOOP-1345
Project: Hadoop
Issue Type: Bug
Components: dfs
Affects Versions: 0.12.3
Reporter: Hairong Kuang
Assigned To: Hairong Kuang
Priority: Blocker
Fix For: 0.13.0
In ChecksumFile.FSInputChecker, when a ChecksumException occurs, it tries to recover from the error by reading a different replica. However, the current code does not restore the Checksum object's old state. This causes a read not able to recover from ChecksumException although there are non-corrupted replicas available if the read follows a seek to a position which is not at the checksum chunk boundary .
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HADOOP-1345) Checksum object does not get
restored to the old state in retries when handle ChecksumException
Posted by "Raghu Angadi (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-1345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12494787 ]
Raghu Angadi commented on HADOOP-1345:
--------------------------------------
+1. Looks good.
> Checksum object does not get restored to the old state in retries when handle ChecksumException
> -----------------------------------------------------------------------------------------------
>
> Key: HADOOP-1345
> URL: https://issues.apache.org/jira/browse/HADOOP-1345
> Project: Hadoop
> Issue Type: Bug
> Components: dfs
> Affects Versions: 0.12.3
> Reporter: Hairong Kuang
> Assigned To: Hairong Kuang
> Priority: Blocker
> Fix For: 0.13.0
>
> Attachments: checksum.patch, checksum.patch
>
>
> In ChecksumFile.FSInputChecker, when a ChecksumException occurs, it tries to recover from the error by reading a different replica. However, the current code does not restore the Checksum object's old state. This causes a read not able to recover from ChecksumException although there are non-corrupted replicas available if the read follows a seek to a position which is not at the checksum chunk boundary .
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-1345) Checksum object does not get restored
to the old state in retries when handle ChecksumException
Posted by "Hairong Kuang (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-1345?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Hairong Kuang updated HADOOP-1345:
----------------------------------
Status: Patch Available (was: Open)
> Checksum object does not get restored to the old state in retries when handle ChecksumException
> -----------------------------------------------------------------------------------------------
>
> Key: HADOOP-1345
> URL: https://issues.apache.org/jira/browse/HADOOP-1345
> Project: Hadoop
> Issue Type: Bug
> Components: dfs
> Affects Versions: 0.12.3
> Reporter: Hairong Kuang
> Assigned To: Hairong Kuang
> Priority: Blocker
> Fix For: 0.13.0
>
> Attachments: checksum.patch, checksum.patch
>
>
> In ChecksumFile.FSInputChecker, when a ChecksumException occurs, it tries to recover from the error by reading a different replica. However, the current code does not restore the Checksum object's old state. This causes a read not able to recover from ChecksumException although there are non-corrupted replicas available if the read follows a seek to a position which is not at the checksum chunk boundary .
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-1345) Checksum object does not get restored
to the old state in retries when handle ChecksumException
Posted by "Hairong Kuang (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-1345?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Hairong Kuang updated HADOOP-1345:
----------------------------------
Attachment: checksum.patch
There is a related bug. SeekToNewSources does not compute the position in .crc file correctly. The new patch refects a fix to the new bug.
> Checksum object does not get restored to the old state in retries when handle ChecksumException
> -----------------------------------------------------------------------------------------------
>
> Key: HADOOP-1345
> URL: https://issues.apache.org/jira/browse/HADOOP-1345
> Project: Hadoop
> Issue Type: Bug
> Components: dfs
> Affects Versions: 0.12.3
> Reporter: Hairong Kuang
> Assigned To: Hairong Kuang
> Priority: Blocker
> Fix For: 0.13.0
>
> Attachments: checksum.patch, checksum.patch
>
>
> In ChecksumFile.FSInputChecker, when a ChecksumException occurs, it tries to recover from the error by reading a different replica. However, the current code does not restore the Checksum object's old state. This causes a read not able to recover from ChecksumException although there are non-corrupted replicas available if the read follows a seek to a position which is not at the checksum chunk boundary .
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HADOOP-1345) Checksum object does not get
restored to the old state in retries when handle ChecksumException
Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-1345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12495963 ]
Hadoop QA commented on HADOOP-1345:
-----------------------------------
Integrated in Hadoop-Nightly #89 (See http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Nightly/89/)
> Checksum object does not get restored to the old state in retries when handle ChecksumException
> -----------------------------------------------------------------------------------------------
>
> Key: HADOOP-1345
> URL: https://issues.apache.org/jira/browse/HADOOP-1345
> Project: Hadoop
> Issue Type: Bug
> Components: dfs
> Affects Versions: 0.12.3
> Reporter: Hairong Kuang
> Assigned To: Hairong Kuang
> Priority: Blocker
> Fix For: 0.13.0
>
> Attachments: checksum.patch, checksum.patch
>
>
> In ChecksumFile.FSInputChecker, when a ChecksumException occurs, it tries to recover from the error by reading a different replica. However, the current code does not restore the Checksum object's old state. This causes a read not able to recover from ChecksumException although there are non-corrupted replicas available if the read follows a seek to a position which is not at the checksum chunk boundary .
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-1345) Checksum object does not get restored
to the old state in retries when handle ChecksumException
Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-1345?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Doug Cutting updated HADOOP-1345:
---------------------------------
Resolution: Fixed
Status: Resolved (was: Patch Available)
I just committed this. Thanks, Hairong!
> Checksum object does not get restored to the old state in retries when handle ChecksumException
> -----------------------------------------------------------------------------------------------
>
> Key: HADOOP-1345
> URL: https://issues.apache.org/jira/browse/HADOOP-1345
> Project: Hadoop
> Issue Type: Bug
> Components: dfs
> Affects Versions: 0.12.3
> Reporter: Hairong Kuang
> Assigned To: Hairong Kuang
> Priority: Blocker
> Fix For: 0.13.0
>
> Attachments: checksum.patch, checksum.patch
>
>
> In ChecksumFile.FSInputChecker, when a ChecksumException occurs, it tries to recover from the error by reading a different replica. However, the current code does not restore the Checksum object's old state. This causes a read not able to recover from ChecksumException although there are non-corrupted replicas available if the read follows a seek to a position which is not at the checksum chunk boundary .
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-1345) Checksum object does not get restored
to the old state in retries when handle ChecksumException
Posted by "Hairong Kuang (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-1345?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Hairong Kuang updated HADOOP-1345:
----------------------------------
Attachment: checksum.patch
> Checksum object does not get restored to the old state in retries when handle ChecksumException
> -----------------------------------------------------------------------------------------------
>
> Key: HADOOP-1345
> URL: https://issues.apache.org/jira/browse/HADOOP-1345
> Project: Hadoop
> Issue Type: Bug
> Components: dfs
> Affects Versions: 0.12.3
> Reporter: Hairong Kuang
> Assigned To: Hairong Kuang
> Priority: Blocker
> Fix For: 0.13.0
>
> Attachments: checksum.patch
>
>
> In ChecksumFile.FSInputChecker, when a ChecksumException occurs, it tries to recover from the error by reading a different replica. However, the current code does not restore the Checksum object's old state. This causes a read not able to recover from ChecksumException although there are non-corrupted replicas available if the read follows a seek to a position which is not at the checksum chunk boundary .
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HADOOP-1345) Checksum object does not get
restored to the old state in retries when handle ChecksumException
Posted by "Nigel Daley (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-1345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12495129 ]
Nigel Daley commented on HADOOP-1345:
-------------------------------------
Can this be unit tested?
> Checksum object does not get restored to the old state in retries when handle ChecksumException
> -----------------------------------------------------------------------------------------------
>
> Key: HADOOP-1345
> URL: https://issues.apache.org/jira/browse/HADOOP-1345
> Project: Hadoop
> Issue Type: Bug
> Components: dfs
> Affects Versions: 0.12.3
> Reporter: Hairong Kuang
> Assigned To: Hairong Kuang
> Priority: Blocker
> Fix For: 0.13.0
>
> Attachments: checksum.patch, checksum.patch
>
>
> In ChecksumFile.FSInputChecker, when a ChecksumException occurs, it tries to recover from the error by reading a different replica. However, the current code does not restore the Checksum object's old state. This causes a read not able to recover from ChecksumException although there are non-corrupted replicas available if the read follows a seek to a position which is not at the checksum chunk boundary .
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HADOOP-1345) Checksum object does not get
restored to the old state in retries when handle ChecksumException
Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-1345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12494807 ]
Hadoop QA commented on HADOOP-1345:
-----------------------------------
+1
http://issues.apache.org/jira/secure/attachment/12357004/checksum.patch applied and successfully tested against trunk revision r536583.
Test results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/128/testReport/
Console output: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/128/console
> Checksum object does not get restored to the old state in retries when handle ChecksumException
> -----------------------------------------------------------------------------------------------
>
> Key: HADOOP-1345
> URL: https://issues.apache.org/jira/browse/HADOOP-1345
> Project: Hadoop
> Issue Type: Bug
> Components: dfs
> Affects Versions: 0.12.3
> Reporter: Hairong Kuang
> Assigned To: Hairong Kuang
> Priority: Blocker
> Fix For: 0.13.0
>
> Attachments: checksum.patch, checksum.patch
>
>
> In ChecksumFile.FSInputChecker, when a ChecksumException occurs, it tries to recover from the error by reading a different replica. However, the current code does not restore the Checksum object's old state. This causes a read not able to recover from ChecksumException although there are non-corrupted replicas available if the read follows a seek to a position which is not at the checksum chunk boundary .
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HADOOP-1345) Checksum object does not get
restored to the old state in retries when handle ChecksumException
Posted by "Hairong Kuang (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-1345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12495154 ]
Hairong Kuang commented on HADOOP-1345:
---------------------------------------
I thought about it. But it is hard to decide which replica to corrupt and deterministically produce ChecksumException with MiniDFSCluster.
> Checksum object does not get restored to the old state in retries when handle ChecksumException
> -----------------------------------------------------------------------------------------------
>
> Key: HADOOP-1345
> URL: https://issues.apache.org/jira/browse/HADOOP-1345
> Project: Hadoop
> Issue Type: Bug
> Components: dfs
> Affects Versions: 0.12.3
> Reporter: Hairong Kuang
> Assigned To: Hairong Kuang
> Priority: Blocker
> Fix For: 0.13.0
>
> Attachments: checksum.patch, checksum.patch
>
>
> In ChecksumFile.FSInputChecker, when a ChecksumException occurs, it tries to recover from the error by reading a different replica. However, the current code does not restore the Checksum object's old state. This causes a read not able to recover from ChecksumException although there are non-corrupted replicas available if the read follows a seek to a position which is not at the checksum chunk boundary .
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.