You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-dev@hadoop.apache.org by "Todd Lipcon (JIRA)" <ji...@apache.org> on 2010/04/21 21:17:50 UTC
[jira] Created: (HDFS-1103) Replica recovery doesn't distinguish
between flushed-but-corrupted last chunk and unflushed last chunk
Replica recovery doesn't distinguish between flushed-but-corrupted last chunk and unflushed last chunk
------------------------------------------------------------------------------------------------------
Key: HDFS-1103
URL: https://issues.apache.org/jira/browse/HDFS-1103
Project: Hadoop HDFS
Issue Type: Bug
Components: data-node
Affects Versions: 0.21.0, 0.22.0
Reporter: Todd Lipcon
When the DN creates a replica under recovery, it calls validateIntegrity, which truncates the last checksum chunk off of a replica if it is found to be invalid. Then when the block recovery process happens, this shortened block wins over a longer replica from another node where there was no corruption. Thus, if just one of the DNs has an invalid last checksum chunk, data that has been sync()ed to other datanodes can be lost.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.