You are viewing a plain text version of this content. The canonical link for it is here.

Posted to hdfs-dev@hadoop.apache.org by "Kihwal Lee (JIRA)" <ji...@apache.org> on 2015/05/14 00:34:59 UTC

[jira] [Created] (HDFS-8395) Verify on-disk data after transferring block data

Kihwal Lee created HDFS-8395:
--------------------------------

             Summary: Verify on-disk data after transferring block data
                 Key: HDFS-8395
                 URL: https://issues.apache.org/jira/browse/HDFS-8395
             Project: Hadoop HDFS
          Issue Type: Improvement
            Reporter: Kihwal Lee
            Priority: Critical


Currently the integrity of on-disk data is not checked during pipeline recovery or replication. The target in the pipeline-recovery-transfer can detect a corruption, but sometimes it is detected long after a corruption happens. (e.g. HDFS-4660) If multiple pipeline failures occur, delayed corruption detection can cause data loss.

During replications involving multiple destinations, if a middle node corrupts the data, it can cause the healthy source to be marked corrupt. Because of lack of full ack mechanism during replication, the corrupt replica will continue to be written and finalized. Now this replica will be source of further replication because the original source is marked corrupt. All subsequent replications of course fail and this results in a missing block.

By adding on-disk corruption detection to appropriate places, the situation can be improved.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)