You are viewing a plain text version of this content. The canonical link for it is here.

Posted to common-dev@hadoop.apache.org by "Sameer Paranjpye (JIRA)" <ji...@apache.org> on 2007/07/02 20:24:04 UTC

[jira] Commented: (HADOOP-1557) Deletion of excess replicas should prefer to delete corrupted replicas before deleting valid replicas

    [ https://issues.apache.org/jira/browse/HADOOP-1557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12509636 ] 

Sameer Paranjpye commented on HADOOP-1557:
------------------------------------------

Why not have clients and/or datanodes report corrupt replicas to the Namenode? Is this not already done?

The Namenode should remove and replace corrupt replicas when they are reported by clients/datanodes, probably on read to start off with. This can be subsequently enhanced to to periodic block scanning.

> Deletion of excess replicas should prefer to delete corrupted replicas before deleting valid replicas
> -----------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-1557
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1557
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>            Reporter: dhruba borthakur
>
> Suppose a block has three replicas and two of the replicas are corrupted. If the replication factor of the file is reduced to 2. The filesystem should preferably delete the two corrupted replicas, otherwise it could lead to a corrupted file.
> One option would be to make the datanode periodically validate all blocks with their corresponding CRCs. The other option would be to make the setReplication call validate existing replicas before deleting excess replicas.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.