You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Raghu Angadi (JIRA)" <ji...@apache.org> on 2007/10/31 23:14:50 UTC

[jira] Commented: (HADOOP-2114) Checksums for Namenode image files

    [ https://issues.apache.org/jira/browse/HADOOP-2114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12539196 ] 

Raghu Angadi commented on HADOOP-2114:
--------------------------------------


The basic proposal is to keep crc for each "record" (e.g. in FSImage a file entry is a record) and in case of errors, try to recover pristine records from multiple copies of the image files.

Before I proceed further with design or implementation, are there any standard solutions for keeping few gigs of data resistent to various errors like on disk corruption? Its ok even if the existing solution is not compatibility with Apache. We can read about the method and see how it compares.

The only requirement is that NameNode should be confident that image files it writes and reads have correct data.

thanks.


> Checksums for Namenode image files
> ----------------------------------
>
>                 Key: HADOOP-2114
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2114
>             Project: Hadoop
>          Issue Type: Improvement
>            Reporter: Raghu Angadi
>
> Currently DFS can write multiple copies of image files but we do not automatically recover from corrupted or truncated image files well. This jira intends to keep CRC for image file records and Namenode should recover accurate image as long as data exists in one of the copies (e.g. it should be ok to have non overlapping corruptions in the copies). Will add more details in next comment.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.