You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Raghu Angadi (JIRA)" <ji...@apache.org> on 2007/06/14 22:55:26 UTC

[jira] Commented: (HADOOP-1491) After successful distcp, couple of checksum error files

    [ https://issues.apache.org/jira/browse/HADOOP-1491?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12504914 ] 

Raghu Angadi commented on HADOOP-1491:
--------------------------------------

My impression from looking at the one case of Koji's investigation:

Two files involved: A and B. On the source side of distcp both are fine. On the destination side A (A_dest) is fine. B_dest is corrupted. .B_dest.crc is same as .B_src.crc, but B_dest has the same content as A_src. Both A and B are small have only one block. Looks like while writing B_dest, it some how wrote block corresponding to A. 

One possible bug that can result in this situation is HADOOP-1396. If both A_dest and B_dest were created around the same time, then it is even more likely culprit (we can check the creation times from creation times of the blocks).


> After successful distcp, couple of checksum error files
> -------------------------------------------------------
>
>                 Key: HADOOP-1491
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1491
>             Project: Hadoop
>          Issue Type: Bug
>          Components: util
>    Affects Versions: 0.12.3
>            Reporter: Koji Noguchi
>
> Tried copying 700,000 files  with distcp. 8 mappers per node.  Single dfs.client.buffer.dir.
> Distcp ran on 25 nodes mapreduce.
> Couple of tasks failed, but job was successful. 
> When checked, 12  files were corrupted. (Checksum error)
> This is repeatable.
> I'll add more information as we find.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.