You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-issues@hadoop.apache.org by "Rosie Li (JIRA)" <ji...@apache.org> on 2011/03/28 22:45:05 UTC

[jira] [Created] (MAPREDUCE-2406) Failed validate copy in distcp

Failed validate copy in distcp
------------------------------

                 Key: MAPREDUCE-2406
                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2406
             Project: Hadoop Map/Reduce
          Issue Type: Bug
    Affects Versions: 0.21.0
            Reporter: Rosie Li
            Priority: Minor


Each time the distcp is done, {{validateCopy(srcstat, absdst)}} will be called. 
When doing distcp, if the -pb(preserve block size) is not set, the dst will use the default block size. However, if the src file use block size other than the default block size, and -pb is not set, after copying, the src and dst will have different block size. It will not pass the validateCopy check in this case.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-2406) Failed validate copy in distcp

Posted by "Markus Jelsma (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-2406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13172226#comment-13172226 ] 

Markus Jelsma commented on MAPREDUCE-2406:
------------------------------------------

I can confirm -pb solves this problem:
http://mail-archives.apache.org/mod_mbox/hadoop-hdfs-user/201112.mbox/%3C201112190922.39970.markus.jelsma@openindex.io%3E

During the first run many files fail. When i retry most files are skipped because they were already copied and a few files succeed. With a third try all files fail consistently. When i finally try again with -pb set all files are copied properly and thus solves this problem.
                
> Failed validate copy in distcp
> ------------------------------
>
>                 Key: MAPREDUCE-2406
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2406
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>    Affects Versions: 0.21.0
>            Reporter: Rosie Li
>            Priority: Minor
>
> Each time the distcp is done, {{validateCopy(srcstat, absdst)}} will be called. 
> When doing distcp, if the -pb(preserve block size) is not set, the dst will use the default block size. However, if the src file use block size other than the default block size, and -pb is not set, after copying, the src and dst will have different block size. It will not pass the validateCopy check in this case.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAPREDUCE-2406) Failed validate copy in distcp

Posted by "Harsh J (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-2406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13062613#comment-13062613 ] 

Harsh J commented on MAPREDUCE-2406:
------------------------------------

Looking at validateCopy in trunk tells me that it does not seem to check anything beyond the crc (if available, and if asked for) and the length of whole files. That doesn't seem to be a cause for a failure due to differing block sizes here?

> Failed validate copy in distcp
> ------------------------------
>
>                 Key: MAPREDUCE-2406
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2406
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>    Affects Versions: 0.21.0
>            Reporter: Rosie Li
>            Priority: Minor
>
> Each time the distcp is done, {{validateCopy(srcstat, absdst)}} will be called. 
> When doing distcp, if the -pb(preserve block size) is not set, the dst will use the default block size. However, if the src file use block size other than the default block size, and -pb is not set, after copying, the src and dst will have different block size. It will not pass the validateCopy check in this case.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira