You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Stephen O'Donnell (Jira)" <ji...@apache.org> on 2020/02/20 12:02:00 UTC

[jira] [Assigned] (HADOOP-15273) distcp can't handle remote stores with different checksum algorithms

     [ https://issues.apache.org/jira/browse/HADOOP-15273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Stephen O'Donnell reassigned HADOOP-15273:
------------------------------------------

    Assignee: Stephen O'Donnell  (was: Steve Loughran)

> distcp can't handle remote stores with different checksum algorithms
> --------------------------------------------------------------------
>
>                 Key: HADOOP-15273
>                 URL: https://issues.apache.org/jira/browse/HADOOP-15273
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: tools/distcp
>    Affects Versions: 3.1.0
>            Reporter: Steve Loughran
>            Assignee: Stephen O'Donnell
>            Priority: Critical
>             Fix For: 3.1.0, 3.0.3
>
>         Attachments: HADOOP-15273-001.patch, HADOOP-15273-002.patch, HADOOP-15273-003.patch
>
>
> When using distcp without {{-skipcrcchecks}} . If there's a checksum mismatch between src and dest store types (e.g hdfs to s3), then the error message will talk about blocksize, even when its the underlying checksum protocol itself which is the cause for failure
> bq. Source and target differ in block-size. Use -pb to preserve block-sizes during copy. Alternatively, skip checksum-checks altogether, using -skipCrc. (NOTE: By skipping checksums, one runs the risk of masking data-corruption during file-transfer.)
> update:  the CRC check takes always place on a distcp upload before the file is renamed into place. *and you can't disable it then*



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org