You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Steve Loughran (Jira)" <ji...@apache.org> on 2020/05/21 12:54:00 UTC
[jira] [Resolved] (HADOOP-15300) distcp -update to WASB and ADL
copies up all the files, always
[ https://issues.apache.org/jira/browse/HADOOP-15300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Steve Loughran resolved HADOOP-15300.
-------------------------------------
Resolution: Duplicate
> distcp -update to WASB and ADL copies up all the files, always
> --------------------------------------------------------------
>
> Key: HADOOP-15300
> URL: https://issues.apache.org/jira/browse/HADOOP-15300
> Project: Hadoop Common
> Issue Type: Bug
> Components: fs/adl, fs/azure
> Affects Versions: 3.1.0
> Reporter: Steve Loughran
> Priority: Major
>
> If you use {{distcp -update}} to an adl or wasb store, repeatedly, all the source files are copied up every time. In contrast, if you use hdfs:// or s3a:// as a destination, only the new ones are uploaded. hdfs uses checksums for a diff, but s3a is just returning file length and relying on distcp logic being "if either src or dest doesn't do checksums, only compare file len"
> somehow that's not kicking in. Tested for file: and hdfs sources, wasb and adl dests
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-dev-help@hadoop.apache.org