You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Ayush Saxena (Jira)" <ji...@apache.org> on 2022/01/05 18:30:00 UTC
[jira] [Resolved] (HADOOP-18056) DistCp: Filter duplicates in the source paths
[ https://issues.apache.org/jira/browse/HADOOP-18056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ayush Saxena resolved HADOOP-18056.
-----------------------------------
Fix Version/s: 3.4.0
3.3.3
Hadoop Flags: Reviewed
Resolution: Fixed
> DistCp: Filter duplicates in the source paths
> ---------------------------------------------
>
> Key: HADOOP-18056
> URL: https://issues.apache.org/jira/browse/HADOOP-18056
> Project: Hadoop Common
> Issue Type: Improvement
> Reporter: Ayush Saxena
> Assignee: Ayush Saxena
> Priority: Major
> Labels: pull-request-available
> Fix For: 3.4.0, 3.3.3
>
> Time Spent: 2h 10m
> Remaining Estimate: 0h
>
> Add a basic filtering to remove the exact duplicate paths exposed for copying.
> In case two same srcPath say /tmp/file1 is passed in the list twice. DistCp fails with DuplicateFileException, post building the listing.
> Would be better if we do a basic filtering of duplicate paths.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)
---------------------------------------------------------------------
To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-dev-help@hadoop.apache.org