You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-issues@hadoop.apache.org by "Frederick Tucker (JIRA)" <ji...@apache.org> on 2016/07/18 21:39:20 UTC

[jira] [Comment Edited] (MAPREDUCE-6734) Add option to distcp to preserve file path structure of source files at the destination

    [ https://issues.apache.org/jira/browse/MAPREDUCE-6734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15382832#comment-15382832 ] 

Frederick Tucker edited comment on MAPREDUCE-6734 at 7/18/16 9:39 PM:
----------------------------------------------------------------------

Added -preservepath and -sourceprefixmask options to distcp to control how to the filestructure is copied to target fs when using distcp. 

-preservepath: Preserve the absolute path of the source file at the target.
-sourceprefixmask: Remove the start of a source's absolute path when running distcp with -preservepath.

Tests, javadoc, and wiki are updated as well.


was (Author: fctucker):
Added -preservepath and -sourceprefixmask options to distcp to control how to the filestructure is copied to target fs when using distcp. 

-preservepath: Preserve the absolute path of the source file at the target.
-sourceprefixmas: Remove the start of a source's absolute path when running distcp with -preservepath.

Tests, javadoc, and wiki are updated as well.

> Add option to distcp to preserve file path structure of source files at the destination
> ---------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-6734
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6734
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: distcp
>    Affects Versions: 3.0.0-alpha2
>         Environment: Software platform
>            Reporter: Frederick Tucker
>            Priority: Critical
>              Labels: distcp, newbie, patch
>             Fix For: 3.0.0-alpha2
>
>         Attachments: MAPREDUCE-6734.3.0.0-alpha2.patch
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> When copying files using distcp with globbed source files, all the matched files in the glob are copied in a single flat directory.  This causes problems when the file structure at the source is important.  It also is an issue when there are two files matched in the glob with the same name because it causes a duplicate file error at the target.  I'd like to have an option to preserve the file structure of the source files when globbing inputs.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: mapreduce-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-help@hadoop.apache.org