You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Steve Loughran (Jira)" <ji...@apache.org> on 2020/12/04 12:53:00 UTC

[jira] [Commented] (HADOOP-16775) DistCp reuses the same temp file within the task attempt for different files.

    [ https://issues.apache.org/jira/browse/HADOOP-16775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17243980#comment-17243980 ] 

Steve Loughran commented on HADOOP-16775:
-----------------------------------------

Note: consistent S3 renders this fix moot. Older releases are safe to use.

> DistCp reuses the same temp file within the task attempt for different files.
> -----------------------------------------------------------------------------
>
>                 Key: HADOOP-16775
>                 URL: https://issues.apache.org/jira/browse/HADOOP-16775
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: tools/distcp
>    Affects Versions: 3.0.0
>            Reporter: Amir Shenavandeh
>            Assignee: Amir Shenavandeh
>            Priority: Major
>              Labels: DistCp, S3, hadoop-tools
>             Fix For: 3.2.2
>
>         Attachments: HADOOP-16775-v1.patch, HADOOP-16775.patch
>
>
> Hadoop DistCp reuses the same temp file name for all the files copied within each task attempt and then moves them to the target name, which is also a server side copy. For copies to S3, this will cause inconsistency as S3 is only consistent for reads after writes, for brand new objects. There is also inconsistency for contents of overwritten objects on S3.
> To avoid this, we should randomize the temp file name and for each temp file use a different name.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org