You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Syed Shameerur Rahman (Jira)" <ji...@apache.org> on 2023/09/06 06:31:00 UTC

[jira] [Commented] (HADOOP-18797) S3A committer fix lost data on concurrent jobs

    [ https://issues.apache.org/jira/browse/HADOOP-18797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17762288#comment-17762288 ] 

Syed Shameerur Rahman commented on HADOOP-18797:
------------------------------------------------

[~stevel@apache.org] Could you please review the changes?

> S3A committer fix lost data on concurrent jobs
> ----------------------------------------------
>
>                 Key: HADOOP-18797
>                 URL: https://issues.apache.org/jira/browse/HADOOP-18797
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs/s3
>            Reporter: Emanuel Velzi
>            Assignee: Syed Shameerur Rahman
>            Priority: Major
>              Labels: pull-request-available
>
> There is a failure in the commit process when multiple jobs are writing to a s3 directory *concurrently* using {*}magic committers{*}.
> This issue is closely related HADOOP-17318.
> When multiple Spark jobs write to the same S3A directory, they upload files simultaneously using "__magic" as the base directory for staging. Inside this directory, there are multiple "/job-some-uuid" directories, each representing a concurrently running job.
> To fix some preoblems related to concunrrency a property was introduced in the previous fix: "spark.hadoop.fs.s3a.committer.abort.pending.uploads". When set to false, it ensures that during the cleanup stage, finalizing jobs do not abort pending uploads from other jobs. So we see in logs this line: 
> {code:java}
> DEBUG [main] o.a.h.fs.s3a.commit.AbstractS3ACommitter (819): Not cleanup up pending uploads to s3a ...{code}
> (from [AbstractS3ACommitter.java#L952|https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/commit/AbstractS3ACommitter.java#L952])
> However, in the next step, the {*}"__magic" directory is recursively deleted{*}:
> {code:java}
> INFO  [main] o.a.h.fs.s3a.commit.magic.MagicS3GuardCommitter (98): Deleting magic directory s3a://my-bucket/my-table/__magic: duration 0:00.560s {code}
> (from [AbstractS3ACommitter.java#L1112 |https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/commit/AbstractS3ACommitter.java#L1112]and [MagicS3GuardCommitter.java#L137)|https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/commit/magic/MagicS3GuardCommitter.java#L137)]
> This deletion operation *affects the second job* that is still running because it loses pending uploads (i.e., ".pendingset" and ".pending" files).
> The consequences can range from an exception in the best case to a silent loss of data in the worst case. The latter occurs when Job_1 deletes files just before Job_2 executes "listPendingUploadsToCommit" to list ".pendingset" files in the job attempt directory previous to complete the uploads with POST requests.
> To resolve this issue, it's important {*}to ensure that only the prefix associated with the job currently finalizing is cleaned{*}.
> Here's a possible solution:
> {code:java}
> /**
>  * Delete the magic directory.
>  */
> public void cleanupStagingDirs() {
>   final Path out = getOutputPath();
>  //Path path = magicSubdir(getOutputPath());
>   Path path = new Path(magicSubdir(out), formatJobDir(getUUID()));
>   try(DurationInfo ignored = new DurationInfo(LOG, true,
>       "Deleting magic directory %s", path)) {
>     Invoker.ignoreIOExceptions(LOG, "cleanup magic directory", path.toString(),
>         () -> deleteWithWarning(getDestFS(), path, true));
>   }
> } {code}
>  
> The side effect of this issue is that the "__magic" directory is never cleaned up. However, I believe this is a minor concern, even considering that other folders such as "_SUCCESS" also persist after jobs end.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org