You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Steve Loughran (JIRA)" <ji...@apache.org> on 2018/07/24 17:56:00 UTC

[jira] [Commented] (HADOOP-15628) S3A Filesystem does not check return from AmazonS3Client deleteObjects

    [ https://issues.apache.org/jira/browse/HADOOP-15628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16554589#comment-16554589 ] 

Steve Loughran commented on HADOOP-15628:
-----------------------------------------

That's interesting. 

# How did you manage to replicate it? Not grant enough permissions to an object?
# What version have you actually seen this on? <= 2.8.x? .  HADOOP-11572 tried to handle this better, which is 2.9+.
# and In HADOOP-15176 and the 3.1 release we've done a lot of work there

1. We actually rely on a MultiObjectsDeleteException being raised on a delete failure, which the API Says "if one or more of the objects couldn't be deleted."
2. We don't have a complete policy on what to do here.; currently it's catch-log-rethrow

I'm actually going to do some work on this in the next 10 days, because we need to handle this for rename() too (it's a copy & delete, after all): HADOOP-13936; HADOOP-15193 are the issues there. It'd be great if you could help there by testing the 3.2 RC0 against your buckets (expect this in september).

At the same time, we aren't going to be able to recover from the failure. All we're going to do is make S3Guard consistent with the remote state (i.e. mark deleted files as delete) and throw again. I don't believe I need to worry about the response from DeleteObjects, assuming that the SDK assertion "partial delete failures raise an exception". If you have evidence that this is not always the case, and that sometimes partial deletes surface in an incomplete result *and no exception raised in the client*, well, that would be cause for concern.

> S3A Filesystem does not check return from AmazonS3Client deleteObjects
> ----------------------------------------------------------------------
>
>                 Key: HADOOP-15628
>                 URL: https://issues.apache.org/jira/browse/HADOOP-15628
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs/s3
>    Affects Versions: 2.9.1, 2.8.4, 3.1.1, 3.0.3
>         Environment: Hadoop 3.0.2 / Hadoop 2.8.3
> Hive 2.3.2 / Hive 2.3.3 / Hive 3.0.0
>            Reporter: Steve Jacobs
>            Priority: Minor
>
> Deletes in S3A that use the Multi-Delete functionality in the Amazon S3 api do not check to see if all objects have been succesfully delete. In the event of a failure, the api will still return a 200 OK (which isn't checked currently):
> [Current Delete Code|https://github.com/apache/hadoop/blob/a0da1ec01051108b77f86799dd5e97563b2a3962/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java#L574] 
> {code:java}
> if (keysToDelete.size() == MAX_ENTRIES_TO_DELETE) {
> DeleteObjectsRequest deleteRequest =
> new DeleteObjectsRequest(bucket).withKeys(keysToDelete);
> s3.deleteObjects(deleteRequest);
> statistics.incrementWriteOps(1);
> keysToDelete.clear();
> }
> {code}
> This should be converted to use the DeleteObjectsResult class from the S3Client: 
> [Amazon Code Example|https://docs.aws.amazon.com/AmazonS3/latest/dev/DeletingMultipleObjectsUsingJava.htm]
> {code:java}
> // Verify that the objects were deleted successfully.
> DeleteObjectsResult delObjRes = s3Client.deleteObjects(multiObjectDeleteRequest); int successfulDeletes = delObjRes.getDeletedObjects().size();
> System.out.println(successfulDeletes + " objects successfully deleted.");
> {code}
> Bucket policies can be misconfigured, and deletes will fail without warning by S3A clients.
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org