You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Steve Loughran (Jira)" <ji...@apache.org> on 2022/03/15 09:59:00 UTC

[jira] [Updated] (HADOOP-18112) Implement paging during S3 multi object delete.

     [ https://issues.apache.org/jira/browse/HADOOP-18112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Steve Loughran updated HADOOP-18112:
------------------------------------
    Summary: Implement paging during S3 multi object delete.  (was: Implement paging during multi object delete.)

> Implement paging during S3 multi object delete.
> -----------------------------------------------
>
>                 Key: HADOOP-18112
>                 URL: https://issues.apache.org/jira/browse/HADOOP-18112
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>    Affects Versions: 3.3.1
>            Reporter: Mukund Thakur
>            Assignee: Mukund Thakur
>            Priority: Critical
>              Labels: pull-request-available
>             Fix For: 3.3.3
>
>          Time Spent: 4h 10m
>  Remaining Estimate: 0h
>
>  
> {*}Error{*}:
> Rename operation fails during multi object delete of size more than 1000. We see below exception during multi object delete of more than 1000 keys in one go during rename operation.
>  
> {noformat}
> org.apache.hadoop.fs.s3a.AWSBadRequestException: rename s3a://ms-targeting-prod-cdp-aws-dr-bkt/data/ms-targeting-prod-hbase/hbase/.tmp/data/default/dr-productionL.Address to s3a://ms-targeting-prod-cdp-aws-dr-bkt/user/root/.Trash/Current/data/ms-targetin
> g-prod-hbase/hbase/.tmp/data/default/dr-productionL.Address16438377847941643837797901 on s3a://ms-targeting-prod-cdp-aws-dr-bkt/data/ms-targeting-prod-hbase/hbase/.tmp/data/default/dr-productionL.Address: com.amazonaws.services.s3.model.AmazonS3Exception
> : The XML you provided was not well-formed or did not validate against our published schema (Service: Amazon S3; Status Code: 400; Error Code: MalformedXML; Request ID: XZ8PGAQHP0FGHPYS; S3 Extended Request ID: vTG8c+koukzQ8yMRGd9BvWfmRwkCZ3fAs/EOiAV5S9E
> JjLqFTNCgDOKokuus5W600Z5iOa/iQBI=; Proxy: null), S3 Extended Request ID: vTG8c+koukzQ8yMRGd9BvWfmRwkCZ3fAs/EOiAV5S9EJjLqFTNCgDOKokuus5W600Z5iOa/iQBI=:MalformedXML: The XML you provided was not well-formed or did not validate against our published schema 
> (Service: Amazon S3; Status Code: 400; Error Code: MalformedXML; Request ID: XZ8PGAQHP0FGHPYS; S3 Extended Request ID: vTG8c+koukzQ8yMRGd9BvWfmRwkCZ3fAs/EOiAV5S9EJjLqFTNCgDOKokuus5W600Z5iOa/iQBI=; Proxy: null)
>         at org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:247)
>         at org.apache.hadoop.fs.s3a.s3guard.RenameTracker.convertToIOException(RenameTracker.java:267)
>         at org.apache.hadoop.fs.s3a.s3guard.RenameTracker.deleteFailed(RenameTracker.java:198)
>         at org.apache.hadoop.fs.s3a.impl.RenameOperation.removeSourceObjects(RenameOperation.java:706)
>         at org.apache.hadoop.fs.s3a.impl.RenameOperation.completeActiveCopiesAndDeleteSources(RenameOperation.java:274)
>         at org.apache.hadoop.fs.s3a.impl.RenameOperation.recursiveDirectoryRename(RenameOperation.java:484)
>         at org.apache.hadoop.fs.s3a.impl.RenameOperation.execute(RenameOperation.java:312)
>         at org.apache.hadoop.fs.s3a.S3AFileSystem.innerRename(S3AFileSystem.java:1912)
>         at org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$rename$7(S3AFileSystem.java:1759)
>         at org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.lambda$trackDurationOfOperation$5(IOStatisticsBinding.java:499)
>         at org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.trackDuration(IOStatisticsBinding.java:444)
>         at org.apache.hadoop.fs.s3a.S3AFileSystem.trackDurationAndSpan(S3AFileSystem.java:2250)
>         at org.apache.hadoop.fs.s3a.S3AFileSystem.rename(S3AFileSystem.java:1757)
>         at org.apache.hadoop.fs.FileSystem.rename(FileSystem.java:1605)
>         at org.apache.hadoop.fs.TrashPolicyDefault.moveToTrash(TrashPolicyDefault.java:186)
>         at org.apache.hadoop.fs.Trash.moveToTrash(Trash.java:110){noformat}
>  
> {*}Solution{*}:
> So implementing paging of requests to reduce the number of keys in a single request. Page size can be configured
> using "fs.s3a.bulk.delete.page.size"



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org