You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Steve Loughran (JIRA)" <ji...@apache.org> on 2018/02/22 10:36:02 UTC

[jira] [Commented] (HADOOP-15193) add bulk delete call to metastore API & DDB impl

    [ https://issues.apache.org/jira/browse/HADOOP-15193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16372638#comment-16372638 ] 

Steve Loughran commented on HADOOP-15193:
-----------------------------------------

I'm thinking of doing this with some explicit {{BulkOperationInfo extends Closeable}} class and

{code}
BulkOperationInfo initiateDirectoryDelete(path)
void deleteBatch(BulkOperationInfo, List<Path>) // every path must be under the path specified in the bulk operation
void completeBulkOperation(BulkOperationInfo, boolean wasSuccessful) 
{code}

This lines us up for setting up other bulk ops, like an explicit rename.

Why this way? It allows us to tell the store that the batches are all part of the same rmdir call, and that there is little/no need to create any parent dir markers, etc, etc, because everything is expected to work. The complete call can do that and choose what to use as a success/failure marker.

The base impl will do nothing but very that in a batch delete, all paths are valid, then issue 1 by 1; nothing done in complete()



> add bulk delete call to metastore API & DDB impl
> ------------------------------------------------
>
>                 Key: HADOOP-15193
>                 URL: https://issues.apache.org/jira/browse/HADOOP-15193
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>    Affects Versions: 3.0.0
>            Reporter: Steve Loughran
>            Priority: Major
>
> recursive dir delete (and any future bulk delete API like HADOOP-15191) benefits from using the DDB bulk table delete call, which takes a list of deletes and executes. Hopefully this will offer better perf. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org