You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Steve Loughran (JIRA)" <ji...@apache.org> on 2018/10/08 15:15:00 UTC

[jira] [Commented] (HADOOP-15193) add bulk delete call to metastore API & DDB impl

    [ https://issues.apache.org/jira/browse/HADOOP-15193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16641997#comment-16641997 ] 

Steve Loughran commented on HADOOP-15193:
-----------------------------------------

DDB batch delete just takes the list of operations and runs through them in sequence, retrying if needed. There is no speedup compared to making individual requests

We do need a call in the metastore API though, as it can be a bit cleverer about the operation.

In particular: if I delete a directory, do I need to explicitly add deleted markers to all the children, or would a delete marker on the dir be enough? If so, you could be very efficient & not create deleted file markers, just those for the directories 


> add bulk delete call to metastore API & DDB impl
> ------------------------------------------------
>
>                 Key: HADOOP-15193
>                 URL: https://issues.apache.org/jira/browse/HADOOP-15193
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>    Affects Versions: 3.0.0
>            Reporter: Steve Loughran
>            Priority: Major
>
> recursive dir delete (and any future bulk delete API like HADOOP-15191) benefits from using the DDB bulk table delete call, which takes a list of deletes and executes. Hopefully this will offer better perf. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org