You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by "Brandon Scheller (Jira)" <ji...@apache.org> on 2019/11/08 00:48:00 UTC

[jira] [Commented] (HUDI-326) Support deleting records with only record_key

    [ https://issues.apache.org/jira/browse/HUDI-326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16969684#comment-16969684 ] 

Brandon Scheller commented on HUDI-326:
---------------------------------------

Additionally, does anyone have some context on how difficult this implementation would be? It seems like Hudi doesn't really have any way to track its own partitions, so it looks like we'd have to scan for all the partitions within the table if we wanted to implement something like this.

> Support deleting records with only record_key
> ---------------------------------------------
>
>                 Key: HUDI-326
>                 URL: https://issues.apache.org/jira/browse/HUDI-326
>             Project: Apache Hudi (incubating)
>          Issue Type: Improvement
>            Reporter: Brandon Scheller
>            Priority: Major
>
> Currently Hudi requires 3 things to issue a hard delete using EmptyHoodieRecordPayload. It requires (record_key, partition_key, precombine_key).
> This means that in many real use scenarios, you are required to issue a select query to find the partition_key and possibly precombine_key for a certain record before deleting it.
> We would like to avoid this extra step by being allowed to issue a delete based on only the record_key of a record.
> This means that it would blanket delete all records with that specific record_key across all partitions.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)