You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by "Ethan Guo (Jira)" <ji...@apache.org> on 2022/04/19 00:15:00 UTC

[jira] [Updated] (HUDI-2459) Support async compaction for metadata table

     [ https://issues.apache.org/jira/browse/HUDI-2459?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ethan Guo updated HUDI-2459:
----------------------------
    Summary: Support async compaction for metadata table  (was: [Impr] Support async compaction for metadata table)

> Support async compaction for metadata table
> -------------------------------------------
>
>                 Key: HUDI-2459
>                 URL: https://issues.apache.org/jira/browse/HUDI-2459
>             Project: Apache Hudi
>          Issue Type: Task
>          Components: writer-core
>            Reporter: sivabalan narayanan
>            Assignee: Ethan Guo
>            Priority: Blocker
>             Fix For: 0.12.0
>
>
> For now, metadata table has inline compaction. But we need to come up with a strategy to support async compaction. 
> Since MDT compaction is fenced based on inflight requests in datatable, if for some reason, compaction in data table kept on failing and never succeeds, we will never compact metadata table as well. This might turn out to be detrimental. 
> So, we should come up with a strategy to support async compaction in metadata table. 
>  
> Some nuances: 
> If there are delta commits corresponding to rollback, we should ensure the final state in base table reflect that and don't miss out on any details. 
> F1 added with dc1. f1 removed by dc3 (rollback in datatable) and again f2 (added by same commit instant) when retried in datatable (compaction for eg). 
> So, final state should reflect just F2 being added and F1 in deleted state, irrespective of whether compaction is complete or not. 
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)