You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by "sivabalan narayanan (Jira)" <ji...@apache.org> on 2023/03/30 05:38:00 UTC

[jira] [Comment Edited] (HUDI-2458) Relax compaction in metadata being fenced based on inflight requests in data table

    [ https://issues.apache.org/jira/browse/HUDI-2458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17706580#comment-17706580 ] 

sivabalan narayanan edited comment on HUDI-2458 at 3/30/23 5:37 AM:
--------------------------------------------------------------------

Documenting all intricacies we have b/w DT and MDT

 
1. Compaction in metadata may not kick in, if there are any inflight operations in data table. 
2. Rollback when being applied to metadata table has a dependency on last compaction instant in metadata table. The commit being rolled back, if its already synced to MDT, we apply -F records. if not, we don't. this is to ensure we only do -F only if we have ever done +F before (no spurious deletes). 
3. Archival in data table is fenced by latest compaction in metadata table.
 - Not sure if this is really an issue. any failed commit in DT, will anyway stop the archival in DT.
4. MDT archival depends on earliest pending/inflight commit in DT.

5. rollback from DT when applied to MDT has a dep on archival in MDT: if the commit being rolled back is already archived in MDT, we throw an exception.
 - But I can't think of a reason why would we archive. bcoz, MDT archival has a dependency on DT earliest inflight commit. and MDT compaction has a dependency on earliest inflight commit in DT. and so this may not be valid use-case or dependency(MDT archival fix was added later though) 

 6. HoodieMetadataLogRecordReader#getValidInstantTimestamps  relies on the active instants on DT timeline for validity filtering. if for some reason, we make DT archival proceed beyond any inflight commits in DT, it could result in invalid data being served from MDT.

1 & 2 : bcoz, we have a constraint in MDT, that any instant which is < latest compaction is fully synced up w/ MDT(i.e. either completed or rolledback). In other words, there can't be a situation where, we can't have a partially failed commits in data table which is not yet synced to MDT and MDT compaction should not kick in. (esply if the commit is completed in MDT, but not in DT). If MDT compaction kicks in, incomplete data could be committed now.


was (Author: shivnarayan):
Documenting all intricacies we have b/w DT and MDT

 
1. Compaction in metadata may not kick in, if there are any inflight operations in data table. 
2. Rollback when being applied to metadata table has a dependency on last compaction instant in metadata table. The commit being rolled back, if its already synced to MDT, we apply -F records. if not, we don't. this is to ensure we only do -F only if we have ever done +F before (no spurious deletes). 
3. Archival in data table is fenced by latest compaction in metadata table. 
   - Not sure if this is really an issue. any failed commit in DT, will anyway stop the archival in DT. 
4. MDT archival depends on earliest pending/inflight commit in DT. 
   
5. rollback from DT when applied to MDT has a dep on archival in MDT: if the commit being rolled back is already archived in MDT, we throw an exception. 
    - But I can't think of a reason why would we archive. bcoz, MDT archival has a dependency on DT earliest inflight commit. and MDT compaction has a dependency on earliest inflight commit in DT. and so this may not be valid use-case or dependency(MDT archival fix was added later though). 
6. HoodieMetadataLogRecordReader#getValidInstantTimestamps  relies on the active instants on DT timeline for validity filtering. if for some reason, we make DT archival proceed beyond any inflight commits in DT, it could result in invalid data being served from MDT. 

  1 & 2 : bcoz, we have a constraint in MDT, that any instant which is < latest compaction is fully synced up w/ MDT(i.e. either completed or rolledback). In other words, there can't be a situation where, we can't have a partially failed commits in data table which is not yet synced to MDT and MDT compaction should not kick in. (esply if the commit is completed in MDT, but not in DT). If MDT compaction kicks in, incomplete data could be committed now.

> Relax compaction in metadata being fenced based on inflight requests in data table
> ----------------------------------------------------------------------------------
>
>                 Key: HUDI-2458
>                 URL: https://issues.apache.org/jira/browse/HUDI-2458
>             Project: Apache Hudi
>          Issue Type: Improvement
>          Components: metadata
>            Reporter: sivabalan narayanan
>            Assignee: Ethan Guo
>            Priority: Blocker
>             Fix For: 0.13.1, 0.12.3
>
>
> Relax compaction in metadata being fenced based on inflight requests in data table.
> Compaction in metadata is triggered only if there are no inflight requests in data table. This might cause liveness problem since for very large deployments, we could either have compaction or clustering always in progress. So, we should try to see how we can relax this constraint.
>  
> Proposal to remove this dependency:
> With recent addition of spurious deletes config, we can actually get away with this. 
> As of now, we have 3 inter linked nuances.
>  - Compaction in metadata may not kick in, if there are any inflight operations in data table. 
>  - Rollback when being applied to metadata table has a dependency on last compaction instant in metadata table. We might even throw exception if instant being rolledback is < latest metadata compaction instant time. 
>  - Archival in data table is fenced by latest compaction in metadata table. 
>  
> So, just incase data timeline has any dangling inflght operation (lets say someone tried clustering, and killed midway and did not ever attempt again), metadata compaction will never kick in at all for good. I need to check what does archival do for such inflight operations in data table though when it tries to archive near by commits. 
>  
> So, with spurious deletes support which we added recently, all these can be much simplified. 
> Whenever we want to apply a rollback commit, we don't need to take different actions based on whether the commit being rolled back is already committed to metadata table or not. Just go ahead and apply the rollback. Merging of metadata payload records will take care of this. If the commit was already synced, final merged payload may not have spurious deletes. If the commit being rolledback was never committed to metadata, final merged payload may have some spurious deletes which we can ignore. 
> With this, compaction in metadata does not need to have any dependency on inflight operations in data table. 
> And we can loosen up the dependency of archival in data table on metadata table compaction as well. 
> So, in summary, all the 3 dependencies quoted above will be moot if we go with this approach. Archival in data table does not have any dependency on metadata table compaction. Rollback when being applied to metadata table does not care about last metadata table compaction. Compaction in metadata table can proceed even if there are inflight operations in data table. 
>  
> Especially our logic to apply rollback metadata to metadata table will become a lot simpler and is easy to reason about. 
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)