You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by "sivabalan narayanan (Jira)" <ji...@apache.org> on 2022/03/11 02:00:00 UTC

[jira] [Updated] (HUDI-3604) Missing to apply rollback commits to Metadata table

     [ https://issues.apache.org/jira/browse/HUDI-3604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

sivabalan narayanan updated HUDI-3604:
--------------------------------------
    Fix Version/s: 0.11.0

> Missing to apply rollback commits to Metadata table
> ---------------------------------------------------
>
>                 Key: HUDI-3604
>                 URL: https://issues.apache.org/jira/browse/HUDI-3604
>             Project: Apache Hudi
>          Issue Type: Bug
>          Components: metadata
>            Reporter: sivabalan narayanan
>            Priority: Blocker
>             Fix For: 0.11.0
>
>
> C1, C2, C3. C4 (RB_C1) 
> When C4 (i.e. RB of C1 is triggered, after deleting data files, and after deleting the commits files in timeline (C1), lets say the process crashed (before applying to MDT). 
> Even if the user restarts the pipeline, there won't be any pending failed commits to rollback and new commit will continue. w/o worrying about C4. But metadata table will miss out this rollback commit. 
>  
> Proposal: 
> We need two fixes atleast: 
> a. We should clean the C1 commit files from data table timeline only after applying the rollback commit to MDT. This way we will ensure no commit files in data table will be cleaned up before applying the rollback to MDT. 
> b. Whenever we check for failed commits to rollback, we should also check for any dangling rollback to be re-attempted. This again needs some fixes in rollback executor as well. since chances that the commit to rollback may not exist in data table timeline at all. but we need to re-attempt the rollback and get it to completion. Its not easy to detect a pending rollback from a dangling rollback. So, can't think of ways to detect dangling rollback just by looking at data table active timeline. 
>  
>  
>  
>  
>  
>  
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)