You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by "sivabalan narayanan (Jira)" <ji...@apache.org> on 2022/02/01 16:15:00 UTC

[jira] [Updated] (HUDI-2917) Rollback may be incorrect for canIndexLogFile index

     [ https://issues.apache.org/jira/browse/HUDI-2917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

sivabalan narayanan updated HUDI-2917:
--------------------------------------
    Sprint: Cont' improve -  2021/01/10, Cont' improve -  2021/01/18, Cont' improve -  2021/01/24, Cont' improve -  2021/01/31  (was: Cont' improve -  2021/01/10, Cont' improve -  2021/01/18, Cont' improve -  2021/01/24)

> Rollback may be incorrect for canIndexLogFile index
> ---------------------------------------------------
>
>                 Key: HUDI-2917
>                 URL: https://issues.apache.org/jira/browse/HUDI-2917
>             Project: Apache Hudi
>          Issue Type: Bug
>          Components: Common Core
>            Reporter: ZiyueGuan
>            Assignee: ZiyueGuan
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 0.11.0
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> Problem:
> we may find some data which should be rollbacked in hudi table.
> Root cause:
> Let's first recall how rollback plan generated about log blocks for deltaCommit. Hudi takes two cases into consideration.
>  # For some log file with no base file, they are comprised by records which are all 'insert record'. Delete them directly. Here we assume all inserted record should be covered by this way.
>  # For those fileID which are updated according to inflight commit meta of instant we want to rollback, we append command block to these log file to rollback.  Here all updated record are handled.
> However, the first condition is not always true. For indexes which can index log file, they could insert record to some existing log file. In current process, inflight hoodieCommitMeta was generated before they are assigned to specific filegroup. 
>  
> Fix: 
> What's needed to fix this problem, we need to use the result of partitioner to generate hoodieCommitMeta rather than workProfile. Also, we may need more comments in rollback code to remind this case.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)