You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by "Danny Chen (Jira)" <ji...@apache.org> on 2022/02/21 06:02:00 UTC

[jira] [Commented] (HUDI-2752) The MOR DELETE block breaks the event time sequence of CDC

    [ https://issues.apache.org/jira/browse/HUDI-2752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17495331#comment-17495331 ] 

Danny Chen commented on HUDI-2752:
----------------------------------

I found that these two issue are a little different, in one batch, if there are disorder delete messages, the data would lost because the append handle always write insert data block before the delete block, so re-open it again ~

> The MOR DELETE block breaks the event time sequence of CDC
> ----------------------------------------------------------
>
>                 Key: HUDI-2752
>                 URL: https://issues.apache.org/jira/browse/HUDI-2752
>             Project: Apache Hudi
>          Issue Type: Task
>          Components: flink
>            Reporter: Danny Chen
>            Assignee: Alexey Kudinkin
>            Priority: Blocker
>             Fix For: 0.11.0
>
>
> Currently, the DELETE blocks are always written after the data blocks for one batch of data write, when there are INSERT/UPDATEs after the DELETE, the data would lost.
> What i can thought of is that the DELETE block should at least keep the event time sequence for #preCombine with other record payloads.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)