You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by "Raymond Xu (Jira)" <ji...@apache.org> on 2022/09/19 10:12:00 UTC

[jira] [Updated] (HUDI-4432) Checkpoint management for muti-writer scenario

     [ https://issues.apache.org/jira/browse/HUDI-4432?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Raymond Xu updated HUDI-4432:
-----------------------------
    Sprint:   (was: 2022/09/19)

> Checkpoint management for muti-writer scenario
> ----------------------------------------------
>
>                 Key: HUDI-4432
>                 URL: https://issues.apache.org/jira/browse/HUDI-4432
>             Project: Apache Hudi
>          Issue Type: Task
>            Reporter: Sagar Sumit
>            Assignee: Harshal Patil
>            Priority: Major
>             Fix For: 0.13.0
>
>
> Please check [https://github.com/apache/hudi/pull/6098/files#r923232330]
> ```
> do we need to design/impl this similar to how deltastreamer checkpointing is done. with Deltastreamer, its feasible to do 1 writer w/ DS and another writer w/ Spark datasource and still Deltastreamer will be able to fetch the right checkpoint to resume from everytime.
> Here I see, we are fetching only the latest commit. So this may not work w/ multi -writer scenarios. may be we can create a follow up ticket and work on it rather than expanding the scope of this patch.
> ```



--
This message was sent by Atlassian Jira
(v8.20.10#820010)