You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by "Vinoth Chandar (Jira)" <ji...@apache.org> on 2021/01/21 06:13:00 UTC

[jira] [Updated] (HUDI-1214) Need ability to set deltastreamer checkpoints when doing Spark datasource writes

     [ https://issues.apache.org/jira/browse/HUDI-1214?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Vinoth Chandar updated HUDI-1214:
---------------------------------
    Fix Version/s:     (was: 0.7.0)
                   0.8.0

> Need ability to set deltastreamer checkpoints when doing Spark datasource writes
> --------------------------------------------------------------------------------
>
>                 Key: HUDI-1214
>                 URL: https://issues.apache.org/jira/browse/HUDI-1214
>             Project: Apache Hudi
>          Issue Type: Improvement
>          Components: Spark Integration
>            Reporter: Balaji Varadarajan
>            Assignee: Trevorzhang
>            Priority: Major
>             Fix For: 0.8.0
>
>
> Such support is needed  for bootstrapping cases when users use spark write to do initial bootstrap and then subsequently use deltastreamer.
> DeltaStreamer manages checkpoints inside hoodie commit files and expects checkpoints in previously committed metadata. Users are expected to pass checkpoint or initial checkpoint provider when performing bootstrap through deltastreamer. Such support is not present when doing bootstrap using Spark Datasource.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)