You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by "sivabalan narayanan (Jira)" <ji...@apache.org> on 2021/01/22 14:22:00 UTC

[jira] [Commented] (HUDI-1214) Need ability to set deltastreamer checkpoints when doing Spark datasource writes

    [ https://issues.apache.org/jira/browse/HUDI-1214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17270166#comment-17270166 ] 

sivabalan narayanan commented on HUDI-1214:
-------------------------------------------

[~vbalaji]: is this a duplicate of https://issues.apache.org/jira/browse/HUDI-1280 ? 

> Need ability to set deltastreamer checkpoints when doing Spark datasource writes
> --------------------------------------------------------------------------------
>
>                 Key: HUDI-1214
>                 URL: https://issues.apache.org/jira/browse/HUDI-1214
>             Project: Apache Hudi
>          Issue Type: Improvement
>          Components: Spark Integration
>            Reporter: Balaji Varadarajan
>            Assignee: Trevorzhang
>            Priority: Major
>             Fix For: 0.8.0
>
>
> Such support is needed  for bootstrapping cases when users use spark write to do initial bootstrap and then subsequently use deltastreamer.
> DeltaStreamer manages checkpoints inside hoodie commit files and expects checkpoints in previously committed metadata. Users are expected to pass checkpoint or initial checkpoint provider when performing bootstrap through deltastreamer. Such support is not present when doing bootstrap using Spark Datasource.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)