You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by "Prashant Wason (Jira)" <ji...@apache.org> on 2020/10/20 18:05:00 UTC

[jira] [Commented] (HUDI-1317) Fix initialization when Async jobs are scheduled - these jobs have older timestamp than INIT timestamp on metadata table

    [ https://issues.apache.org/jira/browse/HUDI-1317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17217835#comment-17217835 ] 

Prashant Wason commented on HUDI-1317:
--------------------------------------

When creating the metadata table for the first time, the timstamp of the first deltacommit on the metadata MOR table should be chosen such that there are no incomplete instants on the dataset after that. 

E.g. If the dataset has the following instants:
t0.commit
t1.commit
t2.commit.inflight
t3.clean

Then we create the metadata table with time t1. It will include all files in the dataset whose time is <= t1. t3.clean will be applied to metadata table as part of normal sync which always happens when the table is opened in write mode.

When t2 completes, its instant will be synced.

 

> Fix initialization when Async jobs are scheduled - these jobs have older timestamp than INIT timestamp on metadata table
> ------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HUDI-1317
>                 URL: https://issues.apache.org/jira/browse/HUDI-1317
>             Project: Apache Hudi
>          Issue Type: Sub-task
>            Reporter: Prashant Wason
>            Assignee: Prashant Wason
>            Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)