You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@gobblin.apache.org by "Hung Tran (JIRA)" <ji...@apache.org> on 2018/04/21 20:56:00 UTC

[jira] [Resolved] (GOBBLIN-464) Enhance LoopingDatasetFinderSource to support global watermark and per-dataset watermark

     [ https://issues.apache.org/jira/browse/GOBBLIN-464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hung Tran resolved GOBBLIN-464.
-------------------------------
    Resolution: Fixed

Issue resolved by pull request #2336
[https://github.com/apache/incubator-gobblin/pull/2336]

> Enhance LoopingDatasetFinderSource to support global watermark and per-dataset watermark
> ----------------------------------------------------------------------------------------
>
>                 Key: GOBBLIN-464
>                 URL: https://issues.apache.org/jira/browse/GOBBLIN-464
>             Project: Apache Gobblin
>          Issue Type: Improvement
>          Components: gobblin-compliance
>    Affects Versions: 0.13.0
>            Reporter: Sudarshan Vasudevan
>            Assignee: Sudarshan Vasudevan
>            Priority: Major
>             Fix For: 0.13.0
>
>
> We need the ability to support a global watermark that spans all datasets as well as a per-dataset watermark in LoopingDatasetFinderSource. In particular we need to keep track of:
>  # The last dataset processed in the previous run of a gobblin job (global watermark) so that the subsequent run can process the "next" dataset (based on some ordering between datasets), and
>  # The last time a particular dataset is processed (per-dataset watermark).
> The actual use case is to limit the time range of queries to an external database based on the last processed time of a dataset. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)