You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Bowen Li (Jira)" <ji...@apache.org> on 2019/12/12 01:04:00 UTC

[jira] [Updated] (FLINK-15208) client submits multiple sub-jobs for job with dynamic catalog table

     [ https://issues.apache.org/jira/browse/FLINK-15208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Bowen Li updated FLINK-15208:
-----------------------------
    Summary: client submits multiple sub-jobs for job with dynamic catalog table  (was: support client to submit multiple sub-jobs for job with dynamic catalog table)

> client submits multiple sub-jobs for job with dynamic catalog table
> -------------------------------------------------------------------
>
>                 Key: FLINK-15208
>                 URL: https://issues.apache.org/jira/browse/FLINK-15208
>             Project: Flink
>          Issue Type: New Feature
>          Components: Table SQL / API, Table SQL / Client
>            Reporter: Bowen Li
>            Assignee: Bowen Li
>            Priority: Major
>
> with dynamic catalog table in FLINK-15206, users can maintain a single SQL job for both their online and offline job. However, they still need to change their configurations in order to submit different jobs over time.
> E.g. when users update logic of their streaming job, they need to bootstrap both a new online job and backfill offline job, let's call them sub-jobs of a job with dynamic catalog table. They would have to 
> 1) manually change execution mode in yaml config to "streaming" and submit the streaming job 
> 2) manually change execution mode in yaml config to "batch" and submit the batch job
> we should introduce a mechanism to allow users submit all or a subset of sub-jobs all at once. In the backfill use case mentioned above, ideally users should just execute the SQL once, and Flink should spin up two jobs for our users. 
> Streaming platform at some big companies like Uber and Netflix are already kind of doing this for backfill use cases one way or another - some do it in UI, some do it in planning phase. Would be great to standardize this practice and provide users with ultimate simplicity.
> The assumption here is that users are fully aware of the consequences of launching two/multiple jobs at the same time. E.g. they need to handle overlapped results if there's any, 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)