You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Danny Chen (Jira)" <ji...@apache.org> on 2020/01/10 02:33:00 UTC

[jira] [Commented] (FLINK-15206) support dynamic catalog table for truly unified SQL job

    [ https://issues.apache.org/jira/browse/FLINK-15206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17012400#comment-17012400 ] 

Danny Chen commented on FLINK-15206:
------------------------------------

The fact is that there are some variables that should bind to each query  which are always dynamic, the key thing here is how a user can pass this variables to Flink framework, from this dimension, "dynamic catalog table" seems does not solve the problem.

> support dynamic catalog table for truly unified SQL job
> -------------------------------------------------------
>
>                 Key: FLINK-15206
>                 URL: https://issues.apache.org/jira/browse/FLINK-15206
>             Project: Flink
>          Issue Type: New Feature
>          Components: Table SQL / API
>            Reporter: Bowen Li
>            Assignee: Bowen Li
>            Priority: Major
>
> currently if users have both an online and an offline job with same business logic in Flink SQL, their codebase is still not unified. They would keep two SQL statements whose only difference is the source (or/and sink) table (with different params). E.g.
> {code:java}
> // online job
> insert into x select * from kafka_table (starting time) ...;
> // offline backfill job
> insert into x select * from hive_table  (starting and ending time) ...;
> {code}
> We can introduce a "dynamic catalog table". The dynamic catalog table acts as a view, and is just an abstract table of multiple actual tables behind it that can be switched under some configuration flags. When execute a job, depending on the configuration, the dynamic catalog table can point to an actual source table.
> A use case for this is the example given above - when executed in streaming mode, {{my_source_dynamic_table}} should point to a kafka catalog table with a new starting position, and in batch mode, {{my_source_dynamic_table}} should point to a hive catalog table with starting/ending positions.
>  
> One thing to note is that the starting position of kafka_table, and starting/ending position of hive_table are different every time. needs more thinking of how can we accommodate that



--
This message was sent by Atlassian Jira
(v8.3.4#803005)