You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "lincoln.lee (JIRA)" <ji...@apache.org> on 2017/05/04 09:25:04 UTC
[jira] [Updated] (FLINK-6442) Extend TableAPI Support Sink Table Registration and ‘insert into’ Clause in SQL
[ https://issues.apache.org/jira/browse/FLINK-6442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
lincoln.lee updated FLINK-6442:
-------------------------------
Description:
Currently in TableAPI there’s only registration method for source table, when we use SQL writing a streaming job, we should add additional part for the sink, like TableAPI does:
{code}
val sqlQuery = "SELECT * FROM MyTable WHERE _1 = 3"
val t = StreamTestData.getSmall3TupleDataStream(env)
tEnv.registerDataStream("MyTable", t)
// one way: invoke tableAPI’s writeToSink method directly
val result = tEnv.sql(sqlQuery)
result.writeToSink(new YourStreamSink)
// another way: convert to datastream first and then invoke addSink
val result = tEnv.sql(sqlQuery).toDataStream[Row]
result.addSink(new StreamITCase.StringSink)
{code}
From the api we can see the sink table always be a derived table because its 'schema' is inferred from the result type of upstream query.
Compare to traditional RDBMS which support DML syntax, a query with a target output could be written like this:
{code}
insert into table target_table_name
[(column_name [ ,...n ])]
query
{code}
The equivalent form of the example above is as follows:
{code}
tEnv.registerTableSink("targetTable", new YourSink)
val sql = "INSERT INTO targetTable SELECT a, b, c FROM sourceTable"
val result = tEnv.sql(sql)
{code}
It is supported by Calcite’s grammar:
{code}
insert:( INSERT | UPSERT ) INTO tablePrimary
[ '(' column [, column ]* ')' ]
query
{code}
I'd like to extend Flink TableAPI to support such feature. see design doc: https://goo.gl/n3phK5
was:
Currently in TableAPI there’s only registration method for source table, when we use SQL writing a streaming job, we should add additional part for the sink, like TableAPI does:
{code}
val sqlQuery = "SELECT * FROM MyTable WHERE _1 = 3"
val t = StreamTestData.getSmall3TupleDataStream(env)
tEnv.registerDataStream("MyTable", t)
// one way: invoke tableAPI’s writeToSink method directly
val result = tEnv.sql(sqlQuery)
result.writeToSink(new YourStreamSink)
// another way: convert to datastream first and then invoke addSink
val result = tEnv.sql(sqlQuery).toDataStream[Row]
result.addSink(new StreamITCase.StringSink)
{code}
From the api we can see the sink table always be a derived table because its 'schema' is inferred from the result type of upstream query.
Compare to traditional RDBMS which support DML syntax, a query with a target output could be written like this:
{code}
insert into table target_table_name
[(column_name [ ,...n ])]
query
{code}
design doc: https://goo.gl/n3phK5
> Extend TableAPI Support Sink Table Registration and ‘insert into’ Clause in SQL
> -------------------------------------------------------------------------------
>
> Key: FLINK-6442
> URL: https://issues.apache.org/jira/browse/FLINK-6442
> Project: Flink
> Issue Type: New Feature
> Components: Table API & SQL
> Reporter: lincoln.lee
> Assignee: lincoln.lee
> Priority: Minor
>
> Currently in TableAPI there’s only registration method for source table, when we use SQL writing a streaming job, we should add additional part for the sink, like TableAPI does:
> {code}
> val sqlQuery = "SELECT * FROM MyTable WHERE _1 = 3"
> val t = StreamTestData.getSmall3TupleDataStream(env)
> tEnv.registerDataStream("MyTable", t)
> // one way: invoke tableAPI’s writeToSink method directly
> val result = tEnv.sql(sqlQuery)
> result.writeToSink(new YourStreamSink)
> // another way: convert to datastream first and then invoke addSink
> val result = tEnv.sql(sqlQuery).toDataStream[Row]
> result.addSink(new StreamITCase.StringSink)
> {code}
> From the api we can see the sink table always be a derived table because its 'schema' is inferred from the result type of upstream query.
> Compare to traditional RDBMS which support DML syntax, a query with a target output could be written like this:
> {code}
> insert into table target_table_name
> [(column_name [ ,...n ])]
> query
> {code}
> The equivalent form of the example above is as follows:
> {code}
> tEnv.registerTableSink("targetTable", new YourSink)
> val sql = "INSERT INTO targetTable SELECT a, b, c FROM sourceTable"
> val result = tEnv.sql(sql)
> {code}
> It is supported by Calcite’s grammar:
> {code}
> insert:( INSERT | UPSERT ) INTO tablePrimary
> [ '(' column [, column ]* ')' ]
> query
> {code}
> I'd like to extend Flink TableAPI to support such feature. see design doc: https://goo.gl/n3phK5
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)