You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user-zh@flink.apache.org by 朱广彬 <zh...@gmail.com> on 2020/11/16 08:38:48 UTC

Flink sql-client/sql-gateway 支持multiple insert

Hi,Community:

目前sql-client和sql-gateway只能支持单条SQL statement,这样就没法利用multiple insert的优化。如下:

INSERT INTO RubberOrders SELECT product, amount FROM Orders WHERE product LIKE '%Rubber%'
INSERT INTO GlassOrders SELECT product, amount FROM Orders WHERE product LIKE '%Glass%'

两条语句会分别生成两个Job,对Orders表读了两次。如果Orders表是Kafka,消费两次的成本是很大的浪费,是否可以优化下?

一种方案是,在SQL 语法层面上支持,如Hive的multiple insert:

FROM Orders INSERT INTO RubberOrders SELECT product, amount WHERE product LIKE '%Rubber%' INSERT INTO GlassOrders SELECT product, amount WHERE product LIKE '%Glass%'

另一种方案是,让sql-client/sql-gateway支持multiple SQL statement,这样即使语法不支持,两条INSERT语句也可以被优化成一个Job,只scan一次Orders表。


不知道社区后续是否有计划针对这个feature优化?



附:目前Table API 支持multiple insert的优化

// run multiple INSERT queries on the registered source table and emit the result to registered sink tables
StatementSet stmtSet = tEnv.createStatementSet();
// only single INSERT query can be accepted by `addInsertSql` method
stmtSet.addInsertSql(
  "INSERT INTO RubberOrders SELECT product, amount FROM Orders WHERE product LIKE '%Rubber%'");
stmtSet.addInsertSql(
  "INSERT INTO GlassOrders SELECT product, amount FROM Orders WHERE product LIKE '%Glass%'");
// execute all statements together
TableResult tableResult2 = stmtSet.execute();