You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Fan weiwen (JIRA)" <ji...@apache.org> on 2018/09/30 06:01:00 UTC
[jira] [Commented] (FLINK-6966) Add maxParallelism and UIDs to all
operators generated by the Table API / SQL
[ https://issues.apache.org/jira/browse/FLINK-6966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16633255#comment-16633255 ]
Fan weiwen commented on FLINK-6966:
-----------------------------------
[~sunjincheng121] This is the problem, is it solved?
i have a similar problem
the job has two process one table one sink
for example
A
// view
select a,b, count(*) as acnt from kafka group by
hop(
rowtime,
interval '2' second,
interval '60' minute
),
a,
b
Table table = tableEnv.sqlQuery(sql);
tableEnv.registerTable(Aname, table);
// sink
sinksql = select a,b,ant from Aname
Table table = tableEnv.sqlQuery(sinksql);
DataStream<Row> ds = tableEnv.toAppendStream(table, Row.class);
ds.addSink(...);
B
// view
sql = select a,c, count(*) as bnt from kafka group by
hop(
rowtime,
interval '2' second,
interval '60' minute
),
a,
c
Table table = tableEnv.sqlQuery(sql);
tableEnv.registerTable(Bname, table);
// sink
sinksql = select a,c,bnt from Bname
Table table = tableEnv.sqlQuery(sinksql);
DataStream<Row> ds = tableEnv.toAppendStream(table, Row.class);
ds.addSink(...);
---------
now the job start A, stop B
then stop A , cancel and savepoint
start B , from savepoint
B sink sinksql = select a,c,bnt from Bname
problem is B sink fetch data from A state
State is not maintained only, not uid and operator id
english is pool , can chinese exchange ?
> Add maxParallelism and UIDs to all operators generated by the Table API / SQL
> -----------------------------------------------------------------------------
>
> Key: FLINK-6966
> URL: https://issues.apache.org/jira/browse/FLINK-6966
> Project: Flink
> Issue Type: Improvement
> Components: Table API & SQL
> Affects Versions: 1.4.0
> Reporter: Fabian Hueske
> Priority: Major
>
> At the moment, the Table API does not assign UIDs and the max parallelism to operators (except for operators with parallelism 1).
> We should do that to avoid problems when rescaling or restarting jobs from savepoints.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)