You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Dian Fu (Jira)" <ji...@apache.org> on 2020/05/27 04:20:00 UTC

[jira] [Commented] (FLINK-17953) OverWindow doesn't support to order by non-time attribute in batch mode for Table API program

    [ https://issues.apache.org/jira/browse/FLINK-17953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17117379#comment-17117379 ] 

Dian Fu commented on FLINK-17953:
---------------------------------

In streaming jobs, there are also the same kind of problems if the source is defined through DDL. 

For the following job:
{code}

source_ddl = """
 create table input(
 id INT,
 sales FLOAT,
 word VARCHAR,
 ts TIMESTAMP(3),
WATERMARK FOR ts AS ts - INTERVAL '5' SECOND
 ) with (
 'connector' = 'filesystem',
 'format' = 'csv',
 'path' = '{}'
 )
 """.format(input_data_path)
t_env.execute_sql(source_ddl)

sink_ddl = """
 create table results(
 id INT,
 total_sales FLOAT
 ) with (
 'connector' = 'filesystem',
 'format' = 'csv',
 'path' = '{}'
 )
 """.format(result_data_path)
t_env.execute_sql(sink_ddl)

table = t_env.from_path("input").over_window(
    Over.partition_by("id").order_by("ts").preceding("2.rows").following("current_row").alias('w')) \
 .select("id, sum(sales) over w as total_sales")
{code}

It will also throw the same kind of exception as described in this JIRA.

> OverWindow doesn't support to order by non-time attribute in batch mode for Table API program
> ---------------------------------------------------------------------------------------------
>
>                 Key: FLINK-17953
>                 URL: https://issues.apache.org/jira/browse/FLINK-17953
>             Project: Flink
>          Issue Type: Bug
>          Components: Table SQL / Planner
>    Affects Versions: 1.9.0, 1.10.0, 1.11.0
>            Reporter: Dian Fu
>            Priority: Major
>
> For a simple batch job tested in blink planner:
> {code:java}
> INSERT INTO results
> SELECT id, sum(sales)
> OVER (PARTITION BY id ORDER BY ts ROWS BETWEEN 2 PRECEDING AND 0 FOLLOWING)
> FROM input
> {code}
> It could pass if written in SQL. However, if we rewrite it in Table API, it will throw the following exception:
> {code:java}
> py4j.protocol.Py4JJavaError: An error occurred while calling o85.select.py4j.protocol.Py4JJavaError: An error occurred while calling o85.select.: org.apache.flink.table.api.ValidationException: Ordering must be defined on a time attribute. at org.apache.flink.table.planner.expressions.PlannerTypeInferenceUtilImpl.validateArguments(PlannerTypeInferenceUtilImpl.java:112) at org.apache.flink.table.planner.expressions.PlannerTypeInferenceUtilImpl.runTypeInference(PlannerTypeInferenceUtilImpl.java:71) at org.apache.flink.table.expressions.resolver.rules.ResolveCallByArgumentsRule$ResolvingCallVisitor.runLegacyTypeInference(ResolveCallByArgumentsRule.java:218) at org.apache.flink.table.expressions.resolver.rules.ResolveCallByArgumentsRule$ResolvingCallVisitor.lambda$visit$2(ResolveCallByArgumentsRule.java:134) at java.util.Optional.orElseGet(Optional.java:267) at org.apache.flink.table.expressions.resolver.rules.ResolveCallByArgumentsRule$ResolvingCallVisitor.visit(ResolveCallByArgumentsRule.java:134) at org.apache.flink.table.expressions.resolver.rules.ResolveCallByArgumentsRule$ResolvingCallVisitor.visit(ResolveCallByArgumentsRule.java:89) at org.apache.flink.table.expressions.ApiExpressionVisitor.visit(ApiExpressionVisitor.java:39) at org.apache.flink.table.expressions.UnresolvedCallExpression.accept(UnresolvedCallExpression.java:132) at org.apache.flink.table.expressions.resolver.rules.ResolveCallByArgumentsRule$ResolvingCallVisitor.visit(ResolveCallByArgumentsRule.java:124) at org.apache.flink.table.expressions.resolver.rules.ResolveCallByArgumentsRule$ResolvingCallVisitor.visit(ResolveCallByArgumentsRule.java:89) at org.apache.flink.table.expressions.ApiExpressionVisitor.visit(ApiExpressionVisitor.java:39) at org.apache.flink.table.expressions.UnresolvedCallExpression.accept(UnresolvedCallExpression.java:132) at org.apache.flink.table.expressions.resolver.rules.ResolveCallByArgumentsRule.lambda$apply$0(ResolveCallByArgumentsRule.java:83) at java.util.stream.ReferencePipeline$7$1.accept(ReferencePipeline.java:267) at java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1374) at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481) at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471) at java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708) at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) at java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:499) at org.apache.flink.table.expressions.resolver.rules.ResolveCallByArgumentsRule.apply(ResolveCallByArgumentsRule.java:84) at org.apache.flink.table.expressions.resolver.ExpressionResolver.lambda$null$1(ExpressionResolver.java:211) at java.util.function.Function.lambda$andThen$1(Function.java:88) at org.apache.flink.table.expressions.resolver.ExpressionResolver.resolve(ExpressionResolver.java:178) at org.apache.flink.table.operations.utils.OperationTreeBuilder.projectInternal(OperationTreeBuilder.java:191) at org.apache.flink.table.operations.utils.OperationTreeBuilder.project(OperationTreeBuilder.java:170) at org.apache.flink.table.api.internal.TableImpl$OverWindowedTableImpl.select(TableImpl.java:953) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.flink.api.python.shaded.py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244) at org.apache.flink.api.python.shaded.py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357) at org.apache.flink.api.python.shaded.py4j.Gateway.invoke(Gateway.java:282) at org.apache.flink.api.python.shaded.py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132) at org.apache.flink.api.python.shaded.py4j.commands.CallCommand.execute(CallCommand.java:79) at org.apache.flink.api.python.shaded.py4j.GatewayConnection.run(GatewayConnection.java:238) at java.lang.Thread.run(Thread.java:745)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)