You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Xuannan Su (Jira)" <ji...@apache.org> on 2022/07/13 02:20:00 UTC
[jira] [Updated] (FLINK-28528) Table.getSchema fails on table with watermark
[ https://issues.apache.org/jira/browse/FLINK-28528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Xuannan Su updated FLINK-28528:
-------------------------------
Description:
The bug can be reproduced with the following test. The test can pass if we use the commented way to define the watermark.
{code:python}
def test_flink_2(self):
env = StreamExecutionEnvironment.get_execution_environment()
t_env = StreamTableEnvironment.create(env)
table = t_env.from_descriptor(
TableDescriptor.for_connector("filesystem")
.schema(
Schema.new_builder()
.column("name", DataTypes.STRING())
.column("cost", DataTypes.INT())
.column("distance", DataTypes.INT())
.column("time", DataTypes.TIMESTAMP(3))
.watermark("time", expr.col("time") - expr.lit(60).seconds)
# .watermark("time", "`time` - INTERVAL '60' SECOND")
.build()
)
.format("csv")
.option("path", "./input.csv")
.build()
)
print(table.get_schema())
{code}
It causes the following exception
{code:none}
E pyflink.util.exceptions.TableException: org.apache.flink.table.api.TableException: Expression 'minus(time, 60000)' is not string serializable. Currently, only expressions that originated from a SQL expression have a well-defined string representation.
E at org.apache.flink.table.expressions.ResolvedExpression.asSerializableString(ResolvedExpression.java:51)
E at org.apache.flink.table.api.TableSchema.lambda$fromResolvedSchema$13(TableSchema.java:455)
E at java.util.Collections$SingletonList.forEach(Collections.java:4824)
E at org.apache.flink.table.api.TableSchema.fromResolvedSchema(TableSchema.java:451)
E at org.apache.flink.table.api.Table.getSchema(Table.java:101)
E at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
E at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
E at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
E at java.lang.reflect.Method.invoke(Method.java:498)
E at org.apache.flink.api.python.shaded.py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
E at org.apache.flink.api.python.shaded.py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
E at org.apache.flink.api.python.shaded.py4j.Gateway.invoke(Gateway.java:282)
E at org.apache.flink.api.python.shaded.py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
E at org.apache.flink.api.python.shaded.py4j.commands.CallCommand.execute(CallCommand.java:79)
E at org.apache.flink.api.python.shaded.py4j.GatewayConnection.run(GatewayConnection.java:238)
E at java.lang.Thread.run(Thread.java:748)
{code}
was:
The bug can be reproduced with the following test. The test can pass if we use the commented way to define the watermark.
{code:python}
def test_flink_2(self):
env = StreamExecutionEnvironment.get_execution_environment()
t_env = StreamTableEnvironment.create(env)
table = t_env.from_descriptor(
TableDescriptor.for_connector("filesystem")
.schema(
Schema.new_builder()
.column("name", DataTypes.STRING())
.column("cost", DataTypes.INT())
.column("distance", DataTypes.INT())
.column("time", DataTypes.TIMESTAMP(3))
.watermark("time", expr.col("time") - expr.lit(60).seconds)
# .watermark("time", "`time` - INTERVAL '60' SECOND")
.build()
)
.format("csv")
.option("path", "./input.csv")
.build()
)
print(table.get_schema())
{code}
It causes the following exception
{code:none}
// Some comments here
E pyflink.util.exceptions.TableException: org.apache.flink.table.api.TableException: Expression 'minus(time, 60000)' is not string serializable. Currently, only expressions that originated from a SQL expression have a well-defined string representation.
E at org.apache.flink.table.expressions.ResolvedExpression.asSerializableString(ResolvedExpression.java:51)
E at org.apache.flink.table.api.TableSchema.lambda$fromResolvedSchema$13(TableSchema.java:455)
E at java.util.Collections$SingletonList.forEach(Collections.java:4824)
E at org.apache.flink.table.api.TableSchema.fromResolvedSchema(TableSchema.java:451)
E at org.apache.flink.table.api.Table.getSchema(Table.java:101)
E at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
E at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
E at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
E at java.lang.reflect.Method.invoke(Method.java:498)
E at org.apache.flink.api.python.shaded.py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
E at org.apache.flink.api.python.shaded.py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
E at org.apache.flink.api.python.shaded.py4j.Gateway.invoke(Gateway.java:282)
E at org.apache.flink.api.python.shaded.py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
E at org.apache.flink.api.python.shaded.py4j.commands.CallCommand.execute(CallCommand.java:79)
E at org.apache.flink.api.python.shaded.py4j.GatewayConnection.run(GatewayConnection.java:238)
E at java.lang.Thread.run(Thread.java:748)
{code}
> Table.getSchema fails on table with watermark
> ---------------------------------------------
>
> Key: FLINK-28528
> URL: https://issues.apache.org/jira/browse/FLINK-28528
> Project: Flink
> Issue Type: Bug
> Components: API / Python
> Affects Versions: 1.15.1
> Reporter: Xuannan Su
> Priority: Major
>
> The bug can be reproduced with the following test. The test can pass if we use the commented way to define the watermark.
> {code:python}
> def test_flink_2(self):
> env = StreamExecutionEnvironment.get_execution_environment()
> t_env = StreamTableEnvironment.create(env)
> table = t_env.from_descriptor(
> TableDescriptor.for_connector("filesystem")
> .schema(
> Schema.new_builder()
> .column("name", DataTypes.STRING())
> .column("cost", DataTypes.INT())
> .column("distance", DataTypes.INT())
> .column("time", DataTypes.TIMESTAMP(3))
> .watermark("time", expr.col("time") - expr.lit(60).seconds)
> # .watermark("time", "`time` - INTERVAL '60' SECOND")
> .build()
> )
> .format("csv")
> .option("path", "./input.csv")
> .build()
> )
> print(table.get_schema())
> {code}
> It causes the following exception
> {code:none}
> E pyflink.util.exceptions.TableException: org.apache.flink.table.api.TableException: Expression 'minus(time, 60000)' is not string serializable. Currently, only expressions that originated from a SQL expression have a well-defined string representation.
> E at org.apache.flink.table.expressions.ResolvedExpression.asSerializableString(ResolvedExpression.java:51)
> E at org.apache.flink.table.api.TableSchema.lambda$fromResolvedSchema$13(TableSchema.java:455)
> E at java.util.Collections$SingletonList.forEach(Collections.java:4824)
> E at org.apache.flink.table.api.TableSchema.fromResolvedSchema(TableSchema.java:451)
> E at org.apache.flink.table.api.Table.getSchema(Table.java:101)
> E at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> E at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> E at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> E at java.lang.reflect.Method.invoke(Method.java:498)
> E at org.apache.flink.api.python.shaded.py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
> E at org.apache.flink.api.python.shaded.py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
> E at org.apache.flink.api.python.shaded.py4j.Gateway.invoke(Gateway.java:282)
> E at org.apache.flink.api.python.shaded.py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
> E at org.apache.flink.api.python.shaded.py4j.commands.CallCommand.execute(CallCommand.java:79)
> E at org.apache.flink.api.python.shaded.py4j.GatewayConnection.run(GatewayConnection.java:238)
> E at java.lang.Thread.run(Thread.java:748)
> {code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)