You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Fabian Hueske (JIRA)" <ji...@apache.org> on 2019/01/03 14:59:00 UTC

[jira] [Commented] (FLINK-11220) Can not Select row time field in JOIN query

    [ https://issues.apache.org/jira/browse/FLINK-11220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16733129#comment-16733129 ] 

Fabian Hueske commented on FLINK-11220:
---------------------------------------

Hi, I think manually assigning timestamps and watermarks should be the last resort. 

Users would need to have detailed knowledge about the execution strategy of the window join (or any other operator that affects timestamps and watermarks) to make a sound choice about the watermark generation strategy. In fact, I don't think that an average user will be able to choose the right watermark generation strategy.

Hence, I think we should always try to forward watermarks and time attributes as record timestamps.

There are a few things that we could do:

1) Add a query configuration switch to drop all timestamps and watermarks and keep the CAST option to choose the timestamp from multiple time attributes.
2) Add a method (or method variant) to choose the timestamp that should be forwarded (in case more than one time attribute is available) and a method (or method variant) to drop all time attributes and watermarks.

What do you think?
Fabian

> Can not Select row time field in JOIN query
> -------------------------------------------
>
>                 Key: FLINK-11220
>                 URL: https://issues.apache.org/jira/browse/FLINK-11220
>             Project: Flink
>          Issue Type: Bug
>          Components: Table API &amp; SQL
>    Affects Versions: 1.8.0
>            Reporter: sunjincheng
>            Priority: Major
>
> SQL:
> {code:java}
> Orders...toTable(tEnv, 'orderId, 'orderTime.rowtime)
> Payment...toTable(tEnv, 'orderId, 'payTime.rowtime)
> SELECT orderTime, o.orderId, payTime
>   FROM Orders AS o JOIN Payment AS p
>   ON o.orderId = p.orderId AND
>      p.payTime BETWEEN orderTime AND orderTime + INTERVAL '1' HOUR
> {code}
> Execption:
> {code:java}
> org.apache.flink.table.api.TableException: Found more than one rowtime field: [orderTime, payTime] in the table that should be converted to a DataStream.
> Please select the rowtime field that should be used as event-time timestamp for the DataStream by casting all other fields to TIMESTAMP.
> at org.apache.flink.table.api.StreamTableEnvironment.translate(StreamTableEnvironment.scala:906)
> {code}
> The reason for the error is that we have 2 time fields `orderTime` and  `payTime`.  I think we do not  need throw the exception, and we can remove the logic of `plan.process(new OutputRowtimeProcessFunction[A](conversion, rowtimeFields.head.getIndex))`, if we want using the timestamp after toDataSteram, we should using `assignTimestampsAndWatermarks()`.
> What do you think ? [~twalthr] [~fhueske] 
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)