You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Dian Fu (JIRA)" <ji...@apache.org> on 2017/07/29 01:31:02 UTC

[jira] [Comment Edited] (FLINK-7293) Support custom order by in PatternStream

    [ https://issues.apache.org/jira/browse/FLINK-7293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16105963#comment-16105963 ] 

Dian Fu edited comment on FLINK-7293 at 7/29/17 1:30 AM:
---------------------------------------------------------

{quote}
Could you explain a bit why this is needed? 
{quote}
As we need to support clauses such as
{code}
SELECT *
FROM Ticker MATCH_RECOGNIZE (
     PARTITION BY symbol
     ORDER BY tstamp, price
     MEASURES  STRT.tstamp AS start_tstamp,
               LAST(DOWN.tstamp) AS bottom_tstamp,
               LAST(UP.tstamp) AS end_tstamp
     ONE ROW PER MATCH
     AFTER MATCH SKIP TO LAST UP
     PATTERN (STRT DOWN+ UP+)
     DEFINE
        DOWN AS DOWN.price < PREV(DOWN.price),
        UP AS UP.price > PREV(UP.price)
     ) MR
{code}
There may be multiple columns to order by.

{quote}
I can't see a way to sort an unbounded stream of data  Could you elaborate a bit how do you see it working?
how this is going to play well with the Time semantics.
When both event-time and a custom order-by is used, who is going to win?
{quote}
This is working in the same way as the implementation of {{sort by}} in table API. That's to say, both the event-time and the custom order-by will be used and the event-time should be considered with higher priority and the custom order-by will be considered with lower priorities. With both event-time and a custom order-by are used, when events come, they will be firstly ordered by the event time and when watermark come, the events before watermark with the same event time will firstly ordered by the custom order-by before emitted (Please refer to [DataStreamSort.scala|https://github.com/apache/flink/blob/b8c8f204de718e6d5b7c3df837deafaed7c375f5/flink-libraries/flink-table/src/main/scala/org/apache/flink/table/plan/nodes/datastream/DataStreamSort.scala] for more details)

Thoughts?




was (Author: dian.fu):
{quote}
Could you explain a bit why this is needed? 
{quote}
As we need to support clauses such as
{code}
SELECT *
FROM Ticker MATCH_RECOGNIZE (
     PARTITION BY symbol
     ORDER BY tstamp, price
     MEASURES  STRT.tstamp AS start_tstamp,
               LAST(DOWN.tstamp) AS bottom_tstamp,
               LAST(UP.tstamp) AS end_tstamp
     ONE ROW PER MATCH
     AFTER MATCH SKIP TO LAST UP
     PATTERN (STRT DOWN+ UP+)
     DEFINE
        DOWN AS DOWN.price < PREV(DOWN.price),
        UP AS UP.price > PREV(UP.price)
     ) MR
{code}
There may be multiple columns to order by.

{quote}
I can't see a way to sort an unbounded stream of data  Could you elaborate a bit how do you see it working?
how this is going to play well with the Time semantics.
When both event-time and a custom order-by is used, who is going to win?
{quote}
This is working in the same way as the implementation of {{sort by}} in table API. That's to say, both the event-time and the custom order-by will be used and the event-time should be considered with higher priority and the custom order-by will be considered with lower priorities. (Please refer to [DataStreamSort.scala|https://github.com/apache/flink/blob/b8c8f204de718e6d5b7c3df837deafaed7c375f5/flink-libraries/flink-table/src/main/scala/org/apache/flink/table/plan/nodes/datastream/DataStreamSort.scala] for more details)

Thoughts?



> Support custom order by in PatternStream
> ----------------------------------------
>
>                 Key: FLINK-7293
>                 URL: https://issues.apache.org/jira/browse/FLINK-7293
>             Project: Flink
>          Issue Type: Sub-task
>          Components: CEP
>            Reporter: Dian Fu
>            Assignee: Dian Fu
>
> Currently, when {{ProcessingTime}} is configured, the events are fed to NFA in the order of the arriving time and when {{EventTime}} is configured, the events are fed to NFA in the order of the event time. It should also allow custom {{order by}} to allow users to define the order of the events besides the above factors.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)