You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@calcite.apache.org by "Julian Hyde (JIRA)" <ji...@apache.org> on 2017/05/16 06:27:04 UTC

[jira] [Assigned] (CALCITE-1644) subset clause syntax support for MATCH_RECOGNIZE

     [ https://issues.apache.org/jira/browse/CALCITE-1644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Julian Hyde reassigned CALCITE-1644:
------------------------------------

    Assignee: Julian Hyde  (was: Zhiqiang He)

> subset clause syntax support for MATCH_RECOGNIZE
> ------------------------------------------------
>
>                 Key: CALCITE-1644
>                 URL: https://issues.apache.org/jira/browse/CALCITE-1644
>             Project: Calcite
>          Issue Type: Sub-task
>          Components: core
>    Affects Versions: 1.11.0
>            Reporter: Zhiqiang He
>            Assignee: Julian Hyde
>              Labels: features
>
> h1. SUBSET: Defining Union Row Pattern Variables
> At times, it is helpful to create a grouping of multiple pattern variables that can be referred to with a variable name of its own. These groupings are called union row pattern variables, and you create them with the SUBSET clause. The union row pattern variable created by SUBSET can be used in the MEASURES and DEFINE clauses. The SUBSET clause is optional. It is used to declare union row pattern variables. For example, here is a query using SUBSET to calculate an average based on all rows that are mapped to the union of STRT and DOWN variables, where STRT is the starting point for a pattern, and DOWN is the downward (left) leg of a V shape.
> Example 20-5 illustrates creating a union row pattern variable.
> **Example 20-5 Defining Union Row Pattern Variables**
> {code:sql}
> SELECT *
> FROM Ticker MATCH_RECOGNIZE(
>      PARTITION BY symbol
>      ORDER BY tstamp
>      MEASURES FIRST(STRT.tstamp) AS strt_time,
>               LAST(DOWN.tstamp) AS bottom,
>               AVG(STDN.Price) AS stdn_avgprice
>      ONE ROW PER MATCH
>      AFTER MATCH SKIP TO LAST UP
>      PATTERN (STRT DOWN+ UP+)
>      SUBSET STDN= (STRT, DOWN)
>      DEFINE
>         UP AS UP.Price > PREV(UP.Price),
>         DOWN AS DOWN.Price < PREV (DOWN.Price)
> );
>  
> SYMBOL    STRT_TIME BOTTOM    STDN_AVGPRICE
> ------    --------- --------- -------------
> ACME      05-APR-11 06-APR-11          18.5
> ACME      10-APR-11 12-APR-11    19.6666667
> ACME      14-APR-11 16-APR-11            17
> {code}
> This example declares a single union row pattern variable, STDN, and defines it as the union of the rows mapped to STRT and the rows mapped to DOWN. There can be multiple union row pattern variables in a query. For example:
> {code:sql}
> PATTERN (W+ X+ Y+ Z+)
> SUBSET XY = (X, Y),
>        WZ = (W, Z)
>        {code}
> The right-hand side of a SUBSET item is a comma-separated list of distinct primary row pattern variables within parentheses. This defines the union row pattern variable (on the left-hand side) as the union of the primary row pattern variables (on the right-hand side).
> Note that the list of pattern variables on the right-hand side may not include any union row pattern variables (there are no unions of unions).
> For every match, there is one implicit union row pattern variable called the universal row pattern variable. The universal row pattern variable is the union of all primary row pattern variables. For instance, if your pattern has primary pattern variable A, B, and C, then the universal row pattern variable is equivalent to a SUBSET clause with the argument (A, B, C). Thus, every row of a match is mapped to the universal row pattern variable. Any unqualified column reference within the MEASURES or DEFINE clauses is implicitly qualified by the universal row pattern variable. Note that there is no keyword to explicitly specify the universal row pattern variable.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)