You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@calcite.apache.org by "Julian Hyde (JIRA)" <ji...@apache.org> on 2017/05/22 20:38:04 UTC

[jira] [Comment Edited] (CALCITE-1645) Row per match syntax support for MATCH_RECOGNIZE

    [ https://issues.apache.org/jira/browse/CALCITE-1645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16020155#comment-16020155 ] 

Julian Hyde edited comment on CALCITE-1645 at 5/22/17 8:37 PM:
---------------------------------------------------------------

Thanks for fixing 2, 3, 4.

Regarding 1. I still think it should be boolean, not Boolean.

The goal of a RelNode should be to make it easy to write rules. Which includes canonization: if two expressions are equivalent, they should be structurally the same. It is less important whether the generated SQL looks the same as the input.

Also, we don't want people who write rules to have to know whether ONE ROW PER MATCH or ALL ROWS PER MATCH is the default. That should be sorted out at SQL-to-Rel time.

(Did you know that people can write "COUNT(ALL x)" which means the same as "COUNT\(x)" and is the opposite of "COUNT(DISTINCT x)"? But of course we don't record whether they wrote "ALL". AggregateCall.distinct is a boolean, not a Boolean.)

Regarding 5. Merging some test cases next request would be great. Thanks. It's not urgent. I just want to keep some balance between number of tests (maintenance burden) and coverage. We can reduce the number of tests quite a bit, I think, without much drop in coverage.


was (Author: julianhyde):
Thanks for fixing 2, 3, 4.

Regarding 1. I still think it should be boolean, not Boolean.

The goal of a RelNode should be to make it easy to write rules. Which includes canonization: if two expressions are equivalent, they should be structurally the same. It is less important whether the generated SQL looks the same as the input.

Also, we don't want people who write rules to have to know whether ONE ROW PER MATCH or ALL ROWS PER MATCH is the default. That should be sorted out at SQL-to-Rel time.

(Did you know that people can write "COUNT(ALL x)" which means the same as "COUNT(x)" and is the opposite of "COUNT(DISTINCT x)"? But of course we don't record whether they wrote "ALL". AggregateCall.distinct is a boolean, not a Boolean.)

Regarding 5. Merging some test cases next request would be great. Thanks. It's not urgent. I just want to keep some balance between number of tests (maintenance burden) and coverage. We can reduce the number of tests quite a bit, I think, without much drop in coverage.

> Row per match  syntax support for MATCH_RECOGNIZE
> -------------------------------------------------
>
>                 Key: CALCITE-1645
>                 URL: https://issues.apache.org/jira/browse/CALCITE-1645
>             Project: Calcite
>          Issue Type: Sub-task
>          Components: core
>    Affects Versions: 1.11.0
>            Reporter: Zhiqiang He
>            Assignee: Zhiqiang He
>              Labels: features
>
> h1. [ONE ROW | ALL ROWS] PER MATCH: Choosing Summaries or Details for Each Match
> You will sometimes want summary data about the matches and other times need details. You can do that with the following SQL:
> * ONE ROW PER MATCH
> Each match produces one summary row. This is the default.
> * ALL ROWS PER MATCH
> A match spanning multiple rows will produce one output row for each row in the match.
> The output is explained in "Row Pattern Output".
> The MATCH_RECOGNIZE clause may find a match with zero rows. For an empty match, ONE ROW PER MATCH returns a summary row: the PARTITION BY columns take the values from the row where the empty match occurs, and the measure columns are evaluated over an empty set of rows.
> ALL ROWS PER MATCH has three suboptions:
> * ALL ROWS PER MATCH SHOW EMPTY MATCHES
> * ALL ROWS PER MATCH OMIT EMPTY MATCHES
> * ALL ROWS PER MATCH WITH UNMATCHED ROWS



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)