You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@calcite.apache.org by "Julian Hyde (Jira)" <ji...@apache.org> on 2022/10/27 01:03:00 UTC

[jira] [Commented] (CALCITE-5268) Add QUALIFY clause

    [ https://issues.apache.org/jira/browse/CALCITE-5268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17624802#comment-17624802 ] 

Julian Hyde commented on CALCITE-5268:
--------------------------------------

I just saw the PR come in. Exciting to have a major feature under way! I assigned the issue to you.

The main thing left is to bulk out the testing:
 * I would like to see a few tests in {{{}SqlParserTest{}}}. Not many required. But let's make sure that {{QUALIFY}} works without and without preceding {{{}HAVING{}}}, or followed by {{{}ORDER BY{}}}, or followed by {{{}UNION{}}}, or in a subquery.
 * We need lots of validator tests, especially negative ones. For example, reference columns from {{QUALIFY}} that you are not allowed to reference (e.g. in a GROUP BY or HAVING query, you cannot reference a column or expression that is not in the {{GROUP BY}} clause). And check that the error position is correct.
 ** In some modes, you can reference a column via its alias or ordinal in a {{{}GROUP BY{}}}, {{HAVING}} or {{ORDER BY}} clause. Are you allowed to do the same in {{{}QUALIFY{}}}?
 ** If I write 'select ... from emp natural join dept', I am allowed to reference the common column 'deptno' without qualifying it as 'emp.deptno' or 'dept.deptno'.
 ** What happens if the {{QUALIFY}} expression is not {{{}BOOLEAN{}}}?
 ** Can I put (non-windowed) aggregate functions in {{{}QUALIFY{}}}?
 ** Can I apply windowed aggregate functions to aggregate functions?
 ** If my query does not contain {{{}GROUP BY{}}}, if I add {{SUM}} to the {{SELECT}} clause it nevertheless becomes an aggregate query. Same if I add a {{HAVING}} clause. Does the same thing happen if I have a (non-windowed) aggregate function in a {{QUALIFY}} clause?
 * Correlated subqueries. If my {{QUALIFY}} clause is inside a {{WHERE EXISTS}} subquery, can I reference columns from the enclosing query?
 * Create a {{qualify.iq}} file (copying say {{{}match.iq{}}}) and add some basic queries that run end-to-end. Use 'if (false)' around queries that don't work yet.

For the testing, try running some tricky queries on other engines (Snowflake, Teradata, BigQuery, DataBricks) and make sure that Calcite's error message at least as helpful as theirs. :)

Add the syntax to {{{}reference.md{}}}.

Please do all these things as separate commits, and don't squash or rebase, so that it's easier to review incrementally.

It was a pleasant surprise to get this PR. The code looks good quality, you probably have most of the implementation written already, and the testing will just make it robust for users and against future changes. Let's continue the momentum and get this done. Please nag me (and others on the dev list) for further reviews.

> Add QUALIFY clause
> ------------------
>
>                 Key: CALCITE-5268
>                 URL: https://issues.apache.org/jira/browse/CALCITE-5268
>             Project: Calcite
>          Issue Type: New Feature
>            Reporter: Brandon Chong
>            Assignee: Brandon Chong
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> {{QUALIFY}} is to Window functions what {{HAVING}} is to {{GROUP BY}}.
> For example:
> {code}
> SELECT a, b, c, ROW_NUMBER() OVER (PARTITION BY b ORDER BY c) as row_num
> FROM t1
> QUALIFY row_num = 1
> {code}
> Is the same as:
> {code}
> SELECT *
> FROM (
>     SELECT a, b, c, ROW_NUMBER() OVER (PARTITION BY b ORDER BY c) as row_num
>     FROM t1)
> WHERE row_num = 1
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)