You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@calcite.apache.org by "Rajeshbabu Chintaguntla (JIRA)" <ji...@apache.org> on 2017/05/02 16:46:04 UTC

[jira] [Commented] (CALCITE-1306) Allow GROUP BY and HAVING to reference SELECT expressions by ordinal and alias

    [ https://issues.apache.org/jira/browse/CALCITE-1306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15993242#comment-15993242 ] 

Rajeshbabu Chintaguntla commented on CALCITE-1306:
--------------------------------------------------

[~julianhyde] Thanks for review. Committed the changes handling review comments to pull request at https://github.com/apache/calcite/pull/413.
bq. The query SELECT a, COUNT(*) AS c FROM t GROUP BY a, c is cyclic and must fail in validation.
It fails with aggregate functions not in group by in both calcite and Postgres. Added test as well.
bq. Are large literals interpreted as column references (and therefore fail because out of range) or are they treated as integers? E.g. SELECT deptno FROM emp GROUP BY deptno, 100. See what Postgres does and make sure we do the same.
Postgres fails with ordinal position not in range same in calcite after my changes. Added test as well.
bq. Are literals inside expressions interpreted as column references? E.g. SELECT COUNT(*) FROM (SELECT 1 FROM emp GROUP BY substring(name FROM 2 FOR 3)). See what Postgres does and make sure we do the same.
This is valid in Postgres and same in calcite now. 2 and 3 are considered as liternals not an ordinal. Added test as well.
bq. Check that if the conformance disables them, we get the old behavior for aliases and ordinals.
Added the tests which fail without conformance. Added test as well.
bq.Should not be able to use alias in an aggregate function in the HAVING clause. E.g. SELECT deptno AS x FROM emp HAVING min(x) < 20 is an error.
Not allowing alias in having aggregate expression now with my changes. Added test as well.
bq.If an alias matches a column, the alias wins. E.g. SELECT COUNT(*) FROM (SELECT gender AS deptno FROM emp GROUP BY deptno) should return 2. Please see what Postgres does. Also test a HAVING query.
Giving priority to alias than column when there is conflict. But Postgres gives priority to column name and fail with ambiguous column. Added test as well.
bq.If an alias matches two columns, the alias wins, and the query is not ambiguous. E.g. SELECT COUNT(*) FROM (SELECT gender AS deptno FROM emp, dept GROUP BY deptno). Also test a HAVING query.
Giving priority to alias than column name. Added test as well.
bq.Matching is according to the case of the session. Thus SELECT x + y AS "z" FROM t GROUP BY "Z" is valid if case sensitivity is off.
It works if sensitivity is off added test as well.
bq.Expressions involving aliases, e.g. SELECT a + b AS c, a + b + d, COUNT(*) FROM t GROUP BY c, d should be valid, because even though a + b + d is not grouped, it matches (a + b) + d, which combines two grouped expressions. Check what Postgres does.
This works as well and added test cases for the same. Postgres allows the same.
bq. Referencing aliases in the SELECT clause is not valid, SELECT a + b AS c, c + d FROM t GROUP BY c, d. Again, see what Postgres does.
This is not valid in calcite and Postgres.
bq. We need to test sql-to-rel conversion for some of the cases that do not fail but have different behavior, e.g. large literals, alias same name as columns.
Added some sql-to-rel tests for corner cases.
bq.e.getCause().getMessage().equals(RESOURCE.columnNotFound(id.names.get(0)).str())
Removed this.
bq. if (root.getKind().equals(SqlKind.CUBE)
Changed to switch case.

Please review the changes. 

> Allow GROUP BY and HAVING to reference SELECT expressions by ordinal and alias
> ------------------------------------------------------------------------------
>
>                 Key: CALCITE-1306
>                 URL: https://issues.apache.org/jira/browse/CALCITE-1306
>             Project: Calcite
>          Issue Type: Bug
>            Reporter: Julian Hyde
>            Assignee: Rajeshbabu Chintaguntla
>              Labels: dialect, phoenix
>
> Allow GROUP BY and HAVING to reference SELECT expressions by ordinal and alias. It is not standard SQL, but MySQL and PostgreSQL allow it.
> See [Stack Overflow: SQL - using alias in Group By|http://stackoverflow.com/questions/3841295/sql-using-alias-in-group-by].
> It would be enabled only by new methods {{isGroupByOrdinal}} and {{isGroupByAlias}} in SqlConformance.
> We might allow alias in the HAVING clause (as described in HIVE-10557) but ordinal does not make sense.
> Expressions that are not available before grouping would be illegal; for instance:
> {code}
> select count(*) as c
> from t
> group by c
> {code}
> We'd also need rules to resolve ambiguous expressions. For instance, in
> {code}
> select e.empno as deptno
> from emp as e join dept as d
> where e.deptno = d.deptno
> group by deptno
> {code}
> does {{deptno}} refer to {{e.deptno}}, {{d.deptno}}, or {{e.empno}}?



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)