You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@calcite.apache.org by "Julian Hyde (Jira)" <ji...@apache.org> on 2021/06/21 16:49:00 UTC

[jira] [Commented] (CALCITE-4636) Wrong RelCollation if a field in GROUP BY appears in ORDER BY of WITHIN GROUP clause

    [ https://issues.apache.org/jira/browse/CALCITE-4636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17366726#comment-17366726 ] 

Julian Hyde commented on CALCITE-4636:
--------------------------------------

The commit message "Switch out of agg mode when constructing RelCollation for aggregate functions" describe the fix, not the problem. We prefer that the commit message is the JIRA subject, i.e. a description of the bug. That is what is most meaningful for readers of the release notes, and future developers.

> Wrong RelCollation if a field in GROUP BY appears in ORDER BY of WITHIN GROUP clause
> ------------------------------------------------------------------------------------
>
>                 Key: CALCITE-4636
>                 URL: https://issues.apache.org/jira/browse/CALCITE-4636
>             Project: Calcite
>          Issue Type: Bug
>          Components: core
>            Reporter: Rafay A
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 1.28.0
>
>          Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> If we run this query:
> {code:java}
> select regionkey, collect(regionkey)
> within group (order by regionkey)
> from (SELECT * FROM (VALUES('blah','blah','blah',1,1)) as TBL(f1,f2,f3,nationkey,regionkey))
> group by regionkey
> {code}
>  
> We see that we are ordering by the same field as the group by field (regionkey). Now the query may be non-sensicle as there is no point of collecting the same field with in the same group, but syntactically its a valid SQL and should produce correct plan, but it generates this plan instead:
> {code:java}
> LogicalAggregate(group=[{0}], EXPR$1=[COLLECT($0) WITHIN GROUP ([1])])
>   LogicalProject(REGIONKEY=[$4], F1=[$0])
>     LogicalProject(F1=[$0], F2=[$1], F3=[$2], NATIONKEY=[$3], REGIONKEY=[$4])
>       LogicalValues(tuples=[[{ 'blah', 'blah', 'blah', 1, 1 }]])
> {code}
>  
> If we run the following *good* query, it generates correct plan:
>  
> {code:java}
> select nationkey, collect(regionkey)
> within group (order by regionkey)
> from (SELECT * FROM (VALUES('blah','blah','blah',1,1)) as TBL(f1,f2,f3,nationkey,regionkey))
> group by regionkey{code}
>  
> Generated plan:
> {noformat}
> LogicalAggregate(group=[{0}], EXPR$1=[COLLECT($1) WITHIN GROUP ([1])])
>   LogicalProject(NATIONKEY=[$3], REGIONKEY=[$4])
>     LogicalProject(F1=[$0], F2=[$1], F3=[$2], NATIONKEY=[$3], REGIONKEY=[$4])
>       LogicalValues(tuples=[[{ 'blah', 'blah', 'blah', 1, 1 }]]){noformat}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)