You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@calcite.apache.org by "KrishnaKant Agrawal (JIRA)" <ji...@apache.org> on 2019/01/09 15:57:00 UTC

[jira] [Commented] (CALCITE-2757) DISTINCT not being handled correctly in RelToSqlConverter

    [ https://issues.apache.org/jira/browse/CALCITE-2757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16738352#comment-16738352 ] 

KrishnaKant Agrawal commented on CALCITE-2757:
----------------------------------------------

Hi,

Thanks in advance for the patience required in reading this long comment. 

When I set the needNew flag as true in this case, something weird happens:- 
{code:java}
@Test public void testSelectWithDistinct() {
  String query = "select distinct \"product_id\","
      + " sum(\"product_class_id\") over (partition by \"product_id\") "
      + " from \"product\" ";
  final String expected = "SELECT \"product_id\", \"EXPR$1\"\n"
      + "FROM (SELECT \"product_id\", SUM(\"product_class_id\") OVER "
      + "(PARTITION BY \"product_id\" RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING)\n"
      + "FROM \"foodmart\".\"product\") AS \"t\"\n"
      + "GROUP BY \"product_id\", \"EXPR$1\"";
  sql(query).withHive().ok(expected);
{code}
 
Actual(Removed default window bounds because of visual clarity but they are printed):-
{code:sql}
SELECT \"product_id\", SUM(\"product_class_id\") OVER (PARTITION BY \"product_id\" )
FROM (SELECT \"product_id\", SUM(\"product_class_id\") OVER (PARTITION BY \"product_id\" )
FROM \"foodmart\".\"product\") AS \"t\"
GROUP BY \"product_id\", SUM(\"product_class_id\") OVER (PARTITION BY \"product_id\" )
{code}
Now when we see the EXPR$1, it is the correct thing as the SUM() expression wasn't aliased in the original query.

When I set the needNew true, a subquery was formed so the printing the expressions of the inner query at the parent level when input columns needed to construct that expression in the parent query may not be available anymore, seems like a bug.

 

This is not a problem if the SUM() expression was aliased in the original query.

 

This is being done in AliasContext.field() method where we store expressions in SqlImplementor.ordinalMap and return them as is from AliasContext.field().

Can somebody shed light on this? 

 

> DISTINCT not being handled correctly in RelToSqlConverter
> ---------------------------------------------------------
>
>                 Key: CALCITE-2757
>                 URL: https://issues.apache.org/jira/browse/CALCITE-2757
>             Project: Calcite
>          Issue Type: Bug
>          Components: core
>            Reporter: KrishnaKant Agrawal
>            Assignee: Julian Hyde
>            Priority: Major
>
>   SELECT DISTINCT sum( x ) OVER (PARTITION BY y) FROM t
> is valid (per SQL standard) but
>   SELECT sum( x ) OVER (PARTITION BY y)
>   FROM t
>   GROUP BY sum( x ) OVER (PARTITION BY y)
> is not. For example, given the query
>   select sum(deptno) over (partition by loc)
>   from dept
>   group by  sum(deptno) over (partition by loc);
> Oracle gives
>   ORA-00934: group function is not allowed here
> Therefore we should generate a sub-query, something like this:
>   SELECT c1
>   FROM (
>     SELECT sum(deptno) OVER (PARTITION BY loc)
>     FROM dept) AS t
>   GROUP BY c1;
>  
> This will be achieved by Adding a new condition for setting the needNew Flag in SqlImplemontor.Builder.builder() as true in case there are Aggregate Expressions being passed as Group By Keys.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)