You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@calcite.apache.org by "Hongze Zhang (JIRA)" <ji...@apache.org> on 2019/02/15 14:36:00 UTC

[jira] [Commented] (CALCITE-2757) DISTINCT not being handled correctly in RelToSqlConverter

    [ https://issues.apache.org/jira/browse/CALCITE-2757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16769367#comment-16769367 ] 

Hongze Zhang commented on CALCITE-2757:
---------------------------------------

Hi [~Krishnakant A], I'm wondering if the generated SQL like [following|https://github.com/apache/calcite/pull/1000/files#diff-f985d2e1a0804e0c7250da49b53b9d85R278] could work correctly on different SQL databases.

{code:sql}
SELECT \"product_id\", \"EXPR$1\"
FROM (SELECT \"product_id\", SUM(\"product_class_id\") OVER (PARTITION BY \"product_id\" RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING)
FROM \"foodmart\".\"product\") AS \"t\"
GROUP BY \"product_id\", \"EXPR$1\"
{code}

This requires the expression "SUM(\"product_class_id\") OVER (...)" generate alias "EXPR$1" implicitly, I don't think it is a very common behavior for popular SQL databases.
For example, If I run following SQL on MySQL database
{code:sql}
SELECT EXPR$1 FROM (SELECT id, MAX(rev)
FROM `docs`
GROUP BY id) t
{code}
, I'll get an error with message "Unknown column 'EXPR$1' in 'field list'".

> DISTINCT not being handled correctly in RelToSqlConverter
> ---------------------------------------------------------
>
>                 Key: CALCITE-2757
>                 URL: https://issues.apache.org/jira/browse/CALCITE-2757
>             Project: Calcite
>          Issue Type: Bug
>          Components: core
>            Reporter: KrishnaKant Agrawal
>            Assignee: Julian Hyde
>            Priority: Major
>              Labels: pull-request-available
>
>   SELECT DISTINCT sum( x ) OVER (PARTITION BY y) FROM t
> is valid (per SQL standard) but
>   SELECT sum( x ) OVER (PARTITION BY y)
>   FROM t
>   GROUP BY sum( x ) OVER (PARTITION BY y)
> is not. For example, given the query
>   select sum(deptno) over (partition by loc)
>   from dept
>   group by  sum(deptno) over (partition by loc);
> Oracle gives
>   ORA-00934: group function is not allowed here
> Therefore we should generate a sub-query, something like this:
>   SELECT c1
>   FROM (
>     SELECT sum(deptno) OVER (PARTITION BY loc)
>     FROM dept) AS t
>   GROUP BY c1;
>  
> This will be achieved by Adding a new condition for setting the needNew Flag in SqlImplemontor.Builder.builder() as true in case there are Aggregate Expressions being passed as Group By Keys.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)