You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@calcite.apache.org by "Julian Hyde (Jira)" <ji...@apache.org> on 2020/01/14 02:05:00 UTC

[jira] [Resolved] (CALCITE-3593) RelToSqlConverter changes target of ambiguous HAVING clause with a Project on Filter on Aggregate

     [ https://issues.apache.org/jira/browse/CALCITE-3593?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Julian Hyde resolved CALCITE-3593.
----------------------------------
    Resolution: Fixed

Fixed in [3621b61e|https://github.com/apache/calcite/commit/3621b61ea0bca953652a2e6d1838e5a296c6a007]; thanks for the PR!

Also added SqlConformanceEnum.BIG_QUERY. It is similar to PRAGMATIC_2003 except that isGroupByAlias() and isHavingAlias() return true.

> RelToSqlConverter changes target of ambiguous HAVING clause with a Project on Filter on Aggregate
> -------------------------------------------------------------------------------------------------
>
>                 Key: CALCITE-3593
>                 URL: https://issues.apache.org/jira/browse/CALCITE-3593
>             Project: Calcite
>          Issue Type: Bug
>          Components: jdbc-adapter
>            Reporter: Steven Talbot
>            Assignee: Jin Xing
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 1.22.0
>
>          Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Best shown with the shell of a test in RelToSqlConverter test.
>  The following SQL on BigQuery 
> {code:java}
> select product_id - 1000 as product_id
> from (
>  select product_id, avg(gross_weight) as agw
>  from (SELECT 1 as product_id, 70 as net_weight, 170 as gross_weight) as product
>  where net_weight < 100
>  group by product_id having product_id > 0){code}
> produces one result, because the having filter applies to the product id before subtraction, of course.
> Running it through the machinery in that test (`sql(query).withBigQuery().ok(expected)`) translates it to:
> {noformat}
> SELECT product_id - 1000 AS product_id
> from (SELECT 1 as product_id, 70 as net_weight, 170 as gross_weight) as product
> WHERE net_weight < 100
> GROUP BY product_id
> HAVING product_id > 0{noformat}
> This changes the meaning of the query: now the HAVING is on the after-subtraction product_id and you get no results, rather than the one result.
> Note that this is _not_ true in HyperSQL, as it has different semantics around the HAVING namespace.
> {noformat}
> select "product_id" - 1000 as "product_id"
> from (
>  select "product_id", avg("gross_weight") as agw
>  from (SELECT 1 as "product_id", 70 as "net_weight", 170 as "gross_weight" FROM (VALUES(0))) as product
>  where "net_weight" < 100
>  group by "product_id" having "product_id" > 0){noformat}
> becomes  
> {noformat}
> SELECT "product_id" - 1000 AS "product_id"
> from (SELECT 1 as "product_id", 70 as "net_weight", 170 as "gross_weight" FROM (VALUES(0))) as product
> WHERE "net_weight" < 100
> GROUP BY "product_id"
> HAVING "product_id" > 0{noformat}
> But the meaning is preserved: both return a row. 
> I'm not enough of a SQL standards expert to know which one is being more compliant, but presumably both would have to be supported via some sort of flag?
> I think the fix would be to force the subselect on dialects such as BigQuery that have this behavior. Probably something that looks a lot like [https://github.com/apache/calcite/blob/3530daaa8cad43aad6845b6c79e4bc1ca0e72f5f/core/src/main/java/org/apache/calcite/rel/rel2sql/SqlImplementor.java#L1043-L1047]
> The test, of course, looks like pretty silly SQL no one would ever write, but the point is this is what's generated when you have 
> {noformat}
> Project f(x) as x
>   Filter g(x)
>     Aggregate {<x>}, ...{noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)