You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@calcite.apache.org by "Julian Hyde (Jira)" <ji...@apache.org> on 2023/06/10 19:55:00 UTC

[jira] [Resolved] (CALCITE-5767) JDBC adapter for MSSQL adds GROUPING to ORDER BY clause twice when emulating NULLS LAST

     [ https://issues.apache.org/jira/browse/CALCITE-5767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Julian Hyde resolved CALCITE-5767.
----------------------------------
    Fix Version/s: 1.35.0
       Resolution: Fixed

Fixed in [0f0288f4|https://github.com/apache/calcite/commit/0f0288f4e93476823329a5ec59b471fd186bd16d]; thanks for the PR, [~wnoble]!

> JDBC adapter for MSSQL adds GROUPING to ORDER BY clause twice when emulating NULLS LAST
> ---------------------------------------------------------------------------------------
>
>                 Key: CALCITE-5767
>                 URL: https://issues.apache.org/jira/browse/CALCITE-5767
>             Project: Calcite
>          Issue Type: Bug
>          Components: core
>            Reporter: Will Noble
>            Assignee: Will Noble
>            Priority: Minor
>              Labels: pull-request-available
>             Fix For: 1.35.0
>
>
> JDBC adapter for MSSQL adds a call to the {{GROUPING}} function to the {{ORDER BY}} clause twice when emulating {{NULLS LAST}}. This is a problem because MSSQL disallows duplicate sort keys; it is caused because the MSSQL dialect wrongly gives the {{GROUPING}} function special treatment when emulating NULL direction.
> {{MssqlSqlDialect.emulateNullDirection}} has [special logic for {{GROUPING}} calls|https://github.com/apache/calcite/blob/814ae6ec09e72544ba010f2591e06020c55b162b/core/src/main/java/org/apache/calcite/sql/dialect/MssqlSqlDialect.java#L95]. This seems to be an optimization attempt since {{GROUPING}} is known to never return {{{}NULL{}}}, and therefore needs no null direction emulation. However, this causes problems because:
> 1. MSSQL does not have the same default null direction as Calcite's {{RelBuilder}} (and most other dialects).
> 2. MSSQL does not support the common {{NULLS FIRST}} or {{NULLS LAST}} syntax.
> 3. MSSQL does not allow sorting on the same field twice, even though there is no theoretical issue with this.
> Each of these properties must be present for the problem to occur, so it's a bit niche and specific to MSSQL. Seems like the best solution is to simply eliminate the special-case treatment for {{GROUPING}} in {{{}emulateNullDirection{}}}, which is currently creating problems due to property #3.
> {*}More in-depth explanation{*}:
> In {{{}RelBuilder.collation{}}}, we use the "default null direction" to insert rex nodes as sorting expressions, but this is only the default null direction for NULLS-high dialects, i.e. *not* MSSQL. This is a problem because MSSQL has special-case logic for emulating null direction of GROUPING calls, whereby it effectively duplicates the expression. Really, {{MssqlSqlDialect.emulateNullDirection}} probably should've been returning {{null}} instead, signalling to callers that no null-direction emulation is necessary because {{GROUPING}} never returns {{{}NULL{}}}, but this causes another problem due to property #2 above when the null direction is non-default as is caused simply by using {{RelBuilder.collation}} as described above (it should be noted that this method takes rex nodes instead of {{RelFieldCollation}} object, so there is no way to specify null direction) because the non-default null direction is not expanded into a {{CASE}} expression (MSSQL does not support {{NULLS FIRST}} or {{LAST}} syntax).
> Here's a test illustrating the problem:
> Input SQL (default dialect)
> {code:xml}
> select "product_class_id", "brand_name", GROUPING("brand_name")
> from "product"
> group by GROUPING SETS (("product_class_id", "brand_name"), ("product_class_id"))
> order by 3, 2, 1
> {code}
> Current behavior for unparsing as MSSQL (incorrect because it orders by the same column twice; {{GROUPING([brand_name])}} and {{{}3{}}}, which will fail if you try to actually run this against a real MSSQL database, even though it seems like it shouldn't):
> {code:xml}
> SELECT [product_class_id], [brand_name], GROUPING([brand_name])
> FROM [foodmart].[product]
> GROUP BY GROUPING SETS(([product_class_id], [brand_name]), [product_class_id])
> ORDER BY
>          GROUPING([brand_name]),
>          3,
>          CASE WHEN [brand_name] IS NULL THEN 1 ELSE 0 END,
>          [brand_name],
>          CASE WHEN [product_class_id] IS NULL THEN 1 ELSE 0 END,
>          [product_class_id]
> {code}
> Behavior where {{MssqlSqlDialect.emulateNullDirection}} simply returns {{null}} for {{GROUPING}} expressions (incorrect because it uses {{NULLS LAST}} syntax):
> {code:xml}
> ...
> ORDER BY
>          3 NULLS LAST,
>          CASE WHEN [brand_name] IS NULL THEN 1 ELSE 0 END,
>          [brand_name],
>          CASE WHEN [product_class_id] IS NULL THEN 1 ELSE 0 END,
>          [product_class_id]
> {code}
> Acceptable behavior (although the first {{{}ORDER BY{}}}-clause is effectively ordering by a constant, this will at least run and produce the correct results):
> {code:xml}
> ...
> ORDER BY
>          CASE WHEN GROUPING([brand_name]) IS NULL THEN 0 ELSE 1 END,
>          3,
>          CASE WHEN [brand_name] IS NULL THEN 1 ELSE 0 END,
>          [brand_name],
>          CASE WHEN [product_class_id] IS NULL THEN 1 ELSE 0 END,
>          [product_class_id]
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)