You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@calcite.apache.org by "Julian Hyde (Jira)" <ji...@apache.org> on 2023/06/07 23:16:00 UTC

[jira] [Comment Edited] (CALCITE-5767) MSSQL fails to unparse properly when sorting by GROUPING expression

    [ https://issues.apache.org/jira/browse/CALCITE-5767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17730306#comment-17730306 ] 

Julian Hyde edited comment on CALCITE-5767 at 6/7/23 11:15 PM:
---------------------------------------------------------------

Makes sense, and your fix looks good. But can you revise your summary and description? Currently they follow your stream of consciousness as you investigated the bug; what we need for the release notes is a concise description of the problem and the fix.

The problem is that the JDBC adapter for MSSQL is not handling GROUPING SETS and NULLS LAST correctly. (Your current summary makes it sounds like it's MSSQL's fault.) MSSQL is a particular problem because it doesn't support NULLS LAST, so we have to emulate it.

In your description you make it sound like the problem is in {{RelBuilder}}. It isn't, because {{RelBuilder}} is not targeting a particular dialect. So please de-emphasize that in the description. (As your comment in RelBuilder correctly implies, it is implicitly working in Calcite's dialect, and Calcite's null ordering, and that creates a problem that the dialect will need to solve.) 

In your PR, in the javadoc of the test method, please enclose the Jira URL in {{<a>}}, and include the bug summary, like other test methods.


was (Author: julianhyde):
Makes sense, and your fix looks good. But can you revise your summary and description? Currently they follow your stream of consciousness as you investigated the bug; what we need for the release notes is a concise description of the problem and the fix.

The problem is that the JDBC adapter for MSSQL is not handling GROUPING SETS and NULLS LAST correctly. (Your current summary makes it sounds like it's MSSQL's fault.) MSSQL is a particular problem because it doesn't support NULLS LAST, so we have to emulate it.

In your PR, in the javadoc of the test method, please enclose the Jira URL in `<a>`, like other test methods.

In your description you make it sound like the problem is in {{RelBuilder}}. So please de-emphasize that.

> MSSQL fails to unparse properly when sorting by GROUPING expression
> -------------------------------------------------------------------
>
>                 Key: CALCITE-5767
>                 URL: https://issues.apache.org/jira/browse/CALCITE-5767
>             Project: Calcite
>          Issue Type: Bug
>          Components: core
>            Reporter: Will Noble
>            Assignee: Will Noble
>            Priority: Minor
>              Labels: pull-request-available
>
> In {{{}RelBuilder.collation{}}}, we use the "default null direction" to insert rex nodes as sorting expressions, but this is only the default null direction for NULLS-high dialects, i.e. *not* MSSQL. This is a problem because MSSQL has special-case logic for emulating null direction of GROUPING calls, whereby it effectively duplicates the expression. Really, {{MssqlSqlDialect.emulateNullDirection}} probably should've been returning {{null}} instead, signalling to callers that no null-direction emulation is necessary because {{GROUPING}} never returns {{{}NULL{}}}, but this causes another problem when the null direction is non-default as is caused simply by using {{RelBuilder.collation}} as described above (it should be noted that this method takes rex nodes instead of {{RelFieldCollation}} object, so there is no way to specify null direction) because the non-default null direction is not expanded into a {{CASE}} expression (MSSQL does not support {{NULLS FIRST}} or {{LAST}} syntax).
> Here's a test illustrating the problem:
> Input SQL (default dialect)
> {code:xml}
> select "product_class_id", "brand_name", GROUPING("brand_name")
> from "product"
> group by GROUPING SETS (("product_class_id", "brand_name"), ("product_class_id"))
> order by 3, 2, 1
> {code}
> Current behavior for unparsing as MSSQL (incorrect because it orders by the same column twice; {{GROUPING([brand_name])}} and {{3}}, which will fail if you try to actually run this against a real MSSQL database, even though it seems like it shouldn't):
> {code:xml}
> SELECT [product_class_id], [brand_name], GROUPING([brand_name])
> FROM [foodmart].[product]
> GROUP BY GROUPING SETS(([product_class_id], [brand_name]), [product_class_id])
> ORDER BY
>          GROUPING([brand_name]),
>          3,
>          CASE WHEN [brand_name] IS NULL THEN 1 ELSE 0 END,
>          [brand_name],
>          CASE WHEN [product_class_id] IS NULL THEN 1 ELSE 0 END,
>          [product_class_id]
> {code}
> Behavior where {{MssqlSqlDialect.emulateNullDirection}} simply returns {{null}} for {{GROUPING}} expressions (incorrect because it uses {{NULLS LAST}} syntax):
> {code:xml}
> ...
> ORDER BY
>          3 NULLS LAST,
>          CASE WHEN [brand_name] IS NULL THEN 1 ELSE 0 END,
>          [brand_name],
>          CASE WHEN [product_class_id] IS NULL THEN 1 ELSE 0 END,
>          [product_class_id]
> {code}
> Acceptable behavior (although the first {{ORDER BY}}-clause is effectively ordering by a constant, this will at least run and produce the correct results):
> {code:xml}
> ...
> ORDER BY
>          CASE WHEN GROUPING([brand_name]) IS NULL THEN 0 ELSE 1 END,
>          3,
>          CASE WHEN [brand_name] IS NULL THEN 1 ELSE 0 END,
>          [brand_name],
>          CASE WHEN [product_class_id] IS NULL THEN 1 ELSE 0 END,
>          [product_class_id]
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)