You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "luoyuxia (Jira)" <ji...@apache.org> on 2022/06/24 12:54:00 UTC
[jira] [Commented] (FLINK-28247) Exception will be thrown when over window contains grouping in Hive Dialect
[ https://issues.apache.org/jira/browse/FLINK-28247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17558479#comment-17558479 ]
luoyuxia commented on FLINK-28247:
----------------------------------
The issue is introduced in this [change|[https://github.com/apache/flink/pull/15983/files#diff-ea0bbb90359ad01a52dbb18628362f6714a6c7817ca2367e7fd2c4257e153d1dR734]], seems `current.getChildCount() == 2` is changed to `current.getChildCount() >= 2` by mistake.
To fix it may be simple, we only need to revert such modification for this line . In hive 2.x, it's fine for it only supports one parameter in grouping function.
But in Hive 3.x, Hive support multiargument grouping function[HIVE-15996|https://issues.apache.org/jira/browse/HIVE-15996] . So such change won't work when grouping function contains more than one paramters.
For better compatibility, we may need to follow the behavior in Hive 3.x
> Exception will be thrown when over window contains grouping in Hive Dialect
> ---------------------------------------------------------------------------
>
> Key: FLINK-28247
> URL: https://issues.apache.org/jira/browse/FLINK-28247
> Project: Flink
> Issue Type: Sub-task
> Reporter: luoyuxia
> Priority: Major
>
> The exception will be reprodued by the following sql when using Hive Dialect:
> {code:java}
> create table t(category int, live int, comments int)
> SELECT grouping(category), lag(live) over(partition by grouping(category)) FROM t GROUP BY category, live; {code}
> The reson is it will first call `HiveParserCalcitePlanner#genSelectForWindowing` to generate the window, which will then call `HiveParserUtils#rewriteGroupingFunctionAST` to rewrite the group function in the over window :
>
> {code:java}
> // rewrite grouping function
> if (current.getType() == HiveASTParser.TOK_FUNCTION
> && current.getChildCount() >= 2) {
> HiveParserASTNode func = (HiveParserASTNode) current.getChild(0);
> if (func.getText().equals("grouping")) {
> visited.setValue(true);
> convertGrouping(
> current, grpByAstExprs, noneSet, legacyGrouping, found);
> }
> }
> {code}
>
> So `grouping(category)` will be converted to `grouping(0, 1)`.
> After `HiveParserCalcitePlanner#genSelectForWindowing`, it will try to rewrite it again:
>
> {code:java}
> if (!qbp.getDestToGroupBy().isEmpty()) {
> // Special handling of grouping function
> expr =
> rewriteGroupingFunctionAST(
> getGroupByForClause(qbp, selClauseName),
> expr,
> !cubeRollupGrpSetPresent);
> } {code}
> And it will also fall back to `convertGrouping` again as `current.getChildCount() >= 2` will be true. But then, it can't find any field
> presented in group by for it's `grouping(0, 1)` now.
>
--
This message was sent by Atlassian Jira
(v8.20.7#820007)