You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@spark.apache.org by "Apache Spark (Jira)" <ji...@apache.org> on 2020/12/18 07:09:00 UTC

[jira] [Assigned] (SPARK-33791) grouping__id() result does not consistent with hive's version < 2.3

     [ https://issues.apache.org/jira/browse/SPARK-33791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Apache Spark reassigned SPARK-33791:
------------------------------------

    Assignee:     (was: Apache Spark)

> grouping__id() result does not consistent with hive's version < 2.3
> -------------------------------------------------------------------
>
>                 Key: SPARK-33791
>                 URL: https://issues.apache.org/jira/browse/SPARK-33791
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 2.4.3, 3.0.1
>            Reporter: Su Qilong
>            Priority: Minor
>
> See this [https://cwiki.apache.org/confluence/display/Hive/Enhanced+Aggregation%2C+Cube%2C+Grouping+and+Rollup]
> Hive's grouping__id method made a change since hive version 2.3.0. Now spark does not declare this inconsistency with Hive, which may make user believe they're safe from migrating their query from Hive 1.x to Spark, but which is wrong.
> I guess we should note this difference in Hive migration guide, and add a configuration to let grouping__id to use hive 1.x compatible algorithm



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org