You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Xiao Li (JIRA)" <ji...@apache.org> on 2018/10/17 06:09:04 UTC
[jira] [Updated] (SPARK-24424) Support ANSI-SQL compliant syntax
for GROUPING SET
[ https://issues.apache.org/jira/browse/SPARK-24424?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Xiao Li updated SPARK-24424:
----------------------------
Description:
Currently, our Group By clause follows Hive [https://cwiki.apache.org/confluence/display/Hive/Enhanced+Aggregation,+Cube,+Grouping+and+Rollup] :
However, this does not match ANSI SQL compliance. The proposal is to update our parser and analyzer for ANSI compliance.
For example,
{code:java}
GROUP BY col1, col2 GROUPING SET ...
{code}
It is nice to support ANSI SQL syntax at the same time.
{code:java}
GROUP BY GROUPING SET(...)
{code}
Note, we only need to support one-level grouping set in this stage. That means, nested grouping set is not supported.
Note, we should not break the existing syntax. The parser changes should be like
{code:sql}
group-by-expressions
>>-GROUP BY----+-hive-sql-group-by-expressions-----+---><
'-ansi-sql-grouping-set-expressions-'
hive-sql-group-by-expressions
'--GROUPING SETS--(--grouping-set-expressions--)--'
.-,--------------. +--WITH CUBE--------------------------------------+
V | +--WITH ROLLUP------------------------------------+
>>---+-expression-+-+---+-------------------------------------------------+-><
grouping-expressions-list
.-,--------------.
V |
>>---+-expression-+-+--><
grouping-set-expressions
.-,----------------------------.
| .-,--------------. |
| V | |
V '-(------expression---+-)-' |
>>----+-expression--------------+--+-><
ansi-sql-grouping-set-expressions
>>-+-ROLLUP--(--grouping-expression-list--)---------+--><
+-CUBE--(--grouping-expression-list--)-----------+
'-GROUPING SETS--(--grouping-set-expressions--)--'
{code}
was:
Currently, our Group By clause follows Hive [https://cwiki.apache.org/confluence/display/Hive/Enhanced+Aggregation,+Cube,+Grouping+and+Rollup] :
However, this does not match ANSI SQL compliance. The proposal is to update our parser and analyzer for ANSI compliance.
For example,
{code:java}
GROUP BY col1, col2 WITH ROLLUP
GROUP BY col1, col2 WITH CUBE
GROUP BY col1, col2 GROUPING SET ...
{code}
It is nice to support ANSI SQL syntax at the same time.
{code:java}
GROUP BY ROLLUP(col1, col2)
GROUP BY CUBE(col1, col2)
GROUP BY GROUPING SET(...)
{code}
Note, we only need to support one-level grouping set in this stage. That means, nested grouping set is not supported.
Note, we should not break the existing syntax. The parser changes should be like
{code:sql}
group-by-expressions
>>-GROUP BY----+-hive-sql-group-by-expressions-----+---><
'-ansi-sql-grouping-set-expressions-'
hive-sql-group-by-expressions
'--GROUPING SETS--(--grouping-set-expressions--)--'
.-,--------------. +--WITH CUBE--------------------------------------+
V | +--WITH ROLLUP------------------------------------+
>>---+-expression-+-+---+-------------------------------------------------+-><
grouping-expressions-list
.-,--------------.
V |
>>---+-expression-+-+--><
grouping-set-expressions
.-,----------------------------.
| .-,--------------. |
| V | |
V '-(------expression---+-)-' |
>>----+-expression--------------+--+-><
ansi-sql-grouping-set-expressions
>>-+-ROLLUP--(--grouping-expression-list--)---------+--><
+-CUBE--(--grouping-expression-list--)-----------+
'-GROUPING SETS--(--grouping-set-expressions--)--'
{code}
> Support ANSI-SQL compliant syntax for GROUPING SET
> ---------------------------------------------------
>
> Key: SPARK-24424
> URL: https://issues.apache.org/jira/browse/SPARK-24424
> Project: Spark
> Issue Type: Improvement
> Components: SQL
> Affects Versions: 2.3.0
> Reporter: Xiao Li
> Assignee: Dilip Biswal
> Priority: Major
> Fix For: 2.4.0
>
>
> Currently, our Group By clause follows Hive [https://cwiki.apache.org/confluence/display/Hive/Enhanced+Aggregation,+Cube,+Grouping+and+Rollup] :
> However, this does not match ANSI SQL compliance. The proposal is to update our parser and analyzer for ANSI compliance.
> For example,
> {code:java}
> GROUP BY col1, col2 GROUPING SET ...
> {code}
> It is nice to support ANSI SQL syntax at the same time.
> {code:java}
> GROUP BY GROUPING SET(...)
> {code}
> Note, we only need to support one-level grouping set in this stage. That means, nested grouping set is not supported.
> Note, we should not break the existing syntax. The parser changes should be like
> {code:sql}
> group-by-expressions
> >>-GROUP BY----+-hive-sql-group-by-expressions-----+---><
> '-ansi-sql-grouping-set-expressions-'
> hive-sql-group-by-expressions
> '--GROUPING SETS--(--grouping-set-expressions--)--'
> .-,--------------. +--WITH CUBE--------------------------------------+
> V | +--WITH ROLLUP------------------------------------+
> >>---+-expression-+-+---+-------------------------------------------------+-><
> grouping-expressions-list
> .-,--------------.
> V |
> >>---+-expression-+-+--><
> grouping-set-expressions
> .-,----------------------------.
> | .-,--------------. |
> | V | |
> V '-(------expression---+-)-' |
> >>----+-expression--------------+--+-><
> ansi-sql-grouping-set-expressions
> >>-+-ROLLUP--(--grouping-expression-list--)---------+--><
> +-CUBE--(--grouping-expression-list--)-----------+
> '-GROUPING SETS--(--grouping-set-expressions--)--'
> {code}
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org