You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Xiao Li (JIRA)" <ji...@apache.org> on 2018/10/17 06:09:04 UTC

[jira] [Updated] (SPARK-24424) Support ANSI-SQL compliant syntax for GROUPING SET

     [ https://issues.apache.org/jira/browse/SPARK-24424?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Xiao Li updated SPARK-24424:
----------------------------
    Description: 
Currently, our Group By clause follows Hive [https://cwiki.apache.org/confluence/display/Hive/Enhanced+Aggregation,+Cube,+Grouping+and+Rollup] :
 However, this does not match ANSI SQL compliance. The proposal is to update our parser and analyzer for ANSI compliance. 
 For example,
{code:java}
GROUP BY col1, col2 GROUPING SET ...
{code}
It is nice to support ANSI SQL syntax at the same time.
{code:java}
GROUP BY GROUPING SET(...) 
{code}
Note, we only need to support one-level grouping set in this stage. That means, nested grouping set is not supported.

Note, we should not break the existing syntax. The parser changes should be like
{code:sql}
group-by-expressions

>>-GROUP BY----+-hive-sql-group-by-expressions-----+---><
               '-ansi-sql-grouping-set-expressions-'    

hive-sql-group-by-expressions

                        '--GROUPING SETS--(--grouping-set-expressions--)--'
   .-,--------------.   +--WITH CUBE--------------------------------------+
   V                |   +--WITH ROLLUP------------------------------------+
>>---+-expression-+-+---+-------------------------------------------------+-><

grouping-expressions-list

   .-,--------------.  
   V                |  
>>---+-expression-+-+--><


grouping-set-expressions

    .-,----------------------------.
    |      .-,--------------.      |
    |      V                |      |
    V '-(------expression---+-)-'  |
>>----+-expression--------------+--+-><


ansi-sql-grouping-set-expressions

>>-+-ROLLUP--(--grouping-expression-list--)---------+--><
   +-CUBE--(--grouping-expression-list--)-----------+   
   '-GROUPING SETS--(--grouping-set-expressions--)--'  
{code}
 

  was:
Currently, our Group By clause follows Hive [https://cwiki.apache.org/confluence/display/Hive/Enhanced+Aggregation,+Cube,+Grouping+and+Rollup] :
 However, this does not match ANSI SQL compliance. The proposal is to update our parser and analyzer for ANSI compliance. 
 For example,
{code:java}
GROUP BY col1, col2 WITH ROLLUP

GROUP BY col1, col2 WITH CUBE

GROUP BY col1, col2 GROUPING SET ...
{code}
It is nice to support ANSI SQL syntax at the same time.
{code:java}
GROUP BY ROLLUP(col1, col2)

GROUP BY CUBE(col1, col2)

GROUP BY GROUPING SET(...) 
{code}
Note, we only need to support one-level grouping set in this stage. That means, nested grouping set is not supported.

Note, we should not break the existing syntax. The parser changes should be like
{code:sql}
group-by-expressions

>>-GROUP BY----+-hive-sql-group-by-expressions-----+---><
               '-ansi-sql-grouping-set-expressions-'    

hive-sql-group-by-expressions

                        '--GROUPING SETS--(--grouping-set-expressions--)--'
   .-,--------------.   +--WITH CUBE--------------------------------------+
   V                |   +--WITH ROLLUP------------------------------------+
>>---+-expression-+-+---+-------------------------------------------------+-><

grouping-expressions-list

   .-,--------------.  
   V                |  
>>---+-expression-+-+--><


grouping-set-expressions

    .-,----------------------------.
    |      .-,--------------.      |
    |      V                |      |
    V '-(------expression---+-)-'  |
>>----+-expression--------------+--+-><


ansi-sql-grouping-set-expressions

>>-+-ROLLUP--(--grouping-expression-list--)---------+--><
   +-CUBE--(--grouping-expression-list--)-----------+   
   '-GROUPING SETS--(--grouping-set-expressions--)--'  
{code}
 


> Support ANSI-SQL compliant syntax for  GROUPING SET
> ---------------------------------------------------
>
>                 Key: SPARK-24424
>                 URL: https://issues.apache.org/jira/browse/SPARK-24424
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 2.3.0
>            Reporter: Xiao Li
>            Assignee: Dilip Biswal
>            Priority: Major
>             Fix For: 2.4.0
>
>
> Currently, our Group By clause follows Hive [https://cwiki.apache.org/confluence/display/Hive/Enhanced+Aggregation,+Cube,+Grouping+and+Rollup] :
>  However, this does not match ANSI SQL compliance. The proposal is to update our parser and analyzer for ANSI compliance. 
>  For example,
> {code:java}
> GROUP BY col1, col2 GROUPING SET ...
> {code}
> It is nice to support ANSI SQL syntax at the same time.
> {code:java}
> GROUP BY GROUPING SET(...) 
> {code}
> Note, we only need to support one-level grouping set in this stage. That means, nested grouping set is not supported.
> Note, we should not break the existing syntax. The parser changes should be like
> {code:sql}
> group-by-expressions
> >>-GROUP BY----+-hive-sql-group-by-expressions-----+---><
>                '-ansi-sql-grouping-set-expressions-'    
> hive-sql-group-by-expressions
>                         '--GROUPING SETS--(--grouping-set-expressions--)--'
>    .-,--------------.   +--WITH CUBE--------------------------------------+
>    V                |   +--WITH ROLLUP------------------------------------+
> >>---+-expression-+-+---+-------------------------------------------------+-><
> grouping-expressions-list
>    .-,--------------.  
>    V                |  
> >>---+-expression-+-+--><
> grouping-set-expressions
>     .-,----------------------------.
>     |      .-,--------------.      |
>     |      V                |      |
>     V '-(------expression---+-)-'  |
> >>----+-expression--------------+--+-><
> ansi-sql-grouping-set-expressions
> >>-+-ROLLUP--(--grouping-expression-list--)---------+--><
>    +-CUBE--(--grouping-expression-list--)-----------+   
>    '-GROUPING SETS--(--grouping-set-expressions--)--'  
> {code}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org