You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Priyanka Garg (JIRA)" <ji...@apache.org> on 2016/06/07 04:34:20 UTC

[jira] [Commented] (SPARK-15797) To expose groupingSets for DataFrame

    [ https://issues.apache.org/jira/browse/SPARK-15797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15317829#comment-15317829 ] 

Priyanka Garg commented on SPARK-15797:
---------------------------------------

I am working on this.

> To expose groupingSets for DataFrame
> ------------------------------------
>
>                 Key: SPARK-15797
>                 URL: https://issues.apache.org/jira/browse/SPARK-15797
>             Project: Spark
>          Issue Type: New Feature
>          Components: SQL
>    Affects Versions: 1.5.1
>            Reporter: Priyanka Garg
>
> Currently, Cube and rollup functions are exposed in data frame but not grouping sets. 
> For eg.
> df.rollup($"department", $"group", $designation).avg() results into 
> a. All combinations of department , group and designations
> b. All combinations of department , group , taking designation as null
> c. All departments , taking groups and designation as null
> d. taking department and group both null ( means aggregating on the complete data)
> On the same lines , there should be a function grouping sets , in which custom groupings can be specified.
> For eg.
> df.groupingSets(($"department", $"group", $"designation"), ($"group") ,($"designation"), () ).avg() 
> This should result into:
> 1. All combinations of department, group and designation
> 2. All values of group taking department and designation as null
> 3. All  values of designation, taking department and group as null.
> 4. Aggregation on complete data i.e. taking designation, group and department as null.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org