You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Priyanka Garg (JIRA)" <ji...@apache.org> on 2016/06/07 04:34:20 UTC
[jira] [Commented] (SPARK-15797) To expose groupingSets for
DataFrame
[ https://issues.apache.org/jira/browse/SPARK-15797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15317829#comment-15317829 ]
Priyanka Garg commented on SPARK-15797:
---------------------------------------
I am working on this.
> To expose groupingSets for DataFrame
> ------------------------------------
>
> Key: SPARK-15797
> URL: https://issues.apache.org/jira/browse/SPARK-15797
> Project: Spark
> Issue Type: New Feature
> Components: SQL
> Affects Versions: 1.5.1
> Reporter: Priyanka Garg
>
> Currently, Cube and rollup functions are exposed in data frame but not grouping sets.
> For eg.
> df.rollup($"department", $"group", $designation).avg() results into
> a. All combinations of department , group and designations
> b. All combinations of department , group , taking designation as null
> c. All departments , taking groups and designation as null
> d. taking department and group both null ( means aggregating on the complete data)
> On the same lines , there should be a function grouping sets , in which custom groupings can be specified.
> For eg.
> df.groupingSets(($"department", $"group", $"designation"), ($"group") ,($"designation"), () ).avg()
> This should result into:
> 1. All combinations of department, group and designation
> 2. All values of group taking department and designation as null
> 3. All values of designation, taking department and group as null.
> 4. Aggregation on complete data i.e. taking designation, group and department as null.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org