You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by HyukjinKwon <gi...@git.apache.org> on 2018/08/01 01:39:55 UTC
[GitHub] spark pull request #21699: [SPARK-24722][SQL] pivot() with Column type argum...
Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/21699#discussion_r206732345
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/RelationalGroupedDataset.scala ---
@@ -339,29 +400,30 @@ class RelationalGroupedDataset protected[sql](
/**
* Pivots a column of the current `DataFrame` and performs the specified aggregation.
- * There are two versions of pivot function: one that requires the caller to specify the list
- * of distinct values to pivot on, and one that does not. The latter is more concise but less
- * efficient, because Spark needs to first compute the list of distinct values internally.
+ * This is an overloaded version of the `pivot` method with `pivotColumn` of the `String` type.
*
* {{{
* // Compute the sum of earnings for each year by course with each course as a separate column
- * df.groupBy("year").pivot("course", Seq("dotNET", "Java")).sum("earnings")
- *
- * // Or without specifying column values (less efficient)
- * df.groupBy("year").pivot("course").sum("earnings")
+ * df.groupBy($"year").pivot($"course", Seq("dotNET", "Java")).sum($"earnings")
* }}}
*
- * @param pivotColumn Name of the column to pivot.
+ * @param pivotColumn the column to pivot.
* @param values List of values that will be translated to columns in the output DataFrame.
- * @since 1.6.0
+ * @since 2.4.0
*/
- def pivot(pivotColumn: String, values: Seq[Any]): RelationalGroupedDataset = {
+ def pivot(pivotColumn: Column, values: Seq[Any]): RelationalGroupedDataset = {
+ import org.apache.spark.sql.functions.struct
groupType match {
case RelationalGroupedDataset.GroupByType =>
+ val pivotValues = values.map {
--- End diff --
Hm? wait @maryannxue I think we shouldn't do this at least here.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org