You are viewing a plain text version of this content. The canonical link for it is here.

Posted to reviews@spark.apache.org by HyukjinKwon <gi...@git.apache.org> on 2018/08/01 01:39:55 UTC

[GitHub] spark pull request #21699: [SPARK-24722][SQL] pivot() with Column type argum...

Github user HyukjinKwon commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21699#discussion_r206732345
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/RelationalGroupedDataset.scala ---
    @@ -339,29 +400,30 @@ class RelationalGroupedDataset protected[sql](
     
       /**
        * Pivots a column of the current `DataFrame` and performs the specified aggregation.
    -   * There are two versions of pivot function: one that requires the caller to specify the list
    -   * of distinct values to pivot on, and one that does not. The latter is more concise but less
    -   * efficient, because Spark needs to first compute the list of distinct values internally.
    +   * This is an overloaded version of the `pivot` method with `pivotColumn` of the `String` type.
        *
        * {{{
        *   // Compute the sum of earnings for each year by course with each course as a separate column
    -   *   df.groupBy("year").pivot("course", Seq("dotNET", "Java")).sum("earnings")
    -   *
    -   *   // Or without specifying column values (less efficient)
    -   *   df.groupBy("year").pivot("course").sum("earnings")
    +   *   df.groupBy($"year").pivot($"course", Seq("dotNET", "Java")).sum($"earnings")
        * }}}
        *
    -   * @param pivotColumn Name of the column to pivot.
    +   * @param pivotColumn the column to pivot.
        * @param values List of values that will be translated to columns in the output DataFrame.
    -   * @since 1.6.0
    +   * @since 2.4.0
        */
    -  def pivot(pivotColumn: String, values: Seq[Any]): RelationalGroupedDataset = {
    +  def pivot(pivotColumn: Column, values: Seq[Any]): RelationalGroupedDataset = {
    +    import org.apache.spark.sql.functions.struct
         groupType match {
           case RelationalGroupedDataset.GroupByType =>
    +        val pivotValues = values.map {
    --- End diff --
    
    Hm? wait @maryannxue I think we shouldn't do this at least here.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org