You are viewing a plain text version of this content. The canonical link for it is here.

Posted to reviews@spark.apache.org by MaxGekk <gi...@git.apache.org> on 2018/10/04 09:41:43 UTC

[GitHub] spark pull request #22626: [SPARK-25638][SQL] Adding new function - to_csv()

GitHub user MaxGekk opened a pull request:

    https://github.com/apache/spark/pull/22626

    [SPARK-25638][SQL] Adding new function - to_csv()

    ## What changes were proposed in this pull request?
    
    New functions takes a struct and converts it to a CSV strings using passed CSV options. It accepts the same CSV options as CSV data source does.  
    
    ## How was this patch tested?
    
    Added `CsvExpressionsSuite`, `CsvFunctionsSuite` as well as R, Python and SQL tests similar to tests for `to_json()`


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/MaxGekk/spark-1 to_csv

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/22626.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #22626
    
----
commit 357140bccbc827ed9057c8f315901c8f8aad2b89
Author: Maxim Gekk <ma...@...>
Date:   2018-09-30T13:50:18Z

    First prototype

commit 19f58a403b248f936e30091fedd5eac17eb24f25
Author: Maxim Gekk <ma...@...>
Date:   2018-09-30T15:03:04Z

    CSV Expressions tests

commit 209a95119186a55d37f2c338129682c22e1e2518
Author: Maxim Gekk <ma...@...>
Date:   2018-10-03T19:08:19Z

    Merge remote-tracking branch 'origin/master' into to_csv

commit f7f82e9aa4653fd722447640a4642edff5e11acc
Author: Maxim Gekk <ma...@...>
Date:   2018-10-03T20:45:41Z

    Adding to_csv and tests

commit 7928dea941eab9e08d31079e2fea5b338cd5c6c4
Author: Maxim Gekk <ma...@...>
Date:   2018-10-03T21:05:15Z

    SQL tests

commit 124dcbc31f5500d0dd6286439b5d5a846a24aea3
Author: Maxim Gekk <ma...@...>
Date:   2018-10-03T21:29:36Z

    Adding to_csv to PySpark

commit 73b4a22050a03d96db920b12d8c60ceec2f97d63
Author: Maxim Gekk <ma...@...>
Date:   2018-10-04T09:22:24Z

    Support R

commit 91512d790123a6d32a8dc8a3961349ba6a3e01df
Author: Maxim Gekk <ma...@...>
Date:   2018-10-04T09:38:38Z

    2.5 -> 3.0

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #22626: [SPARK-25638][SQL] Adding new function - to_csv()

Posted by HyukjinKwon <gi...@git.apache.org>.

Github user HyukjinKwon commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22626#discussion_r230544717
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/csvExpressions.scala ---
    @@ -174,3 +176,66 @@ case class SchemaOfCsv(
     
       override def prettyName: String = "schema_of_csv"
     }
    +
    +/**
    + * Converts a [[StructType]] to a CSV output string.
    + */
    +// scalastyle:off line.size.limit
    +@ExpressionDescription(
    +  usage = "_FUNC_(expr[, options]) - Returns a CSV string with a given struct value",
    +  examples = """
    +    Examples:
    +      > SELECT _FUNC_(named_struct('a', 1, 'b', 2));
    +       1,2
    +      > SELECT _FUNC_(named_struct('time', to_timestamp('2015-08-26', 'yyyy-MM-dd')), map('timestampFormat', 'dd/MM/yyyy'));
    +       "26/08/2015"
    +  """,
    +  since = "3.0.0")
    +// scalastyle:on line.size.limit
    +case class StructsToCsv(
    +     options: Map[String, String],
    +     child: Expression,
    +     timeZoneId: Option[String] = None)
    +  extends UnaryExpression with TimeZoneAwareExpression with CodegenFallback with ExpectsInputTypes {
    +  override def nullable: Boolean = true
    +
    +  def this(options: Map[String, String], child: Expression) = this(options, child, None)
    +
    +  // Used in `FunctionRegistry`
    +  def this(child: Expression) = this(Map.empty, child, None)
    +
    +  def this(child: Expression, options: Expression) =
    +    this(
    +      options = ExprUtils.convertToMapData(options),
    +      child = child,
    +      timeZoneId = None)
    +
    +  @transient
    +  lazy val writer = new CharArrayWriter()
    +
    +  @transient
    +  lazy val inputSchema: StructType = child.dataType match {
    +    case st: StructType => st
    +    case other =>
    +      throw new IllegalArgumentException(s"Unsupported input type ${other.catalogString}")
    +  }
    +
    +  @transient
    +  lazy val gen = new UnivocityGenerator(
    +    inputSchema, writer, new CSVOptions(options, columnPruning = true, timeZoneId.get))
    +
    +  // This converts rows to the CSV output according to the given schema.
    +  @transient
    +  lazy val converter: Any => UTF8String = {
    +    (row: Any) => UTF8String.fromString(gen.writeToString(row.asInstanceOf[InternalRow]))
    +  }
    +
    +  override def dataType: DataType = StringType
    +
    +  override def withTimeZone(timeZoneId: String): TimeZoneAwareExpression =
    +    copy(timeZoneId = Option(timeZoneId))
    +
    +  override def nullSafeEval(value: Any): Any = converter(value)
    +
    +  override def inputTypes: Seq[AbstractDataType] = TypeCollection(StructType) :: Nil
    --- End diff --
    
    I think we can `StructType :: Nil`


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #22626: [SPARK-25638][SQL] Adding new function - to_csv()

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22626
  
    **[Test build #96933 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96933/testReport)** for PR 22626 at commit [`91512d7`](https://github.com/apache/spark/commit/91512d790123a6d32a8dc8a3961349ba6a3e01df).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #22626: [SPARK-25638][SQL] Adding new function - to_csv()

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22626
  
    Merged build finished. Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #22626: [SPARK-25638][SQL] Adding new function - to_csv()

Posted by HyukjinKwon <gi...@git.apache.org>.

Github user HyukjinKwon commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22626#discussion_r229981416
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/CsvFunctionsSuite.scala ---
    @@ -45,7 +45,6 @@ class CsvFunctionsSuite extends QueryTest with SharedSQLContext {
           Row(Row(java.sql.Timestamp.valueOf("2015-08-26 18:00:00.0"))))
       }
     
    -
    --- End diff --
    
    seems mistake.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #22626: [SPARK-25638][SQL] Adding new function - to_csv()

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22626
  
    **[Test build #96933 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96933/testReport)** for PR 22626 at commit [`91512d7`](https://github.com/apache/spark/commit/91512d790123a6d32a8dc8a3961349ba6a3e01df).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #22626: [SPARK-25638][SQL] Adding new function - to_csv()

Posted by HyukjinKwon <gi...@git.apache.org>.

Github user HyukjinKwon commented on the issue:

    https://github.com/apache/spark/pull/22626
  
    @MaxGekk, BTW, I added `schema_of_csv` at R side. You can add `schema_of_json` at R side in a similar way.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #22626: [SPARK-25638][SQL] Adding new function - to_csv()

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22626
  
    **[Test build #98405 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98405/testReport)** for PR 22626 at commit [`230f789`](https://github.com/apache/spark/commit/230f7890d75ab9ed6041eb7d9be1aa01c9f82968).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #22626: [SPARK-25638][SQL] Adding new function - to_csv()

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22626
  
    **[Test build #98405 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98405/testReport)** for PR 22626 at commit [`230f789`](https://github.com/apache/spark/commit/230f7890d75ab9ed6041eb7d9be1aa01c9f82968).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #22626: [SPARK-25638][SQL] Adding new function - to_csv()

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22626
  
    **[Test build #98336 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98336/testReport)** for PR 22626 at commit [`39f6899`](https://github.com/apache/spark/commit/39f689932ee2df194420fc63c7c5d9e351b09b86).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #22626: [SPARK-25638][SQL] Adding new function - to_csv()

Posted by MaxGekk <gi...@git.apache.org>.

Github user MaxGekk commented on the issue:

    https://github.com/apache/spark/pull/22626
  
    @HyukjinKwon Could you look at this PR one more time, please.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #22626: [SPARK-25638][SQL] Adding new function - to_csv()

Posted by HyukjinKwon <gi...@git.apache.org>.

Github user HyukjinKwon commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22626#discussion_r230544667
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/csvExpressions.scala ---
    @@ -174,3 +176,66 @@ case class SchemaOfCsv(
     
       override def prettyName: String = "schema_of_csv"
     }
    +
    +/**
    + * Converts a [[StructType]] to a CSV output string.
    + */
    +// scalastyle:off line.size.limit
    +@ExpressionDescription(
    +  usage = "_FUNC_(expr[, options]) - Returns a CSV string with a given struct value",
    +  examples = """
    +    Examples:
    +      > SELECT _FUNC_(named_struct('a', 1, 'b', 2));
    +       1,2
    +      > SELECT _FUNC_(named_struct('time', to_timestamp('2015-08-26', 'yyyy-MM-dd')), map('timestampFormat', 'dd/MM/yyyy'));
    +       "26/08/2015"
    +  """,
    +  since = "3.0.0")
    +// scalastyle:on line.size.limit
    +case class StructsToCsv(
    +     options: Map[String, String],
    +     child: Expression,
    +     timeZoneId: Option[String] = None)
    +  extends UnaryExpression with TimeZoneAwareExpression with CodegenFallback with ExpectsInputTypes {
    +  override def nullable: Boolean = true
    +
    +  def this(options: Map[String, String], child: Expression) = this(options, child, None)
    +
    +  // Used in `FunctionRegistry`
    +  def this(child: Expression) = this(Map.empty, child, None)
    +
    +  def this(child: Expression, options: Expression) =
    +    this(
    +      options = ExprUtils.convertToMapData(options),
    +      child = child,
    +      timeZoneId = None)
    +
    +  @transient
    +  lazy val writer = new CharArrayWriter()
    +
    +  @transient
    +  lazy val inputSchema: StructType = child.dataType match {
    +    case st: StructType => st
    +    case other =>
    +      throw new IllegalArgumentException(s"Unsupported input type ${other.catalogString}")
    +  }
    +
    +  @transient
    +  lazy val gen = new UnivocityGenerator(
    +    inputSchema, writer, new CSVOptions(options, columnPruning = true, timeZoneId.get))
    +
    +  // This converts rows to the CSV output according to the given schema.
    +  @transient
    +  lazy val converter: Any => UTF8String = {
    +    (row: Any) => UTF8String.fromString(gen.writeToString(row.asInstanceOf[InternalRow]))
    --- End diff --
    
    @MaxGekk, can we use the data from `writer` like `writer.toString` and `writer.reset()` like `to_json`? Looks we are going to avoid header (which is fine). If we explicitly set `header` to `false` in this expression, looks we don't need to add `writeToString` in `UnivocityGenerator`.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #22626: [SPARK-25638][SQL] Adding new function - to_csv()

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22626
  
    **[Test build #98428 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98428/testReport)** for PR 22626 at commit [`6969b49`](https://github.com/apache/spark/commit/6969b49812acd2664bca724378a3739cb7846a6a).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #22626: [SPARK-25638][SQL] Adding new function - to_csv()

Posted by HyukjinKwon <gi...@git.apache.org>.

Github user HyukjinKwon commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22626#discussion_r229981271
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/csvExpressions.scala ---
    @@ -174,3 +176,66 @@ case class SchemaOfCsv(
     
       override def prettyName: String = "schema_of_csv"
     }
    +
    +/**
    + * Converts a [[StructType]] to a CSV output string.
    + */
    +// scalastyle:off line.size.limit
    +@ExpressionDescription(
    +  usage = "_FUNC_(expr[, options]) - Returns a CSV string with a given struct value",
    +  examples = """
    +    Examples:
    +      > SELECT _FUNC_(named_struct('a', 1, 'b', 2));
    +       1,2
    +      > SELECT _FUNC_(named_struct('time', to_timestamp('2015-08-26', 'yyyy-MM-dd')), map('timestampFormat', 'dd/MM/yyyy'));
    +       "26/08/2015"
    +  """,
    +  since = "3.0.0")
    +// scalastyle:on line.size.limit
    +case class StructsToCsv(
    +                         options: Map[String, String],
    +                         child: Expression,
    +                         timeZoneId: Option[String] = None)
    --- End diff --
    
    seems indentation mistake


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #22626: [SPARK-25638][SQL] Adding new function - to_csv()

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22626
  
    Merged build finished. Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #22626: [SPARK-25638][SQL] Adding new function - to_csv()

Posted by HyukjinKwon <gi...@git.apache.org>.

Github user HyukjinKwon commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22626#discussion_r230544492
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/csv/UnivocityGenerator.scala ---
    @@ -15,18 +15,17 @@
      * limitations under the License.
      */
     
    -package org.apache.spark.sql.execution.datasources.csv
    +package org.apache.spark.sql.catalyst.csv
     
     import java.io.Writer
     
     import com.univocity.parsers.csv.CsvWriter
     
     import org.apache.spark.sql.catalyst.InternalRow
    -import org.apache.spark.sql.catalyst.csv.CSVOptions
     import org.apache.spark.sql.catalyst.util.DateTimeUtils
     import org.apache.spark.sql.types._
     
    -private[csv] class UnivocityGenerator(
    +private[sql] class UnivocityGenerator(
    --- End diff --
    
    Let's remove `private[sql]`. We are already in an internal package `catalyst`.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #22626: [SPARK-25638][SQL] Adding new function - to_csv()

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22626
  
    **[Test build #98431 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98431/testReport)** for PR 22626 at commit [`1895cdc`](https://github.com/apache/spark/commit/1895cdc3540f67ad562e10488ac7ffe7012d9ccc).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #22626: [SPARK-25638][SQL] Adding new function - to_csv()

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22626
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #22626: [SPARK-25638][SQL] Adding new function - to_csv()

Posted by MaxGekk <gi...@git.apache.org>.

Github user MaxGekk commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22626#discussion_r230027771
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/CsvFunctionsSuite.scala ---
    @@ -45,7 +45,6 @@ class CsvFunctionsSuite extends QueryTest with SharedSQLContext {
           Row(Row(java.sql.Timestamp.valueOf("2015-08-26 18:00:00.0"))))
       }
     
    -
    --- End diff --
    
    I explicitly removed the blank line between the tests. Do you think we need it?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #22626: [SPARK-25638][SQL] Adding new function - to_csv()

Posted by HyukjinKwon <gi...@git.apache.org>.

Github user HyukjinKwon commented on the issue:

    https://github.com/apache/spark/pull/22626
  
    add to whitelist


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #22626: [SPARK-25638][SQL] Adding new function - to_csv()

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22626
  
    Can one of the admins verify this patch?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #22626: [SPARK-25638][SQL] Adding new function - to_csv()

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22626
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98352/
    Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #22626: [SPARK-25638][SQL] Adding new function - to_csv()

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22626
  
    **[Test build #96931 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96931/testReport)** for PR 22626 at commit [`91512d7`](https://github.com/apache/spark/commit/91512d790123a6d32a8dc8a3961349ba6a3e01df).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #22626: [SPARK-25638][SQL] Adding new function - to_csv()

Posted by HyukjinKwon <gi...@git.apache.org>.

Github user HyukjinKwon commented on the issue:

    https://github.com/apache/spark/pull/22626
  
    Merged to master.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #22626: [SPARK-25638][SQL] Adding new function - to_csv()

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22626
  
    **[Test build #98336 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98336/testReport)** for PR 22626 at commit [`39f6899`](https://github.com/apache/spark/commit/39f689932ee2df194420fc63c7c5d9e351b09b86).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #22626: [SPARK-25638][SQL] Adding new function - to_csv()

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22626
  
    **[Test build #98349 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98349/testReport)** for PR 22626 at commit [`142e8a2`](https://github.com/apache/spark/commit/142e8a26b28170250476f46a9109c4f98c2f18ee).
     * This patch **fails SparkR unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #22626: [SPARK-25638][SQL] Adding new function - to_csv()

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22626
  
    **[Test build #98352 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98352/testReport)** for PR 22626 at commit [`ee330ba`](https://github.com/apache/spark/commit/ee330ba7628ff69c4ad29685d32fbba49846d019).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #22626: [SPARK-25638][SQL] Adding new function - to_csv()

Posted by HyukjinKwon <gi...@git.apache.org>.

Github user HyukjinKwon commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22626#discussion_r229981376
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/functions.scala ---
    @@ -3905,6 +3905,47 @@ object functions {
         withExpr(SchemaOfCsv(csv.expr, options.asScala.toMap))
       }
     
    +  /**
    +   * (Scala-specific) Converts a column containing a `StructType` into a CSV string
    +   * with the specified schema. Throws an exception, in the case of an unsupported type.
    +   *
    +   * @param e a column containing a struct.
    +   * @param options options to control how the struct column is converted into a CSV string.
    +   *                It accepts the same options and the CSV data source.
    +   *
    +   * @group collection_funcs
    +   * @since 3.0.0
    +   */
    +  def to_csv(e: Column, options: Map[String, String]): Column = withExpr {
    --- End diff --
    
    Let's get rid of this Scala version for now.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #22626: [SPARK-25638][SQL] Adding new function - to_csv()

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22626
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98431/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #22626: [SPARK-25638][SQL] Adding new function - to_csv()

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22626
  
    **[Test build #98349 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98349/testReport)** for PR 22626 at commit [`142e8a2`](https://github.com/apache/spark/commit/142e8a26b28170250476f46a9109c4f98c2f18ee).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #22626: [SPARK-25638][SQL] Adding new function - to_csv()

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22626
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #22626: [SPARK-25638][SQL] Adding new function - to_csv()

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22626
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98428/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #22626: [SPARK-25638][SQL] Adding new function - to_csv()

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22626
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98336/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #22626: [SPARK-25638][SQL] Adding new function - to_csv()

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22626
  
    Can one of the admins verify this patch?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #22626: [SPARK-25638][SQL] Adding new function - to_csv()

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22626
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/96933/
    Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #22626: [SPARK-25638][SQL] Adding new function - to_csv()

Posted by MaxGekk <gi...@git.apache.org>.

Github user MaxGekk commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22626#discussion_r230559020
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/csv/UnivocityGenerator.scala ---
    @@ -15,18 +15,17 @@
      * limitations under the License.
      */
     
    -package org.apache.spark.sql.execution.datasources.csv
    +package org.apache.spark.sql.catalyst.csv
     
     import java.io.Writer
     
     import com.univocity.parsers.csv.CsvWriter
     
     import org.apache.spark.sql.catalyst.InternalRow
    -import org.apache.spark.sql.catalyst.csv.CSVOptions
     import org.apache.spark.sql.catalyst.util.DateTimeUtils
     import org.apache.spark.sql.types._
     
    -private[csv] class UnivocityGenerator(
    +private[sql] class UnivocityGenerator(
    --- End diff --
    
    removed


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #22626: [SPARK-25638][SQL] Adding new function - to_csv()

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22626
  
    **[Test build #98428 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98428/testReport)** for PR 22626 at commit [`6969b49`](https://github.com/apache/spark/commit/6969b49812acd2664bca724378a3739cb7846a6a).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #22626: [SPARK-25638][SQL] Adding new function - to_csv()

Posted by HyukjinKwon <gi...@git.apache.org>.

Github user HyukjinKwon commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22626#discussion_r230577431
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/csvExpressions.scala ---
    @@ -174,3 +176,68 @@ case class SchemaOfCsv(
     
       override def prettyName: String = "schema_of_csv"
     }
    +
    +/**
    + * Converts a [[StructType]] to a CSV output string.
    + */
    +// scalastyle:off line.size.limit
    +@ExpressionDescription(
    +  usage = "_FUNC_(expr[, options]) - Returns a CSV string with a given struct value",
    +  examples = """
    +    Examples:
    +      > SELECT _FUNC_(named_struct('a', 1, 'b', 2));
    +       1,2
    +      > SELECT _FUNC_(named_struct('time', to_timestamp('2015-08-26', 'yyyy-MM-dd')), map('timestampFormat', 'dd/MM/yyyy'));
    +       "26/08/2015"
    +  """,
    +  since = "3.0.0")
    +// scalastyle:on line.size.limit
    +case class StructsToCsv(
    +     options: Map[String, String],
    +     child: Expression,
    +     timeZoneId: Option[String] = None)
    +  extends UnaryExpression with TimeZoneAwareExpression with CodegenFallback with ExpectsInputTypes {
    +  override def nullable: Boolean = true
    +
    +  def this(options: Map[String, String], child: Expression) = this(options, child, None)
    +
    +  // Used in `FunctionRegistry`
    +  def this(child: Expression) = this(Map.empty, child, None)
    +
    +  def this(child: Expression, options: Expression) =
    +    this(
    +      options = ExprUtils.convertToMapData(options),
    +      child = child,
    +      timeZoneId = None)
    +
    +  @transient
    +  lazy val writer = new CharArrayWriter()
    +
    +  @transient
    +  lazy val inputSchema: StructType = child.dataType match {
    +    case st: StructType => st
    +    case other =>
    +      throw new IllegalArgumentException(s"Unsupported input type ${other.catalogString}")
    +  }
    +
    +  @transient
    +  lazy val gen = new UnivocityGenerator(
    +    inputSchema, writer, new CSVOptions(options, columnPruning = true, timeZoneId.get))
    --- End diff --
    
    nit: We wouldn't need `lazy val writer` then but just `new CharArrayWriter()` here.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #22626: [SPARK-25638][SQL] Adding new function - to_csv()

Posted by cloud-fan <gi...@git.apache.org>.

Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/22626
  
    This needs to be rebased.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #22626: [SPARK-25638][SQL] Adding new function - to_csv()

Posted by MaxGekk <gi...@git.apache.org>.

Github user MaxGekk commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22626#discussion_r230559006
  
    --- Diff: sql/core/src/test/resources/sql-tests/inputs/csv-functions.sql ---
    @@ -15,3 +15,10 @@ CREATE TEMPORARY VIEW csvTable(csvField, a) AS SELECT * FROM VALUES ('1,abc', 'a
     SELECT schema_of_csv(csvField) FROM csvTable;
     -- Clean up
     DROP VIEW IF EXISTS csvTable;
    +-- to_csv
    +select to_csv(named_struct('a', 1, 'b', 2));
    +select to_csv(named_struct('time', to_timestamp('2015-08-26', 'yyyy-MM-dd')), map('timestampFormat', 'dd/MM/yyyy'));
    +-- Check if errors handled
    +select to_csv(named_struct('a', 1, 'b', 2), named_struct('mode', 'PERMISSIVE'));
    +select to_csv(named_struct('a', 1, 'b', 2), map('mode', 1));
    --- End diff --
    
    I removed `select to_csv()`


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #22626: [SPARK-25638][SQL] Adding new function - to_csv()

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22626
  
    Merged build finished. Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #22626: [SPARK-25638][SQL] Adding new function - to_csv()

Posted by HyukjinKwon <gi...@git.apache.org>.

Github user HyukjinKwon commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22626#discussion_r230544775
  
    --- Diff: sql/core/src/test/resources/sql-tests/inputs/csv-functions.sql ---
    @@ -15,3 +15,10 @@ CREATE TEMPORARY VIEW csvTable(csvField, a) AS SELECT * FROM VALUES ('1,abc', 'a
     SELECT schema_of_csv(csvField) FROM csvTable;
     -- Clean up
     DROP VIEW IF EXISTS csvTable;
    +-- to_csv
    +select to_csv(named_struct('a', 1, 'b', 2));
    +select to_csv(named_struct('time', to_timestamp('2015-08-26', 'yyyy-MM-dd')), map('timestampFormat', 'dd/MM/yyyy'));
    +-- Check if errors handled
    +select to_csv(named_struct('a', 1, 'b', 2), named_struct('mode', 'PERMISSIVE'));
    +select to_csv(named_struct('a', 1, 'b', 2), map('mode', 1));
    --- End diff --
    
    This one too since the exception is from `convertToMapData`. We just only need one test - this one or the one right above. One of them can be removed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #22626: [SPARK-25638][SQL] Adding new function - to_csv()

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22626
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98349/
    Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #22626: [SPARK-25638][SQL] Adding new function - to_csv()

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22626
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #22626: [SPARK-25638][SQL] Adding new function - to_csv()

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22626
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/96931/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #22626: [SPARK-25638][SQL] Adding new function - to_csv()

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22626
  
    **[Test build #98352 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98352/testReport)** for PR 22626 at commit [`ee330ba`](https://github.com/apache/spark/commit/ee330ba7628ff69c4ad29685d32fbba49846d019).
     * This patch **fails SparkR unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #22626: [SPARK-25638][SQL] Adding new function - to_csv()

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22626
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #22626: [SPARK-25638][SQL] Adding new function - to_csv()

Posted by MaxGekk <gi...@git.apache.org>.

Github user MaxGekk commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22626#discussion_r230555774
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/csvExpressions.scala ---
    @@ -174,3 +176,66 @@ case class SchemaOfCsv(
     
       override def prettyName: String = "schema_of_csv"
     }
    +
    +/**
    + * Converts a [[StructType]] to a CSV output string.
    + */
    +// scalastyle:off line.size.limit
    +@ExpressionDescription(
    +  usage = "_FUNC_(expr[, options]) - Returns a CSV string with a given struct value",
    +  examples = """
    +    Examples:
    +      > SELECT _FUNC_(named_struct('a', 1, 'b', 2));
    +       1,2
    +      > SELECT _FUNC_(named_struct('time', to_timestamp('2015-08-26', 'yyyy-MM-dd')), map('timestampFormat', 'dd/MM/yyyy'));
    +       "26/08/2015"
    +  """,
    +  since = "3.0.0")
    +// scalastyle:on line.size.limit
    +case class StructsToCsv(
    +     options: Map[String, String],
    +     child: Expression,
    +     timeZoneId: Option[String] = None)
    +  extends UnaryExpression with TimeZoneAwareExpression with CodegenFallback with ExpectsInputTypes {
    +  override def nullable: Boolean = true
    +
    +  def this(options: Map[String, String], child: Expression) = this(options, child, None)
    +
    +  // Used in `FunctionRegistry`
    +  def this(child: Expression) = this(Map.empty, child, None)
    +
    +  def this(child: Expression, options: Expression) =
    +    this(
    +      options = ExprUtils.convertToMapData(options),
    +      child = child,
    +      timeZoneId = None)
    +
    +  @transient
    +  lazy val writer = new CharArrayWriter()
    +
    +  @transient
    +  lazy val inputSchema: StructType = child.dataType match {
    +    case st: StructType => st
    +    case other =>
    +      throw new IllegalArgumentException(s"Unsupported input type ${other.catalogString}")
    +  }
    +
    +  @transient
    +  lazy val gen = new UnivocityGenerator(
    +    inputSchema, writer, new CSVOptions(options, columnPruning = true, timeZoneId.get))
    +
    +  // This converts rows to the CSV output according to the given schema.
    +  @transient
    +  lazy val converter: Any => UTF8String = {
    +    (row: Any) => UTF8String.fromString(gen.writeToString(row.asInstanceOf[InternalRow]))
    --- End diff --
    
    I tried to use the functions initially but had to add `writeToString` in the commit https://github.com/apache/spark/pull/22626/commits/19f58a403b248f936e30091fedd5eac17eb24f25 because `\n` was added by `uniVocity` at the end of strings.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #22626: [SPARK-25638][SQL] Adding new function - to_csv()

Posted by HyukjinKwon <gi...@git.apache.org>.

Github user HyukjinKwon commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22626#discussion_r230544760
  
    --- Diff: sql/core/src/test/resources/sql-tests/inputs/csv-functions.sql ---
    @@ -15,3 +15,10 @@ CREATE TEMPORARY VIEW csvTable(csvField, a) AS SELECT * FROM VALUES ('1,abc', 'a
     SELECT schema_of_csv(csvField) FROM csvTable;
     -- Clean up
     DROP VIEW IF EXISTS csvTable;
    +-- to_csv
    +select to_csv(named_struct('a', 1, 'b', 2));
    +select to_csv(named_struct('time', to_timestamp('2015-08-26', 'yyyy-MM-dd')), map('timestampFormat', 'dd/MM/yyyy'));
    +-- Check if errors handled
    +select to_csv(named_struct('a', 1, 'b', 2), named_struct('mode', 'PERMISSIVE'));
    +select to_csv(named_struct('a', 1, 'b', 2), map('mode', 1));
    +select to_csv();
    --- End diff --
    
    I think we don't have to test this since it's not specific to this expression.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #22626: [SPARK-25638][SQL] Adding new function - to_csv()

Posted by MaxGekk <gi...@git.apache.org>.

Github user MaxGekk commented on the issue:

    https://github.com/apache/spark/pull/22626
  
    @cloud-fan @gatorsmile Could you look at the PR, please.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #22626: [SPARK-25638][SQL] Adding new function - to_csv()

Posted by MaxGekk <gi...@git.apache.org>.

Github user MaxGekk commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22626#discussion_r230028296
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/csvExpressions.scala ---
    @@ -174,3 +176,66 @@ case class SchemaOfCsv(
     
       override def prettyName: String = "schema_of_csv"
     }
    +
    +/**
    + * Converts a [[StructType]] to a CSV output string.
    + */
    +// scalastyle:off line.size.limit
    +@ExpressionDescription(
    +  usage = "_FUNC_(expr[, options]) - Returns a CSV string with a given struct value",
    +  examples = """
    +    Examples:
    +      > SELECT _FUNC_(named_struct('a', 1, 'b', 2));
    +       1,2
    +      > SELECT _FUNC_(named_struct('time', to_timestamp('2015-08-26', 'yyyy-MM-dd')), map('timestampFormat', 'dd/MM/yyyy'));
    +       "26/08/2015"
    +  """,
    +  since = "3.0.0")
    +// scalastyle:on line.size.limit
    +case class StructsToCsv(
    +                         options: Map[String, String],
    +                         child: Expression,
    +                         timeZoneId: Option[String] = None)
    --- End diff --
    
    Just hoped `scalastyle` should show the mistakes of IntelliJ IDEA.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #22626: [SPARK-25638][SQL] Adding new function - to_csv()

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22626
  
    **[Test build #98431 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98431/testReport)** for PR 22626 at commit [`1895cdc`](https://github.com/apache/spark/commit/1895cdc3540f67ad562e10488ac7ffe7012d9ccc).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #22626: [SPARK-25638][SQL] Adding new function - to_csv()

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22626
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98405/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #22626: [SPARK-25638][SQL] Adding new function - to_csv()

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22626
  
    **[Test build #96931 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96931/testReport)** for PR 22626 at commit [`91512d7`](https://github.com/apache/spark/commit/91512d790123a6d32a8dc8a3961349ba6a3e01df).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #22626: [SPARK-25638][SQL] Adding new function - to_csv()

Posted by HyukjinKwon <gi...@git.apache.org>.

Github user HyukjinKwon commented on the issue:

    https://github.com/apache/spark/pull/22626
  
    ok to test


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #22626: [SPARK-25638][SQL] Adding new function - to_csv()

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22626
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #22626: [SPARK-25638][SQL] Adding new function - to_csv()

Posted by HyukjinKwon <gi...@git.apache.org>.

Github user HyukjinKwon commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22626#discussion_r230544556
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/CsvFunctionsSuite.scala ---
    @@ -45,7 +45,6 @@ class CsvFunctionsSuite extends QueryTest with SharedSQLContext {
           Row(Row(java.sql.Timestamp.valueOf("2015-08-26 18:00:00.0"))))
       }
     
    -
    --- End diff --
    
    Ah, I prefer to don't include unrelated changes but it's okay


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #22626: [SPARK-25638][SQL] Adding new function - to_csv()

Posted by asfgit <gi...@git.apache.org>.

Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/22626


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org