You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by MaxGekk <gi...@git.apache.org> on 2018/10/04 09:41:43 UTC
[GitHub] spark pull request #22626: [SPARK-25638][SQL] Adding new function - to_csv()
GitHub user MaxGekk opened a pull request:
https://github.com/apache/spark/pull/22626
[SPARK-25638][SQL] Adding new function - to_csv()
## What changes were proposed in this pull request?
New functions takes a struct and converts it to a CSV strings using passed CSV options. It accepts the same CSV options as CSV data source does.
## How was this patch tested?
Added `CsvExpressionsSuite`, `CsvFunctionsSuite` as well as R, Python and SQL tests similar to tests for `to_json()`
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/MaxGekk/spark-1 to_csv
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/22626.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #22626
----
commit 357140bccbc827ed9057c8f315901c8f8aad2b89
Author: Maxim Gekk <ma...@...>
Date: 2018-09-30T13:50:18Z
First prototype
commit 19f58a403b248f936e30091fedd5eac17eb24f25
Author: Maxim Gekk <ma...@...>
Date: 2018-09-30T15:03:04Z
CSV Expressions tests
commit 209a95119186a55d37f2c338129682c22e1e2518
Author: Maxim Gekk <ma...@...>
Date: 2018-10-03T19:08:19Z
Merge remote-tracking branch 'origin/master' into to_csv
commit f7f82e9aa4653fd722447640a4642edff5e11acc
Author: Maxim Gekk <ma...@...>
Date: 2018-10-03T20:45:41Z
Adding to_csv and tests
commit 7928dea941eab9e08d31079e2fea5b338cd5c6c4
Author: Maxim Gekk <ma...@...>
Date: 2018-10-03T21:05:15Z
SQL tests
commit 124dcbc31f5500d0dd6286439b5d5a846a24aea3
Author: Maxim Gekk <ma...@...>
Date: 2018-10-03T21:29:36Z
Adding to_csv to PySpark
commit 73b4a22050a03d96db920b12d8c60ceec2f97d63
Author: Maxim Gekk <ma...@...>
Date: 2018-10-04T09:22:24Z
Support R
commit 91512d790123a6d32a8dc8a3961349ba6a3e01df
Author: Maxim Gekk <ma...@...>
Date: 2018-10-04T09:38:38Z
2.5 -> 3.0
----
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #22626: [SPARK-25638][SQL] Adding new function - to_csv()
Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/22626#discussion_r230544717
--- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/csvExpressions.scala ---
@@ -174,3 +176,66 @@ case class SchemaOfCsv(
override def prettyName: String = "schema_of_csv"
}
+
+/**
+ * Converts a [[StructType]] to a CSV output string.
+ */
+// scalastyle:off line.size.limit
+@ExpressionDescription(
+ usage = "_FUNC_(expr[, options]) - Returns a CSV string with a given struct value",
+ examples = """
+ Examples:
+ > SELECT _FUNC_(named_struct('a', 1, 'b', 2));
+ 1,2
+ > SELECT _FUNC_(named_struct('time', to_timestamp('2015-08-26', 'yyyy-MM-dd')), map('timestampFormat', 'dd/MM/yyyy'));
+ "26/08/2015"
+ """,
+ since = "3.0.0")
+// scalastyle:on line.size.limit
+case class StructsToCsv(
+ options: Map[String, String],
+ child: Expression,
+ timeZoneId: Option[String] = None)
+ extends UnaryExpression with TimeZoneAwareExpression with CodegenFallback with ExpectsInputTypes {
+ override def nullable: Boolean = true
+
+ def this(options: Map[String, String], child: Expression) = this(options, child, None)
+
+ // Used in `FunctionRegistry`
+ def this(child: Expression) = this(Map.empty, child, None)
+
+ def this(child: Expression, options: Expression) =
+ this(
+ options = ExprUtils.convertToMapData(options),
+ child = child,
+ timeZoneId = None)
+
+ @transient
+ lazy val writer = new CharArrayWriter()
+
+ @transient
+ lazy val inputSchema: StructType = child.dataType match {
+ case st: StructType => st
+ case other =>
+ throw new IllegalArgumentException(s"Unsupported input type ${other.catalogString}")
+ }
+
+ @transient
+ lazy val gen = new UnivocityGenerator(
+ inputSchema, writer, new CSVOptions(options, columnPruning = true, timeZoneId.get))
+
+ // This converts rows to the CSV output according to the given schema.
+ @transient
+ lazy val converter: Any => UTF8String = {
+ (row: Any) => UTF8String.fromString(gen.writeToString(row.asInstanceOf[InternalRow]))
+ }
+
+ override def dataType: DataType = StringType
+
+ override def withTimeZone(timeZoneId: String): TimeZoneAwareExpression =
+ copy(timeZoneId = Option(timeZoneId))
+
+ override def nullSafeEval(value: Any): Any = converter(value)
+
+ override def inputTypes: Seq[AbstractDataType] = TypeCollection(StructType) :: Nil
--- End diff --
I think we can `StructType :: Nil`
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22626: [SPARK-25638][SQL] Adding new function - to_csv()
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/22626
**[Test build #96933 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96933/testReport)** for PR 22626 at commit [`91512d7`](https://github.com/apache/spark/commit/91512d790123a6d32a8dc8a3961349ba6a3e01df).
* This patch **fails Spark unit tests**.
* This patch merges cleanly.
* This patch adds no public classes.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22626: [SPARK-25638][SQL] Adding new function - to_csv()
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/22626
Merged build finished. Test FAILed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #22626: [SPARK-25638][SQL] Adding new function - to_csv()
Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/22626#discussion_r229981416
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/CsvFunctionsSuite.scala ---
@@ -45,7 +45,6 @@ class CsvFunctionsSuite extends QueryTest with SharedSQLContext {
Row(Row(java.sql.Timestamp.valueOf("2015-08-26 18:00:00.0"))))
}
-
--- End diff --
seems mistake.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22626: [SPARK-25638][SQL] Adding new function - to_csv()
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/22626
**[Test build #96933 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96933/testReport)** for PR 22626 at commit [`91512d7`](https://github.com/apache/spark/commit/91512d790123a6d32a8dc8a3961349ba6a3e01df).
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22626: [SPARK-25638][SQL] Adding new function - to_csv()
Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/22626
@MaxGekk, BTW, I added `schema_of_csv` at R side. You can add `schema_of_json` at R side in a similar way.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22626: [SPARK-25638][SQL] Adding new function - to_csv()
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/22626
**[Test build #98405 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98405/testReport)** for PR 22626 at commit [`230f789`](https://github.com/apache/spark/commit/230f7890d75ab9ed6041eb7d9be1aa01c9f82968).
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22626: [SPARK-25638][SQL] Adding new function - to_csv()
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/22626
**[Test build #98405 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98405/testReport)** for PR 22626 at commit [`230f789`](https://github.com/apache/spark/commit/230f7890d75ab9ed6041eb7d9be1aa01c9f82968).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22626: [SPARK-25638][SQL] Adding new function - to_csv()
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/22626
**[Test build #98336 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98336/testReport)** for PR 22626 at commit [`39f6899`](https://github.com/apache/spark/commit/39f689932ee2df194420fc63c7c5d9e351b09b86).
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22626: [SPARK-25638][SQL] Adding new function - to_csv()
Posted by MaxGekk <gi...@git.apache.org>.
Github user MaxGekk commented on the issue:
https://github.com/apache/spark/pull/22626
@HyukjinKwon Could you look at this PR one more time, please.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #22626: [SPARK-25638][SQL] Adding new function - to_csv()
Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/22626#discussion_r230544667
--- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/csvExpressions.scala ---
@@ -174,3 +176,66 @@ case class SchemaOfCsv(
override def prettyName: String = "schema_of_csv"
}
+
+/**
+ * Converts a [[StructType]] to a CSV output string.
+ */
+// scalastyle:off line.size.limit
+@ExpressionDescription(
+ usage = "_FUNC_(expr[, options]) - Returns a CSV string with a given struct value",
+ examples = """
+ Examples:
+ > SELECT _FUNC_(named_struct('a', 1, 'b', 2));
+ 1,2
+ > SELECT _FUNC_(named_struct('time', to_timestamp('2015-08-26', 'yyyy-MM-dd')), map('timestampFormat', 'dd/MM/yyyy'));
+ "26/08/2015"
+ """,
+ since = "3.0.0")
+// scalastyle:on line.size.limit
+case class StructsToCsv(
+ options: Map[String, String],
+ child: Expression,
+ timeZoneId: Option[String] = None)
+ extends UnaryExpression with TimeZoneAwareExpression with CodegenFallback with ExpectsInputTypes {
+ override def nullable: Boolean = true
+
+ def this(options: Map[String, String], child: Expression) = this(options, child, None)
+
+ // Used in `FunctionRegistry`
+ def this(child: Expression) = this(Map.empty, child, None)
+
+ def this(child: Expression, options: Expression) =
+ this(
+ options = ExprUtils.convertToMapData(options),
+ child = child,
+ timeZoneId = None)
+
+ @transient
+ lazy val writer = new CharArrayWriter()
+
+ @transient
+ lazy val inputSchema: StructType = child.dataType match {
+ case st: StructType => st
+ case other =>
+ throw new IllegalArgumentException(s"Unsupported input type ${other.catalogString}")
+ }
+
+ @transient
+ lazy val gen = new UnivocityGenerator(
+ inputSchema, writer, new CSVOptions(options, columnPruning = true, timeZoneId.get))
+
+ // This converts rows to the CSV output according to the given schema.
+ @transient
+ lazy val converter: Any => UTF8String = {
+ (row: Any) => UTF8String.fromString(gen.writeToString(row.asInstanceOf[InternalRow]))
--- End diff --
@MaxGekk, can we use the data from `writer` like `writer.toString` and `writer.reset()` like `to_json`? Looks we are going to avoid header (which is fine). If we explicitly set `header` to `false` in this expression, looks we don't need to add `writeToString` in `UnivocityGenerator`.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22626: [SPARK-25638][SQL] Adding new function - to_csv()
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/22626
**[Test build #98428 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98428/testReport)** for PR 22626 at commit [`6969b49`](https://github.com/apache/spark/commit/6969b49812acd2664bca724378a3739cb7846a6a).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #22626: [SPARK-25638][SQL] Adding new function - to_csv()
Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/22626#discussion_r229981271
--- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/csvExpressions.scala ---
@@ -174,3 +176,66 @@ case class SchemaOfCsv(
override def prettyName: String = "schema_of_csv"
}
+
+/**
+ * Converts a [[StructType]] to a CSV output string.
+ */
+// scalastyle:off line.size.limit
+@ExpressionDescription(
+ usage = "_FUNC_(expr[, options]) - Returns a CSV string with a given struct value",
+ examples = """
+ Examples:
+ > SELECT _FUNC_(named_struct('a', 1, 'b', 2));
+ 1,2
+ > SELECT _FUNC_(named_struct('time', to_timestamp('2015-08-26', 'yyyy-MM-dd')), map('timestampFormat', 'dd/MM/yyyy'));
+ "26/08/2015"
+ """,
+ since = "3.0.0")
+// scalastyle:on line.size.limit
+case class StructsToCsv(
+ options: Map[String, String],
+ child: Expression,
+ timeZoneId: Option[String] = None)
--- End diff --
seems indentation mistake
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22626: [SPARK-25638][SQL] Adding new function - to_csv()
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/22626
Merged build finished. Test FAILed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #22626: [SPARK-25638][SQL] Adding new function - to_csv()
Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/22626#discussion_r230544492
--- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/csv/UnivocityGenerator.scala ---
@@ -15,18 +15,17 @@
* limitations under the License.
*/
-package org.apache.spark.sql.execution.datasources.csv
+package org.apache.spark.sql.catalyst.csv
import java.io.Writer
import com.univocity.parsers.csv.CsvWriter
import org.apache.spark.sql.catalyst.InternalRow
-import org.apache.spark.sql.catalyst.csv.CSVOptions
import org.apache.spark.sql.catalyst.util.DateTimeUtils
import org.apache.spark.sql.types._
-private[csv] class UnivocityGenerator(
+private[sql] class UnivocityGenerator(
--- End diff --
Let's remove `private[sql]`. We are already in an internal package `catalyst`.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22626: [SPARK-25638][SQL] Adding new function - to_csv()
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/22626
**[Test build #98431 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98431/testReport)** for PR 22626 at commit [`1895cdc`](https://github.com/apache/spark/commit/1895cdc3540f67ad562e10488ac7ffe7012d9ccc).
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22626: [SPARK-25638][SQL] Adding new function - to_csv()
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/22626
Merged build finished. Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #22626: [SPARK-25638][SQL] Adding new function - to_csv()
Posted by MaxGekk <gi...@git.apache.org>.
Github user MaxGekk commented on a diff in the pull request:
https://github.com/apache/spark/pull/22626#discussion_r230027771
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/CsvFunctionsSuite.scala ---
@@ -45,7 +45,6 @@ class CsvFunctionsSuite extends QueryTest with SharedSQLContext {
Row(Row(java.sql.Timestamp.valueOf("2015-08-26 18:00:00.0"))))
}
-
--- End diff --
I explicitly removed the blank line between the tests. Do you think we need it?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22626: [SPARK-25638][SQL] Adding new function - to_csv()
Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/22626
add to whitelist
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22626: [SPARK-25638][SQL] Adding new function - to_csv()
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/22626
Can one of the admins verify this patch?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22626: [SPARK-25638][SQL] Adding new function - to_csv()
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/22626
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98352/
Test FAILed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22626: [SPARK-25638][SQL] Adding new function - to_csv()
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/22626
**[Test build #96931 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96931/testReport)** for PR 22626 at commit [`91512d7`](https://github.com/apache/spark/commit/91512d790123a6d32a8dc8a3961349ba6a3e01df).
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22626: [SPARK-25638][SQL] Adding new function - to_csv()
Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/22626
Merged to master.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22626: [SPARK-25638][SQL] Adding new function - to_csv()
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/22626
**[Test build #98336 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98336/testReport)** for PR 22626 at commit [`39f6899`](https://github.com/apache/spark/commit/39f689932ee2df194420fc63c7c5d9e351b09b86).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22626: [SPARK-25638][SQL] Adding new function - to_csv()
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/22626
**[Test build #98349 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98349/testReport)** for PR 22626 at commit [`142e8a2`](https://github.com/apache/spark/commit/142e8a26b28170250476f46a9109c4f98c2f18ee).
* This patch **fails SparkR unit tests**.
* This patch merges cleanly.
* This patch adds no public classes.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22626: [SPARK-25638][SQL] Adding new function - to_csv()
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/22626
**[Test build #98352 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98352/testReport)** for PR 22626 at commit [`ee330ba`](https://github.com/apache/spark/commit/ee330ba7628ff69c4ad29685d32fbba49846d019).
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #22626: [SPARK-25638][SQL] Adding new function - to_csv()
Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/22626#discussion_r229981376
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/functions.scala ---
@@ -3905,6 +3905,47 @@ object functions {
withExpr(SchemaOfCsv(csv.expr, options.asScala.toMap))
}
+ /**
+ * (Scala-specific) Converts a column containing a `StructType` into a CSV string
+ * with the specified schema. Throws an exception, in the case of an unsupported type.
+ *
+ * @param e a column containing a struct.
+ * @param options options to control how the struct column is converted into a CSV string.
+ * It accepts the same options and the CSV data source.
+ *
+ * @group collection_funcs
+ * @since 3.0.0
+ */
+ def to_csv(e: Column, options: Map[String, String]): Column = withExpr {
--- End diff --
Let's get rid of this Scala version for now.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22626: [SPARK-25638][SQL] Adding new function - to_csv()
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/22626
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98431/
Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22626: [SPARK-25638][SQL] Adding new function - to_csv()
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/22626
**[Test build #98349 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98349/testReport)** for PR 22626 at commit [`142e8a2`](https://github.com/apache/spark/commit/142e8a26b28170250476f46a9109c4f98c2f18ee).
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22626: [SPARK-25638][SQL] Adding new function - to_csv()
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/22626
Merged build finished. Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22626: [SPARK-25638][SQL] Adding new function - to_csv()
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/22626
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98428/
Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22626: [SPARK-25638][SQL] Adding new function - to_csv()
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/22626
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98336/
Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22626: [SPARK-25638][SQL] Adding new function - to_csv()
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/22626
Can one of the admins verify this patch?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22626: [SPARK-25638][SQL] Adding new function - to_csv()
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/22626
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/96933/
Test FAILed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #22626: [SPARK-25638][SQL] Adding new function - to_csv()
Posted by MaxGekk <gi...@git.apache.org>.
Github user MaxGekk commented on a diff in the pull request:
https://github.com/apache/spark/pull/22626#discussion_r230559020
--- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/csv/UnivocityGenerator.scala ---
@@ -15,18 +15,17 @@
* limitations under the License.
*/
-package org.apache.spark.sql.execution.datasources.csv
+package org.apache.spark.sql.catalyst.csv
import java.io.Writer
import com.univocity.parsers.csv.CsvWriter
import org.apache.spark.sql.catalyst.InternalRow
-import org.apache.spark.sql.catalyst.csv.CSVOptions
import org.apache.spark.sql.catalyst.util.DateTimeUtils
import org.apache.spark.sql.types._
-private[csv] class UnivocityGenerator(
+private[sql] class UnivocityGenerator(
--- End diff --
removed
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22626: [SPARK-25638][SQL] Adding new function - to_csv()
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/22626
**[Test build #98428 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98428/testReport)** for PR 22626 at commit [`6969b49`](https://github.com/apache/spark/commit/6969b49812acd2664bca724378a3739cb7846a6a).
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #22626: [SPARK-25638][SQL] Adding new function - to_csv()
Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/22626#discussion_r230577431
--- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/csvExpressions.scala ---
@@ -174,3 +176,68 @@ case class SchemaOfCsv(
override def prettyName: String = "schema_of_csv"
}
+
+/**
+ * Converts a [[StructType]] to a CSV output string.
+ */
+// scalastyle:off line.size.limit
+@ExpressionDescription(
+ usage = "_FUNC_(expr[, options]) - Returns a CSV string with a given struct value",
+ examples = """
+ Examples:
+ > SELECT _FUNC_(named_struct('a', 1, 'b', 2));
+ 1,2
+ > SELECT _FUNC_(named_struct('time', to_timestamp('2015-08-26', 'yyyy-MM-dd')), map('timestampFormat', 'dd/MM/yyyy'));
+ "26/08/2015"
+ """,
+ since = "3.0.0")
+// scalastyle:on line.size.limit
+case class StructsToCsv(
+ options: Map[String, String],
+ child: Expression,
+ timeZoneId: Option[String] = None)
+ extends UnaryExpression with TimeZoneAwareExpression with CodegenFallback with ExpectsInputTypes {
+ override def nullable: Boolean = true
+
+ def this(options: Map[String, String], child: Expression) = this(options, child, None)
+
+ // Used in `FunctionRegistry`
+ def this(child: Expression) = this(Map.empty, child, None)
+
+ def this(child: Expression, options: Expression) =
+ this(
+ options = ExprUtils.convertToMapData(options),
+ child = child,
+ timeZoneId = None)
+
+ @transient
+ lazy val writer = new CharArrayWriter()
+
+ @transient
+ lazy val inputSchema: StructType = child.dataType match {
+ case st: StructType => st
+ case other =>
+ throw new IllegalArgumentException(s"Unsupported input type ${other.catalogString}")
+ }
+
+ @transient
+ lazy val gen = new UnivocityGenerator(
+ inputSchema, writer, new CSVOptions(options, columnPruning = true, timeZoneId.get))
--- End diff --
nit: We wouldn't need `lazy val writer` then but just `new CharArrayWriter()` here.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22626: [SPARK-25638][SQL] Adding new function - to_csv()
Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:
https://github.com/apache/spark/pull/22626
This needs to be rebased.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #22626: [SPARK-25638][SQL] Adding new function - to_csv()
Posted by MaxGekk <gi...@git.apache.org>.
Github user MaxGekk commented on a diff in the pull request:
https://github.com/apache/spark/pull/22626#discussion_r230559006
--- Diff: sql/core/src/test/resources/sql-tests/inputs/csv-functions.sql ---
@@ -15,3 +15,10 @@ CREATE TEMPORARY VIEW csvTable(csvField, a) AS SELECT * FROM VALUES ('1,abc', 'a
SELECT schema_of_csv(csvField) FROM csvTable;
-- Clean up
DROP VIEW IF EXISTS csvTable;
+-- to_csv
+select to_csv(named_struct('a', 1, 'b', 2));
+select to_csv(named_struct('time', to_timestamp('2015-08-26', 'yyyy-MM-dd')), map('timestampFormat', 'dd/MM/yyyy'));
+-- Check if errors handled
+select to_csv(named_struct('a', 1, 'b', 2), named_struct('mode', 'PERMISSIVE'));
+select to_csv(named_struct('a', 1, 'b', 2), map('mode', 1));
--- End diff --
I removed `select to_csv()`
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22626: [SPARK-25638][SQL] Adding new function - to_csv()
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/22626
Merged build finished. Test FAILed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #22626: [SPARK-25638][SQL] Adding new function - to_csv()
Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/22626#discussion_r230544775
--- Diff: sql/core/src/test/resources/sql-tests/inputs/csv-functions.sql ---
@@ -15,3 +15,10 @@ CREATE TEMPORARY VIEW csvTable(csvField, a) AS SELECT * FROM VALUES ('1,abc', 'a
SELECT schema_of_csv(csvField) FROM csvTable;
-- Clean up
DROP VIEW IF EXISTS csvTable;
+-- to_csv
+select to_csv(named_struct('a', 1, 'b', 2));
+select to_csv(named_struct('time', to_timestamp('2015-08-26', 'yyyy-MM-dd')), map('timestampFormat', 'dd/MM/yyyy'));
+-- Check if errors handled
+select to_csv(named_struct('a', 1, 'b', 2), named_struct('mode', 'PERMISSIVE'));
+select to_csv(named_struct('a', 1, 'b', 2), map('mode', 1));
--- End diff --
This one too since the exception is from `convertToMapData`. We just only need one test - this one or the one right above. One of them can be removed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22626: [SPARK-25638][SQL] Adding new function - to_csv()
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/22626
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98349/
Test FAILed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22626: [SPARK-25638][SQL] Adding new function - to_csv()
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/22626
Merged build finished. Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22626: [SPARK-25638][SQL] Adding new function - to_csv()
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/22626
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/96931/
Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22626: [SPARK-25638][SQL] Adding new function - to_csv()
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/22626
**[Test build #98352 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98352/testReport)** for PR 22626 at commit [`ee330ba`](https://github.com/apache/spark/commit/ee330ba7628ff69c4ad29685d32fbba49846d019).
* This patch **fails SparkR unit tests**.
* This patch merges cleanly.
* This patch adds no public classes.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22626: [SPARK-25638][SQL] Adding new function - to_csv()
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/22626
Merged build finished. Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #22626: [SPARK-25638][SQL] Adding new function - to_csv()
Posted by MaxGekk <gi...@git.apache.org>.
Github user MaxGekk commented on a diff in the pull request:
https://github.com/apache/spark/pull/22626#discussion_r230555774
--- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/csvExpressions.scala ---
@@ -174,3 +176,66 @@ case class SchemaOfCsv(
override def prettyName: String = "schema_of_csv"
}
+
+/**
+ * Converts a [[StructType]] to a CSV output string.
+ */
+// scalastyle:off line.size.limit
+@ExpressionDescription(
+ usage = "_FUNC_(expr[, options]) - Returns a CSV string with a given struct value",
+ examples = """
+ Examples:
+ > SELECT _FUNC_(named_struct('a', 1, 'b', 2));
+ 1,2
+ > SELECT _FUNC_(named_struct('time', to_timestamp('2015-08-26', 'yyyy-MM-dd')), map('timestampFormat', 'dd/MM/yyyy'));
+ "26/08/2015"
+ """,
+ since = "3.0.0")
+// scalastyle:on line.size.limit
+case class StructsToCsv(
+ options: Map[String, String],
+ child: Expression,
+ timeZoneId: Option[String] = None)
+ extends UnaryExpression with TimeZoneAwareExpression with CodegenFallback with ExpectsInputTypes {
+ override def nullable: Boolean = true
+
+ def this(options: Map[String, String], child: Expression) = this(options, child, None)
+
+ // Used in `FunctionRegistry`
+ def this(child: Expression) = this(Map.empty, child, None)
+
+ def this(child: Expression, options: Expression) =
+ this(
+ options = ExprUtils.convertToMapData(options),
+ child = child,
+ timeZoneId = None)
+
+ @transient
+ lazy val writer = new CharArrayWriter()
+
+ @transient
+ lazy val inputSchema: StructType = child.dataType match {
+ case st: StructType => st
+ case other =>
+ throw new IllegalArgumentException(s"Unsupported input type ${other.catalogString}")
+ }
+
+ @transient
+ lazy val gen = new UnivocityGenerator(
+ inputSchema, writer, new CSVOptions(options, columnPruning = true, timeZoneId.get))
+
+ // This converts rows to the CSV output according to the given schema.
+ @transient
+ lazy val converter: Any => UTF8String = {
+ (row: Any) => UTF8String.fromString(gen.writeToString(row.asInstanceOf[InternalRow]))
--- End diff --
I tried to use the functions initially but had to add `writeToString` in the commit https://github.com/apache/spark/pull/22626/commits/19f58a403b248f936e30091fedd5eac17eb24f25 because `\n` was added by `uniVocity` at the end of strings.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #22626: [SPARK-25638][SQL] Adding new function - to_csv()
Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/22626#discussion_r230544760
--- Diff: sql/core/src/test/resources/sql-tests/inputs/csv-functions.sql ---
@@ -15,3 +15,10 @@ CREATE TEMPORARY VIEW csvTable(csvField, a) AS SELECT * FROM VALUES ('1,abc', 'a
SELECT schema_of_csv(csvField) FROM csvTable;
-- Clean up
DROP VIEW IF EXISTS csvTable;
+-- to_csv
+select to_csv(named_struct('a', 1, 'b', 2));
+select to_csv(named_struct('time', to_timestamp('2015-08-26', 'yyyy-MM-dd')), map('timestampFormat', 'dd/MM/yyyy'));
+-- Check if errors handled
+select to_csv(named_struct('a', 1, 'b', 2), named_struct('mode', 'PERMISSIVE'));
+select to_csv(named_struct('a', 1, 'b', 2), map('mode', 1));
+select to_csv();
--- End diff --
I think we don't have to test this since it's not specific to this expression.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22626: [SPARK-25638][SQL] Adding new function - to_csv()
Posted by MaxGekk <gi...@git.apache.org>.
Github user MaxGekk commented on the issue:
https://github.com/apache/spark/pull/22626
@cloud-fan @gatorsmile Could you look at the PR, please.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #22626: [SPARK-25638][SQL] Adding new function - to_csv()
Posted by MaxGekk <gi...@git.apache.org>.
Github user MaxGekk commented on a diff in the pull request:
https://github.com/apache/spark/pull/22626#discussion_r230028296
--- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/csvExpressions.scala ---
@@ -174,3 +176,66 @@ case class SchemaOfCsv(
override def prettyName: String = "schema_of_csv"
}
+
+/**
+ * Converts a [[StructType]] to a CSV output string.
+ */
+// scalastyle:off line.size.limit
+@ExpressionDescription(
+ usage = "_FUNC_(expr[, options]) - Returns a CSV string with a given struct value",
+ examples = """
+ Examples:
+ > SELECT _FUNC_(named_struct('a', 1, 'b', 2));
+ 1,2
+ > SELECT _FUNC_(named_struct('time', to_timestamp('2015-08-26', 'yyyy-MM-dd')), map('timestampFormat', 'dd/MM/yyyy'));
+ "26/08/2015"
+ """,
+ since = "3.0.0")
+// scalastyle:on line.size.limit
+case class StructsToCsv(
+ options: Map[String, String],
+ child: Expression,
+ timeZoneId: Option[String] = None)
--- End diff --
Just hoped `scalastyle` should show the mistakes of IntelliJ IDEA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22626: [SPARK-25638][SQL] Adding new function - to_csv()
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/22626
**[Test build #98431 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98431/testReport)** for PR 22626 at commit [`1895cdc`](https://github.com/apache/spark/commit/1895cdc3540f67ad562e10488ac7ffe7012d9ccc).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22626: [SPARK-25638][SQL] Adding new function - to_csv()
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/22626
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98405/
Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22626: [SPARK-25638][SQL] Adding new function - to_csv()
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/22626
**[Test build #96931 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96931/testReport)** for PR 22626 at commit [`91512d7`](https://github.com/apache/spark/commit/91512d790123a6d32a8dc8a3961349ba6a3e01df).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22626: [SPARK-25638][SQL] Adding new function - to_csv()
Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/22626
ok to test
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22626: [SPARK-25638][SQL] Adding new function - to_csv()
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/22626
Merged build finished. Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #22626: [SPARK-25638][SQL] Adding new function - to_csv()
Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/22626#discussion_r230544556
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/CsvFunctionsSuite.scala ---
@@ -45,7 +45,6 @@ class CsvFunctionsSuite extends QueryTest with SharedSQLContext {
Row(Row(java.sql.Timestamp.valueOf("2015-08-26 18:00:00.0"))))
}
-
--- End diff --
Ah, I prefer to don't include unrelated changes but it's okay
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #22626: [SPARK-25638][SQL] Adding new function - to_csv()
Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:
https://github.com/apache/spark/pull/22626
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org