You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by henryr <gi...@git.apache.org> on 2018/01/23 01:26:48 UTC
[GitHub] spark pull request #20355: SPARK-23148: [SQL] Allow pathnames with special c...
GitHub user henryr opened a pull request:
https://github.com/apache/spark/pull/20355
SPARK-23148: [SQL] Allow pathnames with special characters for CSV / …
…JSON / text
## What changes were proposed in this pull request?
Fix for JSON and CSV data sources when file names include characters
that would be changed by URL encoding.
## How was this patch tested?
New unit tests for JSON, CSV and text suites
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/henryr/spark spark-23148
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/20355.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #20355
----
commit 98da4cd4b865ef8944a10a24ee709ed9ee0a4721
Author: Henry Robinson <he...@...>
Date: 2018-01-19T22:55:53Z
SPARK-23148: [SQL] Allow pathnames with special characters for CSV / JSON / text
## What changes were proposed in this pull request?
Fix for JSON and CSV data sources when file names include characters
that would be changed by URL encoding.
## How was this patch tested?
New unit tests for JSON, CSV and text suites
----
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #20355: SPARK-23148: [SQL] Allow pathnames with special characte...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/20355
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/86554/
Test FAILed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #20355: [SPARK-23148][SQL] Allow pathnames with special characte...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/20355
Merged build finished. Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #20355: [SPARK-23148][SQL] Allow pathnames with special characte...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/20355
**[Test build #86577 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86577/testReport)** for PR 20355 at commit [`51b0db5`](https://github.com/apache/spark/commit/51b0db5272858995a3b3e389a851f4e4aa04c651).
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #20355: SPARK-23148: [SQL] Allow pathnames with special characte...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/20355
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/86502/
Test FAILed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #20355: SPARK-23148: [SQL] Allow pathnames with special characte...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/20355
Merged build finished. Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #20355: SPARK-23148: [SQL] Allow pathnames with special characte...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/20355
**[Test build #86552 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86552/testReport)** for PR 20355 at commit [`740def4`](https://github.com/apache/spark/commit/740def4c9a96a7dba5a8f57c49042dee661608b6).
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #20355: SPARK-23148: [SQL] Allow pathnames with special c...
Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/20355#discussion_r163137474
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/text/TextSuite.scala ---
@@ -172,6 +172,14 @@ class TextSuite extends QueryTest with SharedSQLContext {
}
}
+ test("SPARK-23148: test for spaces in file names") {
--- End diff --
Shall we remove this test case and only leave test CSV / JSON with `multiLine` enabled? I think other tests are basically duplicates of https://github.com/apache/spark/blob/a0aedb0ded4183cc33b27e369df1cbf862779e26/sql/core/src/test/scala/org/apache/spark/sql/FileBasedDataSourceSuite.scala#L71.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #20355: SPARK-23148: [SQL] Allow pathnames with special characte...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/20355
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/86503/
Test FAILed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #20355: [SPARK-23148][SQL] Allow pathnames with special characte...
Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/20355
Merged to master and branch-2.3.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #20355: SPARK-23148: [SQL] Allow pathnames with special characte...
Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/20355
Let's don't forget to fix the PR title to `[SPARK-23148][SQL] ... `. This style is actually documented to be encouraged in this project contributing guide.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #20355: SPARK-23148: [SQL] Allow pathnames with special c...
Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/20355#discussion_r163434591
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/FileBasedDataSourceSuite.scala ---
@@ -23,6 +23,7 @@ class FileBasedDataSourceSuite extends QueryTest with SharedSQLContext {
import testImplicits._
private val allFileBasedDataSources = Seq("orc", "parquet", "csv", "json", "text")
+ private val nameWithSpecialChars = s"sp&cial%c hars"
--- End diff --
nit: seems prefix `s` can be removed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #20355: SPARK-23148: [SQL] Allow pathnames with special characte...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/20355
**[Test build #86552 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86552/testReport)** for PR 20355 at commit [`740def4`](https://github.com/apache/spark/commit/740def4c9a96a7dba5a8f57c49042dee661608b6).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #20355: SPARK-23148: [SQL] Allow pathnames with special characte...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/20355
Merged build finished. Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #20355: SPARK-23148: [SQL] Allow pathnames with special characte...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/20355
Merged build finished. Test FAILed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #20355: SPARK-23148: [SQL] Allow pathnames with special characte...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/20355
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/117/
Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #20355: SPARK-23148: [SQL] Allow pathnames with special characte...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/20355
**[Test build #86554 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86554/testReport)** for PR 20355 at commit [`1b9420a`](https://github.com/apache/spark/commit/1b9420a4d6fd2b202d6b076405621773d8f9e80d).
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #20355: SPARK-23148: [SQL] Allow pathnames with special characte...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/20355
Merged build finished. Test FAILed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #20355: SPARK-23148: [SQL] Allow pathnames with special characte...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/20355
**[Test build #86505 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86505/testReport)** for PR 20355 at commit [`9c56736`](https://github.com/apache/spark/commit/9c567368105522537977437532c3e45c6fd07296).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #20355: SPARK-23148: [SQL] Allow pathnames with special c...
Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/20355#discussion_r163129347
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/json/JsonSuite.scala ---
@@ -2063,4 +2063,12 @@ class JsonSuite extends QueryTest with SharedSQLContext with TestJsonData {
)
}
}
+
+ test("SPARK-23148: test for spaces in file names") {
+ withTempPath { dir =>
--- End diff --
Could we test `multiLine` here too?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #20355: SPARK-23148: [SQL] Allow pathnames with special c...
Posted by henryr <gi...@git.apache.org>.
Github user henryr commented on a diff in the pull request:
https://github.com/apache/spark/pull/20355#discussion_r163455727
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/FileBasedDataSourceSuite.scala ---
@@ -78,4 +78,20 @@ class FileBasedDataSourceSuite extends QueryTest with SharedSQLContext {
}
}
}
+
+ // Separate test case for text-based formats that support multiLine as an option.
+ Seq("json", "csv", "text").foreach { format =>
--- End diff --
That sounds good - my main concern is to make sure that both `multiLine=true` _and_ `multiLine=false` have coverage with a space in the name, since they are such different paths. I'll keep the change that adds a space to `nameWithSpecialChars`, but otherwise have the tests as you suggest - let me know what you think of the next patch!
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #20355: SPARK-23148: [SQL] Allow pathnames with special characte...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/20355
**[Test build #86502 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86502/testReport)** for PR 20355 at commit [`98da4cd`](https://github.com/apache/spark/commit/98da4cd4b865ef8944a10a24ee709ed9ee0a4721).
* This patch **fails Scala style tests**.
* This patch merges cleanly.
* This patch adds no public classes.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #20355: SPARK-23148: [SQL] Allow pathnames with special c...
Posted by henryr <gi...@git.apache.org>.
Github user henryr commented on a diff in the pull request:
https://github.com/apache/spark/pull/20355#discussion_r163411267
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/text/TextSuite.scala ---
@@ -172,6 +172,14 @@ class TextSuite extends QueryTest with SharedSQLContext {
}
}
+ test("SPARK-23148: test for spaces in file names") {
--- End diff --
In the end, to reduce code duplication, I made it so that orc and parquet run multiline as well (I tried to find a neat way to only run multiline if the format was csv, text or json without having a separate test case but it just complicated things). Let me know if you'd rather I have two separate test cases to avoid running the two redundant cases with orc / parquet.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #20355: [SPARK-23148][SQL] Allow pathnames with special c...
Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:
https://github.com/apache/spark/pull/20355
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #20355: [SPARK-23148][SQL] Allow pathnames with special characte...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/20355
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/86577/
Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #20355: [SPARK-23148][SQL] Allow pathnames with special characte...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/20355
**[Test build #86577 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86577/testReport)** for PR 20355 at commit [`51b0db5`](https://github.com/apache/spark/commit/51b0db5272858995a3b3e389a851f4e4aa04c651).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #20355: SPARK-23148: [SQL] Allow pathnames with special characte...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/20355
Merged build finished. Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #20355: SPARK-23148: [SQL] Allow pathnames with special characte...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/20355
**[Test build #86503 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86503/testReport)** for PR 20355 at commit [`535234c`](https://github.com/apache/spark/commit/535234c57a9b2633c8dbd061577766c24bdbebca).
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #20355: SPARK-23148: [SQL] Allow pathnames with special c...
Posted by henryr <gi...@git.apache.org>.
Github user henryr commented on a diff in the pull request:
https://github.com/apache/spark/pull/20355#discussion_r163146077
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/text/TextSuite.scala ---
@@ -172,6 +172,14 @@ class TextSuite extends QueryTest with SharedSQLContext {
}
}
+ test("SPARK-23148: test for spaces in file names") {
--- End diff --
@HyukjinKwon good idea, thanks for pointing out that test - how about I just add a space to the special characters string, and have the existing test also test multiline on/off for text, csv and json?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #20355: [SPARK-23148][SQL] Allow pathnames with special characte...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/20355
**[Test build #86564 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86564/testReport)** for PR 20355 at commit [`51b0db5`](https://github.com/apache/spark/commit/51b0db5272858995a3b3e389a851f4e4aa04c651).
* This patch **fails due to an unknown error code, -9**.
* This patch merges cleanly.
* This patch adds no public classes.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #20355: SPARK-23148: [SQL] Allow pathnames with special characte...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/20355
Merged build finished. Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #20355: SPARK-23148: [SQL] Allow pathnames with special characte...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/20355
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/86505/
Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #20355: SPARK-23148: [SQL] Allow pathnames with special characte...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/20355
Merged build finished. Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #20355: SPARK-23148: [SQL] Allow pathnames with special characte...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/20355
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/86552/
Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #20355: [SPARK-23148][SQL] Allow pathnames with special characte...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/20355
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/171/
Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #20355: [SPARK-23148][SQL] Allow pathnames with special characte...
Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/20355
retest this please
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #20355: SPARK-23148: [SQL] Allow pathnames with special characte...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/20355
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/116/
Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #20355: SPARK-23148: [SQL] Allow pathnames with special characte...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/20355
**[Test build #86554 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86554/testReport)** for PR 20355 at commit [`1b9420a`](https://github.com/apache/spark/commit/1b9420a4d6fd2b202d6b076405621773d8f9e80d).
* This patch **fails Spark unit tests**.
* This patch merges cleanly.
* This patch adds no public classes.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #20355: [SPARK-23148][SQL] Allow pathnames with special characte...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/20355
Merged build finished. Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #20355: [SPARK-23148][SQL] Allow pathnames with special characte...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/20355
Merged build finished. Test FAILed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #20355: [SPARK-23148][SQL] Allow pathnames with special characte...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/20355
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/183/
Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #20355: SPARK-23148: [SQL] Allow pathnames with special characte...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/20355
Merged build finished. Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #20355: [SPARK-23148][SQL] Allow pathnames with special characte...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/20355
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/86564/
Test FAILed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #20355: SPARK-23148: [SQL] Allow pathnames with special c...
Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/20355#discussion_r163414704
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/FileBasedDataSourceSuite.scala ---
@@ -68,13 +68,16 @@ class FileBasedDataSourceSuite extends QueryTest with SharedSQLContext {
}
allFileBasedDataSources.foreach { format =>
- test(s"SPARK-22146 read files containing special characters using $format") {
- val nameWithSpecialChars = s"sp&cial%chars"
- withTempDir { dir =>
- val tmpFile = s"$dir/$nameWithSpecialChars"
- spark.createDataset(Seq("a", "b")).write.format(format).save(tmpFile)
- val fileContent = spark.read.format(format).load(tmpFile)
- checkAnswer(fileContent, Seq(Row("a"), Row("b")))
+ test(s"SPARK-22146 / SPARK-23148 read files containing special characters using $format") {
+ val nameWithSpecialChars = s"sp&cial%c hars"
+ Seq(true, false).foreach { multiline =>
--- End diff --
Less dup is fine but this case slightly confuses like orc and parquet support multiline, and runs duplicated tests as you pointed out if I should nitpick. I think I prefer a separate test.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #20355: SPARK-23148: [SQL] Allow pathnames with special characte...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/20355
**[Test build #86503 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86503/testReport)** for PR 20355 at commit [`535234c`](https://github.com/apache/spark/commit/535234c57a9b2633c8dbd061577766c24bdbebca).
* This patch **fails Scala style tests**.
* This patch merges cleanly.
* This patch adds no public classes.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #20355: SPARK-23148: [SQL] Allow pathnames with special c...
Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/20355#discussion_r163152164
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/text/TextSuite.scala ---
@@ -172,6 +172,14 @@ class TextSuite extends QueryTest with SharedSQLContext {
}
}
+ test("SPARK-23148: test for spaces in file names") {
--- End diff --
Alternatively, adding the test like below in `FileBasedDataSourceSuite.scala`, instead of `CSVSuite` and `JsonSuite`:
```scala
// Only CSV/JSON supports multiLine option and the code paths are different blabla ..
Seq("csv", "json").foreach { format =>
test(s"SPARK-23148 read files containing special characters using $format - multLine enabled") {
val nameWithSpecialChars = s"sp&ci al%chars"
withTempDir { dir =>
... spark.read.option("multuLine", true).format(format)...
}
}
}
```
could be also fine if possible. Either way is fine to me.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #20355: SPARK-23148: [SQL] Allow pathnames with special c...
Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/20355#discussion_r163149134
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/text/TextSuite.scala ---
@@ -172,6 +172,14 @@ class TextSuite extends QueryTest with SharedSQLContext {
}
}
+ test("SPARK-23148: test for spaces in file names") {
--- End diff --
Let's just add both `multiLine` specific tests for `CSVSuite` and `JsonSuite` and add a space in "SPARK-22146 read files containing special characters using ..." if you prefer this way.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #20355: SPARK-23148: [SQL] Allow pathnames with special c...
Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/20355#discussion_r163434524
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/FileBasedDataSourceSuite.scala ---
@@ -78,4 +78,20 @@ class FileBasedDataSourceSuite extends QueryTest with SharedSQLContext {
}
}
}
+
+ // Separate test case for text-based formats that support multiLine as an option.
+ Seq("json", "csv", "text").foreach { format =>
--- End diff --
Actually "text" doesn't support `multiLine` but `wholetext` which runs another code path. Let's take this out.
How about leaving out the tests for `SPARK-22146` as was, and just adding a test dedicated for `multiline` here like:
```scala
Seq("json", "csv").foreach { format =>
test("SPARK-23148 read files containing special characters " +
s"using $format multiline enabled") {
withTempDir { dir =>
val tmpFile = s"$dir/$nameWithSpecialChars"
spark.createDataset(Seq("a", "b")).write.format(format).save(tmpFile)
val reader = spark.read.format(format).option("multiLine", true)
val fileContent = reader.load(tmpFile)
checkAnswer(fileContent, Seq(Row("a"), Row("b")))
}
}
}
```
This PR changes really fix the code paths for both CSV / JSON when `multiLine` is enabled only .. I am not sure why you don't like this suggestion .. one test less invasive and more targeted for one JIRA ...
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #20355: SPARK-23148: [SQL] Allow pathnames with special characte...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/20355
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/160/
Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #20355: SPARK-23148: [SQL] Allow pathnames with special c...
Posted by henryr <gi...@git.apache.org>.
Github user henryr commented on a diff in the pull request:
https://github.com/apache/spark/pull/20355#discussion_r163422934
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/FileBasedDataSourceSuite.scala ---
@@ -68,13 +68,16 @@ class FileBasedDataSourceSuite extends QueryTest with SharedSQLContext {
}
allFileBasedDataSources.foreach { format =>
- test(s"SPARK-22146 read files containing special characters using $format") {
- val nameWithSpecialChars = s"sp&cial%chars"
- withTempDir { dir =>
- val tmpFile = s"$dir/$nameWithSpecialChars"
- spark.createDataset(Seq("a", "b")).write.format(format).save(tmpFile)
- val fileContent = spark.read.format(format).load(tmpFile)
- checkAnswer(fileContent, Seq(Row("a"), Row("b")))
+ test(s"SPARK-22146 / SPARK-23148 read files containing special characters using $format") {
+ val nameWithSpecialChars = s"sp&cial%c hars"
+ Seq(true, false).foreach { multiline =>
--- End diff --
Sounds good to me.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #20355: SPARK-23148: [SQL] Allow pathnames with special characte...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/20355
Merged build finished. Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #20355: [SPARK-23148][SQL] Allow pathnames with special characte...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/20355
**[Test build #86564 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86564/testReport)** for PR 20355 at commit [`51b0db5`](https://github.com/apache/spark/commit/51b0db5272858995a3b3e389a851f4e4aa04c651).
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #20355: [SPARK-23148][SQL] Allow pathnames with special characte...
Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/20355
Please fix the PR title.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #20355: [SPARK-23148][SQL] Allow pathnames with special characte...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/20355
Merged build finished. Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #20355: SPARK-23148: [SQL] Allow pathnames with special characte...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/20355
**[Test build #86505 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86505/testReport)** for PR 20355 at commit [`9c56736`](https://github.com/apache/spark/commit/9c567368105522537977437532c3e45c6fd07296).
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #20355: SPARK-23148: [SQL] Allow pathnames with special characte...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/20355
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/119/
Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #20355: SPARK-23148: [SQL] Allow pathnames with special characte...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/20355
Merged build finished. Test FAILed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #20355: SPARK-23148: [SQL] Allow pathnames with special characte...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/20355
**[Test build #86502 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86502/testReport)** for PR 20355 at commit [`98da4cd`](https://github.com/apache/spark/commit/98da4cd4b865ef8944a10a24ee709ed9ee0a4721).
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #20355: SPARK-23148: [SQL] Allow pathnames with special characte...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/20355
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/162/
Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org