You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by MaxGekk <gi...@git.apache.org> on 2018/12/05 12:16:27 UTC
[GitHub] spark pull request #23235: [SPARK-26151][SQL][FOLLOWUP] Return partial resul...
GitHub user MaxGekk opened a pull request:
https://github.com/apache/spark/pull/23235
[SPARK-26151][SQL][FOLLOWUP] Return partial results for bad CSV records
## What changes were proposed in this pull request?
Updated SQL migration guide according to changes in https://github.com/apache/spark/pull/23120
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/MaxGekk/spark-1 failuresafe-partial-result-followup
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/23235.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #23235
----
commit 8c115f7871d4db66b13ee21ea3a1231f7153791e
Author: Maxim Gekk <ma...@...>
Date: 2018-12-05T12:13:26Z
Updating the migration guide
----
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #23235: [SPARK-26151][SQL][FOLLOWUP] Return partial resul...
Posted by MaxGekk <gi...@git.apache.org>.
Github user MaxGekk commented on a diff in the pull request:
https://github.com/apache/spark/pull/23235#discussion_r239055208
--- Diff: docs/sql-migration-guide-upgrade.md ---
@@ -35,6 +35,8 @@ displayTitle: Spark SQL Upgrading Guide
- Since Spark 3.0, CSV datasource uses java.time API for parsing and generating CSV content. New formatting implementation supports date/timestamp patterns conformed to ISO 8601. To switch back to the implementation used in Spark 2.4 and earlier, set `spark.sql.legacy.timeParser.enabled` to `true`.
+ - In Spark version 2.4 and earlier, CSV datasource converts a malformed CSV string to a row with all `null`s in the PERMISSIVE mode if specified schema is `StructType`. Since Spark 3.0, returned row can contain non-`null` fields if some of CSV column values were parsed and converted to desired types successfully.
--- End diff --
you are right. I will remove the part about `StructType`
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23235: [SPARK-26151][SQL][FOLLOWUP] Return partial results for ...
Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:
https://github.com/apache/spark/pull/23235
thanks, merging to master!
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23235: [SPARK-26151][SQL][FOLLOWUP] Return partial results for ...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/23235
**[Test build #99720 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99720/testReport)** for PR 23235 at commit [`8c115f7`](https://github.com/apache/spark/commit/8c115f7871d4db66b13ee21ea3a1231f7153791e).
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23235: [SPARK-26151][SQL][FOLLOWUP] Return partial results for ...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/23235
**[Test build #99728 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99728/testReport)** for PR 23235 at commit [`463f9e1`](https://github.com/apache/spark/commit/463f9e16ead2291a7f0f3893e485a56b77da2f06).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #23235: [SPARK-26151][SQL][FOLLOWUP] Return partial resul...
Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:
https://github.com/apache/spark/pull/23235
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23235: [SPARK-26151][SQL][FOLLOWUP] Return partial results for ...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/23235
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5771/
Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23235: [SPARK-26151][SQL][FOLLOWUP] Return partial results for ...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/23235
Merged build finished. Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #23235: [SPARK-26151][SQL][FOLLOWUP] Return partial resul...
Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/23235#discussion_r239049825
--- Diff: docs/sql-migration-guide-upgrade.md ---
@@ -35,6 +35,8 @@ displayTitle: Spark SQL Upgrading Guide
- Since Spark 3.0, CSV datasource uses java.time API for parsing and generating CSV content. New formatting implementation supports date/timestamp patterns conformed to ISO 8601. To switch back to the implementation used in Spark 2.4 and earlier, set `spark.sql.legacy.timeParser.enabled` to `true`.
+ - In Spark version 2.4 and earlier, CSV datasource converts a malformed CSV string to a row with all `null`s in the PERMISSIVE mode if specified schema is `StructType`. Since Spark 3.0, returned row can contain non-`null` fields if some of CSV column values were parsed and converted to desired types successfully.
--- End diff --
Ah, `from_csv` and `to_csv` are added in 3.0 so it's intentionally not mentioned. BTW, I think CSV functionalities can only have `StructType` so maybe we don't have to mention.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23235: [SPARK-26151][SQL][FOLLOWUP] Return partial results for ...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/23235
**[Test build #99720 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99720/testReport)** for PR 23235 at commit [`8c115f7`](https://github.com/apache/spark/commit/8c115f7871d4db66b13ee21ea3a1231f7153791e).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23235: [SPARK-26151][SQL][FOLLOWUP] Return partial results for ...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/23235
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99728/
Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23235: [SPARK-26151][SQL][FOLLOWUP] Return partial results for ...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/23235
Merged build finished. Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23235: [SPARK-26151][SQL][FOLLOWUP] Return partial results for ...
Posted by MaxGekk <gi...@git.apache.org>.
Github user MaxGekk commented on the issue:
https://github.com/apache/spark/pull/23235
@cloud-fan Please, have a look at the PR.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23235: [SPARK-26151][SQL][FOLLOWUP] Return partial results for ...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/23235
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5765/
Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23235: [SPARK-26151][SQL][FOLLOWUP] Return partial results for ...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/23235
Can one of the admins verify this patch?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23235: [SPARK-26151][SQL][FOLLOWUP] Return partial results for ...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/23235
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99720/
Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23235: [SPARK-26151][SQL][FOLLOWUP] Return partial results for ...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/23235
Merged build finished. Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23235: [SPARK-26151][SQL][FOLLOWUP] Return partial results for ...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/23235
Merged build finished. Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23235: [SPARK-26151][SQL][FOLLOWUP] Return partial results for ...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/23235
**[Test build #99728 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99728/testReport)** for PR 23235 at commit [`463f9e1`](https://github.com/apache/spark/commit/463f9e16ead2291a7f0f3893e485a56b77da2f06).
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org