You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by j-baker <gi...@git.apache.org> on 2018/11/05 10:42:25 UTC
[GitHub] spark pull request #22946: [SPARK-25943][SQL] Fail if mismatching nested str...
GitHub user j-baker opened a pull request:
https://github.com/apache/spark/pull/22946
[SPARK-25943][SQL] Fail if mismatching nested struct fields when writing to datasource
At present, Spark reorders mismatched columns when writing to
a datasource, but does not reorder nested structs.
This causes failure at present if the types do not match, but
not if the names do not match; this causes structs to get
silently mangled.
It's not obvious to me where I should add tests, so would
appreciate guidance on that!
## What changes were proposed in this pull request?
Unify DatasourcesV1 behaviour with DatasourcesV2, by throwing if the names are mismatched.
## How was this patch tested?
Nothing. I'd really appreciate a suggestion of where to put tests for this; I'm unfamiliar with the codebase and I couldn't find any obvious looking tests of that codepath.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/j-baker/spark jbaker/nested_struct
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/22946.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #22946
----
commit 63dd40f47ab8e8e9c120a9801b2f037336001ea6
Author: James Baker <jb...@...>
Date: 2018-11-05T10:35:02Z
[SPARK-25943][SQL] Fail if mismatching nested struct fields when writing to datasource
At present, Spark reorders mismatched columns when writing to
a datasource, but does not reorder nested structs.
This causes failure at present if the types do not match, but
not if the names do not match; this causes structs to get
silently mangled.
It's not obvious to me where I should add tests, so would
appreciate guidance on that!
----
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22946: [SPARK-25943][SQL] Fail if mismatching nested struct fie...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/22946
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98851/
Test FAILed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22946: [SPARK-25943][SQL] Fail if mismatching nested struct fie...
Posted by mccheah <gi...@git.apache.org>.
Github user mccheah commented on the issue:
https://github.com/apache/spark/pull/22946
Ok to test
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22946: [SPARK-25943][SQL] Fail if mismatching nested struct fie...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/22946
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98505/
Test FAILed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22946: [SPARK-25943][SQL] Fail if mismatching nested struct fie...
Posted by j-baker <gi...@git.apache.org>.
Github user j-baker commented on the issue:
https://github.com/apache/spark/pull/22946
have got a test and have hopefully fixed the existing ones!
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22946: [SPARK-25943][SQL] Fail if mismatching nested struct fie...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/22946
Merged build finished. Test FAILed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22946: [SPARK-25943][SQL] Fail if mismatching nested struct fie...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/22946
**[Test build #98500 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98500/testReport)** for PR 22946 at commit [`63dd40f`](https://github.com/apache/spark/commit/63dd40f47ab8e8e9c120a9801b2f037336001ea6).
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22946: [SPARK-25943][SQL] Fail if mismatching nested struct fie...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/22946
**[Test build #98851 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98851/testReport)** for PR 22946 at commit [`0a558c4`](https://github.com/apache/spark/commit/0a558c46b328e0db2048c5c39baf483b37d89a2a).
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22946: [SPARK-25943][SQL] Fail if mismatching nested struct fie...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/22946
**[Test build #98847 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98847/testReport)** for PR 22946 at commit [`b6a191a`](https://github.com/apache/spark/commit/b6a191a2c250db89f579c52229cd0044e7464284).
* This patch **fails Spark unit tests**.
* This patch merges cleanly.
* This patch adds no public classes.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22946: [SPARK-25943][SQL] Fail if mismatching nested struct fie...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/22946
Merged build finished. Test FAILed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22946: [SPARK-25943][SQL] Fail if mismatching nested struct fie...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/22946
Merged build finished. Test FAILed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22946: [SPARK-25943][SQL] Fail if mismatching nested struct fie...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/22946
**[Test build #98499 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98499/testReport)** for PR 22946 at commit [`63dd40f`](https://github.com/apache/spark/commit/63dd40f47ab8e8e9c120a9801b2f037336001ea6).
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22946: [SPARK-25943][SQL] Fail if mismatching nested struct fie...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/22946
**[Test build #98500 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98500/testReport)** for PR 22946 at commit [`63dd40f`](https://github.com/apache/spark/commit/63dd40f47ab8e8e9c120a9801b2f037336001ea6).
* This patch **fails Scala style tests**.
* This patch merges cleanly.
* This patch adds no public classes.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22946: [SPARK-25943][SQL] Fail if mismatching nested struct fie...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/22946
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98500/
Test FAILed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22946: [SPARK-25943][SQL] Fail if mismatching nested struct fie...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/22946
**[Test build #98505 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98505/testReport)** for PR 22946 at commit [`e6f56d3`](https://github.com/apache/spark/commit/e6f56d3974fd7026519d84f446cde507bfd72419).
* This patch **fails Spark unit tests**.
* This patch merges cleanly.
* This patch adds no public classes.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22946: [SPARK-25943][SQL] Fail if mismatching nested struct fie...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/22946
Can one of the admins verify this patch?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22946: [SPARK-25943][SQL] Fail if mismatching nested struct fie...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/22946
**[Test build #98505 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98505/testReport)** for PR 22946 at commit [`e6f56d3`](https://github.com/apache/spark/commit/e6f56d3974fd7026519d84f446cde507bfd72419).
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22946: [SPARK-25943][SQL] Fail if mismatching nested struct fie...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/22946
Can one of the admins verify this patch?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22946: [SPARK-25943][SQL] Fail if mismatching nested struct fie...
Posted by j-baker <gi...@git.apache.org>.
Github user j-baker commented on the issue:
https://github.com/apache/spark/pull/22946
Hm, it seems that we consider this behaviour to be a feature, and there is a test for this specific thing. This PR would additionally break the code:
```sql
CREATE TABLE t5(i1 INT, t5 STRUCT<i1:INT, i2:INT>) USING parquet;
INSERT INTO t5 VALUES(1, (2, 3));
```
Not really sure how to proceed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22946: [SPARK-25943][SQL] Fail if mismatching nested struct fie...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/22946
**[Test build #98852 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98852/testReport)** for PR 22946 at commit [`417582d`](https://github.com/apache/spark/commit/417582d52af49d4a59e1085f205f2ddf99b900f6).
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22946: [SPARK-25943][SQL] Fail if mismatching nested struct fie...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/22946
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98499/
Test FAILed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22946: [SPARK-25943][SQL] Fail if mismatching nested struct fie...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/22946
**[Test build #98852 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98852/testReport)** for PR 22946 at commit [`417582d`](https://github.com/apache/spark/commit/417582d52af49d4a59e1085f205f2ddf99b900f6).
* This patch **fails Spark unit tests**.
* This patch merges cleanly.
* This patch adds no public classes.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22946: [SPARK-25943][SQL] Fail if mismatching nested struct fie...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/22946
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98852/
Test FAILed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22946: [SPARK-25943][SQL] Fail if mismatching nested struct fie...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/22946
**[Test build #98851 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98851/testReport)** for PR 22946 at commit [`0a558c4`](https://github.com/apache/spark/commit/0a558c46b328e0db2048c5c39baf483b37d89a2a).
* This patch **fails Scala style tests**.
* This patch merges cleanly.
* This patch adds no public classes.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22946: [SPARK-25943][SQL] Fail if mismatching nested struct fie...
Posted by mccheah <gi...@git.apache.org>.
Github user mccheah commented on the issue:
https://github.com/apache/spark/pull/22946
@cloud-fan for review.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22946: [SPARK-25943][SQL] Fail if mismatching nested struct fie...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/22946
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98847/
Test FAILed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22946: [SPARK-25943][SQL] Fail if mismatching nested struct fie...
Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:
https://github.com/apache/spark/pull/22946
ah good catch! Can you also add a test?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22946: [SPARK-25943][SQL] Fail if mismatching nested struct fie...
Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:
https://github.com/apache/spark/pull/22946
ok to test
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22946: [SPARK-25943][SQL] Fail if mismatching nested struct fie...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/22946
Merged build finished. Test FAILed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22946: [SPARK-25943][SQL] Fail if mismatching nested struct fie...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/22946
Can one of the admins verify this patch?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22946: [SPARK-25943][SQL] Fail if mismatching nested struct fie...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/22946
**[Test build #98499 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98499/testReport)** for PR 22946 at commit [`63dd40f`](https://github.com/apache/spark/commit/63dd40f47ab8e8e9c120a9801b2f037336001ea6).
* This patch **fails Scala style tests**.
* This patch merges cleanly.
* This patch adds no public classes.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22946: [SPARK-25943][SQL] Fail if mismatching nested struct fie...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/22946
Merged build finished. Test FAILed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22946: [SPARK-25943][SQL] Fail if mismatching nested struct fie...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/22946
Merged build finished. Test FAILed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22946: [SPARK-25943][SQL] Fail if mismatching nested struct fie...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/22946
**[Test build #98847 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98847/testReport)** for PR 22946 at commit [`b6a191a`](https://github.com/apache/spark/commit/b6a191a2c250db89f579c52229cd0044e7464284).
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org