You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by j-baker <gi...@git.apache.org> on 2018/11/05 10:42:25 UTC

[GitHub] spark pull request #22946: [SPARK-25943][SQL] Fail if mismatching nested str...

GitHub user j-baker opened a pull request:

    https://github.com/apache/spark/pull/22946

    [SPARK-25943][SQL] Fail if mismatching nested struct fields when writing to datasource

    At present, Spark reorders mismatched columns when writing to
    a datasource, but does not reorder nested structs.
    
    This causes failure at present if the types do not match, but
    not if the names do not match; this causes structs to get
    silently mangled.
    
    It's not obvious to me where I should add tests, so would
    appreciate guidance on that!
    
    ## What changes were proposed in this pull request?
    
    Unify DatasourcesV1 behaviour with DatasourcesV2, by throwing if the names are mismatched.
    
    ## How was this patch tested?
    
    Nothing. I'd really appreciate a suggestion of where to put tests for this; I'm unfamiliar with the codebase and I couldn't find any obvious looking tests of that codepath.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/j-baker/spark jbaker/nested_struct

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/22946.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #22946
    
----
commit 63dd40f47ab8e8e9c120a9801b2f037336001ea6
Author: James Baker <jb...@...>
Date:   2018-11-05T10:35:02Z

    [SPARK-25943][SQL] Fail if mismatching nested struct fields when writing to datasource
    
    At present, Spark reorders mismatched columns when writing to
    a datasource, but does not reorder nested structs.
    
    This causes failure at present if the types do not match, but
    not if the names do not match; this causes structs to get
    silently mangled.
    
    It's not obvious to me where I should add tests, so would
    appreciate guidance on that!

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22946: [SPARK-25943][SQL] Fail if mismatching nested struct fie...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22946
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98851/
    Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22946: [SPARK-25943][SQL] Fail if mismatching nested struct fie...

Posted by mccheah <gi...@git.apache.org>.
Github user mccheah commented on the issue:

    https://github.com/apache/spark/pull/22946
  
    Ok to test


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22946: [SPARK-25943][SQL] Fail if mismatching nested struct fie...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22946
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98505/
    Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22946: [SPARK-25943][SQL] Fail if mismatching nested struct fie...

Posted by j-baker <gi...@git.apache.org>.
Github user j-baker commented on the issue:

    https://github.com/apache/spark/pull/22946
  
    have got a test and have hopefully fixed the existing ones!


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22946: [SPARK-25943][SQL] Fail if mismatching nested struct fie...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22946
  
    Merged build finished. Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22946: [SPARK-25943][SQL] Fail if mismatching nested struct fie...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22946
  
    **[Test build #98500 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98500/testReport)** for PR 22946 at commit [`63dd40f`](https://github.com/apache/spark/commit/63dd40f47ab8e8e9c120a9801b2f037336001ea6).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22946: [SPARK-25943][SQL] Fail if mismatching nested struct fie...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22946
  
    **[Test build #98851 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98851/testReport)** for PR 22946 at commit [`0a558c4`](https://github.com/apache/spark/commit/0a558c46b328e0db2048c5c39baf483b37d89a2a).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22946: [SPARK-25943][SQL] Fail if mismatching nested struct fie...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22946
  
    **[Test build #98847 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98847/testReport)** for PR 22946 at commit [`b6a191a`](https://github.com/apache/spark/commit/b6a191a2c250db89f579c52229cd0044e7464284).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22946: [SPARK-25943][SQL] Fail if mismatching nested struct fie...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22946
  
    Merged build finished. Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22946: [SPARK-25943][SQL] Fail if mismatching nested struct fie...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22946
  
    Merged build finished. Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22946: [SPARK-25943][SQL] Fail if mismatching nested struct fie...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22946
  
    **[Test build #98499 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98499/testReport)** for PR 22946 at commit [`63dd40f`](https://github.com/apache/spark/commit/63dd40f47ab8e8e9c120a9801b2f037336001ea6).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22946: [SPARK-25943][SQL] Fail if mismatching nested struct fie...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22946
  
    **[Test build #98500 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98500/testReport)** for PR 22946 at commit [`63dd40f`](https://github.com/apache/spark/commit/63dd40f47ab8e8e9c120a9801b2f037336001ea6).
     * This patch **fails Scala style tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22946: [SPARK-25943][SQL] Fail if mismatching nested struct fie...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22946
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98500/
    Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22946: [SPARK-25943][SQL] Fail if mismatching nested struct fie...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22946
  
    **[Test build #98505 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98505/testReport)** for PR 22946 at commit [`e6f56d3`](https://github.com/apache/spark/commit/e6f56d3974fd7026519d84f446cde507bfd72419).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22946: [SPARK-25943][SQL] Fail if mismatching nested struct fie...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22946
  
    Can one of the admins verify this patch?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22946: [SPARK-25943][SQL] Fail if mismatching nested struct fie...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22946
  
    **[Test build #98505 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98505/testReport)** for PR 22946 at commit [`e6f56d3`](https://github.com/apache/spark/commit/e6f56d3974fd7026519d84f446cde507bfd72419).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22946: [SPARK-25943][SQL] Fail if mismatching nested struct fie...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22946
  
    Can one of the admins verify this patch?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22946: [SPARK-25943][SQL] Fail if mismatching nested struct fie...

Posted by j-baker <gi...@git.apache.org>.
Github user j-baker commented on the issue:

    https://github.com/apache/spark/pull/22946
  
    Hm, it seems that we consider this behaviour to be a feature, and there is a test for this specific thing. This PR would additionally break the code:
    
    ```sql
    CREATE TABLE t5(i1 INT, t5 STRUCT<i1:INT, i2:INT>) USING parquet;
    INSERT INTO t5 VALUES(1, (2, 3));
    ```
    
    Not really sure how to proceed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22946: [SPARK-25943][SQL] Fail if mismatching nested struct fie...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22946
  
    **[Test build #98852 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98852/testReport)** for PR 22946 at commit [`417582d`](https://github.com/apache/spark/commit/417582d52af49d4a59e1085f205f2ddf99b900f6).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22946: [SPARK-25943][SQL] Fail if mismatching nested struct fie...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22946
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98499/
    Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22946: [SPARK-25943][SQL] Fail if mismatching nested struct fie...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22946
  
    **[Test build #98852 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98852/testReport)** for PR 22946 at commit [`417582d`](https://github.com/apache/spark/commit/417582d52af49d4a59e1085f205f2ddf99b900f6).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22946: [SPARK-25943][SQL] Fail if mismatching nested struct fie...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22946
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98852/
    Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22946: [SPARK-25943][SQL] Fail if mismatching nested struct fie...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22946
  
    **[Test build #98851 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98851/testReport)** for PR 22946 at commit [`0a558c4`](https://github.com/apache/spark/commit/0a558c46b328e0db2048c5c39baf483b37d89a2a).
     * This patch **fails Scala style tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22946: [SPARK-25943][SQL] Fail if mismatching nested struct fie...

Posted by mccheah <gi...@git.apache.org>.
Github user mccheah commented on the issue:

    https://github.com/apache/spark/pull/22946
  
    @cloud-fan for review.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22946: [SPARK-25943][SQL] Fail if mismatching nested struct fie...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22946
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98847/
    Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22946: [SPARK-25943][SQL] Fail if mismatching nested struct fie...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/22946
  
    ah good catch! Can you also add a test?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22946: [SPARK-25943][SQL] Fail if mismatching nested struct fie...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/22946
  
    ok to test


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22946: [SPARK-25943][SQL] Fail if mismatching nested struct fie...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22946
  
    Merged build finished. Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22946: [SPARK-25943][SQL] Fail if mismatching nested struct fie...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22946
  
    Can one of the admins verify this patch?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22946: [SPARK-25943][SQL] Fail if mismatching nested struct fie...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22946
  
    **[Test build #98499 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98499/testReport)** for PR 22946 at commit [`63dd40f`](https://github.com/apache/spark/commit/63dd40f47ab8e8e9c120a9801b2f037336001ea6).
     * This patch **fails Scala style tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22946: [SPARK-25943][SQL] Fail if mismatching nested struct fie...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22946
  
    Merged build finished. Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22946: [SPARK-25943][SQL] Fail if mismatching nested struct fie...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22946
  
    Merged build finished. Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22946: [SPARK-25943][SQL] Fail if mismatching nested struct fie...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22946
  
    **[Test build #98847 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98847/testReport)** for PR 22946 at commit [`b6a191a`](https://github.com/apache/spark/commit/b6a191a2c250db89f579c52229cd0044e7464284).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org