You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by MaxGekk <gi...@git.apache.org> on 2018/05/23 12:31:32 UTC

[GitHub] spark pull request #21410: [SPARK-24366][SQL] Improving of error messages fo...

GitHub user MaxGekk opened a pull request:

    https://github.com/apache/spark/pull/21410

    [SPARK-24366][SQL] Improving of error messages for type converting

    ## What changes were proposed in this pull request?
    
    Currently, users are getting the following error messages on type conversions:
    
    ```
    scala.MatchError: test (of class java.lang.String)
    ```
    
    The message doesn't give any clues to the users where in the schema the error happened. In this PR, I would like to improve the error message like:
    
    ```
    The value (test) of the type (java.lang.String) cannot be converted to struct<f1:int>
    ```
    
    ## How was this patch tested?
    
    Added tests for converting of wrong values to `struct`, `map`, `array`, `string` and `decimal`.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/MaxGekk/spark-1 type-conv-error

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/21410.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #21410
    
----
commit 26cc2f84ee6324db23936e20816d240031211311
Author: Maxim Gekk <ma...@...>
Date:   2018-05-23T12:22:33Z

    Improving of error messages for type conversions

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21410: [SPARK-24366][SQL] Improving of error messages for type ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21410
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/91151/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21410: [SPARK-24366][SQL] Improving of error messages for type ...

Posted by MaxGekk <gi...@git.apache.org>.
Github user MaxGekk commented on the issue:

    https://github.com/apache/spark/pull/21410
  
    @gatorsmile Could you look at the PR, please. The changes should help us in trouble shooting of customer's issues.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21410: [SPARK-24366][SQL] Improving of error messages fo...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/21410


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21410: [SPARK-24366][SQL] Improving of error messages for type ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21410
  
    **[Test build #91151 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91151/testReport)** for PR 21410 at commit [`ac76544`](https://github.com/apache/spark/commit/ac7654415a3ec82c6cf3306e664cf09018c66db6).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21410: [SPARK-24366][SQL] Improving of error messages for type ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21410
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/91062/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21410: [SPARK-24366][SQL] Improving of error messages for type ...

Posted by MaxGekk <gi...@git.apache.org>.
Github user MaxGekk commented on the issue:

    https://github.com/apache/spark/pull/21410
  
    > Is there a way to identify where in the schema the issue is occurring?
    
    We can catch the exceptions on each level of schema tree traversal, and show sub-trees in each catch. For example: `array<map<..., array<struct<f2:int>>>>` , the first exception will point out `struct<f2:int>`, the second one `array<struct<f2:int>>` and up to the "root" schema. 
    
    > e.g., a.b.c where this is happening, is required to easily isolate the issue in the input data and resolve it.
    
    I guess in the case of arrays and maps, you want to see indexes and keys. Could you provide concrete example with values and a schema (array, struct, map), and what kind of info the error should contain.
    
    Just in case, I would propose to make such improvements in a separate PR.   


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21410: [SPARK-24366][SQL] Improving of error messages for type ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21410
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/91034/
    Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21410: [SPARK-24366][SQL] Improving of error messages for type ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21410
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21410: [SPARK-24366][SQL] Improving of error messages for type ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21410
  
    **[Test build #91034 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91034/testReport)** for PR 21410 at commit [`26cc2f8`](https://github.com/apache/spark/commit/26cc2f84ee6324db23936e20816d240031211311).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21410: [SPARK-24366][SQL] Improving of error messages for type ...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on the issue:

    https://github.com/apache/spark/pull/21410
  
    Thanks! Merged to master.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21410: [SPARK-24366][SQL] Improving of error messages for type ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21410
  
    **[Test build #91034 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91034/testReport)** for PR 21410 at commit [`26cc2f8`](https://github.com/apache/spark/commit/26cc2f84ee6324db23936e20816d240031211311).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21410: [SPARK-24366][SQL] Improving of error messages for type ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21410
  
    **[Test build #91062 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91062/testReport)** for PR 21410 at commit [`26cc2f8`](https://github.com/apache/spark/commit/26cc2f84ee6324db23936e20816d240031211311).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21410: [SPARK-24366][SQL] Improving of error messages for type ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21410
  
    **[Test build #91062 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91062/testReport)** for PR 21410 at commit [`26cc2f8`](https://github.com/apache/spark/commit/26cc2f84ee6324db23936e20816d240031211311).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21410: [SPARK-24366][SQL] Improving of error messages fo...

Posted by MaxGekk <gi...@git.apache.org>.
Github user MaxGekk commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21410#discussion_r190823171
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/CatalystTypeConverters.scala ---
    @@ -309,6 +322,9 @@ object CatalystTypeConverters {
             case d: JavaBigDecimal => Decimal(d)
             case d: JavaBigInteger => Decimal(d)
             case d: Decimal => d
    +        case other => throw new IllegalArgumentException(
    +          s"The value (${other.toString}) of the type (${other.getClass.getCanonicalName}) "
    +            + s"cannot be converted to ${dataType.simpleString}")
    --- End diff --
    
    All `simpleString`s are replaced by `catalogString`s


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21410: [SPARK-24366][SQL] Improving of error messages for type ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21410
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21410: [SPARK-24366][SQL] Improving of error messages for type ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21410
  
    **[Test build #91151 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91151/testReport)** for PR 21410 at commit [`ac76544`](https://github.com/apache/spark/commit/ac7654415a3ec82c6cf3306e664cf09018c66db6).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21410: [SPARK-24366][SQL] Improving of error messages fo...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21410#discussion_r190790254
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/CatalystTypeConverters.scala ---
    @@ -309,6 +322,9 @@ object CatalystTypeConverters {
             case d: JavaBigDecimal => Decimal(d)
             case d: JavaBigInteger => Decimal(d)
             case d: Decimal => d
    +        case other => throw new IllegalArgumentException(
    +          s"The value (${other.toString}) of the type (${other.getClass.getCanonicalName}) "
    +            + s"cannot be converted to ${dataType.simpleString}")
    --- End diff --
    
    Let us use `catalogString` here?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21410: [SPARK-24366][SQL] Improving of error messages for type ...

Posted by MaxGekk <gi...@git.apache.org>.
Github user MaxGekk commented on the issue:

    https://github.com/apache/spark/pull/21410
  
    jenkins, retest this, please


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21410: [SPARK-24366][SQL] Improving of error messages for type ...

Posted by ssimeonov <gi...@git.apache.org>.
Github user ssimeonov commented on the issue:

    https://github.com/apache/spark/pull/21410
  
    This is an excellent start and a worthy improvement. 
    
    Is there a way to identify where in the schema the issue is occurring? For example, when you have a schema with many nested fields, the failing value is helpful but the breadcrumb trail, e.g., `a.b.c` where this is happening, is required to easily isolate the issue in the input data and resolve it. 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21410: [SPARK-24366][SQL] Improving of error messages for type ...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on the issue:

    https://github.com/apache/spark/pull/21410
  
    LGTM except one minor comment.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21410: [SPARK-24366][SQL] Improving of error messages for type ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21410
  
    Can one of the admins verify this patch?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21410: [SPARK-24366][SQL] Improving of error messages for type ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21410
  
    Merged build finished. Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org