You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by atallahhezbor <gi...@git.apache.org> on 2018/01/24 16:08:40 UTC

[GitHub] spark pull request #20385: [SPARK-21396][SQL] Fixes MatchError when UDTs are...

GitHub user atallahhezbor opened a pull request:

    https://github.com/apache/spark/pull/20385

    [SPARK-21396][SQL] Fixes MatchError when UDTs are passed through Hive Thriftserver

    Signed-off-by: Atallah Hezbor <at...@gmail.com>
    
    ## What changes were proposed in this pull request?
    
    This PR proposes modifying the match statement that gets the columns of a row in HiveThriftServer. There was previously no case for `UserDefinedType`, so querying a table that contained them would throw a match error. The changes catch that case and return the string representation.
    
    ## How was this patch tested?
    
    While I would have liked to add a unit test, I couldn't easily incorporate UDTs into the ``HiveThriftServer2Suites`` pipeline. With some guidance I would be happy to push a commit with tests.
    
    Instead I did a manual test by loading a `DataFrame` with Point UDT in a spark shell with a HiveThriftServer. Then in beeline, connecting to the server and querying that table. 
    
    Here is the result before the change
    ```
    0: jdbc:hive2://localhost:10000> select * from chicago;
    Error: scala.MatchError: org.apache.spark.sql.PointUDT@2d980dc3 (of class org.apache.spark.sql.PointUDT) (state=,code=0)
    
    ```
    
    
    And after the change:
    ```
    0: jdbc:hive2://localhost:10000> select * from chicago;
    +---------------------------------------+--------------+------------------------+---------------------+--+
    |                __fid__                | case_number  |          dtg           |        geom         |
    +---------------------------------------+--------------+------------------------+---------------------+--+
    | 109602f9-54f8-414b-8c6f-42b1a337643e  | 2            | 2016-01-01 19:00:00.0  | POINT (-77 38)      |
    | 709602f9-fcff-4429-8027-55649b6fd7ed  | 1            | 2015-12-31 19:00:00.0  | POINT (-76.5 38.5)  |
    | 009602f9-fcb5-45b1-a867-eb8ba10cab40  | 3            | 2016-01-02 19:00:00.0  | POINT (-78 39)      |
    +---------------------------------------+--------------+------------------------+---------------------+--+
    ```
    
    


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/atallahhezbor/spark udts_over_hive

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/20385.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #20385
    
----
commit c8fb4366cc1b7dbb3000d0d5261ed532df14961b
Author: Atallah Hezbor <at...@...>
Date:   2018-01-24T15:35:39Z

    Fixes MatchError when UDTs are passed through Hive Thriftserver
    
    Signed-off-by: Atallah Hezbor <at...@gmail.com>

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20385: [SPARK-21396][SQL] Fixes MatchError when UDTs are passed...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20385
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/86887/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20385: [SPARK-21396][SQL] Fixes MatchError when UDTs are passed...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on the issue:

    https://github.com/apache/spark/pull/20385
  
    cc @liufengdb 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20385: [SPARK-21396][SQL] Fixes MatchError when UDTs are passed...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on the issue:

    https://github.com/apache/spark/pull/20385
  
    ok to test


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20385: [SPARK-21396][SQL] Fixes MatchError when UDTs are passed...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/20385
  
    **[Test build #86597 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86597/testReport)** for PR 20385 at commit [`c8fb436`](https://github.com/apache/spark/commit/c8fb4366cc1b7dbb3000d0d5261ed532df14961b).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20385: [SPARK-21396][SQL] Fixes MatchError when UDTs are passed...

Posted by felixcheung <gi...@git.apache.org>.
Github user felixcheung commented on the issue:

    https://github.com/apache/spark/pull/20385
  
    Jenkins test this please


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20385: [SPARK-21396][SQL] Fixes MatchError when UDTs are passed...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20385
  
    Can one of the admins verify this patch?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20385: [SPARK-21396][SQL] Fixes MatchError when UDTs are passed...

Posted by liufengdb <gi...@git.apache.org>.
Github user liufengdb commented on the issue:

    https://github.com/apache/spark/pull/20385
  
    LGTM!


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20385: [SPARK-21396][SQL] Fixes MatchError when UDTs are passed...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20385
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20385: [SPARK-21396][SQL] Fixes MatchError when UDTs are passed...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20385
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/86597/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #20385: [SPARK-21396][SQL] Fixes MatchError when UDTs are...

Posted by liufengdb <gi...@git.apache.org>.
Github user liufengdb commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20385#discussion_r163648777
  
    --- Diff: sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/SparkExecuteStatementOperation.scala ---
    @@ -102,6 +102,8 @@ private[hive] class SparkExecuteStatementOperation(
             to += from.getAs[Timestamp](ordinal)
           case BinaryType =>
             to += from.getAs[Array[Byte]](ordinal)
    +      case udt: UserDefinedType[_] =>
    +        to += from.get(ordinal).toString
    --- End diff --
    
    It is possible `from.get(ordinal)` returns `null`, then a null pointer exception. I think a better way to add this case is by the method `HiveUtils.toHiveString` method.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20385: [SPARK-21396][SQL] Fixes MatchError when UDTs are passed...

Posted by liufengdb <gi...@git.apache.org>.
Github user liufengdb commented on the issue:

    https://github.com/apache/spark/pull/20385
  
    Actually, one more thing, do you need to consider the UDT as one attribute of a structured type? https://github.com/apache/spark/pull/20385/files#diff-842e3447fc453de26c706db1cac8f2c4L467


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20385: [SPARK-21396][SQL] Fixes MatchError when UDTs are passed...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on the issue:

    https://github.com/apache/spark/pull/20385
  
    The unit test cases are needed for `HiveUtils.toHiveString`.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20385: [SPARK-21396][SQL] Fixes MatchError when UDTs are passed...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on the issue:

    https://github.com/apache/spark/pull/20385
  
    @atallahhezbor Yeah! Please help us improve the test coverage. We do not have a clear way to test the functionality in `SparkExecuteStatementOperation`
    
    Adding unit test cases for `HiveUtils.toHiveString` is enough if we move the code changes to `HiveUtils.toHiveString`


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20385: [SPARK-21396][SQL] Fixes MatchError when UDTs are passed...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20385
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20385: [SPARK-21396][SQL] Fixes MatchError when UDTs are passed...

Posted by atallahhezbor <gi...@git.apache.org>.
Github user atallahhezbor commented on the issue:

    https://github.com/apache/spark/pull/20385
  
    @liufengdb @gatorsmile I'm happy to write a unit test if you require it. Though as I mentioned before, I did not see a clear way of testing the functionality in `SparkExecuteStatementOperation`. It looks to me that a lot of those functions are not tested directly.
    
    Also @gatorsmile, I'm not sure I understand you. If I move the functionality to `HiveUtils.toHiveString`, are you saying I should write a unit test of that whole function?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20385: [SPARK-21396][SQL] Fixes MatchError when UDTs are passed...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/20385
  
    **[Test build #86887 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86887/testReport)** for PR 20385 at commit [`e05041f`](https://github.com/apache/spark/commit/e05041f5d3fcfc43686d7915eaf34ff39af96219).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20385: [SPARK-21396][SQL] Fixes MatchError when UDTs are passed...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/20385
  
    **[Test build #86597 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86597/testReport)** for PR 20385 at commit [`c8fb436`](https://github.com/apache/spark/commit/c8fb4366cc1b7dbb3000d0d5261ed532df14961b).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20385: [SPARK-21396][SQL] Fixes MatchError when UDTs are passed...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20385
  
    Can one of the admins verify this patch?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20385: [SPARK-21396][SQL] Fixes MatchError when UDTs are passed...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/20385
  
    **[Test build #86887 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86887/testReport)** for PR 20385 at commit [`e05041f`](https://github.com/apache/spark/commit/e05041f5d3fcfc43686d7915eaf34ff39af96219).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20385: [SPARK-21396][SQL] Fixes MatchError when UDTs are passed...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on the issue:

    https://github.com/apache/spark/pull/20385
  
    LGTM
    
    @atallahhezbor Could you submit another PR to address the comment from @liufengdb ?
    
    This fix is nice to have in Spark 2.3. Let merge this now. 
    
    Thanks! Merged to master/2.3


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #20385: [SPARK-21396][SQL] Fixes MatchError when UDTs are...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/20385


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org