You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by atallahhezbor <gi...@git.apache.org> on 2018/01/24 16:08:40 UTC
[GitHub] spark pull request #20385: [SPARK-21396][SQL] Fixes MatchError when UDTs are...
GitHub user atallahhezbor opened a pull request:
https://github.com/apache/spark/pull/20385
[SPARK-21396][SQL] Fixes MatchError when UDTs are passed through Hive Thriftserver
Signed-off-by: Atallah Hezbor <at...@gmail.com>
## What changes were proposed in this pull request?
This PR proposes modifying the match statement that gets the columns of a row in HiveThriftServer. There was previously no case for `UserDefinedType`, so querying a table that contained them would throw a match error. The changes catch that case and return the string representation.
## How was this patch tested?
While I would have liked to add a unit test, I couldn't easily incorporate UDTs into the ``HiveThriftServer2Suites`` pipeline. With some guidance I would be happy to push a commit with tests.
Instead I did a manual test by loading a `DataFrame` with Point UDT in a spark shell with a HiveThriftServer. Then in beeline, connecting to the server and querying that table.
Here is the result before the change
```
0: jdbc:hive2://localhost:10000> select * from chicago;
Error: scala.MatchError: org.apache.spark.sql.PointUDT@2d980dc3 (of class org.apache.spark.sql.PointUDT) (state=,code=0)
```
And after the change:
```
0: jdbc:hive2://localhost:10000> select * from chicago;
+---------------------------------------+--------------+------------------------+---------------------+--+
| __fid__ | case_number | dtg | geom |
+---------------------------------------+--------------+------------------------+---------------------+--+
| 109602f9-54f8-414b-8c6f-42b1a337643e | 2 | 2016-01-01 19:00:00.0 | POINT (-77 38) |
| 709602f9-fcff-4429-8027-55649b6fd7ed | 1 | 2015-12-31 19:00:00.0 | POINT (-76.5 38.5) |
| 009602f9-fcb5-45b1-a867-eb8ba10cab40 | 3 | 2016-01-02 19:00:00.0 | POINT (-78 39) |
+---------------------------------------+--------------+------------------------+---------------------+--+
```
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/atallahhezbor/spark udts_over_hive
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/20385.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #20385
----
commit c8fb4366cc1b7dbb3000d0d5261ed532df14961b
Author: Atallah Hezbor <at...@...>
Date: 2018-01-24T15:35:39Z
Fixes MatchError when UDTs are passed through Hive Thriftserver
Signed-off-by: Atallah Hezbor <at...@gmail.com>
----
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #20385: [SPARK-21396][SQL] Fixes MatchError when UDTs are passed...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/20385
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/86887/
Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #20385: [SPARK-21396][SQL] Fixes MatchError when UDTs are passed...
Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/20385
cc @liufengdb
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #20385: [SPARK-21396][SQL] Fixes MatchError when UDTs are passed...
Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/20385
ok to test
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #20385: [SPARK-21396][SQL] Fixes MatchError when UDTs are passed...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/20385
**[Test build #86597 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86597/testReport)** for PR 20385 at commit [`c8fb436`](https://github.com/apache/spark/commit/c8fb4366cc1b7dbb3000d0d5261ed532df14961b).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #20385: [SPARK-21396][SQL] Fixes MatchError when UDTs are passed...
Posted by felixcheung <gi...@git.apache.org>.
Github user felixcheung commented on the issue:
https://github.com/apache/spark/pull/20385
Jenkins test this please
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #20385: [SPARK-21396][SQL] Fixes MatchError when UDTs are passed...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/20385
Can one of the admins verify this patch?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #20385: [SPARK-21396][SQL] Fixes MatchError when UDTs are passed...
Posted by liufengdb <gi...@git.apache.org>.
Github user liufengdb commented on the issue:
https://github.com/apache/spark/pull/20385
LGTM!
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #20385: [SPARK-21396][SQL] Fixes MatchError when UDTs are passed...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/20385
Merged build finished. Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #20385: [SPARK-21396][SQL] Fixes MatchError when UDTs are passed...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/20385
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/86597/
Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #20385: [SPARK-21396][SQL] Fixes MatchError when UDTs are...
Posted by liufengdb <gi...@git.apache.org>.
Github user liufengdb commented on a diff in the pull request:
https://github.com/apache/spark/pull/20385#discussion_r163648777
--- Diff: sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/SparkExecuteStatementOperation.scala ---
@@ -102,6 +102,8 @@ private[hive] class SparkExecuteStatementOperation(
to += from.getAs[Timestamp](ordinal)
case BinaryType =>
to += from.getAs[Array[Byte]](ordinal)
+ case udt: UserDefinedType[_] =>
+ to += from.get(ordinal).toString
--- End diff --
It is possible `from.get(ordinal)` returns `null`, then a null pointer exception. I think a better way to add this case is by the method `HiveUtils.toHiveString` method.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #20385: [SPARK-21396][SQL] Fixes MatchError when UDTs are passed...
Posted by liufengdb <gi...@git.apache.org>.
Github user liufengdb commented on the issue:
https://github.com/apache/spark/pull/20385
Actually, one more thing, do you need to consider the UDT as one attribute of a structured type? https://github.com/apache/spark/pull/20385/files#diff-842e3447fc453de26c706db1cac8f2c4L467
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #20385: [SPARK-21396][SQL] Fixes MatchError when UDTs are passed...
Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/20385
The unit test cases are needed for `HiveUtils.toHiveString`.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #20385: [SPARK-21396][SQL] Fixes MatchError when UDTs are passed...
Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/20385
@atallahhezbor Yeah! Please help us improve the test coverage. We do not have a clear way to test the functionality in `SparkExecuteStatementOperation`
Adding unit test cases for `HiveUtils.toHiveString` is enough if we move the code changes to `HiveUtils.toHiveString`
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #20385: [SPARK-21396][SQL] Fixes MatchError when UDTs are passed...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/20385
Merged build finished. Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #20385: [SPARK-21396][SQL] Fixes MatchError when UDTs are passed...
Posted by atallahhezbor <gi...@git.apache.org>.
Github user atallahhezbor commented on the issue:
https://github.com/apache/spark/pull/20385
@liufengdb @gatorsmile I'm happy to write a unit test if you require it. Though as I mentioned before, I did not see a clear way of testing the functionality in `SparkExecuteStatementOperation`. It looks to me that a lot of those functions are not tested directly.
Also @gatorsmile, I'm not sure I understand you. If I move the functionality to `HiveUtils.toHiveString`, are you saying I should write a unit test of that whole function?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #20385: [SPARK-21396][SQL] Fixes MatchError when UDTs are passed...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/20385
**[Test build #86887 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86887/testReport)** for PR 20385 at commit [`e05041f`](https://github.com/apache/spark/commit/e05041f5d3fcfc43686d7915eaf34ff39af96219).
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #20385: [SPARK-21396][SQL] Fixes MatchError when UDTs are passed...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/20385
**[Test build #86597 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86597/testReport)** for PR 20385 at commit [`c8fb436`](https://github.com/apache/spark/commit/c8fb4366cc1b7dbb3000d0d5261ed532df14961b).
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #20385: [SPARK-21396][SQL] Fixes MatchError when UDTs are passed...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/20385
Can one of the admins verify this patch?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #20385: [SPARK-21396][SQL] Fixes MatchError when UDTs are passed...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/20385
**[Test build #86887 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86887/testReport)** for PR 20385 at commit [`e05041f`](https://github.com/apache/spark/commit/e05041f5d3fcfc43686d7915eaf34ff39af96219).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #20385: [SPARK-21396][SQL] Fixes MatchError when UDTs are passed...
Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/20385
LGTM
@atallahhezbor Could you submit another PR to address the comment from @liufengdb ?
This fix is nice to have in Spark 2.3. Let merge this now.
Thanks! Merged to master/2.3
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #20385: [SPARK-21396][SQL] Fixes MatchError when UDTs are...
Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:
https://github.com/apache/spark/pull/20385
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org