You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by jerryshao <gi...@git.apache.org> on 2018/01/18 03:40:24 UTC
[GitHub] spark pull request #20305: [SPARK-23140][SQL] Add DataSourceV2Strategy to Hi...
GitHub user jerryshao opened a pull request:
https://github.com/apache/spark/pull/20305
[SPARK-23140][SQL] Add DataSourceV2Strategy to Hive Session state's planner
## What changes were proposed in this pull request?
`DataSourceV2Strategy` is missing in `HiveSessionStateBuilder`'s planner, which will throw exception as described in [SPARK-23140](https://issues.apache.org/jira/browse/SPARK-23140).
## How was this patch tested?
Manual test.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/jerryshao/apache-spark SPARK-23140
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/20305.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #20305
----
commit fc5d3ce47fad2e5fd3a2c0e5c94002189fffc462
Author: jerryshao <ss...@...>
Date: 2018-01-18T03:37:54Z
Add DataSourceV2Strategy to Hive Session state's planner
----
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #20305: [SPARK-23140][SQL] Add DataSourceV2Strategy to Hive Sess...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/20305
**[Test build #86328 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86328/testReport)** for PR 20305 at commit [`a978dcc`](https://github.com/apache/spark/commit/a978dcc3052f0df4485594c6cd4a944d8b6dab5e).
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #20305: [SPARK-23140][SQL] Add DataSourceV2Strategy to Hi...
Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:
https://github.com/apache/spark/pull/20305#discussion_r162274134
--- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveSessionStateBuilder.scala ---
@@ -98,20 +98,7 @@ class HiveSessionStateBuilder(session: SparkSession, parentState: Option[Session
override def extraPlanningStrategies: Seq[Strategy] =
super.extraPlanningStrategies ++ customPlanningStrategies
- override def strategies: Seq[Strategy] = {
- experimentalMethods.extraStrategies ++
- extraPlanningStrategies ++ Seq(
- FileSourceStrategy,
- DataSourceStrategy(conf),
- SpecialLimits,
- InMemoryScans,
- HiveTableScans,
- Scripts,
- Aggregation,
- JoinSelection,
- BasicOperators
- )
- }
+ override def strategies: Seq[Strategy] = Seq(HiveTableScans, Scripts) ++ super.strategies
--- End diff --
This breaks the assumption that `experimentalMethods.extraStrategies` should always run first.
I think we can just do:
```
override def extraPlanningStrategies: Seq[Strategy] =
super.extraPlanningStrategies ++ customPlanningStrategies ++ Seq(HiveTableScans, Scripts)
```
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #20305: [SPARK-23140][SQL] Add DataSourceV2Strategy to Hi...
Posted by jerryshao <gi...@git.apache.org>.
Github user jerryshao commented on a diff in the pull request:
https://github.com/apache/spark/pull/20305#discussion_r162249755
--- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveSessionStateBuilder.scala ---
@@ -101,6 +102,7 @@ class HiveSessionStateBuilder(session: SparkSession, parentState: Option[Session
override def strategies: Seq[Strategy] = {
experimentalMethods.extraStrategies ++
extraPlanningStrategies ++ Seq(
--- End diff --
Sure, I will change it.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #20305: [SPARK-23140][SQL] Add DataSourceV2Strategy to Hive Sess...
Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:
https://github.com/apache/spark/pull/20305
LGTM
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #20305: [SPARK-23140][SQL] Add DataSourceV2Strategy to Hive Sess...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/20305
Merged build finished. Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #20305: [SPARK-23140][SQL] Add DataSourceV2Strategy to Hive Sess...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/20305
**[Test build #86311 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86311/testReport)** for PR 20305 at commit [`fc5d3ce`](https://github.com/apache/spark/commit/fc5d3ce47fad2e5fd3a2c0e5c94002189fffc462).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #20305: [SPARK-23140][SQL] Add DataSourceV2Strategy to Hive Sess...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/20305
**[Test build #86313 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86313/testReport)** for PR 20305 at commit [`b721eb6`](https://github.com/apache/spark/commit/b721eb62baceb778636b6d243a0e4da77e89b894).
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #20305: [SPARK-23140][SQL] Add DataSourceV2Strategy to Hive Sess...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/20305
Merged build finished. Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #20305: [SPARK-23140][SQL] Add DataSourceV2Strategy to Hive Sess...
Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:
https://github.com/apache/spark/pull/20305
thanks, merging to master/2.3!
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #20305: [SPARK-23140][SQL] Add DataSourceV2Strategy to Hi...
Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:
https://github.com/apache/spark/pull/20305
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #20305: [SPARK-23140][SQL] Add DataSourceV2Strategy to Hive Sess...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/20305
Merged build finished. Test FAILed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #20305: [SPARK-23140][SQL] Add DataSourceV2Strategy to Hi...
Posted by jerryshao <gi...@git.apache.org>.
Github user jerryshao commented on a diff in the pull request:
https://github.com/apache/spark/pull/20305#discussion_r162270415
--- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveSessionStateBuilder.scala ---
@@ -101,6 +102,7 @@ class HiveSessionStateBuilder(session: SparkSession, parentState: Option[Session
override def strategies: Seq[Strategy] = {
experimentalMethods.extraStrategies ++
extraPlanningStrategies ++ Seq(
--- End diff --
Looks like the ordering matters, If I put Hive related strategies in the end, some unit tests will be failed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #20305: [SPARK-23140][SQL] Add DataSourceV2Strategy to Hive Sess...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/20305
**[Test build #86311 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86311/testReport)** for PR 20305 at commit [`fc5d3ce`](https://github.com/apache/spark/commit/fc5d3ce47fad2e5fd3a2c0e5c94002189fffc462).
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #20305: [SPARK-23140][SQL] Add DataSourceV2Strategy to Hive Sess...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/20305
Merged build finished. Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #20305: [SPARK-23140][SQL] Add DataSourceV2Strategy to Hive Sess...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/20305
**[Test build #86328 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86328/testReport)** for PR 20305 at commit [`a978dcc`](https://github.com/apache/spark/commit/a978dcc3052f0df4485594c6cd4a944d8b6dab5e).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #20305: [SPARK-23140][SQL] Add DataSourceV2Strategy to Hive Sess...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/20305
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/86313/
Test FAILed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #20305: [SPARK-23140][SQL] Add DataSourceV2Strategy to Hive Sess...
Posted by jerryshao <gi...@git.apache.org>.
Github user jerryshao commented on the issue:
https://github.com/apache/spark/pull/20305
@cloud-fan , please help to review, thanks!
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #20305: [SPARK-23140][SQL] Add DataSourceV2Strategy to Hive Sess...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/20305
**[Test build #86326 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86326/testReport)** for PR 20305 at commit [`094b7eb`](https://github.com/apache/spark/commit/094b7ebbaf7bfe75e706cf42565f0c077938e821).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #20305: [SPARK-23140][SQL] Add DataSourceV2Strategy to Hi...
Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:
https://github.com/apache/spark/pull/20305#discussion_r162248162
--- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveSessionStateBuilder.scala ---
@@ -101,6 +102,7 @@ class HiveSessionStateBuilder(session: SparkSession, parentState: Option[Session
override def strategies: Seq[Strategy] = {
experimentalMethods.extraStrategies ++
extraPlanningStrategies ++ Seq(
--- End diff --
can we change it to `super. strategies ++ Seq(HiveTableScans, Scripts)`?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #20305: [SPARK-23140][SQL] Add DataSourceV2Strategy to Hive Sess...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/20305
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/86318/
Test FAILed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #20305: [SPARK-23140][SQL] Add DataSourceV2Strategy to Hi...
Posted by jerryshao <gi...@git.apache.org>.
Github user jerryshao commented on a diff in the pull request:
https://github.com/apache/spark/pull/20305#discussion_r162274760
--- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveSessionStateBuilder.scala ---
@@ -98,20 +98,7 @@ class HiveSessionStateBuilder(session: SparkSession, parentState: Option[Session
override def extraPlanningStrategies: Seq[Strategy] =
super.extraPlanningStrategies ++ customPlanningStrategies
- override def strategies: Seq[Strategy] = {
- experimentalMethods.extraStrategies ++
- extraPlanningStrategies ++ Seq(
- FileSourceStrategy,
- DataSourceStrategy(conf),
- SpecialLimits,
- InMemoryScans,
- HiveTableScans,
- Scripts,
- Aggregation,
- JoinSelection,
- BasicOperators
- )
- }
+ override def strategies: Seq[Strategy] = Seq(HiveTableScans, Scripts) ++ super.strategies
--- End diff --
OK, let me update it.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #20305: [SPARK-23140][SQL] Add DataSourceV2Strategy to Hive Sess...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/20305
**[Test build #86313 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86313/testReport)** for PR 20305 at commit [`b721eb6`](https://github.com/apache/spark/commit/b721eb62baceb778636b6d243a0e4da77e89b894).
* This patch **fails Spark unit tests**.
* This patch merges cleanly.
* This patch adds no public classes.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #20305: [SPARK-23140][SQL] Add DataSourceV2Strategy to Hive Sess...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/20305
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/86328/
Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #20305: [SPARK-23140][SQL] Add DataSourceV2Strategy to Hive Sess...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/20305
**[Test build #86318 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86318/testReport)** for PR 20305 at commit [`f17b44d`](https://github.com/apache/spark/commit/f17b44de6e4d2ece008d3856fdcc037cce7dd147).
* This patch **fails due to an unknown error code, -9**.
* This patch merges cleanly.
* This patch adds no public classes.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #20305: [SPARK-23140][SQL] Add DataSourceV2Strategy to Hive Sess...
Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:
https://github.com/apache/spark/pull/20305
thanks, good catch!
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #20305: [SPARK-23140][SQL] Add DataSourceV2Strategy to Hive Sess...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/20305
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/86326/
Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #20305: [SPARK-23140][SQL] Add DataSourceV2Strategy to Hive Sess...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/20305
**[Test build #86318 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86318/testReport)** for PR 20305 at commit [`f17b44d`](https://github.com/apache/spark/commit/f17b44de6e4d2ece008d3856fdcc037cce7dd147).
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #20305: [SPARK-23140][SQL] Add DataSourceV2Strategy to Hive Sess...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/20305
**[Test build #86326 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86326/testReport)** for PR 20305 at commit [`094b7eb`](https://github.com/apache/spark/commit/094b7ebbaf7bfe75e706cf42565f0c077938e821).
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #20305: [SPARK-23140][SQL] Add DataSourceV2Strategy to Hive Sess...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/20305
Merged build finished. Test FAILed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #20305: [SPARK-23140][SQL] Add DataSourceV2Strategy to Hive Sess...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/20305
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/86311/
Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org