You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by jerryshao <gi...@git.apache.org> on 2018/01/18 03:40:24 UTC

[GitHub] spark pull request #20305: [SPARK-23140][SQL] Add DataSourceV2Strategy to Hi...

GitHub user jerryshao opened a pull request:

    https://github.com/apache/spark/pull/20305

    [SPARK-23140][SQL] Add DataSourceV2Strategy to Hive Session state's planner

    ## What changes were proposed in this pull request?
    
    `DataSourceV2Strategy` is missing in `HiveSessionStateBuilder`'s planner, which will throw exception as described in [SPARK-23140](https://issues.apache.org/jira/browse/SPARK-23140).
    
    ## How was this patch tested?
    
    Manual test.


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/jerryshao/apache-spark SPARK-23140

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/20305.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #20305
    
----
commit fc5d3ce47fad2e5fd3a2c0e5c94002189fffc462
Author: jerryshao <ss...@...>
Date:   2018-01-18T03:37:54Z

    Add DataSourceV2Strategy to Hive Session state's planner

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20305: [SPARK-23140][SQL] Add DataSourceV2Strategy to Hive Sess...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/20305
  
    **[Test build #86328 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86328/testReport)** for PR 20305 at commit [`a978dcc`](https://github.com/apache/spark/commit/a978dcc3052f0df4485594c6cd4a944d8b6dab5e).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #20305: [SPARK-23140][SQL] Add DataSourceV2Strategy to Hi...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20305#discussion_r162274134
  
    --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveSessionStateBuilder.scala ---
    @@ -98,20 +98,7 @@ class HiveSessionStateBuilder(session: SparkSession, parentState: Option[Session
           override def extraPlanningStrategies: Seq[Strategy] =
             super.extraPlanningStrategies ++ customPlanningStrategies
     
    -      override def strategies: Seq[Strategy] = {
    -        experimentalMethods.extraStrategies ++
    -          extraPlanningStrategies ++ Seq(
    -          FileSourceStrategy,
    -          DataSourceStrategy(conf),
    -          SpecialLimits,
    -          InMemoryScans,
    -          HiveTableScans,
    -          Scripts,
    -          Aggregation,
    -          JoinSelection,
    -          BasicOperators
    -        )
    -      }
    +      override def strategies: Seq[Strategy] = Seq(HiveTableScans, Scripts) ++ super.strategies
    --- End diff --
    
    This breaks the assumption that `experimentalMethods.extraStrategies` should always run first.
    
    I think we can just do:
    ```
    override def extraPlanningStrategies: Seq[Strategy] =
      super.extraPlanningStrategies ++ customPlanningStrategies ++ Seq(HiveTableScans, Scripts)
    ```


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #20305: [SPARK-23140][SQL] Add DataSourceV2Strategy to Hi...

Posted by jerryshao <gi...@git.apache.org>.
Github user jerryshao commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20305#discussion_r162249755
  
    --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveSessionStateBuilder.scala ---
    @@ -101,6 +102,7 @@ class HiveSessionStateBuilder(session: SparkSession, parentState: Option[Session
           override def strategies: Seq[Strategy] = {
             experimentalMethods.extraStrategies ++
               extraPlanningStrategies ++ Seq(
    --- End diff --
    
    Sure, I will change it.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20305: [SPARK-23140][SQL] Add DataSourceV2Strategy to Hive Sess...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/20305
  
    LGTM


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20305: [SPARK-23140][SQL] Add DataSourceV2Strategy to Hive Sess...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20305
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20305: [SPARK-23140][SQL] Add DataSourceV2Strategy to Hive Sess...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/20305
  
    **[Test build #86311 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86311/testReport)** for PR 20305 at commit [`fc5d3ce`](https://github.com/apache/spark/commit/fc5d3ce47fad2e5fd3a2c0e5c94002189fffc462).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20305: [SPARK-23140][SQL] Add DataSourceV2Strategy to Hive Sess...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/20305
  
    **[Test build #86313 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86313/testReport)** for PR 20305 at commit [`b721eb6`](https://github.com/apache/spark/commit/b721eb62baceb778636b6d243a0e4da77e89b894).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20305: [SPARK-23140][SQL] Add DataSourceV2Strategy to Hive Sess...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20305
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20305: [SPARK-23140][SQL] Add DataSourceV2Strategy to Hive Sess...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/20305
  
    thanks, merging to master/2.3!


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #20305: [SPARK-23140][SQL] Add DataSourceV2Strategy to Hi...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/20305


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20305: [SPARK-23140][SQL] Add DataSourceV2Strategy to Hive Sess...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20305
  
    Merged build finished. Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #20305: [SPARK-23140][SQL] Add DataSourceV2Strategy to Hi...

Posted by jerryshao <gi...@git.apache.org>.
Github user jerryshao commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20305#discussion_r162270415
  
    --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveSessionStateBuilder.scala ---
    @@ -101,6 +102,7 @@ class HiveSessionStateBuilder(session: SparkSession, parentState: Option[Session
           override def strategies: Seq[Strategy] = {
             experimentalMethods.extraStrategies ++
               extraPlanningStrategies ++ Seq(
    --- End diff --
    
    Looks like the ordering matters, If I put Hive related strategies in the end, some unit tests will be failed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20305: [SPARK-23140][SQL] Add DataSourceV2Strategy to Hive Sess...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/20305
  
    **[Test build #86311 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86311/testReport)** for PR 20305 at commit [`fc5d3ce`](https://github.com/apache/spark/commit/fc5d3ce47fad2e5fd3a2c0e5c94002189fffc462).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20305: [SPARK-23140][SQL] Add DataSourceV2Strategy to Hive Sess...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20305
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20305: [SPARK-23140][SQL] Add DataSourceV2Strategy to Hive Sess...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/20305
  
    **[Test build #86328 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86328/testReport)** for PR 20305 at commit [`a978dcc`](https://github.com/apache/spark/commit/a978dcc3052f0df4485594c6cd4a944d8b6dab5e).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20305: [SPARK-23140][SQL] Add DataSourceV2Strategy to Hive Sess...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20305
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/86313/
    Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20305: [SPARK-23140][SQL] Add DataSourceV2Strategy to Hive Sess...

Posted by jerryshao <gi...@git.apache.org>.
Github user jerryshao commented on the issue:

    https://github.com/apache/spark/pull/20305
  
    @cloud-fan , please help to review, thanks!


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20305: [SPARK-23140][SQL] Add DataSourceV2Strategy to Hive Sess...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/20305
  
    **[Test build #86326 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86326/testReport)** for PR 20305 at commit [`094b7eb`](https://github.com/apache/spark/commit/094b7ebbaf7bfe75e706cf42565f0c077938e821).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #20305: [SPARK-23140][SQL] Add DataSourceV2Strategy to Hi...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20305#discussion_r162248162
  
    --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveSessionStateBuilder.scala ---
    @@ -101,6 +102,7 @@ class HiveSessionStateBuilder(session: SparkSession, parentState: Option[Session
           override def strategies: Seq[Strategy] = {
             experimentalMethods.extraStrategies ++
               extraPlanningStrategies ++ Seq(
    --- End diff --
    
    can we change it to `super. strategies ++ Seq(HiveTableScans, Scripts)`?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20305: [SPARK-23140][SQL] Add DataSourceV2Strategy to Hive Sess...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20305
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/86318/
    Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #20305: [SPARK-23140][SQL] Add DataSourceV2Strategy to Hi...

Posted by jerryshao <gi...@git.apache.org>.
Github user jerryshao commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20305#discussion_r162274760
  
    --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveSessionStateBuilder.scala ---
    @@ -98,20 +98,7 @@ class HiveSessionStateBuilder(session: SparkSession, parentState: Option[Session
           override def extraPlanningStrategies: Seq[Strategy] =
             super.extraPlanningStrategies ++ customPlanningStrategies
     
    -      override def strategies: Seq[Strategy] = {
    -        experimentalMethods.extraStrategies ++
    -          extraPlanningStrategies ++ Seq(
    -          FileSourceStrategy,
    -          DataSourceStrategy(conf),
    -          SpecialLimits,
    -          InMemoryScans,
    -          HiveTableScans,
    -          Scripts,
    -          Aggregation,
    -          JoinSelection,
    -          BasicOperators
    -        )
    -      }
    +      override def strategies: Seq[Strategy] = Seq(HiveTableScans, Scripts) ++ super.strategies
    --- End diff --
    
    OK, let me update it.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20305: [SPARK-23140][SQL] Add DataSourceV2Strategy to Hive Sess...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/20305
  
    **[Test build #86313 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86313/testReport)** for PR 20305 at commit [`b721eb6`](https://github.com/apache/spark/commit/b721eb62baceb778636b6d243a0e4da77e89b894).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20305: [SPARK-23140][SQL] Add DataSourceV2Strategy to Hive Sess...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20305
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/86328/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20305: [SPARK-23140][SQL] Add DataSourceV2Strategy to Hive Sess...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/20305
  
    **[Test build #86318 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86318/testReport)** for PR 20305 at commit [`f17b44d`](https://github.com/apache/spark/commit/f17b44de6e4d2ece008d3856fdcc037cce7dd147).
     * This patch **fails due to an unknown error code, -9**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20305: [SPARK-23140][SQL] Add DataSourceV2Strategy to Hive Sess...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/20305
  
    thanks, good catch!


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20305: [SPARK-23140][SQL] Add DataSourceV2Strategy to Hive Sess...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20305
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/86326/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20305: [SPARK-23140][SQL] Add DataSourceV2Strategy to Hive Sess...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/20305
  
    **[Test build #86318 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86318/testReport)** for PR 20305 at commit [`f17b44d`](https://github.com/apache/spark/commit/f17b44de6e4d2ece008d3856fdcc037cce7dd147).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20305: [SPARK-23140][SQL] Add DataSourceV2Strategy to Hive Sess...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/20305
  
    **[Test build #86326 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86326/testReport)** for PR 20305 at commit [`094b7eb`](https://github.com/apache/spark/commit/094b7ebbaf7bfe75e706cf42565f0c077938e821).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20305: [SPARK-23140][SQL] Add DataSourceV2Strategy to Hive Sess...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20305
  
    Merged build finished. Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20305: [SPARK-23140][SQL] Add DataSourceV2Strategy to Hive Sess...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20305
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/86311/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org