You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by maryannxue <gi...@git.apache.org> on 2018/11/14 19:07:34 UTC

[GitHub] spark pull request #23036: [SPARK-26065][SQL] Change query hint from a `Logi...

GitHub user maryannxue opened a pull request:

    https://github.com/apache/spark/pull/23036

    [SPARK-26065][SQL] Change query hint from a `LogicalPlan` to a field

    ## What changes were proposed in this pull request?
    
    The existing query hint implementation relies on a logical plan node `ResolvedHint` to store query hints in logical plans, and on `Statistics` in physical plans. Since `ResolvedHint` is not really a logical operator and can break the pattern matching for existing and future optimization rules, it is a issue to the Optimizer as the old `AnalysisBarrier` was to the Analyzer.
    
    Given the fact that all our query hints are either 1) a join hint, i.e., broadcast hint; or 2) a re-partition hint, which is indeed an operator, we only need to add a hint field on the Join plan and that will be a good enough solution for the current hint usage.
    
    This PR is to let `Join` node have a hint for its left sub-tree and another hint for its right sub-tree and each hint is a merged result of all the effective hints specified in the corresponding sub-tree. The "effectiveness" of a hint, i.e., whether that hint should be propagated to the `Join` node, is currently consistent with the hint propagation rules originally implemented in the `Statistics` approach. Note that the `ResolvedHint` node still has to live through the analysis stage because of the `Dataset` interface, but it will be got rid of and moved to the `Join` node in the "pre-optimization" stage.
    
    ## How was this patch tested?
    
    Added a `JoinHintSuite`.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/maryannxue/spark query-hint

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/23036.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #23036
    
----
commit fce106da44310eea8ede8a74634abca002015d4f
Author: maryannxue <ma...@...>
Date:   2018-11-14T17:02:17Z

    [SPARK-26065][SQL] Change query hint from a LogicalPlan to a field

commit 785a4235a5026a777819faea06066dbe041bf819
Author: maryannxue <ma...@...>
Date:   2018-11-14T18:55:02Z

    [SPARK-26065][SQL] Change query hint from a LogicalPlan to a field

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #23036: [SPARK-26065][SQL] Change query hint from a `LogicalPlan...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/23036
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5031/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #23036: [SPARK-26065][SQL] Change query hint from a `LogicalPlan...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/23036
  
    **[Test build #98836 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98836/testReport)** for PR 23036 at commit [`785a423`](https://github.com/apache/spark/commit/785a4235a5026a777819faea06066dbe041bf819).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #23036: [SPARK-26065][SQL] Change query hint from a `LogicalPlan...

Posted by maryannxue <gi...@git.apache.org>.
Github user maryannxue commented on the issue:

    https://github.com/apache/spark/pull/23036
  
    @gatorsmile @cloud-fan @rxin @juliuszsompolski 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #23036: [SPARK-26065][SQL] Change query hint from a `LogicalPlan...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/23036
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98836/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #23036: [SPARK-26065][SQL] Change query hint from a `LogicalPlan...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/23036
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #23036: [SPARK-26065][SQL] Change query hint from a `LogicalPlan...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/23036
  
    **[Test build #98836 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98836/testReport)** for PR 23036 at commit [`785a423`](https://github.com/apache/spark/commit/785a4235a5026a777819faea06066dbe041bf819).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #23036: [SPARK-26065][SQL] Change query hint from a `LogicalPlan...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/23036
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org