You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2021/12/09 08:03:09 UTC

[GitHub] [spark] beliefer opened a new pull request #34844: [SPARK-37592][SQL] Improve performance of `JoinSelection`

beliefer opened a new pull request #34844:
URL: https://github.com/apache/spark/pull/34844


   ### What changes were proposed in this pull request?
   When I reading the implement of AQE, I find the process select join with hint exists a lot cumbersome code.
   
   
   ### Why are the changes needed?
   Improve performance of `JoinSelection`
   
   
   ### Does this PR introduce _any_ user-facing change?
   'No'.
   Just change the inner implement.
   
   
   ### How was this patch tested?
   Jenkins test.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34844: [SPARK-37592][SQL] Improve performance of `JoinSelection`

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34844:
URL: https://github.com/apache/spark/pull/34844#issuecomment-992492529


   Kubernetes integration test status failure
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50611/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #34844: [SPARK-37592][SQL] Improve performance of `JoinSelection`

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #34844:
URL: https://github.com/apache/spark/pull/34844#issuecomment-992208817


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/50595/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] beliefer commented on pull request #34844: [SPARK-37592][SQL] Improve performance of `JoinSelection`

Posted by GitBox <gi...@apache.org>.
beliefer commented on pull request #34844:
URL: https://github.com/apache/spark/pull/34844#issuecomment-992375020


   retest this please


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #34844: [SPARK-37592][SQL] Improve performance of `JoinSelection`

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #34844:
URL: https://github.com/apache/spark/pull/34844#issuecomment-992520175


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/50611/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a change in pull request #34844: [SPARK-37592][SQL] Improve performance of `JoinSelection`

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #34844:
URL: https://github.com/apache/spark/pull/34844#discussion_r765795115



##########
File path: sql/core/src/main/scala/org/apache/spark/sql/execution/Columnar.scala
##########
@@ -548,11 +548,9 @@ case class ApplyColumnarRulesAndInsertTransitions(
 
   def apply(plan: SparkPlan): SparkPlan = {
     var preInsertPlan: SparkPlan = plan
-    columnarRules.foreach((r : ColumnarRule) =>
-      preInsertPlan = r.preColumnarTransitions(preInsertPlan))
+    columnarRules.foreach( r => preInsertPlan = r.preColumnarTransitions(preInsertPlan))

Review comment:
       how is this related to `JoinSelection`?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34844: [SPARK-37592][SQL] Improve performance of `JoinSelection`

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34844:
URL: https://github.com/apache/spark/pull/34844#issuecomment-992166601


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50595/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34844: [SPARK-37592][SQL] Improve performance of `JoinSelection`

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34844:
URL: https://github.com/apache/spark/pull/34844#issuecomment-992205957


   Kubernetes integration test status failure
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50595/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on pull request #34844: [SPARK-37592][SQL] Improve performance of `JoinSelection`

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on pull request #34844:
URL: https://github.com/apache/spark/pull/34844#issuecomment-993621186


   thanks, merging to master!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on pull request #34844: [SPARK-37592][SQL] Improve performance of `JoinSelection`

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on pull request #34844:
URL: https://github.com/apache/spark/pull/34844#issuecomment-993933528


   +1, LGTM.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34844: [SPARK-37592][SQL] Improve performance of `JoinSelection`

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34844:
URL: https://github.com/apache/spark/pull/34844#issuecomment-992142522


   **[Test build #146120 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/146120/testReport)** for PR 34844 at commit [`3ce77ee`](https://github.com/apache/spark/commit/3ce77ee19851d3b721d849314effa826cf6251d8).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a change in pull request #34844: [SPARK-37592][SQL] Improve performance of `JoinSelection`

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #34844:
URL: https://github.com/apache/spark/pull/34844#discussion_r767426034



##########
File path: sql/core/src/main/scala/org/apache/spark/sql/execution/Columnar.scala
##########
@@ -548,11 +548,9 @@ case class ApplyColumnarRulesAndInsertTransitions(
 
   def apply(plan: SparkPlan): SparkPlan = {
     var preInsertPlan: SparkPlan = plan
-    columnarRules.foreach((r : ColumnarRule) =>
-      preInsertPlan = r.preColumnarTransitions(preInsertPlan))
+    columnarRules.foreach( r => preInsertPlan = r.preColumnarTransitions(preInsertPlan))

Review comment:
       ```suggestion
       columnarRules.foreach(r => preInsertPlan = r.preColumnarTransitions(preInsertPlan))
   ```

##########
File path: sql/core/src/main/scala/org/apache/spark/sql/execution/Columnar.scala
##########
@@ -548,11 +548,9 @@ case class ApplyColumnarRulesAndInsertTransitions(
 
   def apply(plan: SparkPlan): SparkPlan = {
     var preInsertPlan: SparkPlan = plan
-    columnarRules.foreach((r : ColumnarRule) =>
-      preInsertPlan = r.preColumnarTransitions(preInsertPlan))
+    columnarRules.foreach( r => preInsertPlan = r.preColumnarTransitions(preInsertPlan))
     var postInsertPlan = insertTransitions(preInsertPlan, outputsColumnar)
-    columnarRules.reverse.foreach((r : ColumnarRule) =>
-      postInsertPlan = r.postColumnarTransitions(postInsertPlan))
+    columnarRules.reverse.foreach( r => postInsertPlan = r.postColumnarTransitions(postInsertPlan))

Review comment:
       ```suggestion
       columnarRules.reverse.foreach(r => postInsertPlan = r.postColumnarTransitions(postInsertPlan))
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34844: [SPARK-37592][SQL] Improve performance of `JoinSelection`

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34844:
URL: https://github.com/apache/spark/pull/34844#issuecomment-989605791


   **[Test build #146031 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/146031/testReport)** for PR 34844 at commit [`aa49f15`](https://github.com/apache/spark/commit/aa49f15b819f07508853f486b5359515f2734f5a).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #34844: [SPARK-37592][SQL] Improve performance of `JoinSelection`

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #34844:
URL: https://github.com/apache/spark/pull/34844#issuecomment-992355596


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/146120/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #34844: [SPARK-37592][SQL] Improve performance of `JoinSelection`

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #34844:
URL: https://github.com/apache/spark/pull/34844#issuecomment-992691666


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/146137/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a change in pull request #34844: [SPARK-37592][SQL] Improve performance of `JoinSelection`

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #34844:
URL: https://github.com/apache/spark/pull/34844#discussion_r765796956



##########
File path: sql/core/src/main/scala/org/apache/spark/sql/execution/SparkStrategies.scala
##########
@@ -266,11 +266,15 @@ abstract class SparkStrategies extends QueryPlanner[SparkPlan] {
             }
         }
 
-        createBroadcastHashJoin(true)
-          .orElse { if (hintToSortMergeJoin(hint)) createSortMergeJoin() else None }
-          .orElse(createShuffleHashJoin(true))
-          .orElse { if (hintToShuffleReplicateNL(hint)) createCartesianProduct() else None }
-          .getOrElse(createJoinWithoutHint())
+        if (hint.isEmpty) {
+          createJoinWithoutHint()

Review comment:
       Can we do this in `case logical.Join(left, right, joinType, condition, hint) ...` as well?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34844: [SPARK-37592][SQL] Improve performance of `JoinSelection`

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34844:
URL: https://github.com/apache/spark/pull/34844#issuecomment-992341368


   **[Test build #146120 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/146120/testReport)** for PR 34844 at commit [`3ce77ee`](https://github.com/apache/spark/commit/3ce77ee19851d3b721d849314effa826cf6251d8).
    * This patch **fails SparkR unit tests**.
    * This patch merges cleanly.
    * This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #34844: [SPARK-37592][SQL] Improve performance of `JoinSelection`

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #34844:
URL: https://github.com/apache/spark/pull/34844#issuecomment-992355596


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/146120/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #34844: [SPARK-37592][SQL] Improve performance of `JoinSelection`

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #34844:
URL: https://github.com/apache/spark/pull/34844#issuecomment-992520175


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/50611/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #34844: [SPARK-37592][SQL] Improve performance of `JoinSelection`

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #34844:
URL: https://github.com/apache/spark/pull/34844#issuecomment-989689729


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/50507/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34844: [SPARK-37592][SQL] Improve performance of `JoinSelection`

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34844:
URL: https://github.com/apache/spark/pull/34844#issuecomment-989637768


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50507/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] beliefer commented on pull request #34844: [SPARK-37592][SQL] Improve performance of `JoinSelection`

Posted by GitBox <gi...@apache.org>.
beliefer commented on pull request #34844:
URL: https://github.com/apache/spark/pull/34844#issuecomment-993650575


   @cloud-fan Thanks a lot!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34844: [SPARK-37592][SQL] Improve performance of `JoinSelection`

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34844:
URL: https://github.com/apache/spark/pull/34844#issuecomment-992419038


   **[Test build #146137 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/146137/testReport)** for PR 34844 at commit [`3ce77ee`](https://github.com/apache/spark/commit/3ce77ee19851d3b721d849314effa826cf6251d8).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34844: [SPARK-37592][SQL] Improve performance of `JoinSelection`

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34844:
URL: https://github.com/apache/spark/pull/34844#issuecomment-992687741


   **[Test build #146137 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/146137/testReport)** for PR 34844 at commit [`3ce77ee`](https://github.com/apache/spark/commit/3ce77ee19851d3b721d849314effa826cf6251d8).
    * This patch passes all tests.
    * This patch merges cleanly.
    * This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34844: [SPARK-37592][SQL] Improve performance of `JoinSelection`

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34844:
URL: https://github.com/apache/spark/pull/34844#issuecomment-989822090


   **[Test build #146031 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/146031/testReport)** for PR 34844 at commit [`aa49f15`](https://github.com/apache/spark/commit/aa49f15b819f07508853f486b5359515f2734f5a).
    * This patch passes all tests.
    * This patch merges cleanly.
    * This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #34844: [SPARK-37592][SQL] Improve performance of `JoinSelection`

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #34844:
URL: https://github.com/apache/spark/pull/34844#issuecomment-989605791


   **[Test build #146031 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/146031/testReport)** for PR 34844 at commit [`aa49f15`](https://github.com/apache/spark/commit/aa49f15b819f07508853f486b5359515f2734f5a).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34844: [SPARK-37592][SQL] Improve performance of `JoinSelection`

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34844:
URL: https://github.com/apache/spark/pull/34844#issuecomment-989674597


   Kubernetes integration test status failure
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50507/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #34844: [SPARK-37592][SQL] Improve performance of `JoinSelection`

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #34844:
URL: https://github.com/apache/spark/pull/34844#issuecomment-989689729


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/50507/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a change in pull request #34844: [SPARK-37592][SQL] Improve performance of `JoinSelection`

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #34844:
URL: https://github.com/apache/spark/pull/34844#discussion_r765796382



##########
File path: sql/core/src/main/scala/org/apache/spark/sql/execution/SparkStrategies.scala
##########
@@ -266,11 +266,15 @@ abstract class SparkStrategies extends QueryPlanner[SparkPlan] {
             }
         }
 
-        createBroadcastHashJoin(true)
-          .orElse { if (hintToSortMergeJoin(hint)) createSortMergeJoin() else None }
-          .orElse(createShuffleHashJoin(true))
-          .orElse { if (hintToShuffleReplicateNL(hint)) createCartesianProduct() else None }
-          .getOrElse(createJoinWithoutHint())
+        if (hint.isEmpty) {
+          createJoinWithoutHint()

Review comment:
       this change LGTM




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #34844: [SPARK-37592][SQL] Improve performance of `JoinSelection`

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #34844:
URL: https://github.com/apache/spark/pull/34844#issuecomment-992419038


   **[Test build #146137 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/146137/testReport)** for PR 34844 at commit [`3ce77ee`](https://github.com/apache/spark/commit/3ce77ee19851d3b721d849314effa826cf6251d8).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #34844: [SPARK-37592][SQL] Improve performance of `JoinSelection`

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #34844:
URL: https://github.com/apache/spark/pull/34844#issuecomment-992142522


   **[Test build #146120 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/146120/testReport)** for PR 34844 at commit [`3ce77ee`](https://github.com/apache/spark/commit/3ce77ee19851d3b721d849314effa826cf6251d8).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] beliefer commented on pull request #34844: [SPARK-37592][SQL] Improve performance of `JoinSelection`

Posted by GitBox <gi...@apache.org>.
beliefer commented on pull request #34844:
URL: https://github.com/apache/spark/pull/34844#issuecomment-989842515


   ping @cloud-fan 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan closed pull request #34844: [SPARK-37592][SQL] Improve performance of `JoinSelection`

Posted by GitBox <gi...@apache.org>.
cloud-fan closed pull request #34844:
URL: https://github.com/apache/spark/pull/34844


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #34844: [SPARK-37592][SQL] Improve performance of `JoinSelection`

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #34844:
URL: https://github.com/apache/spark/pull/34844#issuecomment-992691666


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/146137/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #34844: [SPARK-37592][SQL] Improve performance of `JoinSelection`

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #34844:
URL: https://github.com/apache/spark/pull/34844#issuecomment-992208817


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/50595/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #34844: [SPARK-37592][SQL] Improve performance of `JoinSelection`

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #34844:
URL: https://github.com/apache/spark/pull/34844#issuecomment-989827522


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/146031/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #34844: [SPARK-37592][SQL] Improve performance of `JoinSelection`

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #34844:
URL: https://github.com/apache/spark/pull/34844#issuecomment-989827522


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/146031/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34844: [SPARK-37592][SQL] Improve performance of `JoinSelection`

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34844:
URL: https://github.com/apache/spark/pull/34844#issuecomment-992455141


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50611/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org