You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2021/12/09 08:03:09 UTC
[GitHub] [spark] beliefer opened a new pull request #34844: [SPARK-37592][SQL] Improve performance of `JoinSelection`
beliefer opened a new pull request #34844:
URL: https://github.com/apache/spark/pull/34844
### What changes were proposed in this pull request?
When I reading the implement of AQE, I find the process select join with hint exists a lot cumbersome code.
### Why are the changes needed?
Improve performance of `JoinSelection`
### Does this PR introduce _any_ user-facing change?
'No'.
Just change the inner implement.
### How was this patch tested?
Jenkins test.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34844: [SPARK-37592][SQL] Improve performance of `JoinSelection`
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34844:
URL: https://github.com/apache/spark/pull/34844#issuecomment-992492529
Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50611/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #34844: [SPARK-37592][SQL] Improve performance of `JoinSelection`
Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #34844:
URL: https://github.com/apache/spark/pull/34844#issuecomment-992208817
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/50595/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] beliefer commented on pull request #34844: [SPARK-37592][SQL] Improve performance of `JoinSelection`
Posted by GitBox <gi...@apache.org>.
beliefer commented on pull request #34844:
URL: https://github.com/apache/spark/pull/34844#issuecomment-992375020
retest this please
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #34844: [SPARK-37592][SQL] Improve performance of `JoinSelection`
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #34844:
URL: https://github.com/apache/spark/pull/34844#issuecomment-992520175
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/50611/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #34844: [SPARK-37592][SQL] Improve performance of `JoinSelection`
Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #34844:
URL: https://github.com/apache/spark/pull/34844#discussion_r765795115
##########
File path: sql/core/src/main/scala/org/apache/spark/sql/execution/Columnar.scala
##########
@@ -548,11 +548,9 @@ case class ApplyColumnarRulesAndInsertTransitions(
def apply(plan: SparkPlan): SparkPlan = {
var preInsertPlan: SparkPlan = plan
- columnarRules.foreach((r : ColumnarRule) =>
- preInsertPlan = r.preColumnarTransitions(preInsertPlan))
+ columnarRules.foreach( r => preInsertPlan = r.preColumnarTransitions(preInsertPlan))
Review comment:
how is this related to `JoinSelection`?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34844: [SPARK-37592][SQL] Improve performance of `JoinSelection`
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34844:
URL: https://github.com/apache/spark/pull/34844#issuecomment-992166601
Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50595/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34844: [SPARK-37592][SQL] Improve performance of `JoinSelection`
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34844:
URL: https://github.com/apache/spark/pull/34844#issuecomment-992205957
Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50595/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] cloud-fan commented on pull request #34844: [SPARK-37592][SQL] Improve performance of `JoinSelection`
Posted by GitBox <gi...@apache.org>.
cloud-fan commented on pull request #34844:
URL: https://github.com/apache/spark/pull/34844#issuecomment-993621186
thanks, merging to master!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on pull request #34844: [SPARK-37592][SQL] Improve performance of `JoinSelection`
Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on pull request #34844:
URL: https://github.com/apache/spark/pull/34844#issuecomment-993933528
+1, LGTM.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34844: [SPARK-37592][SQL] Improve performance of `JoinSelection`
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34844:
URL: https://github.com/apache/spark/pull/34844#issuecomment-992142522
**[Test build #146120 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/146120/testReport)** for PR 34844 at commit [`3ce77ee`](https://github.com/apache/spark/commit/3ce77ee19851d3b721d849314effa826cf6251d8).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #34844: [SPARK-37592][SQL] Improve performance of `JoinSelection`
Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #34844:
URL: https://github.com/apache/spark/pull/34844#discussion_r767426034
##########
File path: sql/core/src/main/scala/org/apache/spark/sql/execution/Columnar.scala
##########
@@ -548,11 +548,9 @@ case class ApplyColumnarRulesAndInsertTransitions(
def apply(plan: SparkPlan): SparkPlan = {
var preInsertPlan: SparkPlan = plan
- columnarRules.foreach((r : ColumnarRule) =>
- preInsertPlan = r.preColumnarTransitions(preInsertPlan))
+ columnarRules.foreach( r => preInsertPlan = r.preColumnarTransitions(preInsertPlan))
Review comment:
```suggestion
columnarRules.foreach(r => preInsertPlan = r.preColumnarTransitions(preInsertPlan))
```
##########
File path: sql/core/src/main/scala/org/apache/spark/sql/execution/Columnar.scala
##########
@@ -548,11 +548,9 @@ case class ApplyColumnarRulesAndInsertTransitions(
def apply(plan: SparkPlan): SparkPlan = {
var preInsertPlan: SparkPlan = plan
- columnarRules.foreach((r : ColumnarRule) =>
- preInsertPlan = r.preColumnarTransitions(preInsertPlan))
+ columnarRules.foreach( r => preInsertPlan = r.preColumnarTransitions(preInsertPlan))
var postInsertPlan = insertTransitions(preInsertPlan, outputsColumnar)
- columnarRules.reverse.foreach((r : ColumnarRule) =>
- postInsertPlan = r.postColumnarTransitions(postInsertPlan))
+ columnarRules.reverse.foreach( r => postInsertPlan = r.postColumnarTransitions(postInsertPlan))
Review comment:
```suggestion
columnarRules.reverse.foreach(r => postInsertPlan = r.postColumnarTransitions(postInsertPlan))
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34844: [SPARK-37592][SQL] Improve performance of `JoinSelection`
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34844:
URL: https://github.com/apache/spark/pull/34844#issuecomment-989605791
**[Test build #146031 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/146031/testReport)** for PR 34844 at commit [`aa49f15`](https://github.com/apache/spark/commit/aa49f15b819f07508853f486b5359515f2734f5a).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #34844: [SPARK-37592][SQL] Improve performance of `JoinSelection`
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #34844:
URL: https://github.com/apache/spark/pull/34844#issuecomment-992355596
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/146120/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #34844: [SPARK-37592][SQL] Improve performance of `JoinSelection`
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #34844:
URL: https://github.com/apache/spark/pull/34844#issuecomment-992691666
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/146137/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #34844: [SPARK-37592][SQL] Improve performance of `JoinSelection`
Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #34844:
URL: https://github.com/apache/spark/pull/34844#discussion_r765796956
##########
File path: sql/core/src/main/scala/org/apache/spark/sql/execution/SparkStrategies.scala
##########
@@ -266,11 +266,15 @@ abstract class SparkStrategies extends QueryPlanner[SparkPlan] {
}
}
- createBroadcastHashJoin(true)
- .orElse { if (hintToSortMergeJoin(hint)) createSortMergeJoin() else None }
- .orElse(createShuffleHashJoin(true))
- .orElse { if (hintToShuffleReplicateNL(hint)) createCartesianProduct() else None }
- .getOrElse(createJoinWithoutHint())
+ if (hint.isEmpty) {
+ createJoinWithoutHint()
Review comment:
Can we do this in `case logical.Join(left, right, joinType, condition, hint) ...` as well?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34844: [SPARK-37592][SQL] Improve performance of `JoinSelection`
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34844:
URL: https://github.com/apache/spark/pull/34844#issuecomment-992341368
**[Test build #146120 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/146120/testReport)** for PR 34844 at commit [`3ce77ee`](https://github.com/apache/spark/commit/3ce77ee19851d3b721d849314effa826cf6251d8).
* This patch **fails SparkR unit tests**.
* This patch merges cleanly.
* This patch adds no public classes.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #34844: [SPARK-37592][SQL] Improve performance of `JoinSelection`
Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #34844:
URL: https://github.com/apache/spark/pull/34844#issuecomment-992355596
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/146120/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #34844: [SPARK-37592][SQL] Improve performance of `JoinSelection`
Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #34844:
URL: https://github.com/apache/spark/pull/34844#issuecomment-992520175
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/50611/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #34844: [SPARK-37592][SQL] Improve performance of `JoinSelection`
Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #34844:
URL: https://github.com/apache/spark/pull/34844#issuecomment-989689729
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/50507/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34844: [SPARK-37592][SQL] Improve performance of `JoinSelection`
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34844:
URL: https://github.com/apache/spark/pull/34844#issuecomment-989637768
Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50507/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] beliefer commented on pull request #34844: [SPARK-37592][SQL] Improve performance of `JoinSelection`
Posted by GitBox <gi...@apache.org>.
beliefer commented on pull request #34844:
URL: https://github.com/apache/spark/pull/34844#issuecomment-993650575
@cloud-fan Thanks a lot!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34844: [SPARK-37592][SQL] Improve performance of `JoinSelection`
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34844:
URL: https://github.com/apache/spark/pull/34844#issuecomment-992419038
**[Test build #146137 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/146137/testReport)** for PR 34844 at commit [`3ce77ee`](https://github.com/apache/spark/commit/3ce77ee19851d3b721d849314effa826cf6251d8).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34844: [SPARK-37592][SQL] Improve performance of `JoinSelection`
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34844:
URL: https://github.com/apache/spark/pull/34844#issuecomment-992687741
**[Test build #146137 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/146137/testReport)** for PR 34844 at commit [`3ce77ee`](https://github.com/apache/spark/commit/3ce77ee19851d3b721d849314effa826cf6251d8).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34844: [SPARK-37592][SQL] Improve performance of `JoinSelection`
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34844:
URL: https://github.com/apache/spark/pull/34844#issuecomment-989822090
**[Test build #146031 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/146031/testReport)** for PR 34844 at commit [`aa49f15`](https://github.com/apache/spark/commit/aa49f15b819f07508853f486b5359515f2734f5a).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #34844: [SPARK-37592][SQL] Improve performance of `JoinSelection`
Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #34844:
URL: https://github.com/apache/spark/pull/34844#issuecomment-989605791
**[Test build #146031 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/146031/testReport)** for PR 34844 at commit [`aa49f15`](https://github.com/apache/spark/commit/aa49f15b819f07508853f486b5359515f2734f5a).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34844: [SPARK-37592][SQL] Improve performance of `JoinSelection`
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34844:
URL: https://github.com/apache/spark/pull/34844#issuecomment-989674597
Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50507/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #34844: [SPARK-37592][SQL] Improve performance of `JoinSelection`
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #34844:
URL: https://github.com/apache/spark/pull/34844#issuecomment-989689729
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/50507/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #34844: [SPARK-37592][SQL] Improve performance of `JoinSelection`
Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #34844:
URL: https://github.com/apache/spark/pull/34844#discussion_r765796382
##########
File path: sql/core/src/main/scala/org/apache/spark/sql/execution/SparkStrategies.scala
##########
@@ -266,11 +266,15 @@ abstract class SparkStrategies extends QueryPlanner[SparkPlan] {
}
}
- createBroadcastHashJoin(true)
- .orElse { if (hintToSortMergeJoin(hint)) createSortMergeJoin() else None }
- .orElse(createShuffleHashJoin(true))
- .orElse { if (hintToShuffleReplicateNL(hint)) createCartesianProduct() else None }
- .getOrElse(createJoinWithoutHint())
+ if (hint.isEmpty) {
+ createJoinWithoutHint()
Review comment:
this change LGTM
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #34844: [SPARK-37592][SQL] Improve performance of `JoinSelection`
Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #34844:
URL: https://github.com/apache/spark/pull/34844#issuecomment-992419038
**[Test build #146137 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/146137/testReport)** for PR 34844 at commit [`3ce77ee`](https://github.com/apache/spark/commit/3ce77ee19851d3b721d849314effa826cf6251d8).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #34844: [SPARK-37592][SQL] Improve performance of `JoinSelection`
Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #34844:
URL: https://github.com/apache/spark/pull/34844#issuecomment-992142522
**[Test build #146120 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/146120/testReport)** for PR 34844 at commit [`3ce77ee`](https://github.com/apache/spark/commit/3ce77ee19851d3b721d849314effa826cf6251d8).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] beliefer commented on pull request #34844: [SPARK-37592][SQL] Improve performance of `JoinSelection`
Posted by GitBox <gi...@apache.org>.
beliefer commented on pull request #34844:
URL: https://github.com/apache/spark/pull/34844#issuecomment-989842515
ping @cloud-fan
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] cloud-fan closed pull request #34844: [SPARK-37592][SQL] Improve performance of `JoinSelection`
Posted by GitBox <gi...@apache.org>.
cloud-fan closed pull request #34844:
URL: https://github.com/apache/spark/pull/34844
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #34844: [SPARK-37592][SQL] Improve performance of `JoinSelection`
Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #34844:
URL: https://github.com/apache/spark/pull/34844#issuecomment-992691666
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/146137/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #34844: [SPARK-37592][SQL] Improve performance of `JoinSelection`
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #34844:
URL: https://github.com/apache/spark/pull/34844#issuecomment-992208817
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/50595/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #34844: [SPARK-37592][SQL] Improve performance of `JoinSelection`
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #34844:
URL: https://github.com/apache/spark/pull/34844#issuecomment-989827522
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/146031/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #34844: [SPARK-37592][SQL] Improve performance of `JoinSelection`
Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #34844:
URL: https://github.com/apache/spark/pull/34844#issuecomment-989827522
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/146031/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34844: [SPARK-37592][SQL] Improve performance of `JoinSelection`
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34844:
URL: https://github.com/apache/spark/pull/34844#issuecomment-992455141
Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50611/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org