You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2020/11/06 18:27:17 UTC

[GitHub] [spark] c21 commented on a change in pull request #30280: [SPARK-32577][TEST] Fix wrong config for shuffled hash join in test in-joins.sql

c21 commented on a change in pull request #30280:
URL: https://github.com/apache/spark/pull/30280#discussion_r518928747



##########
File path: sql/core/src/test/resources/sql-tests/inputs/subquery/in-subquery/in-joins.sql
##########
@@ -6,8 +6,8 @@
 --  2. run with whole-stage-codegen, operator codegen or no codegen.
 
 --CONFIG_DIM1 spark.sql.autoBroadcastJoinThreshold=10485760
---CONFIG_DIM1 spark.sql.autoBroadcastJoinThreshold=-1,spark.sql.join.preferSortMergeJoin=true
---CONFIG_DIM1 spark.sql.autoBroadcastJoinThreshold=-1,spark.sql.join.preferSortMergeJoin=false
+--CONFIG_DIM1 spark.sql.autoBroadcastJoinThreshold=10485760,spark.sql.join.preferSortMergeJoin=true
+--CONFIG_DIM1 spark.sql.autoBroadcastJoinThreshold=10485760,spark.sql.join.preferSortMergeJoin=false

Review comment:
       @warrenzhu25 - Shuffled hash join will only be enabled with proper config value for `spark.sql.autoBroadcastJoinThreshold` and `spark.sql.shuffle.partitions`, and one side should be 3x smaller compared to the other side ([code](https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/joins.scala#L368-L381)). I don't think test cases here satisfy the second condition (one side 3x smaller than the other side). Can you double check the query plan? Thanks.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org