You are viewing a plain text version of this content. The canonical link for it is here.

Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2020/07/29 07:54:25 UTC

[GitHub] [spark] wangshisan opened a new pull request #29266: [SPARK-32464][SQL] Support skew handling on join that has one side wi…

wangshisan opened a new pull request #29266:
URL: https://github.com/apache/spark/pull/29266


   ### What changes were proposed in this pull request?
   Support skew handling on join that has one side with no query stage.
   
   
   ### Why are the changes needed?
   In our production environment, there are many bucket tables, which are used to join with other tables. And there are some skewed joins now and then. While, in current implementation, the skew join handling can only applied when both sides of a SMJ are QueryStages. So skew join handling is not able to deal with such cases.
   
   
   ### Does this PR introduce _any_ user-facing change?
   No.
   
   
   ### How was this patch tested?
   Ut.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] wangshisan commented on pull request #29266: [SPARK-32464][SQL] Support skew handling on join that has one side wi…

Posted by GitBox <gi...@apache.org>.

wangshisan commented on pull request #29266:
URL: https://github.com/apache/spark/pull/29266#issuecomment-668926319

> Yea I'm also wondering the approach here. The skew join handling needs to split the skew side, and repeat the other side. I don't think we can split the buckets of bucketed table, and I'm not sure how we are going to read buckets repeatedly from a bucketed table.

Yeah, that's right, we cannot split the bucket table side. But we can duplicate the bucket side, just leverage the RDD mechanism, try to duplicate some parent partitions in the child RDD.
For instance, we have a RDD A with partitions (0, 1, 2, 3), and now we need duplicate the second partition (partition 1). We can just create a new RDD, B for example, with partition (0, 1, 2, 3, 4), and guarantee the mapping relationship:
- RDD B partition 0 <- RDD A partition 0
- RDD B partition 1 <- RDD A partition 1
- RDD B partition 2 <- RDD A partition 1
- RDD B partition 3 <- RDD A partition 2
- RDD B partition 4 <- RDD A partition 3

And this is the new class RecombinationedRDD designed for.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] wangshisan commented on a change in pull request #29266: [SPARK-32464][SQL] Support skew handling on join that has one side wi…

Posted by GitBox <gi...@apache.org>.

wangshisan commented on a change in pull request #29266:
URL: https://github.com/apache/spark/pull/29266#discussion_r465422329



##########
File path: sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/OptimizeSkewedJoin.scala
##########
@@ -250,6 +251,85 @@ case class OptimizeSkewedJoin(conf: SQLConf) extends Rule[SparkPlan] {
       }
   }
 
+  def optimizeSingleStageSkewJoin(plan: SparkPlan): SparkPlan = plan.transformUp {

Review comment:
       That's a good idea. I will have a try.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] maryannxue commented on a change in pull request #29266: [SPARK-32464][SQL] Support skew handling on join that has one side wi…

Posted by GitBox <gi...@apache.org>.

maryannxue commented on a change in pull request #29266:
URL: https://github.com/apache/spark/pull/29266#discussion_r465149665



##########
File path: sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/OptimizeSkewedJoin.scala
##########
@@ -250,6 +251,85 @@ case class OptimizeSkewedJoin(conf: SQLConf) extends Rule[SparkPlan] {
       }
   }
 
+  def optimizeSingleStageSkewJoin(plan: SparkPlan): SparkPlan = plan.transformUp {

Review comment:
       Is it possible to integrate this logic into the two-query-stage skew processing?
   We can just treat the non-query-stage side as "canSplitXXXSide(...) = false", right?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] maryannxue commented on a change in pull request #29266: [SPARK-32464][SQL] Support skew handling on join that has one side wi…

Posted by GitBox <gi...@apache.org>.

maryannxue commented on a change in pull request #29266:
URL: https://github.com/apache/spark/pull/29266#discussion_r465148553



##########
File path: sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/OptimizeSkewedJoin.scala
##########
@@ -250,6 +251,85 @@ case class OptimizeSkewedJoin(conf: SQLConf) extends Rule[SparkPlan] {
       }
   }
 
+  def optimizeSingleStageSkewJoin(plan: SparkPlan): SparkPlan = plan.transformUp {
+    case smj @ SortMergeJoinExec(_, _, joinType, _,
+        sort @ SortExec(_, _, ShuffleStage(qs: ShuffleStageInfo), _), right, _) =>
+      if (qs.shuffleStage.shuffle.canChangeNumPartitions && canSplitLeftSide(joinType)) {
+        val (numSkewed, splitPartitions, partitionIndexes) = handleSkewed(qs)
+        if (numSkewed > 0) {
+          logInfo(s"number of skewed partitions $numSkewed")
+          smj.copy(
+            left = sort.copy(child = CustomShuffleReaderExec(qs.shuffleStage, splitPartitions)),
+            right = PartitionRecombinationExec(_ => partitionIndexes,
+              partitionIndexes.length, right),
+            isSkewJoin = true)
+        } else smj
+      } else {
+        smj
+      }
+
+    case smj @ SortMergeJoinExec(_, _, _, _, left,
+        sort @ SortExec(_, _, ShuffleStage(qs: ShuffleStageInfo), _), _) =>
+      val (numSkewed, splitPartitions, partitionIndexes) = handleSkewed(qs)

Review comment:
       Why is it that we don't need `if (qs.shuffleStage.shuffle.canChangeNumPartitions && canSplitLeftSide(joinType))` here??




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] SparkQA commented on pull request #29266: [SPARK-32464][SQL] Support skew handling on join that has one side wi…

Posted by GitBox <gi...@apache.org>.

SparkQA commented on pull request #29266:
URL: https://github.com/apache/spark/pull/29266#issuecomment-664737759






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] cloud-fan edited a comment on pull request #29266: [SPARK-32464][SQL] Support skew handling on join that has one side wi…

Posted by GitBox <gi...@apache.org>.

cloud-fan edited a comment on pull request #29266:
URL: https://github.com/apache/spark/pull/29266#issuecomment-668623798


   Yea I'm also wondering the approach here. The skew join handling needs to split the skew side, and repeat the other side. I don't think we can split the buckets of bucketed table, and I'm not sure how we are going to read buckets repeatedly from a bucketed table.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] github-actions[bot] commented on pull request #29266: [SPARK-32464][SQL] Support skew handling on join that has one side wi…

Posted by GitBox <gi...@apache.org>.

github-actions[bot] commented on pull request #29266:
URL: https://github.com/apache/spark/pull/29266#issuecomment-727100443


   We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable.
   If you'd like to revive this PR, please reopen it and ask a committer to remove the Stale tag!


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] github-actions[bot] closed pull request #29266: [SPARK-32464][SQL] Support skew handling on join that has one side wi…

Posted by GitBox <gi...@apache.org>.

github-actions[bot] closed pull request #29266:
URL: https://github.com/apache/spark/pull/29266


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] maryannxue commented on a change in pull request #29266: [SPARK-32464][SQL] Support skew handling on join that has one side wi…

Posted by GitBox <gi...@apache.org>.

maryannxue commented on a change in pull request #29266:
URL: https://github.com/apache/spark/pull/29266#discussion_r465144166



##########
File path: core/src/main/scala/org/apache/spark/rdd/PartitionRecombinationedRDD.scala
##########
@@ -0,0 +1,53 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.rdd
+
+import scala.reflect.ClassTag
+
+import org.apache.spark.{Dependency, NarrowDependency, Partition, TaskContext}
+
+/**
+ * A RDD that just redistribute the dependency RDD's partitions.
+ * It provides the capability to reorder, remove, duplicate... any RDD partition level operation.
+ */
+class RecombinationedRDD[T: ClassTag](

Review comment:
       `RedistributedRDD` ? (btw, combine is the verb form of combination)




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] wangshisan commented on pull request #29266: [SPARK-32464][SQL] Support skew handling on join that has one side wi…

Posted by GitBox <gi...@apache.org>.

wangshisan commented on pull request #29266:
URL: https://github.com/apache/spark/pull/29266#issuecomment-668932395


   > this is with AQE? if so can we please add that to description and it might be nice to describe approach taken to handle it in description as well.
   
   Added.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] wangshisan commented on pull request #29266: [SPARK-32464][SQL] Support skew handling on join that has one side wi…

Posted by GitBox <gi...@apache.org>.

wangshisan commented on pull request #29266:
URL: https://github.com/apache/spark/pull/29266#issuecomment-667772074


   @cloud-fan @JkSelf  Could you have a look?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] cloud-fan commented on pull request #29266: [SPARK-32464][SQL] Support skew handling on join that has one side wi…

Posted by GitBox <gi...@apache.org>.

cloud-fan commented on pull request #29266:
URL: https://github.com/apache/spark/pull/29266#issuecomment-668623798


   Yea I'm also wondering the approach here. The skew join handling needs to split the skew side, and repeat the other side. I don't think we can split the buckets of bucketed table, and I'm not sure how we are going to read buckets repeated from a bucketed table.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] wangyum commented on pull request #29266: [SPARK-32464][SQL] Support skew handling on join that has one side wi…

Posted by GitBox <gi...@apache.org>.

wangyum commented on pull request #29266:
URL: https://github.com/apache/spark/pull/29266#issuecomment-664736537






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] tgravescs commented on pull request #29266: [SPARK-32464][SQL] Support skew handling on join that has one side wi…

Posted by GitBox <gi...@apache.org>.

tgravescs commented on pull request #29266:
URL: https://github.com/apache/spark/pull/29266#issuecomment-668602891


   this is with AQE?  if so can we please add that to description and it might be nice to describe approach taken to handle it in description as well.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] JkSelf commented on pull request #29266: [SPARK-32464][SQL] Support skew handling on join that has one side wi…

Posted by GitBox <gi...@apache.org>.

JkSelf commented on pull request #29266:
URL: https://github.com/apache/spark/pull/29266#issuecomment-668989319


   Can you show the plan changes in UI? And whether introduced additional shuffle when change the partition num in bucket side or not?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] wangshisan commented on a change in pull request #29266: [SPARK-32464][SQL] Support skew handling on join that has one side wi…

Posted by GitBox <gi...@apache.org>.

wangshisan commented on a change in pull request #29266:
URL: https://github.com/apache/spark/pull/29266#discussion_r465428028



##########
File path: sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/OptimizeSkewedJoin.scala
##########
@@ -250,6 +251,85 @@ case class OptimizeSkewedJoin(conf: SQLConf) extends Rule[SparkPlan] {
       }
   }
 
+  def optimizeSingleStageSkewJoin(plan: SparkPlan): SparkPlan = plan.transformUp {
+    case smj @ SortMergeJoinExec(_, _, joinType, _,
+        sort @ SortExec(_, _, ShuffleStage(qs: ShuffleStageInfo), _), right, _) =>
+      if (qs.shuffleStage.shuffle.canChangeNumPartitions && canSplitLeftSide(joinType)) {
+        val (numSkewed, splitPartitions, partitionIndexes) = handleSkewed(qs)
+        if (numSkewed > 0) {
+          logInfo(s"number of skewed partitions $numSkewed")
+          smj.copy(
+            left = sort.copy(child = CustomShuffleReaderExec(qs.shuffleStage, splitPartitions)),
+            right = PartitionRecombinationExec(_ => partitionIndexes,
+              partitionIndexes.length, right),
+            isSkewJoin = true)
+        } else smj
+      } else {
+        smj
+      }
+
+    case smj @ SortMergeJoinExec(_, _, _, _, left,
+        sort @ SortExec(_, _, ShuffleStage(qs: ShuffleStageInfo), _), _) =>
+      val (numSkewed, splitPartitions, partitionIndexes) = handleSkewed(qs)

Review comment:
       My fault, missed some code..




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] wangshisan edited a comment on pull request #29266: [SPARK-32464][SQL] Support skew handling on join that has one side wi…

Posted by GitBox <gi...@apache.org>.

wangshisan edited a comment on pull request #29266:
URL: https://github.com/apache/spark/pull/29266#issuecomment-668926319

Yeah, that's right, we cannot split the bucket table side. But we can duplicate the bucket side, just leverage the RDD mechanism, try to duplicate some parent partitions in the child RDD.
For instance, we have a RDD A with partitions (0, 1, 2, 3), and now we need duplicate the second partition (partition 1). We can just create a new RDD, B for example, with partition (0, 1, 2, 3, 4), and guarantee the dependency relationship:
- RDD B partition 0 <- RDD A partition 0
- RDD B partition 1 <- RDD A partition 1
- RDD B partition 2 <- RDD A partition 1
- RDD B partition 3 <- RDD A partition 2
- RDD B partition 4 <- RDD A partition 3

And this is the new class RecombinationedRDD designed for.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] SparkQA removed a comment on pull request #29266: [SPARK-32464][SQL] Support skew handling on join that has one side wi…

Posted by GitBox <gi...@apache.org>.

SparkQA removed a comment on pull request #29266:
URL: https://github.com/apache/spark/pull/29266#issuecomment-664737759






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] maryannxue commented on a change in pull request #29266: [SPARK-32464][SQL] Support skew handling on join that has one side wi…

Posted by GitBox <gi...@apache.org>.

maryannxue commented on a change in pull request #29266:
URL: https://github.com/apache/spark/pull/29266#discussion_r465144822



##########
File path: core/src/main/scala/org/apache/spark/rdd/PartitionRecombinationedRDD.scala
##########
@@ -0,0 +1,53 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.rdd
+
+import scala.reflect.ClassTag
+
+import org.apache.spark.{Dependency, NarrowDependency, Partition, TaskContext}
+
+/**
+ * A RDD that just redistribute the dependency RDD's partitions.
+ * It provides the capability to reorder, remove, duplicate... any RDD partition level operation.
+ */
+class RecombinationedRDD[T: ClassTag](
+    prev: RDD[T],
+    f: (Seq[Int] => Seq[Int])) extends RDD[T](prev) {

Review comment:
       Can use `Array[CoalescedPartitionSpec]` instead?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29266: [SPARK-32464][SQL] Support skew handling on join that has one side wi…

Posted by GitBox <gi...@apache.org>.

AmplabJenkins removed a comment on pull request #29266:
URL: https://github.com/apache/spark/pull/29266#issuecomment-664729541






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on pull request #29266: [SPARK-32464][SQL] Support skew handling on join that has one side wi…

Posted by GitBox <gi...@apache.org>.

AmplabJenkins commented on pull request #29266:
URL: https://github.com/apache/spark/pull/29266#issuecomment-664729541






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] gengliangwang commented on pull request #29266: [SPARK-32464][SQL] Support skew handling on join that has one side wi…

Posted by GitBox <gi...@apache.org>.

gengliangwang commented on pull request #29266:
URL: https://github.com/apache/spark/pull/29266#issuecomment-668432042


   cc @maryannxue as well


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org