You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2020/07/31 03:19:44 UTC

[GitHub] [spark] cloud-fan commented on a change in pull request #29307: [SPARK-32083][SQL] AQE coalesce should at least return one partition

cloud-fan commented on a change in pull request #29307:
URL: https://github.com/apache/spark/pull/29307#discussion_r463384641



##########
File path: sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/ShuffleExchangeExec.scala
##########
@@ -83,7 +83,12 @@ trait ShuffleExchangeLike extends Exchange {
 case class ShuffleExchangeExec(
     override val outputPartitioning: Partitioning,
     child: SparkPlan,
-    canChangeNumPartitions: Boolean = true) extends ShuffleExchangeLike {
+    isUserSpecifiedNumPartitions: Boolean = false) extends ShuffleExchangeLike {
+
+  // If users specify the num partitions via APIs like `repartition`, we shouldn't change it.
+  // For `SinglePartition`, it requires exactly one partition and we can't change it either.
+  override def canChangeNumPartitions: Boolean =
+    !isUserSpecifiedNumPartitions && outputPartitioning != SinglePartition

Review comment:
       This change is for future-proof. It doesn't change anything. When there is a global aggregate, there will always be data in the final partition and we can't coalesce to 0 partitions anyway.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org