You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2022/02/25 05:07:47 UTC

[GitHub] [spark] c21 commented on a change in pull request #35574: [SPARK-38237][SQL][SS] Allow `ClusteredDistribution` to require full clustering keys

c21 commented on a change in pull request #35574:
URL: https://github.com/apache/spark/pull/35574#discussion_r814478424



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/physical/partitioning.scala
##########
@@ -72,9 +72,14 @@ case object AllTuples extends Distribution {
 /**
  * Represents data where tuples that share the same values for the `clustering`
  * [[Expression Expressions]] will be co-located in the same partition.
+ *
+ * @param requireAllClusterKeys When true, `Partitioning` which satisfies this distribution,
+ *                              must match all `clustering` expressions in the same ordering.
  */
 case class ClusteredDistribution(
     clustering: Seq[Expression],
+    requireAllClusterKeys: Boolean = SQLConf.get.getConf(

Review comment:
       @cloud-fan - I agree for the point of caller-side code unchanged. I guess it's just feeling more coherent for others to read and understand code, when putting `clustering` and `requireAllClusterKeys` together. This was raised by https://github.com/apache/spark/pull/35574#discussion_r813499279 by @HeartSaVioR  as well. I am curious would adding the field in the middle here break other external library depending on Spark? I guess otherwise reviewers already paid the cost to review this PR, so not sure how important to change the caller-side code back. Just want to understand more here.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org