You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2021/02/15 20:10:23 UTC

[GitHub] [spark] HeartSaVioR commented on a change in pull request #31355: [SPARK-34255][SQL] Support partitioning with static number on required distribution and ordering on V2 write

HeartSaVioR commented on a change in pull request #31355:
URL: https://github.com/apache/spark/pull/31355#discussion_r576402899



##########
File path: sql/catalyst/src/main/java/org/apache/spark/sql/connector/write/RequiresDistributionAndOrdering.java
##########
@@ -42,6 +42,19 @@
    */
   Distribution requiredDistribution();
 
+  /**
+   * Returns the number of partitions required by this write if specific distribution is required.
+   * <p>
+   * Implementations may want to override this if it requires the specific number of partitions
+   * on distribution.
+   * <p>
+   * {@link UnspecifiedDistribution} is not affected by this method, as it doesn't require the
+   * specific distribution.
+   *
+   * @return the required number of partitions, non-positive values mean no requirement.
+   */
+  default int requiredNumPartitionsOnDistribution() { return 0; }

Review comment:
       I'm actually more familiar with the word "parallelism" but the word looks to be less used in Spark - "partition" is being used almost everywhere. I'm OK to mention it as "parallelism" but let's hear more voices on this.
   
   The name comes from the fact the number is only effective when distribution is specified - longer name is to avoid misunderstanding that it also takes effect on sorting request, whereas it is not. Probably we could discuss the impact first and revisit this.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org