You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Apache Spark (Jira)" <ji...@apache.org> on 2022/02/25 00:19:00 UTC

[jira] [Assigned] (SPARK-37377) Refactor V2 Partitioning interface and remove deprecated usage of Distribution

     [ https://issues.apache.org/jira/browse/SPARK-37377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Apache Spark reassigned SPARK-37377:
------------------------------------

    Assignee:     (was: Apache Spark)

> Refactor V2 Partitioning interface and remove deprecated usage of Distribution
> ------------------------------------------------------------------------------
>
>                 Key: SPARK-37377
>                 URL: https://issues.apache.org/jira/browse/SPARK-37377
>             Project: Spark
>          Issue Type: Sub-task
>          Components: SQL
>    Affects Versions: 3.3.0
>            Reporter: Chao Sun
>            Priority: Major
>
> Currently {{Partitioning}} is defined as follow:
> {code:scala}
> @Evolving
> public interface Partitioning {
>   int numPartitions();
>   boolean satisfy(Distribution distribution);
> }
> {code}
> There are two issues with the interface: 1) it uses a deprecated {{Distribution}} interface, and should switch to {{org.apache.spark.sql.connector.distributions.Distribution}}. 2) currently there is no way to use this in join where we want to compare reported partitionings from both sides and decide whether they are "compatible" (and thus allows Spark to eliminate shuffle). 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org