You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Chao Sun (Jira)" <ji...@apache.org> on 2022/12/22 19:58:00 UTC

[jira] [Assigned] (SPARK-41413) SPJ: Avoid shuffle when partition keys mismatch, but join expressions are compatible

     [ https://issues.apache.org/jira/browse/SPARK-41413?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Chao Sun reassigned SPARK-41413:
--------------------------------

    Assignee: Chao Sun

> SPJ: Avoid shuffle when partition keys mismatch, but join expressions are compatible
> ------------------------------------------------------------------------------------
>
>                 Key: SPARK-41413
>                 URL: https://issues.apache.org/jira/browse/SPARK-41413
>             Project: Spark
>          Issue Type: Sub-task
>          Components: SQL
>    Affects Versions: 3.3.1
>            Reporter: Chao Sun
>            Assignee: Chao Sun
>            Priority: Major
>
> Currently when checking whether two sides of a Storage Partitioned Join are compatible, we requires both the partition expressions as well as the partition keys are compatible. However, this condition could be relaxed so that we only require the former. In the case that the latter is not compatible, we can calculate a common superset of keys and push down the information to both sides of the join, and use empty partitions for the missing keys.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org