You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Chao Sun (Jira)" <ji...@apache.org> on 2022/12/06 20:43:00 UTC

[jira] [Created] (SPARK-41413) Storage-Partitioned Join should avoid shuffle when partition keys mismatch, but join expressions are compatible

Chao Sun created SPARK-41413:
--------------------------------

             Summary: Storage-Partitioned Join should avoid shuffle when partition keys mismatch, but join expressions are compatible
                 Key: SPARK-41413
                 URL: https://issues.apache.org/jira/browse/SPARK-41413
             Project: Spark
          Issue Type: Sub-task
          Components: SQL
    Affects Versions: 3.3.1
            Reporter: Chao Sun


Currently when checking whether two sides of a Storage Partitioned Join are compatible, we requires both the partition expressions as well as the partition keys are compatible. However, this condition could be relaxed so that we only require the former. In the case that the latter is not compatible, we can calculate a common superset of keys and push down the information to both sides of the join, and use empty partitions for the missing keys.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org