You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@spark.apache.org by "Cheng Su (Jira)" <ji...@apache.org> on 2021/03/02 09:47:00 UTC

[jira] [Updated] (SPARK-34593) Preserve broadcast nested loop join output partitioning and ordering

     [ https://issues.apache.org/jira/browse/SPARK-34593?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Cheng Su updated SPARK-34593:
-----------------------------
    Description: `BroadcastNestedLoopJoinExec` does not preserve `outputPartitioning` and `outputOrdering` right now. But it can preserve the streamed side partitioning and ordering when possible. This can help avoid shuffle and sort in later stage, if there's join and aggregation in the query.  (was: `BroadcastNestedLoopJoinExec` does not propagate `outputPartitioning` and `outputOrdering` right now. But it can propagate the streamed side partitioning and ordering when possible. This can help avoid shuffle and sort in later stage, if there's join and aggregation in the query.)

> Preserve broadcast nested loop join output partitioning and ordering
> --------------------------------------------------------------------
>
>                 Key: SPARK-34593
>                 URL: https://issues.apache.org/jira/browse/SPARK-34593
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 3.2.0
>            Reporter: Cheng Su
>            Priority: Minor
>
> `BroadcastNestedLoopJoinExec` does not preserve `outputPartitioning` and `outputOrdering` right now. But it can preserve the streamed side partitioning and ordering when possible. This can help avoid shuffle and sort in later stage, if there's join and aggregation in the query.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org