You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Zhenhua Wang (JIRA)" <ji...@apache.org> on 2017/03/18 06:56:41 UTC

[jira] [Updated] (SPARK-19915) Improve join reorder: Exclude cartesian product candidates to reduce the search space

     [ https://issues.apache.org/jira/browse/SPARK-19915?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Zhenhua Wang updated SPARK-19915:
---------------------------------
    Summary: Improve join reorder: Exclude cartesian product candidates to reduce the search space  (was: Improve join reorder: simplify cost evaluation, postpone column pruning, exclude cartesian product)

> Improve join reorder: Exclude cartesian product candidates to reduce the search space
> -------------------------------------------------------------------------------------
>
>                 Key: SPARK-19915
>                 URL: https://issues.apache.org/jira/browse/SPARK-19915
>             Project: Spark
>          Issue Type: Sub-task
>          Components: SQL
>    Affects Versions: 2.2.0
>            Reporter: Zhenhua Wang
>            Assignee: Zhenhua Wang
>             Fix For: 2.2.0
>
>
> 1. Usually cardinality is more important than size, we can simplify cost evaluation by using only cardinality. Note that this also enables us to not care about column pruing during reordering. Because otherwise, project will influence the output size of intermediate joins.
> 2. Do column pruning during reordering is troublesome. Given the first change, we can do it right after reordering, then logics for adding projects on intermediate joins can be removed. This makes the code simpler and more reliable.
> 3. Exclude cartesian products in the "memo". This significantly reduces the search space and memory overhead of memo. Otherwise every combination of items will exist in the memo. We can find those unjoinable items after reordering is finished and put them at the end.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org