You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Yang Jie (Jira)" <ji...@apache.org> on 2020/09/10 16:26:00 UTC

[jira] [Updated] (SPARK-32848) Let CostBasedJoinReorder produce same result in Scala 2.12 and 2.13

     [ https://issues.apache.org/jira/browse/SPARK-32848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Yang Jie updated SPARK-32848:
-----------------------------
    Summary: Let CostBasedJoinReorder produce same result in Scala 2.12 and 2.13  (was: Let CostBasedJoinReorder produce deterministic result with Scala 2.12 and 2.13)

> Let CostBasedJoinReorder produce same result in Scala 2.12 and 2.13
> -------------------------------------------------------------------
>
>                 Key: SPARK-32848
>                 URL: https://issues.apache.org/jira/browse/SPARK-32848
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 3.0.0, 3.0.1, 3.1.0
>            Reporter: Yang Jie
>            Priority: Major
>
> The optimization result of {{CostBasedJoinReorder}} maybe different with same input in Scala 2.12 and Scala 2.13 if there are more than one same cost candidate plans.
> The test case named "Test 4: Star with several branches" in StarJoinCostBasedReorderSuite is a typical case.
>  
> If the input is 
> {code:java}
> d1.join(t3).join(t4).join(f1).join(d2).join(t5).join(t6).join(d3).join(t1).join(t2)
>   .where((nameToAttr("d1_c2") === nameToAttr("t3_c1")) &&
>     (nameToAttr("t3_c2") === nameToAttr("t4_c2")) &&
>     (nameToAttr("d1_pk") === nameToAttr("f1_fk1")) &&
>     (nameToAttr("f1_fk2") === nameToAttr("d2_pk")) &&
>     (nameToAttr("d2_c2") === nameToAttr("t5_c1")) &&
>     (nameToAttr("t5_c2") === nameToAttr("t6_c2")) &&
>     (nameToAttr("f1_fk3") === nameToAttr("d3_pk")) &&
>     (nameToAttr("d3_c2") === nameToAttr("t1_c1")) &&
>     (nameToAttr("t1_c2") === nameToAttr("t2_c2")))
> {code}
> the optimization result  in Scala 2.12 is 
> {code:java}
>   f1.join(d3, Inner, Some(nameToAttr("f1_fk3") === nameToAttr("d3_pk")))
>     .join(d1, Inner, Some(nameToAttr("f1_fk1") === nameToAttr("d1_pk")))
>     .join(d2, Inner, Some(nameToAttr("f1_fk2") === nameToAttr("d2_pk")))
>     .
>     .
>     .{code}
> and the optimization result  in Scala 2.13 is 
> {code:java}
> f1.join(d3, Inner, Some(nameToAttr("f1_fk3") === nameToAttr("d3_pk")))
>     .join(d2, Inner, Some(nameToAttr("f1_fk2") === nameToAttr("d2_pk")))
>     .join(d1, Inner, Some(nameToAttr("f1_fk1") === nameToAttr("d1_pk")))
>     .
>     .
>     .
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org