You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Lai Zhou (Jira)" <ji...@apache.org> on 2019/08/23 07:41:00 UTC

[jira] [Created] (SPARK-28860) Using ColumnStats of join key to get TableAccessCardinality when finding star joins in ReorderJoinRule

Lai Zhou created SPARK-28860:
--------------------------------

             Summary:  Using ColumnStats of join key to get TableAccessCardinality when finding star joins in ReorderJoinRule
                 Key: SPARK-28860
                 URL: https://issues.apache.org/jira/browse/SPARK-28860
             Project: Spark
          Issue Type: Improvement
          Components: SQL
    Affects Versions: 2.4.3
            Reporter: Lai Zhou


Now the star-schema detection uses TableAccessCardinality to reorder DimTables  when there is a selectiveStarJoin . 

[StarSchemaDetection.scala#L341|https://github.com/apache/spark/blob/98e1a4cea44d7cb2f6d502c0202ad3cac2a1ad8d/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/StarSchemaDetection.scala#L341]

 
{code:java}
if (isSelectiveStarJoin(dimTables, conditions)) { 
val reorderDimTables = dimTables.map { plan => TableAccessCardinality(plan, getTableAccessCardinality(plan)) }.sortBy(_.size).map { case TableAccessCardinality(p1, _) => p1 }{code}
 

 

But the getTableAccessCardinality method does't consider the ColumnStats of the equi-join-key. I'm not sure if we should compute Join cardinality for the dimTable based on it's

join key here.

[~ioana-delaney]

 

 

 

 



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org