You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Lai Zhou (Jira)" <ji...@apache.org> on 2019/08/23 07:41:00 UTC
[jira] [Created] (SPARK-28860) Using ColumnStats of join key to
get TableAccessCardinality when finding star joins in ReorderJoinRule
Lai Zhou created SPARK-28860:
--------------------------------
Summary: Using ColumnStats of join key to get TableAccessCardinality when finding star joins in ReorderJoinRule
Key: SPARK-28860
URL: https://issues.apache.org/jira/browse/SPARK-28860
Project: Spark
Issue Type: Improvement
Components: SQL
Affects Versions: 2.4.3
Reporter: Lai Zhou
Now the star-schema detection uses TableAccessCardinality to reorder DimTables when there is a selectiveStarJoin .
[StarSchemaDetection.scala#L341|https://github.com/apache/spark/blob/98e1a4cea44d7cb2f6d502c0202ad3cac2a1ad8d/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/StarSchemaDetection.scala#L341]
{code:java}
if (isSelectiveStarJoin(dimTables, conditions)) {
val reorderDimTables = dimTables.map { plan => TableAccessCardinality(plan, getTableAccessCardinality(plan)) }.sortBy(_.size).map { case TableAccessCardinality(p1, _) => p1 }{code}
But the getTableAccessCardinality method does't consider the ColumnStats of the equi-join-key. I'm not sure if we should compute Join cardinality for the dimTable based on it's
join key here.
[~ioana-delaney]
--
This message was sent by Atlassian Jira
(v8.3.2#803003)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org