You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@calcite.apache.org by "Ruben Q L (Jira)" <ji...@apache.org> on 2020/09/01 15:40:00 UTC

[jira] [Commented] (CALCITE-4208) Improve metadata row count for Join

    [ https://issues.apache.org/jira/browse/CALCITE-4208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17188575#comment-17188575 ] 

Ruben Q L commented on CALCITE-4208:
------------------------------------

Row count computation for SEMI join:
{code}
RexNode semiJoinSelectivity = RelMdUtil.makeSemiJoinSelectivityRexNode(mq, join);
return NumberUtil.multiply(
  mq.getSelectivity(join.getLeft(), semiJoinSelectivity),
  mq.getRowCount(join.getLeft()));
{code}

Proposed row count computation for ANTI join:
{code}
RexNode semiJoinSelectivity = RelMdUtil.makeSemiJoinSelectivityRexNode(mq, join);
return NumberUtil.multiply(
  1D - mq.getSelectivity(join.getLeft(), semiJoinSelectivity),
  mq.getRowCount(join.getLeft()));
{code}

Row count computation for INNER join:
{code}
return leftRowCount * rightRowCount * mq.getSelectivity(join, condition);
{code}

Proposed row count computation for LEFT join:
{code}
return leftRowCount;
{code}

Proposed row count computation for RIGHT join:
{code}
return rightRowCount;
{code}

Proposed row count computation for FULL join:
{code}
return leftRowCount + rightRowCount - (leftRowCount * rightRowCount * mq.getSelectivity(join, condition));
{code}

> Improve metadata row count for Join
> -----------------------------------
>
>                 Key: CALCITE-4208
>                 URL: https://issues.apache.org/jira/browse/CALCITE-4208
>             Project: Calcite
>          Issue Type: Improvement
>          Components: core
>            Reporter: Ruben Q L
>            Priority: Major
>
> Currently, the default metadata row count for join {{RelMdRowCount#getRowCount(Join rel, RelMetadataQuery mq)}} relies on {{RelMdUtil.getJoinRowCount}}. This method has several issues:
>  - In case of ANTI join, it returns the same estimation as a SEMI join
>  - In other cases (INNER, LEFT, RIGHT, FULL), it returns always the same formula:
>  {{leftRowCount * rightRowCount * mq.getSelectivity(join, condition)}}
>  which seems valid for an INNER join, but not for LEFT / RIGHT / FULL.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)