You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@calcite.apache.org by "Stamatis Zampetakis (Jira)" <ji...@apache.org> on 2021/04/28 10:56:01 UTC

[jira] [Resolved] (CALCITE-4574) Wrong/Invalid plans when using RelBuilder#join with correlations

     [ https://issues.apache.org/jira/browse/CALCITE-4574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Stamatis Zampetakis resolved CALCITE-4574.
------------------------------------------
    Resolution: Fixed

Fixed in [8c2228eaf8ccc05ae58778276e760092557f78cc|https://github.com/apache/calcite/commit/8c2228eaf8ccc05ae58778276e760092557f78cc]. Thanks for the review [~jamesstarr], [~rubenql], [~julianhyde]!

> Wrong/Invalid plans when using RelBuilder#join with correlations
> ----------------------------------------------------------------
>
>                 Key: CALCITE-4574
>                 URL: https://issues.apache.org/jira/browse/CALCITE-4574
>             Project: Calcite
>          Issue Type: Bug
>    Affects Versions: 1.26.0
>            Reporter: Stamatis Zampetakis
>            Assignee: Stamatis Zampetakis
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 1.27.0
>
>          Time Spent: 50m
>  Remaining Estimate: 0h
>
> RelBuilder#join produces wrong/invalid relational expressions when correlated variables are passed as a parameter along with different join types and non trivial (always true) conditions.
> *Wrong plans* exist already in the code base where the {{requiredColumns}} attribute in {{LogicalCorrelate}} is missing some columns.
> Consider for instance the middle plan in {{RelOptRulesTest#testSelectNotInCorrelated}}:
> {noformat}
> LogicalProject(SAL=[$5], EXPR$1=[IS NULL($10)])
>   LogicalCorrelate(correlation=[$cor0], joinType=[left], requiredColumns=[{2}]) <-- PROBLEM
>     LogicalTableScan(table=[[CATALOG, SALES, EMP]])
>     LogicalFilter(condition=[=($cor0.EMPNO, $0)]) <-- $cor0.EMPNO refers to column 0 in EMP relation
>       LogicalProject(DEPTNO=[$0], i=[true])
>         LogicalFilter(condition=[=($cor0.JOB, $1)]) <-- $cor0.JOB refers to column 2 in EMP relation 
>           LogicalTableScan(table=[[CATALOG, SALES, DEPT]])
> {noformat}
> {{EMPNO}} column (index 0) that is referenced in the correlation in the right input is not present in the {{requiredColumns}} attribute.
> *Invalid plans* are created when the join type is SEMI or ANTI and the join condition uses columns from the right side. Currently, the join condition is added after the {{Correlate}} and columns from right side no longer exist thus the filter does not reference valid inputs. 
> If we are lucky we will get an {{AssertionError}} when constructing the {{Filter}} operator:
> {noformat}
> RexInputRef index 8 out of range 0..7
> java.lang.AssertionError: RexInputRef index 8 out of range 0..7
> 	at org.apache.calcite.util.Litmus$1.fail(Litmus.java:32)
> 	at org.apache.calcite.rex.RexChecker.visitInputRef(RexChecker.java:125)
> 	at org.apache.calcite.rex.RexChecker.visitInputRef(RexChecker.java:61)
> 	at org.apache.calcite.rex.RexInputRef.accept(RexInputRef.java:114)
> 	at org.apache.calcite.rex.RexChecker.visitCall(RexChecker.java:144)
> 	at org.apache.calcite.rex.RexChecker.visitCall(RexChecker.java:61)
> 	at org.apache.calcite.rex.RexCall.accept(RexCall.java:189)
> 	at org.apache.calcite.rel.core.Filter.isValid(Filter.java:127)
> 	at org.apache.calcite.rel.logical.LogicalFilter.<init>(LogicalFilter.java:72)
> 	at org.apache.calcite.rel.logical.LogicalFilter.create(LogicalFilter.java:116)
> 	at org.apache.calcite.rel.core.RelFactories$FilterFactoryImpl.createFilter(RelFactories.java:345)
> 	at org.apache.calcite.tools.RelBuilder.filter(RelBuilder.java:1349)
> 	at org.apache.calcite.tools.RelBuilder.filter(RelBuilder.java:1307)
> 	at org.apache.calcite.tools.RelBuilder.join(RelBuilder.java:2407)
> {noformat}
> otherwise (assertions disabled) we will end up with an invalid plan.
> {code:java}
> RelNode root = builder
>         .scan("EMP")
>         .variable(v)
>         .scan("DEPT")
>         .join(type,
>             builder.equals(
>                 builder.field(2, 0, "DEPTNO"),
>                 builder.field(2, 1, "DEPTNO")), ImmutableSet.of(v.get().id))
>         .build();
> {code}
> +Actual plan+
> {noformat}
> LogicalFilter(condition=[=($7, $8)]) <- PROBLEM I
>   LogicalCorrelate(correlation=[$cor0], joinType=[semi], requiredColumns=[{}]) <- PROBLEM II
>     LogicalTableScan(table=[[scott, EMP]])
>     LogicalTableScan(table=[[scott, DEPT]])
> {noformat}
> +Expected plan+
> {noformat}
> LogicalCorrelate(correlation=[$cor0], joinType=[semi], requiredColumns=[{7}])
>   LogicalTableScan(table=[[scott, EMP]])
>   LogicalFilter(condition=[=($cor0.DEPTNO, $0)])
>     LogicalTableScan(table=[[scott, DEPT]])
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)