You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@calcite.apache.org by "Ruben Q L (Jira)" <ji...@apache.org> on 2020/11/20 14:17:00 UTC

[jira] [Comment Edited] (CALCITE-4414) RelMdSelectivity#getSelectivity for Calc can propagate a predicate with wrong references

    [ https://issues.apache.org/jira/browse/CALCITE-4414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17236176#comment-17236176 ] 

Ruben Q L edited comment on CALCITE-4414 at 11/20/20, 2:16 PM:
---------------------------------------------------------------

Test:
{code}
  @Test void test() {
    final RelBuilder builder = RelBuilder.create(RelBuilderTest.config().build());
    final RelNode relNode = builder
        .scan("EMP")
        .project(
            builder.field("DEPTNO"))
        .scan("EMP")
        .project(
            builder.field("DEPTNO"))
        .union(true)
        .projectPlus(builder.field("DEPTNO"))
        .filter(
            builder.equals(
                builder.field(0),
                builder.literal(0)))
        .build();

    // Program to convert Project + Filter into a single Calc
    final HepProgram program = new HepProgramBuilder()
        .addRuleInstance(CoreRules.FILTER_TO_CALC)
        .addRuleInstance(CoreRules.PROJECT_TO_CALC)
        .addRuleInstance(CoreRules.CALC_MERGE)
        .build();
    final HepPlanner hepPlanner = new HepPlanner(program);
    hepPlanner.setRoot(relNode);
    RelNode output = hepPlanner.findBestExp();

    // Add filter on the extra field generated by projectPlus (now a Calc after hepPlanner)
    output = builder
        .push(output)
        .filter(
            builder.equals(
                builder.field(1),
                builder.literal(0)))
        .build();

    output.estimateRowCount(output.getCluster().getMetadataQuery());
  }
{code}

Fails with:
{noformat}
java.lang.ArrayIndexOutOfBoundsException: 1
	at org.apache.calcite.plan.RelOptUtil$RexInputConverter.visitInputRef(RelOptUtil.java:4452)
	at org.apache.calcite.plan.RelOptUtil$RexInputConverter.visitInputRef(RelOptUtil.java:4372)
	at org.apache.calcite.rex.RexInputRef.accept(RexInputRef.java:112)
	at org.apache.calcite.rex.RexShuttle.visitList(RexShuttle.java:158)
	at org.apache.calcite.rex.RexShuttle.visitCall(RexShuttle.java:110)
	at org.apache.calcite.rex.RexShuttle.visitCall(RexShuttle.java:33)
	at org.apache.calcite.rex.RexCall.accept(RexCall.java:175)
	at org.apache.calcite.rel.metadata.RelMdSelectivity.getSelectivity(RelMdSelectivity.java:80)
	at GeneratedMetadataHandler_Selectivity.getSelectivity_$(Unknown Source)
	at GeneratedMetadataHandler_Selectivity.getSelectivity(Unknown Source)
	at org.apache.calcite.rel.metadata.RelMetadataQuery.getSelectivity(RelMetadataQuery.java:426)
	at org.apache.calcite.rel.metadata.RelMdSelectivity.getSelectivity(RelMdSelectivity.java:131)
	at GeneratedMetadataHandler_Selectivity.getSelectivity_$(Unknown Source)
	at GeneratedMetadataHandler_Selectivity.getSelectivity(Unknown Source)
	at org.apache.calcite.rel.metadata.RelMetadataQuery.getSelectivity(RelMetadataQuery.java:426)
	at org.apache.calcite.rel.metadata.RelMdUtil.estimateFilteredRows(RelMdUtil.java:767)
	at org.apache.calcite.rel.core.Filter.estimateRowCount(Filter.java:139)
	at org.apache.calcite.test.RelMdSelectivityTest.test(RelMdSelectivityTest.java:72)
{noformat}

The plan that we are dealing with at the end of the test:
{noformat}
LogicalFilter(condition=[=($1, 0)])
  LogicalCalc(expr#0=[{inputs}], expr#1=[0], expr#2=[=($t0, $t1)], DEPTNO=[$t0], DEPTNO0=[$t0], $condition=[$t2])
    LogicalUnion(all=[true])
      LogicalCalc(expr#0..7=[{inputs}], DEPTNO=[$t7])
        LogicalTableScan(table=[[scott, EMP]])
      LogicalCalc(expr#0..7=[{inputs}], DEPTNO=[$t7])
        LogicalTableScan(table=[[scott, EMP]])
{noformat}


was (Author: rubenql):
Test:
{code}
  @Test void test() {
    final RelBuilder builder = RelBuilder.create(RelBuilderTest.config().build());
    final RelNode relNode = builder
        .scan("EMP")
        .project(
            builder.field("DEPTNO"))
        .scan("EMP")
        .project(
            builder.field("DEPTNO"))
        .union(true)
        .projectPlus(builder.field("DEPTNO"))
        .filter(
            builder.equals(
                builder.field(0),
                builder.literal(0)))
        .build();

    // Program to convert Project + Filter into a single Calc
    final HepProgram program = new HepProgramBuilder()
        .addRuleInstance(CoreRules.FILTER_TO_CALC)
        .addRuleInstance(CoreRules.PROJECT_TO_CALC)
        .addRuleInstance(CoreRules.CALC_MERGE)
        .build();
    final HepPlanner hepPlanner = new HepPlanner(program);
    hepPlanner.setRoot(relNode);
    RelNode output = hepPlanner.findBestExp();

    // Add filter on the extra field generated by projectPlus (now a Calc after hepPlanner)
    output = builder
        .push(output)
        .filter(
            builder.equals(
                builder.field(1),
                builder.literal(0)))
        .build();

    output.estimateRowCount(output.getCluster().getMetadataQuery());
  }
{code}

Fails with:
{noformat}
java.lang.ArrayIndexOutOfBoundsException: 1
	at org.apache.calcite.plan.RelOptUtil$RexInputConverter.visitInputRef(RelOptUtil.java:4452)
	at org.apache.calcite.plan.RelOptUtil$RexInputConverter.visitInputRef(RelOptUtil.java:4372)
	at org.apache.calcite.rex.RexInputRef.accept(RexInputRef.java:112)
	at org.apache.calcite.rex.RexShuttle.visitList(RexShuttle.java:158)
	at org.apache.calcite.rex.RexShuttle.visitCall(RexShuttle.java:110)
	at org.apache.calcite.rex.RexShuttle.visitCall(RexShuttle.java:33)
	at org.apache.calcite.rex.RexCall.accept(RexCall.java:175)
	at org.apache.calcite.rel.metadata.RelMdSelectivity.getSelectivity(RelMdSelectivity.java:80)
	at GeneratedMetadataHandler_Selectivity.getSelectivity_$(Unknown Source)
	at GeneratedMetadataHandler_Selectivity.getSelectivity(Unknown Source)
	at org.apache.calcite.rel.metadata.RelMetadataQuery.getSelectivity(RelMetadataQuery.java:426)
	at org.apache.calcite.rel.metadata.RelMdSelectivity.getSelectivity(RelMdSelectivity.java:131)
	at GeneratedMetadataHandler_Selectivity.getSelectivity_$(Unknown Source)
	at GeneratedMetadataHandler_Selectivity.getSelectivity(Unknown Source)
	at org.apache.calcite.rel.metadata.RelMetadataQuery.getSelectivity(RelMetadataQuery.java:426)
	at org.apache.calcite.rel.metadata.RelMdUtil.estimateFilteredRows(RelMdUtil.java:767)
	at org.apache.calcite.rel.core.Filter.estimateRowCount(Filter.java:139)
	at org.apache.calcite.test.RelMdSelectivityTest.test(RelMdSelectivityTest.java:72)
{noformat}

> RelMdSelectivity#getSelectivity for Calc can propagate a predicate with wrong references
> ----------------------------------------------------------------------------------------
>
>                 Key: CALCITE-4414
>                 URL: https://issues.apache.org/jira/browse/CALCITE-4414
>             Project: Calcite
>          Issue Type: Bug
>          Components: core
>            Reporter: Ruben Q L
>            Assignee: Ruben Q L
>            Priority: Major
>             Fix For: 1.27.0
>
>
> {{RelMdSelectivity#getSelectivity(Calc rel, RelMetadataQuery mq, RexNode predicate)}} method:
> {code}
>   public Double getSelectivity(Calc rel, RelMetadataQuery mq, RexNode predicate) {
>     final RexProgram rexProgram = rel.getProgram();
>     final RexLocalRef programCondition = rexProgram.getCondition();
>     if (programCondition == null) {
>       return getSelectivity(rel.getInput(), mq, predicate); // [2]
>     } else {
>       // [1]
>       return mq.getSelectivity(rel.getInput(),
>           RelMdUtil.minusPreds(
>               rel.getCluster().getRexBuilder(),
>               predicate,
>               rexProgram.expandLocalRef(programCondition)));
>     }
>   }
> {code}
> currently passes down the predicate to its input [1] without considering any possible translation, since the predicate might include expressions generated by the Calc's projections; hence when the Calc's input analyzes the predicate, it can end up trying to access fields that do not exist on its rowType.
> This can lead to unforeseeable consequences, like the test attached to the first comment, where after {{RelMdSelectivity#getSelectivity(Calc)}} we reach {{RelMdSelectivity#getSelectivity(Union)}} and this method ends up in an {{ArrayIndexOutOfBoundsException}} because it tries to access a field ($1) that does not exists on its rowType (which only has $0). This $1 is actually projected by the Calc which is on top of the Union.
> Notice in the code snipped above that in our test example the issue only happens in line [1], and not in [2] because the "if" block calls {{getSelectivity}} instead of {{mq.getSelectivity}}, although I find this a bit questionable and maybe {{mq.getSelectivity}} should be called here as well.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)