You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@calcite.apache.org by "Ruben Q L (Jira)" <ji...@apache.org> on 2020/11/20 14:14:00 UTC

[jira] [Commented] (CALCITE-4414) RelMdSelectivity#getSelectivity for Calc can propagate a predicate with wrong references

    [ https://issues.apache.org/jira/browse/CALCITE-4414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17236176#comment-17236176 ] 

Ruben Q L commented on CALCITE-4414:
------------------------------------

Test:
{code}
  @Test void test() {
    final RelBuilder builder = RelBuilder.create(RelBuilderTest.config().build());
    final RelNode relNode = builder
        .scan("EMP")
        .project(
            builder.field("DEPTNO"))
        .scan("EMP")
        .project(
            builder.field("DEPTNO"))
        .union(true)
        .projectPlus(builder.field("DEPTNO"))
        .filter(
            builder.equals(
                builder.field(0),
                builder.literal(0)))
        .build();

    // Program to convert Project + Filter into a single Calc
    final HepProgram program = new HepProgramBuilder()
        .addRuleInstance(CoreRules.FILTER_TO_CALC)
        .addRuleInstance(CoreRules.PROJECT_TO_CALC)
        .addRuleInstance(CoreRules.CALC_MERGE)
        .build();
    final HepPlanner hepPlanner = new HepPlanner(program);
    hepPlanner.setRoot(relNode);
    RelNode output = hepPlanner.findBestExp();

    // Add filter on the extra field generated by projectPlus (now a Calc after hepPlanner)
    output = builder
        .push(output)
        .filter(
            builder.equals(
                builder.field(1),
                builder.literal(0)))
        .build();

    output.estimateRowCount(output.getCluster().getMetadataQuery());
  }
{code}

Fails with:
{noformat}
java.lang.ArrayIndexOutOfBoundsException: 1
	at org.apache.calcite.plan.RelOptUtil$RexInputConverter.visitInputRef(RelOptUtil.java:4452)
	at org.apache.calcite.plan.RelOptUtil$RexInputConverter.visitInputRef(RelOptUtil.java:4372)
	at org.apache.calcite.rex.RexInputRef.accept(RexInputRef.java:112)
	at org.apache.calcite.rex.RexShuttle.visitList(RexShuttle.java:158)
	at org.apache.calcite.rex.RexShuttle.visitCall(RexShuttle.java:110)
	at org.apache.calcite.rex.RexShuttle.visitCall(RexShuttle.java:33)
	at org.apache.calcite.rex.RexCall.accept(RexCall.java:175)
	at org.apache.calcite.rel.metadata.RelMdSelectivity.getSelectivity(RelMdSelectivity.java:80)
	at GeneratedMetadataHandler_Selectivity.getSelectivity_$(Unknown Source)
	at GeneratedMetadataHandler_Selectivity.getSelectivity(Unknown Source)
	at org.apache.calcite.rel.metadata.RelMetadataQuery.getSelectivity(RelMetadataQuery.java:426)
	at org.apache.calcite.rel.metadata.RelMdSelectivity.getSelectivity(RelMdSelectivity.java:131)
	at GeneratedMetadataHandler_Selectivity.getSelectivity_$(Unknown Source)
	at GeneratedMetadataHandler_Selectivity.getSelectivity(Unknown Source)
	at org.apache.calcite.rel.metadata.RelMetadataQuery.getSelectivity(RelMetadataQuery.java:426)
	at org.apache.calcite.rel.metadata.RelMdUtil.estimateFilteredRows(RelMdUtil.java:767)
	at org.apache.calcite.rel.core.Filter.estimateRowCount(Filter.java:139)
	at org.apache.calcite.test.RelMdSelectivityTest.test(RelMdSelectivityTest.java:72)
{noformat}

> RelMdSelectivity#getSelectivity for Calc can propagate a predicate with wrong references
> ----------------------------------------------------------------------------------------
>
>                 Key: CALCITE-4414
>                 URL: https://issues.apache.org/jira/browse/CALCITE-4414
>             Project: Calcite
>          Issue Type: Bug
>          Components: core
>            Reporter: Ruben Q L
>            Assignee: Ruben Q L
>            Priority: Major
>             Fix For: 1.27.0
>
>
> {{RelMdSelectivity#getSelectivity(Calc rel, RelMetadataQuery mq, RexNode predicate)}} method:
> {code}
>   public Double getSelectivity(Calc rel, RelMetadataQuery mq, RexNode predicate) {
>     final RexProgram rexProgram = rel.getProgram();
>     final RexLocalRef programCondition = rexProgram.getCondition();
>     if (programCondition == null) {
>       return getSelectivity(rel.getInput(), mq, predicate); // [2]
>     } else {
>       // [1]
>       return mq.getSelectivity(rel.getInput(),
>           RelMdUtil.minusPreds(
>               rel.getCluster().getRexBuilder(),
>               predicate,
>               rexProgram.expandLocalRef(programCondition)));
>     }
>   }
> {code}
> Currently passes down the predicate to its input [1] without considering any possible translation, since the predicate might include expressions generate by the Calc's projections, hence when the Calc's input analyzes the predicate it can end up trying to access fields that do not exist on its rowType.
> This can lead to unforeseeable consequences, like the test attached to the first comment, where after {{RelMdSelectivity#getSelectivity(Calc)}} we reach {{RelMdSelectivity#getSelectivity(Union)}} and this method ends up in an {{ArrayIndexOutOfBoundsException}} becaus it tries to access a field ($1) that does not exists on its rowType (which only has $0). This $1 is actually projected by the Calc which is on top of the Union.
> Notice in the code snipped above that in our test example the issue only happens in line [1], and not in [2] because the "if" block calls {{getSelectivity}} instead of {{mq.getSelectivity}}, although I find this a bit questionable and maybe {{mq.getSelectivity}} should be called here as well.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)