You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@calcite.apache.org by "Ruben Q L (Jira)" <ji...@apache.org> on 2020/11/20 14:17:00 UTC
[jira] [Comment Edited] (CALCITE-4414)
RelMdSelectivity#getSelectivity for Calc can propagate a predicate with
wrong references
[ https://issues.apache.org/jira/browse/CALCITE-4414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17236176#comment-17236176 ]
Ruben Q L edited comment on CALCITE-4414 at 11/20/20, 2:16 PM:
---------------------------------------------------------------
Test:
{code}
@Test void test() {
final RelBuilder builder = RelBuilder.create(RelBuilderTest.config().build());
final RelNode relNode = builder
.scan("EMP")
.project(
builder.field("DEPTNO"))
.scan("EMP")
.project(
builder.field("DEPTNO"))
.union(true)
.projectPlus(builder.field("DEPTNO"))
.filter(
builder.equals(
builder.field(0),
builder.literal(0)))
.build();
// Program to convert Project + Filter into a single Calc
final HepProgram program = new HepProgramBuilder()
.addRuleInstance(CoreRules.FILTER_TO_CALC)
.addRuleInstance(CoreRules.PROJECT_TO_CALC)
.addRuleInstance(CoreRules.CALC_MERGE)
.build();
final HepPlanner hepPlanner = new HepPlanner(program);
hepPlanner.setRoot(relNode);
RelNode output = hepPlanner.findBestExp();
// Add filter on the extra field generated by projectPlus (now a Calc after hepPlanner)
output = builder
.push(output)
.filter(
builder.equals(
builder.field(1),
builder.literal(0)))
.build();
output.estimateRowCount(output.getCluster().getMetadataQuery());
}
{code}
Fails with:
{noformat}
java.lang.ArrayIndexOutOfBoundsException: 1
at org.apache.calcite.plan.RelOptUtil$RexInputConverter.visitInputRef(RelOptUtil.java:4452)
at org.apache.calcite.plan.RelOptUtil$RexInputConverter.visitInputRef(RelOptUtil.java:4372)
at org.apache.calcite.rex.RexInputRef.accept(RexInputRef.java:112)
at org.apache.calcite.rex.RexShuttle.visitList(RexShuttle.java:158)
at org.apache.calcite.rex.RexShuttle.visitCall(RexShuttle.java:110)
at org.apache.calcite.rex.RexShuttle.visitCall(RexShuttle.java:33)
at org.apache.calcite.rex.RexCall.accept(RexCall.java:175)
at org.apache.calcite.rel.metadata.RelMdSelectivity.getSelectivity(RelMdSelectivity.java:80)
at GeneratedMetadataHandler_Selectivity.getSelectivity_$(Unknown Source)
at GeneratedMetadataHandler_Selectivity.getSelectivity(Unknown Source)
at org.apache.calcite.rel.metadata.RelMetadataQuery.getSelectivity(RelMetadataQuery.java:426)
at org.apache.calcite.rel.metadata.RelMdSelectivity.getSelectivity(RelMdSelectivity.java:131)
at GeneratedMetadataHandler_Selectivity.getSelectivity_$(Unknown Source)
at GeneratedMetadataHandler_Selectivity.getSelectivity(Unknown Source)
at org.apache.calcite.rel.metadata.RelMetadataQuery.getSelectivity(RelMetadataQuery.java:426)
at org.apache.calcite.rel.metadata.RelMdUtil.estimateFilteredRows(RelMdUtil.java:767)
at org.apache.calcite.rel.core.Filter.estimateRowCount(Filter.java:139)
at org.apache.calcite.test.RelMdSelectivityTest.test(RelMdSelectivityTest.java:72)
{noformat}
The plan that we are dealing with at the end of the test:
{noformat}
LogicalFilter(condition=[=($1, 0)])
LogicalCalc(expr#0=[{inputs}], expr#1=[0], expr#2=[=($t0, $t1)], DEPTNO=[$t0], DEPTNO0=[$t0], $condition=[$t2])
LogicalUnion(all=[true])
LogicalCalc(expr#0..7=[{inputs}], DEPTNO=[$t7])
LogicalTableScan(table=[[scott, EMP]])
LogicalCalc(expr#0..7=[{inputs}], DEPTNO=[$t7])
LogicalTableScan(table=[[scott, EMP]])
{noformat}
was (Author: rubenql):
Test:
{code}
@Test void test() {
final RelBuilder builder = RelBuilder.create(RelBuilderTest.config().build());
final RelNode relNode = builder
.scan("EMP")
.project(
builder.field("DEPTNO"))
.scan("EMP")
.project(
builder.field("DEPTNO"))
.union(true)
.projectPlus(builder.field("DEPTNO"))
.filter(
builder.equals(
builder.field(0),
builder.literal(0)))
.build();
// Program to convert Project + Filter into a single Calc
final HepProgram program = new HepProgramBuilder()
.addRuleInstance(CoreRules.FILTER_TO_CALC)
.addRuleInstance(CoreRules.PROJECT_TO_CALC)
.addRuleInstance(CoreRules.CALC_MERGE)
.build();
final HepPlanner hepPlanner = new HepPlanner(program);
hepPlanner.setRoot(relNode);
RelNode output = hepPlanner.findBestExp();
// Add filter on the extra field generated by projectPlus (now a Calc after hepPlanner)
output = builder
.push(output)
.filter(
builder.equals(
builder.field(1),
builder.literal(0)))
.build();
output.estimateRowCount(output.getCluster().getMetadataQuery());
}
{code}
Fails with:
{noformat}
java.lang.ArrayIndexOutOfBoundsException: 1
at org.apache.calcite.plan.RelOptUtil$RexInputConverter.visitInputRef(RelOptUtil.java:4452)
at org.apache.calcite.plan.RelOptUtil$RexInputConverter.visitInputRef(RelOptUtil.java:4372)
at org.apache.calcite.rex.RexInputRef.accept(RexInputRef.java:112)
at org.apache.calcite.rex.RexShuttle.visitList(RexShuttle.java:158)
at org.apache.calcite.rex.RexShuttle.visitCall(RexShuttle.java:110)
at org.apache.calcite.rex.RexShuttle.visitCall(RexShuttle.java:33)
at org.apache.calcite.rex.RexCall.accept(RexCall.java:175)
at org.apache.calcite.rel.metadata.RelMdSelectivity.getSelectivity(RelMdSelectivity.java:80)
at GeneratedMetadataHandler_Selectivity.getSelectivity_$(Unknown Source)
at GeneratedMetadataHandler_Selectivity.getSelectivity(Unknown Source)
at org.apache.calcite.rel.metadata.RelMetadataQuery.getSelectivity(RelMetadataQuery.java:426)
at org.apache.calcite.rel.metadata.RelMdSelectivity.getSelectivity(RelMdSelectivity.java:131)
at GeneratedMetadataHandler_Selectivity.getSelectivity_$(Unknown Source)
at GeneratedMetadataHandler_Selectivity.getSelectivity(Unknown Source)
at org.apache.calcite.rel.metadata.RelMetadataQuery.getSelectivity(RelMetadataQuery.java:426)
at org.apache.calcite.rel.metadata.RelMdUtil.estimateFilteredRows(RelMdUtil.java:767)
at org.apache.calcite.rel.core.Filter.estimateRowCount(Filter.java:139)
at org.apache.calcite.test.RelMdSelectivityTest.test(RelMdSelectivityTest.java:72)
{noformat}
> RelMdSelectivity#getSelectivity for Calc can propagate a predicate with wrong references
> ----------------------------------------------------------------------------------------
>
> Key: CALCITE-4414
> URL: https://issues.apache.org/jira/browse/CALCITE-4414
> Project: Calcite
> Issue Type: Bug
> Components: core
> Reporter: Ruben Q L
> Assignee: Ruben Q L
> Priority: Major
> Fix For: 1.27.0
>
>
> {{RelMdSelectivity#getSelectivity(Calc rel, RelMetadataQuery mq, RexNode predicate)}} method:
> {code}
> public Double getSelectivity(Calc rel, RelMetadataQuery mq, RexNode predicate) {
> final RexProgram rexProgram = rel.getProgram();
> final RexLocalRef programCondition = rexProgram.getCondition();
> if (programCondition == null) {
> return getSelectivity(rel.getInput(), mq, predicate); // [2]
> } else {
> // [1]
> return mq.getSelectivity(rel.getInput(),
> RelMdUtil.minusPreds(
> rel.getCluster().getRexBuilder(),
> predicate,
> rexProgram.expandLocalRef(programCondition)));
> }
> }
> {code}
> currently passes down the predicate to its input [1] without considering any possible translation, since the predicate might include expressions generated by the Calc's projections; hence when the Calc's input analyzes the predicate, it can end up trying to access fields that do not exist on its rowType.
> This can lead to unforeseeable consequences, like the test attached to the first comment, where after {{RelMdSelectivity#getSelectivity(Calc)}} we reach {{RelMdSelectivity#getSelectivity(Union)}} and this method ends up in an {{ArrayIndexOutOfBoundsException}} because it tries to access a field ($1) that does not exists on its rowType (which only has $0). This $1 is actually projected by the Calc which is on top of the Union.
> Notice in the code snipped above that in our test example the issue only happens in line [1], and not in [2] because the "if" block calls {{getSelectivity}} instead of {{mq.getSelectivity}}, although I find this a bit questionable and maybe {{mq.getSelectivity}} should be called here as well.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)