You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@calcite.apache.org by "bingfeng.guo (Jira)" <ji...@apache.org> on 2021/11/14 08:58:00 UTC
[jira] [Commented] (CALCITE-4887) Filter subQuery remove tot fully optimized
[ https://issues.apache.org/jira/browse/CALCITE-4887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17443264#comment-17443264 ]
bingfeng.guo commented on CALCITE-4887:
---------------------------------------
Whether the logic of the subquery should always be true,Like below
org.apache.calcite.rel.rules.SubQueryRemoveRule#matchFilter
{quote}private static void matchFilter(SubQueryRemoveRule rule, RelOptRuleCall call) {
final Filter filter = call.rel(0);
final RelBuilder builder = call.builder();
builder.push(filter.getInput());
int count = 0;
RexNode c = filter.getCondition();
while (true) {
final RexSubQuery e = RexUtil.SubQueryFinder.find(c);
if (e == null) {
assert count > 0;
break;
}
++count;
final Set<CorrelationId> variablesSet =
RelOptUtil.getVariablesUsed(e.rel);
final RexNode target = rule.apply(e, variablesSet, RelOptUtil.Logic.TRUE,
builder, 1, builder.peek().getRowType().getFieldCount());
final RexShuttle shuttle = new ReplaceSubQueryShuttle(e, target);
c = c.accept(shuttle);
}
builder.filter(c);
builder.project(fields(builder, filter.getRowType().getFieldCount()));
call.transformTo(builder.build());
}
{quote}
> Filter subQuery remove tot fully optimized
> ------------------------------------------
>
> Key: CALCITE-4887
> URL: https://issues.apache.org/jira/browse/CALCITE-4887
> Project: Calcite
> Issue Type: Improvement
> Reporter: bingfeng.guo
> Priority: Major
>
> for example:
> {quote}SELECT
> count(1)
> FROM
> SSB.P_LINEORDER as P_LINEORDER
> LEFT JOIN SSB.PART as PART ON P_LINEORDER.LO_PARTKEY = PART.P_PARTKEY
> LEFT JOIN SSB.DATES as DATES ON P_LINEORDER.LO_ORDERDATE = DATES.D_DATEKEY
> LEFT JOIN SSB.CUSTOMER as CUSTOMER ON P_LINEORDER.LO_CUSTKEY = CUSTOMER.C_CUSTKEY
> LEFT JOIN SSB.SUPPLIER as SUPPLIER ON P_LINEORDER.LO_SUPPKEY = SUPPLIER.S_SUPPKEY
> WHERE
> P_LINEORDER.LO_ORDERDATE = '1992-01-01'
> or (
> P_LINEORDER.LO_ORDERDATE in (
> select
> D_DATEKEY
> from
> SSB.DATES
> where
> DATES.D_DATEKEY = '1992-01-01'
> )
> )
> LIMIT
> 500
> {quote}
>
> plan is
>
> {quote}LimitRel(ctx=[], fetch=[500])
> AggregateRel(group-set=[[]], groups=[null], EXPR$0=[COUNT()])
> ProjectRel(ctx=[])
> FilterRel(condition=[OR(=($5, 1992-01-01), CAST(AND(IS NOT NULL($65), <>($62, 0))):BOOLEAN)])
> JoinRel(condition=[=($5, $64)], joinType=[left])
> JoinRel(condition=[true], joinType=[inner])
> JoinRel(condition=[=($4, $55)], joinType=[left])
> JoinRel(condition=[=($2, $47)], joinType=[left])
> JoinRel(condition=[=($5, $30)], joinType=[left])
> JoinRel(condition=[=($3, $21)], joinType=[left])
> TableScan(table=[[SSB, P_LINEORDER]], fields=[[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20]])
> TableScan(table=[[SSB, PART]], fields=[[0, 1, 2, 3, 4, 5, 6, 7, 8]])
> TableScan(table=[[SSB, DATES]], fields=[[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16]])
> TableScan(table=[[SSB, CUSTOMER]], fields=[[0, 1, 2, 3, 4, 5, 6, 7]])
> TableScan(table=[[SSB, SUPPLIER]], fields=[[0, 1, 2, 3, 4, 5, 6]])
> AggregateRel(group-set=[[]], groups=[null], c=[COUNT()], ck=[COUNT($0)])
> ProjectRel(D_DATEKEY=[$0])
> FilterRel(condition=[=($0, 1992-01-01)])
> TableScan(table=[[SSB, DATES]], fields=[[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16]])
> AggregateRel(group-set=[[0, 1]], groups=[null])
> ProjectRel(D_DATEKEY=[$0], i=[true])
> FilterRel(condition=[=($0, 1992-01-01)])
> TableScan(table=[[SSB, DATES]], fields=[[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16]])
> {quote}
>
>
> But the optimal plan should be
> {quote}
> LimitRel(fetch=[500])
> AggregateRel(group-set=[[]], groups=[null], EXPR$0=[COUNT()])
> ProjectRel(ctx=[])
> FilterRel(condition=[=($5, 1992-01-01)])
> JoinRel(condition=[=($5, $62)], joinType=[inner])
> JoinRel(condition=[=($4, $55)], joinType=[left])
> JoinRel(condition=[=($2, $47)], joinType=[left])
> JoinRel(condition=[=($5, $30)], joinType=[left])
> JoinRel(condition=[=($3, $21)], joinType=[left])
> TableScan(table=[[SSB, P_LINEORDER]], fields=[[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20]])
> TableScan(table=[[SSB, PART]], fields=[[0, 1, 2, 3, 4, 5, 6, 7, 8]])
> TableScan(table=[[SSB, DATES]], fields=[[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16]])
> TableScan(table=[[SSB, CUSTOMER]], fields=[[0, 1, 2, 3, 4, 5, 6, 7]])
> TableScan(table=[[SSB, SUPPLIER]], fields=[[0, 1, 2, 3, 4, 5, 6]])
> AggregateRel(group-set=[[0]], groups=[null])
> ProjectRel(D_DATEKEY=[$0])
> FilterRel(condition=[=($0, 1992-01-01)])
> TableScan(table=[[SSB, DATES]], fields=[[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16]])
>
> {quote}
>
>
--
This message was sent by Atlassian Jira
(v8.20.1#820001)