You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@calcite.apache.org by "godfrey he (JIRA)" <ji...@apache.org> on 2018/01/07 06:16:02 UTC

[jira] [Updated] (CALCITE-2125) error plan result that using AggregateExpandDistinctAggregatesRule.INSTANCE and AggregateProjectMergeRule.INSTANCE

     [ https://issues.apache.org/jira/browse/CALCITE-2125?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

godfrey he updated CALCITE-2125:
--------------------------------
    Description: 
test case:
{code}
 @Test public void testDistinctCount0() {
    final HepProgram program = HepProgram.builder()
        .addRuleInstance(AggregateExpandDistinctAggregatesRule.INSTANCE)
        .addRuleInstance(AggregateProjectMergeRule.INSTANCE)
        .build();
    checkPlanning(program,
        "select type, count(distinct acctno), sum(distinct balance)"
            + " from customer.account group by type");
  }
{code}

current result:
{code}
    <TestCase name="testDistinctCount0">
        <Resource name="sql">
            <![CDATA[select type, count(distinct acctno) from customer.account group by type]]>
        </Resource>
        <Resource name="planBefore">
            <![CDATA[
LogicalAggregate(group=[{0}], EXPR$1=[COUNT(DISTINCT $1)], EXPR$2=[SUM(DISTINCT $2)])
  LogicalProject(TYPE=[$1], ACCTNO=[$0], BALANCE=[$2])
    LogicalTableScan(table=[[CATALOG, CUSTOMER, ACCOUNT]])
]]>
        </Resource>
        <Resource name="planAfter">
            <![CDATA[
LogicalProject(TYPE=[$0], EXPR$1=[$1], EXPR$2=[CAST($2):INTEGER NOT NULL])
  LogicalAggregate(group=[{0}], EXPR$1=[COUNT($1) FILTER $3], EXPR$2=[SUM($2) FILTER $4])
    LogicalProject(TYPE=[$0], ACCTNO=[$1], BALANCE=[$2], $g_1=[=($3, 1)], $g_2=[=($3, 2)])
      LogicalProject(TYPE=[$1], ACCTNO=[$0], BALANCE=[$2], $g=[$3])
        LogicalAggregate(group=[{0, 1, 2}], groups=[[{0, 1}, {1, 2}]], $g=[GROUPING($1, $0, $2)])
          LogicalTableScan(table=[[CATALOG, CUSTOMER, ACCOUNT]])
]]>
        </Resource>
    </TestCase>
{code}

However, the result plan is wrong.

first, if we only use *AggregateExpandDistinctAggregatesRule.INSTANCE* to optimize the query, the result plan is correct:
{code}
 @Test public void testDistinctCount0() {
    final HepProgram program = HepProgram.builder()
        .addRuleInstance(AggregateExpandDistinctAggregatesRule.INSTANCE)
        //.addRuleInstance(AggregateProjectMergeRule.INSTANCE)
        .build();
    checkPlanning(program,
        "select type, count(distinct acctno), sum(distinct balance)"
            + " from customer.account group by type");
  }

LogicalProject(TYPE=[$0], EXPR$1=[$1], EXPR$2=[CAST($2):INTEGER NOT NULL])
  LogicalAggregate(group=[{0}], EXPR$1=[COUNT($1) FILTER $3], EXPR$2=[SUM($2) FILTER $4])
    LogicalProject(TYPE=[$0], ACCTNO=[$1], BALANCE=[$2], $g_1=[=($3, 1)], $g_2=[=($3, 2)])
      LogicalAggregate(group=[{0, 1, 2}], groups=[[{0, 1}, {0, 2}]], $g=[GROUPING($0, $1, $2)])
        LogicalProject(TYPE=[$1], ACCTNO=[$0], BALANCE=[$2])
          LogicalTableScan(table=[[CATALOG, CUSTOMER, ACCOUNT]])
{code}

then *AggregateProjectMergeRule.INSTANCE* is added, it will change the sub-plan from 
{code}
LogicalAggregate(group=[{0, 1, 2}], groups=[[{0, 1}, {0, 2}]], $g=[GROUPING($0, $1, $2)])
  LogicalProject(TYPE=[$1], ACCTNO=[$0], BALANCE=[$2])
{code}
to 
{code}
LogicalAggregate(group=[{0, 1, 2}], groups=[[{0, 1}, {1, 2}]], $g=[GROUPING($1, $0, $2)])
{code}

Notes that the groups was changed from {code} groups=[[{0, 1}, {0, 2}]] {code} to 
{code} groups=[[{0, 1}, {1, 2}]] {code}, but the filter values generated by *groupValue* in *AggregateExpandDistinctAggregatesRule* are not changed in node:
{code}
LogicalProject(TYPE=[$0], ACCTNO=[$1], BALANCE=[$2], $g_1=[=($3, 1)], $g_2=[=($3, 2)])
{code}

{code}
// filter values before AggregateProjectMergeRule added
AggregateExpandDistinctAggregatesRule.groupValue(ImmutableBitSet.of(0,1,2), ImmutableBitSet.of(0,1)) is 1
AggregateExpandDistinctAggregatesRule.groupValue(ImmutableBitSet.of(0,1,2), ImmutableBitSet.of(0,2)) is 2

// filter values after AggregateProjectMergeRule added
AggregateExpandDistinctAggregatesRule.groupValue(ImmutableBitSet.of(1,0,2), ImmutableBitSet.of(0,1)) is 1
AggregateExpandDistinctAggregatesRule.groupValue(ImmutableBitSet.of(1,0,2), ImmutableBitSet.of(1,2)) is 4
{code}

  was:
test case:
{code}
 @Test public void testDistinctCount0() {
    final HepProgram program = HepProgram.builder()
        .addRuleInstance(AggregateExpandDistinctAggregatesRule.INSTANCE)
        .addRuleInstance(AggregateProjectMergeRule.INSTANCE)
        .build();
    checkPlanning(program,
        "select type, count(distinct acctno), sum(distinct balance)"
            + " from customer.account group by type");
  }
{code}

current result:
{code}
    <TestCase name="testDistinctCount0">
        <Resource name="sql">
            <![CDATA[select type, count(distinct acctno) from customer.account group by type]]>
        </Resource>
        <Resource name="planBefore">
            <![CDATA[
LogicalAggregate(group=[{0}], EXPR$1=[COUNT(DISTINCT $1)], EXPR$2=[SUM(DISTINCT $2)])
  LogicalProject(TYPE=[$1], ACCTNO=[$0], BALANCE=[$2])
    LogicalTableScan(table=[[CATALOG, CUSTOMER, ACCOUNT]])
]]>
        </Resource>
        <Resource name="planAfter">
            <![CDATA[
LogicalProject(TYPE=[$0], EXPR$1=[$1], EXPR$2=[CAST($2):INTEGER NOT NULL])
  LogicalAggregate(group=[{0}], EXPR$1=[COUNT($1) FILTER $3], EXPR$2=[SUM($2) FILTER $4])
    LogicalProject(TYPE=[$0], ACCTNO=[$1], BALANCE=[$2], $g_1=[=($3, 1)], $g_2=[=($3, 2)])
      LogicalProject(TYPE=[$1], ACCTNO=[$0], BALANCE=[$2], $g=[$3])
        LogicalAggregate(group=[{0, 1, 2}], groups=[[{0, 1}, {1, 2}]], $g=[GROUPING($1, $0, $2)])
          LogicalTableScan(table=[[CATALOG, CUSTOMER, ACCOUNT]])
]]>
        </Resource>
    </TestCase>
{code}

However, the result plan is wrong.

first, if we only use *AggregateExpandDistinctAggregatesRule.INSTANCE* to optimize the query, the result plan is correct:
{code}
 @Test public void testDistinctCount0() {
    final HepProgram program = HepProgram.builder()
        .addRuleInstance(AggregateExpandDistinctAggregatesRule.INSTANCE)
        //.addRuleInstance(AggregateProjectMergeRule.INSTANCE)
        .build();
    checkPlanning(program,
        "select type, count(distinct acctno), sum(distinct balance)"
            + " from customer.account group by type");
  }

LogicalProject(TYPE=[$0], EXPR$1=[$1], EXPR$2=[CAST($2):INTEGER NOT NULL])
  LogicalAggregate(group=[{0}], EXPR$1=[COUNT($1) FILTER $3], EXPR$2=[SUM($2) FILTER $4])
    LogicalProject(TYPE=[$0], ACCTNO=[$1], BALANCE=[$2], $g_1=[=($3, 1)], $g_2=[=($3, 2)])
      LogicalAggregate(group=[{0, 1, 2}], groups=[[{0, 1}, {0, 2}]], $g=[GROUPING($0, $1, $2)])
        LogicalProject(TYPE=[$1], ACCTNO=[$0], BALANCE=[$2])
          LogicalTableScan(table=[[CATALOG, CUSTOMER, ACCOUNT]])
{code}

then *AggregateProjectMergeRule.INSTANCE* is added, it will change the sub-plan from 
{code}
LogicalAggregate(group=[{0, 1, 2}], groups=[[{0, 1}, {0, 2}]], $g=[GROUPING($0, $1, $2)])
  LogicalProject(TYPE=[$1], ACCTNO=[$0], BALANCE=[$2])
{code}
to 
{code}
LogicalAggregate(group=[{0, 1, 2}], groups=[[{0, 1}, {1, 2}]], $g=[GROUPING($1, $0, $2)])
{code}

Notes that the groups was changed from {code} groups=[[{0, 1}, {0, 2}]] {code} to 
{code} groups=[[{0, 1}, {1, 2}]] {code}, but the filter values generated by *groupValue* in *AggregateExpandDistinctAggregatesRule* are not changed in node:
{code}
LogicalProject(TYPE=[$0], ACCTNO=[$1], BALANCE=[$2], $g_1=[=($3, 1)], $g_2=[=($3, 2)])
{code}

{code}
// filter values before AggregateProjectMergeRule added
AggregateExpandDistinctAggregatesRule.groupValue(ImmutableBitSet.of(0,1,2), ImmutableBitSet.of(0,1)); is 1
AggregateExpandDistinctAggregatesRule.groupValue(ImmutableBitSet.of(0,1,2), ImmutableBitSet.of(0,2)); is 2

// filter values after AggregateProjectMergeRule added
AggregateExpandDistinctAggregatesRule.groupValue(ImmutableBitSet.of(1,0,2), ImmutableBitSet.of(0,1)); is 1
AggregateExpandDistinctAggregatesRule.groupValue(ImmutableBitSet.of(1,0,2), ImmutableBitSet.of(1,2)); is 4
{code}


> error plan result that using AggregateExpandDistinctAggregatesRule.INSTANCE and AggregateProjectMergeRule.INSTANCE 
> -------------------------------------------------------------------------------------------------------------------
>
>                 Key: CALCITE-2125
>                 URL: https://issues.apache.org/jira/browse/CALCITE-2125
>             Project: Calcite
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 1.15.0
>            Reporter: godfrey he
>            Assignee: Julian Hyde
>
> test case:
> {code}
>  @Test public void testDistinctCount0() {
>     final HepProgram program = HepProgram.builder()
>         .addRuleInstance(AggregateExpandDistinctAggregatesRule.INSTANCE)
>         .addRuleInstance(AggregateProjectMergeRule.INSTANCE)
>         .build();
>     checkPlanning(program,
>         "select type, count(distinct acctno), sum(distinct balance)"
>             + " from customer.account group by type");
>   }
> {code}
> current result:
> {code}
>     <TestCase name="testDistinctCount0">
>         <Resource name="sql">
>             <![CDATA[select type, count(distinct acctno) from customer.account group by type]]>
>         </Resource>
>         <Resource name="planBefore">
>             <![CDATA[
> LogicalAggregate(group=[{0}], EXPR$1=[COUNT(DISTINCT $1)], EXPR$2=[SUM(DISTINCT $2)])
>   LogicalProject(TYPE=[$1], ACCTNO=[$0], BALANCE=[$2])
>     LogicalTableScan(table=[[CATALOG, CUSTOMER, ACCOUNT]])
> ]]>
>         </Resource>
>         <Resource name="planAfter">
>             <![CDATA[
> LogicalProject(TYPE=[$0], EXPR$1=[$1], EXPR$2=[CAST($2):INTEGER NOT NULL])
>   LogicalAggregate(group=[{0}], EXPR$1=[COUNT($1) FILTER $3], EXPR$2=[SUM($2) FILTER $4])
>     LogicalProject(TYPE=[$0], ACCTNO=[$1], BALANCE=[$2], $g_1=[=($3, 1)], $g_2=[=($3, 2)])
>       LogicalProject(TYPE=[$1], ACCTNO=[$0], BALANCE=[$2], $g=[$3])
>         LogicalAggregate(group=[{0, 1, 2}], groups=[[{0, 1}, {1, 2}]], $g=[GROUPING($1, $0, $2)])
>           LogicalTableScan(table=[[CATALOG, CUSTOMER, ACCOUNT]])
> ]]>
>         </Resource>
>     </TestCase>
> {code}
> However, the result plan is wrong.
> first, if we only use *AggregateExpandDistinctAggregatesRule.INSTANCE* to optimize the query, the result plan is correct:
> {code}
>  @Test public void testDistinctCount0() {
>     final HepProgram program = HepProgram.builder()
>         .addRuleInstance(AggregateExpandDistinctAggregatesRule.INSTANCE)
>         //.addRuleInstance(AggregateProjectMergeRule.INSTANCE)
>         .build();
>     checkPlanning(program,
>         "select type, count(distinct acctno), sum(distinct balance)"
>             + " from customer.account group by type");
>   }
> LogicalProject(TYPE=[$0], EXPR$1=[$1], EXPR$2=[CAST($2):INTEGER NOT NULL])
>   LogicalAggregate(group=[{0}], EXPR$1=[COUNT($1) FILTER $3], EXPR$2=[SUM($2) FILTER $4])
>     LogicalProject(TYPE=[$0], ACCTNO=[$1], BALANCE=[$2], $g_1=[=($3, 1)], $g_2=[=($3, 2)])
>       LogicalAggregate(group=[{0, 1, 2}], groups=[[{0, 1}, {0, 2}]], $g=[GROUPING($0, $1, $2)])
>         LogicalProject(TYPE=[$1], ACCTNO=[$0], BALANCE=[$2])
>           LogicalTableScan(table=[[CATALOG, CUSTOMER, ACCOUNT]])
> {code}
> then *AggregateProjectMergeRule.INSTANCE* is added, it will change the sub-plan from 
> {code}
> LogicalAggregate(group=[{0, 1, 2}], groups=[[{0, 1}, {0, 2}]], $g=[GROUPING($0, $1, $2)])
>   LogicalProject(TYPE=[$1], ACCTNO=[$0], BALANCE=[$2])
> {code}
> to 
> {code}
> LogicalAggregate(group=[{0, 1, 2}], groups=[[{0, 1}, {1, 2}]], $g=[GROUPING($1, $0, $2)])
> {code}
> Notes that the groups was changed from {code} groups=[[{0, 1}, {0, 2}]] {code} to 
> {code} groups=[[{0, 1}, {1, 2}]] {code}, but the filter values generated by *groupValue* in *AggregateExpandDistinctAggregatesRule* are not changed in node:
> {code}
> LogicalProject(TYPE=[$0], ACCTNO=[$1], BALANCE=[$2], $g_1=[=($3, 1)], $g_2=[=($3, 2)])
> {code}
> {code}
> // filter values before AggregateProjectMergeRule added
> AggregateExpandDistinctAggregatesRule.groupValue(ImmutableBitSet.of(0,1,2), ImmutableBitSet.of(0,1)) is 1
> AggregateExpandDistinctAggregatesRule.groupValue(ImmutableBitSet.of(0,1,2), ImmutableBitSet.of(0,2)) is 2
> // filter values after AggregateProjectMergeRule added
> AggregateExpandDistinctAggregatesRule.groupValue(ImmutableBitSet.of(1,0,2), ImmutableBitSet.of(0,1)) is 1
> AggregateExpandDistinctAggregatesRule.groupValue(ImmutableBitSet.of(1,0,2), ImmutableBitSet.of(1,2)) is 4
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)