You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@calcite.apache.org by "Julian Hyde (JIRA)" <ji...@apache.org> on 2017/06/29 17:22:00 UTC

[jira] [Comment Edited] (CALCITE-873) Prevent sort when ORDER BY not necessary due to equality constraints

    [ https://issues.apache.org/jira/browse/CALCITE-873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16067028#comment-16067028 ] 

Julian Hyde edited comment on CALCITE-873 at 6/29/17 5:21 PM:
--------------------------------------------------------------

[~julianhyde] Thanks for your comments!

1) For {{testSortConstantRemoval}}, I will change the expected output for the test in {{RelOptRulesTest.xml}} to point to the correct output.

I changed the output and saw that the output before the rule is:

{noformat}
LogicalSort(sort0=[$1], dir0=[ASC])
  LogicalProject(C=[$2], DEPTNO=[$0])
    LogicalProject(DEPTNO=[10], SAL=[$0], C=[$1])
      LogicalAggregate(group=[{1}], C=[COUNT()])
        LogicalProject(DEPTNO=[$7], SAL=[$5])
          LogicalFilter(condition=[=($7, 10)])
            LogicalTableScan(table=[[CATALOG, SALES, EMP]])
{noformat}

After the execution of the rule, the output is:

{noformat}
LogicalProject(C=[$2], DEPTNO=[$0])
  LogicalAggregate(group=[{0, 1}], C=[COUNT()])
    LogicalProject(DEPTNO=[$7], SAL=[$5])
      LogicalFilter(condition=[=($7, 10)])
        LogicalTableScan(table=[[CATALOG, SALES, EMP]])
{noformat}

For {{testSortConstantRemoval2}}, I think we are hitting the HepPlanner bug where the rules loop is not moving the iterator over the plan, hence is stuck in infinite loop. I have removed the test, and shall file it as a separate JIRA as a bug.


was (Author: atris):
[~julianhyde] Thanks for your comments!

1) For testSortConstantRemoval, I will change the expected output for the test in RelOptRulesTest.xml to point to the correct output.

I changed the output and saw that the output before the rule is:

LogicalSort(sort0=[$1], dir0=[ASC])
  LogicalProject(C=[$2], DEPTNO=[$0])
    LogicalProject(DEPTNO=[10], SAL=[$0], C=[$1])
      LogicalAggregate(group=[{1}], C=[COUNT()])
        LogicalProject(DEPTNO=[$7], SAL=[$5])
          LogicalFilter(condition=[=($7, 10)])
            LogicalTableScan(table=[[CATALOG, SALES, EMP]])

After the execution of the rule, the output is:

LogicalProject(C=[$2], DEPTNO=[$0])
  LogicalAggregate(group=[{0, 1}], C=[COUNT()])
    LogicalProject(DEPTNO=[$7], SAL=[$5])
      LogicalFilter(condition=[=($7, 10)])
        LogicalTableScan(table=[[CATALOG, SALES, EMP]])

For testSortConstantRemoval2, I think we are hitting the HepPlanner bug where the rules loop is not moving the iterator over the plan, hence is stuck in infinite loop. I have removed the test, and shall file it as a separate JIRA as a bug

> Prevent sort when ORDER BY not necessary due to equality constraints
> --------------------------------------------------------------------
>
>                 Key: CALCITE-873
>                 URL: https://issues.apache.org/jira/browse/CALCITE-873
>             Project: Calcite
>          Issue Type: Improvement
>            Reporter: James Taylor
>            Assignee: Atri Sharma
>
> We're working on an optimization in Phoenix to optimize away an ORDER BY when it is known based on equality expressions in the WHERE clause that it is not necessary (PHOENIX-2194). It'd be great if Calcite could do that as well.
> Here's a example, given the following schema:
> {code}
> CREATE TABLE T (
>   K1 VARCHAR,
>   K2 VARCHAR,
>   K3 VARCHAR,
>   CONSTRAINT pk PRIMARY KEY (K1, K2, K3));
> {code}
> In the following queries, no sort is necessary:
> {code}
> SELECT * FROM T WHERE K1='A' ORDER BY K2,K3;
> SELECT * FROM T WHERE K2='B' ORDER BY K1,K3;
> SELECT * FROM T WHERE K1='A' AND K2='B' ORDER BY K3;
> {code}
> There are also some edge cases where a function may be known to select a *prefix* of the column value where it's still ok to not sort:
> {code}
> SELECT * FROM T WHERE K1='A' AND SUBSTR(K2,1,3)='ABC' ORDER BY K2;
> {code}
> But if another column is included in the ORDER BY after the prefixing, a sort would still be necessary:
> {code}
> SELECT * FROM T WHERE K1='A' AND SUBSTR(K2,1,3)='ABC' ORDER BY K2,K3;
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)