You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@calcite.apache.org by "Hongze Zhang (JIRA)" <ji...@apache.org> on 2019/01/31 07:38:00 UTC

[jira] [Comment Edited] (CALCITE-2798) Optimizer should remove ORDER BY in sub-query, provided it has no LIMIT or OFFSET

    [ https://issues.apache.org/jira/browse/CALCITE-2798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16756929#comment-16756929 ] 

Hongze Zhang edited comment on CALCITE-2798 at 1/31/19 7:37 AM:
----------------------------------------------------------------

Thanks [~julianhyde], I am with you that the LogicalSort cannot be removed unreasonably.

My imagination was to remove the collation guarantee for logical nodes except for LogicalSort, such as the LogicalProject and LogicalFilter and others (Sorry maybe a little far from this topic), to make Calcite generate the initial logical plan with "smallest collation requirement". For instance, if we use Calcite to build a Sort-Sort-Project-Sort relation, the initial plan registered to volcano planner will be:
{quote}LogicalSort[0] – input subset collation:[0]
 LogicalSort[0] – input subset collation:[0]
 LogicalProject[0] – input subset collation:[0]
 LogicalSort[0] – input subset collation:[]
{quote}
If we focus on a concept like "smallest collation requirement", maybe we can generate and register following rels instead:
{quote}LogicalSort[0] – input subset collation:[]
 LogicalSort[0] – input subset collation:[]
 LogicalProject[] – input subset collation:[]
 LogicalSort[0] – input subset collation:[]
{quote}
That is, there will be no rel node (on LOGICAL level) that really needs sorted input, and only the LogicalSort will guarante sorted output.

Do we lose optimizations if we were to do something like this?


was (Author: zhztheplayer):
Thanks [~julianhyde], I am with you that the LogicalSort cannot be removed unreasonably.

My imagination was to remove the collation guarantee for logical nodes except for LogicalSort, such as the LogicalProject and LogicalFilter and others, (Maybe a little far from this topic), to make Calcite generate the initial logical plan with "smallest collation requirement". For instance, if we use calcite to build a Sort-Sort-Project-Sort relation, the initial plan registered to volcano planner will be:
{quote}LogicalSort[0] – input subset collation:[0]
 LogicalSort[0] – input subset collation:[0]
 LogicalProject[0] – input subset collation:[0]
 LogicalSort[0] – input subset collation:[]
{quote}
If we focus on a concept like "smallest collation requirement", maybe we can generate and register following rels instead:
{quote}LogicalSort[0] – input subset collation:[]
 LogicalSort[0] – input subset collation:[]
 LogicalProject[] – input subset collation:[]
 LogicalSort[0] – input subset collation:[]
{quote}
That is, there will be no rel node (on LOGICAL level) that really needs sorted input, and only the LogicalSort will guarante sorted output.

Do we lose optimizations if we were to do something like this?

> Optimizer should remove ORDER BY in sub-query, provided it has no LIMIT or OFFSET
> ---------------------------------------------------------------------------------
>
>                 Key: CALCITE-2798
>                 URL: https://issues.apache.org/jira/browse/CALCITE-2798
>             Project: Calcite
>          Issue Type: Improvement
>          Components: core
>    Affects Versions: 1.18.0
>            Reporter: Vladimir Sitnikov
>            Assignee: Julian Hyde
>            Priority: Major
>
> The following SQL performs sort twice, however inner sort can be eliminated
> {code}select * from (
>   select * from "emps" 
> order by "emps"."deptno"
> ) order by 1 desc{code}
> The same goes for (window calculation will sort on its own)
> {code}select row_number() over (order by "emps"."deptno")  from (
>   select * from "emps" 
> order by "emps"."deptno" desc
> ){code}
> The same goes for SetOp (union, minus):
> {code}select * from (
>   select * from "emps" 
> order by "emps"."deptno"
> ) union select * from (
>   select * from "emps" 
> order by "emps"."deptno" desc
> ){code}
> There might be other cases like that (e.g. Aggregate, Join, Exchange, SortExchange)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)