You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@calcite.apache.org by "Zhong Yu (JIRA)" <ji...@apache.org> on 2018/03/09 02:28:00 UTC

[jira] [Comment Edited] (CALCITE-2202) Aggregate Join Push-down on a Single Side

    [ https://issues.apache.org/jira/browse/CALCITE-2202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16392289#comment-16392289 ] 

Zhong Yu edited comment on CALCITE-2202 at 3/9/18 2:27 AM:
-----------------------------------------------------------

Everything is moot if I can not prove my formula. But suppose it is correct –

I do think that COVAR_POP can be pushed down; it can be calculate from SUM(x*y), SUM( x ), SUM( y ), COUNT(x,y), all of which can be split through table union and cross product, therefore can be pushed down over join.

Producing more candidate plans may be bad for CBO; but the extra rule (i.e. singled sided) can be opted in some cases where metadata is missing, or group columns are nearly unique.


was (Author: zhong.j.yu):
Everything is moot if I can not prove my formula. But suppose it is correct --

I do think that COVAR_POP can be pushed down; it can be calculate from SUM(x*y), SUM(x), SUM(y), COUNT(x,y), all of which can be split through table union and cross product, therefore can be pushed down over join.

Producing more candidate plans may be bad for CBO; but the extra rule (i.e. singled sided) can be opted in some cases where metadata is missing, or group columns are nearly unique.

> Aggregate Join Push-down on a Single Side
> -----------------------------------------
>
>                 Key: CALCITE-2202
>                 URL: https://issues.apache.org/jira/browse/CALCITE-2202
>             Project: Calcite
>          Issue Type: Improvement
>          Components: core
>    Affects Versions: next
>            Reporter: Zhong Yu
>            Assignee: Julian Hyde
>            Priority: Major
>             Fix For: next
>
>
> While investigating https://issues.apache.org/jira/browse/CALCITE-2195, it's apparent that aggregation can be pushed on on a single side (either side), and leave the other side non-aggregated, regardless of whether grouping columns are unique on the other side. My analysis – [http://zhong-j-yu.github.io/aggregate-join-push-down.pdf] .
> This may be useful when the metadata is insufficient; in any case, we may try to provide all 3 possible transformations (aggregate on left only; right only; both sides) to the cost based optimizer, so that the cheapest one can be chosen based on stats. 
> Does this make any sense, anybody? If it sounds good, I'll implement it and offer a PR. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)