You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@calcite.apache.org by "Julian Hyde (Jira)" <ji...@apache.org> on 2019/11/26 01:46:00 UTC

[jira] [Comment Edited] (CALCITE-3531) AggregateProjectPullUpConstantsRule should not remove deterministic function group key if the function is dynamic

    [ https://issues.apache.org/jira/browse/CALCITE-3531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16982045#comment-16982045 ] 

Julian Hyde edited comment on CALCITE-3531 at 11/26/19 1:45 AM:
----------------------------------------------------------------

I think we got this wrong.

First of all, CURRENT_TIMESTAMP should return the same result for the duration of the query. That is what the SQL standard says. It should do this even for streaming queries.

Second, if the expression evaluates to the same thing for the duration of the query, then it is safe to remove it from the GROUP BY clause.

If you need a "wallclock time" for streaming/continuous queries, invent a new function, but don't redefine CURRENT_TIMESTAMP.


was (Author: julianhyde):
I think we got this wrong.

First of all, CURRENT_TIMESTAMP should return the same result for the duration of the query. That is what the SQL standard says. It should do this even for streaming queries.

Second, if the expression evaluates to the same thing for the duration of the query, then it is safe to remove it from the GROUP BY clause.

> AggregateProjectPullUpConstantsRule should not remove deterministic function group key if the function is dynamic
> -----------------------------------------------------------------------------------------------------------------
>
>                 Key: CALCITE-3531
>                 URL: https://issues.apache.org/jira/browse/CALCITE-3531
>             Project: Calcite
>          Issue Type: Improvement
>          Components: core
>    Affects Versions: 1.21.0
>            Reporter: Danny Chen
>            Assignee: Danny Chen
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 1.22.0
>
>          Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> Now AggregateProjectPullUpConstantsRule simplify the query:
> {code:sql}
> select hiredate
> from sales.emp
> where sal is null and hiredate = current_timestamp
> group by sal, hiredate
> having count(*) > 3
> {code}
> from plan:
> {code:xml}
> LogicalProject(HIREDATE=[$1])
>   LogicalFilter(condition=[>($2, 3)])
>     LogicalAggregate(group=[{0, 1}], agg#0=[COUNT()])
>       LogicalProject(SAL=[$5], HIREDATE=[$4])
>         LogicalFilter(condition=[AND(IS NULL($5), =($4, CURRENT_TIMESTAMP))])
>           LogicalTableScan(table=[[CATALOG, SALES, EMP]])
> {code}
> to plan:
> {code:xml}
> LogicalProject(HIREDATE=[$1])
>   LogicalFilter(condition=[>($2, 3)])
>     LogicalProject(SAL=[$0], HIREDATE=[CURRENT_TIMESTAMP], $f2=[$1])
>       LogicalAggregate(group=[{0}], agg#0=[COUNT()])
>         LogicalProject(SAL=[$5], HIREDATE=[$4])
>           LogicalFilter(condition=[AND(IS NULL($5), =($4, CURRENT_TIMESTAMP))])
>             LogicalTableScan(table=[[CATALOG, SALES, EMP]])
> {code}
> which is unsafe, because for stream sql, we need to group data by dateTime, also the result is wrong if a batch job runs across days.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)