You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@calcite.apache.org by "Haisheng Yuan (JIRA)" <ji...@apache.org> on 2019/05/22 19:54:00 UTC
[jira] [Commented] (CALCITE-2648) Output collation of
EnumerableWindow is not consistent with its implementation
[ https://issues.apache.org/jira/browse/CALCITE-2648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16846193#comment-16846193 ]
Haisheng Yuan commented on CALCITE-2648:
----------------------------------------
Whether EnumerableWindow should preserve the order is highly dependent on the runtime implementation. In Postgres/GPDB, the window is sort based, so the optimizer assume window operator preserves the order. In calcite, I suppose it uses hashmap for window partitioning?
{quote}
We should not necessarily preserve order, if doing so would be expensive (and/or more complicated).
{quote}
I don't agree with this. If we don't preserve order, we will lose a lot of optimization opportunities. e.g
select row_number() over (partition by a order by b,c), row_number() over (partition by a order by b) from foo;
We just need 1 sort, but in calcite plan, it does the window separately, which is a waste.
> Output collation of EnumerableWindow is not consistent with its implementation
> ------------------------------------------------------------------------------
>
> Key: CALCITE-2648
> URL: https://issues.apache.org/jira/browse/CALCITE-2648
> Project: Calcite
> Issue Type: Bug
> Affects Versions: 1.17.0
> Reporter: Hongze Zhang
> Priority: Major
> Labels: pull-request-available
> Attachments: postgresql_96_doesnt_care_to_keep_collation_for_project_over_expression.png
>
> Time Spent: 40m
> Remaining Estimate: 0h
>
> Here is a case:
> {code:sql}
> select x, COUNT(*) OVER (PARTITION BY x) from (values (20), (35)) as t(x) ORDER BY x
> {code}
> Final plan:
> {code:java}
> EnumerableWindow(window#0=[window(partition {0} order by [] range between UNBOUNDED PRECEDING and UNBOUNDED FOLLOWING aggs [COUNT()])])
> EnumerableValues(tuples=[[{ 20 }, { 35 }]])
> {code}
> Output rows:
> {code:java}
> X |EXPR$1 |
> ---|-------|
> 35 |1 |
> 20 |1 |
> {code}
> EnumerableWindow is supposed to preserve input collations, as a result EnumerableSort is ignored. However the implementation of EnumerableWindow generates non-ordered output (when PARTITION BY clause is used).
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)