You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@calcite.apache.org by "Dmitry Sysolyatin (Jira)" <ji...@apache.org> on 2022/12/11 19:37:00 UTC

[jira] [Updated] (CALCITE-5388) tempList expression inside EnumerableWindow.getPartitionIterator should be unoptimized

     [ https://issues.apache.org/jira/browse/CALCITE-5388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Dmitry Sysolyatin updated CALCITE-5388:
---------------------------------------
    Summary: tempList expression inside EnumerableWindow.getPartitionIterator should be unoptimized  (was: Left value of EnumerableWindow.getPartitionIterator should be unoptimized expression)

> tempList expression inside EnumerableWindow.getPartitionIterator should be unoptimized
> --------------------------------------------------------------------------------------
>
>                 Key: CALCITE-5388
>                 URL: https://issues.apache.org/jira/browse/CALCITE-5388
>             Project: Calcite
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 1.32.0
>            Reporter: Magnus Mogren
>            Assignee: Dmitry Sysolyatin
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 1.33.0
>
>          Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> EnumerableWindow.getPartitionIterator method creates expression for creating  'tempList' array [1] or 'multiMap' map [2] depends on use case. And EnumerableWindow.implement method creates expression for clearing this collection. Because state of created collection is mutable, the collection can not be reused in other instance of EnumerableWindow but it happens in the following use case:
> {code:java}
> with
>     CTE1(rownr1, val1) as ( select ROW_NUMBER() OVER(ORDER BY id ASC), id from (values (1), (2)) as Vals1(id) ),
>     CTE2(rownr2, val2) as ( select ROW_NUMBER() OVER(ORDER BY id ASC), id from (values (1), (2)) as Vals2(id) )
> select
>     CTE1.rownr1,
>     CTE1.val1,
>     CTE2.rownr2,
>     CTE2.val2 
> from
>     CTE1,
>     CTE2 
> where
>     CTE1.val1 = CTE2.val2{code}
> Generated plan:
> {code}
> EnumerableHashJoin(condition=[=($1, $3)], joinType=[inner])
>   EnumerableSort(sort0=[$1], dir0=[ASC])
>     EnumerableCalc(expr#0..1=[{inputs}], EXPR$0=[$t1], ID=[$t0])
>       EnumerableWindow(window#0=[window(order by [0] rows between UNBOUNDED PRECEDING and CURRENT ROW aggs [ROW_NUMBER()])])
>         EnumerableValues(tuples=[[{ 1 }, { 2 }]])
>   EnumerableSort(sort0=[$1], dir0=[ASC])
>     EnumerableCalc(expr#0..1=[{inputs}], EXPR$0=[$t1], ID=[$t0])
>       EnumerableWindow(window#0=[window(order by [0] rows between UNBOUNDED PRECEDING and CURRENT ROW aggs [ROW_NUMBER()])])
>         EnumerableValues(tuples=[[{ 1 }, { 2 }]])
> {code}
> Calcite expression optimizer tries to remove duplicate expressions and as a result the same 'tempList' instance is used for both EnumerableWindow and the query returns empty result instead of:
> |ROWNR1|VAL1|VAL2|
> |1|1|1|
> |2|2|2|
> [1] https://github.com/apache/calcite/blob/main/core/src/main/java/org/apache/calcite/adapter/enumerable/EnumerableWindow.java#L677
> [2] https://github.com/apache/calcite/blob/main/core/src/main/java/org/apache/calcite/adapter/enumerable/EnumerableWindow.java#L696
> [3] https://github.com/apache/calcite/blob/main/core/src/main/java/org/apache/calcite/adapter/enumerable/EnumerableWindow.java#L518
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)