You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ignite.apache.org by "Ignite TC Bot (Jira)" <ji...@apache.org> on 2022/12/12 14:59:00 UTC

[jira] [Commented] (IGNITE-18341) Calcite engine. Introduce correlate based distribution

    [ https://issues.apache.org/jira/browse/IGNITE-18341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17646138#comment-17646138 ] 

Ignite TC Bot commented on IGNITE-18341:
----------------------------------------

{panel:title=Branch: [pull/10424/head] Base: [master] : No blockers found!|borderStyle=dashed|borderColor=#ccc|titleBGColor=#D6F7C1}{panel}
{panel:title=Branch: [pull/10424/head] Base: [master] : New Tests (2)|borderStyle=dashed|borderColor=#ccc|titleBGColor=#D6F7C1}
{color:#00008b}Calcite SQL{color} [[tests 2|https://ci2.ignite.apache.org/viewLog.html?buildId=6950074]]
* {color:#013220}IgniteCalciteTestSuite: CorrelatedSubqueryPlannerTest.testCorrelatedDistribution - PASSED{color}
* {color:#013220}IgniteCalciteTestSuite: CorrelatesIntegrationTest.testCorrelatedDistribution - PASSED{color}

{panel}
[TeamCity *--&gt; Run :: All* Results|https://ci2.ignite.apache.org/viewLog.html?buildId=6950153&amp;buildTypeId=IgniteTests24Java8_RunAll]

> Calcite engine. Introduce correlate based distribution
> ------------------------------------------------------
>
>                 Key: IGNITE-18341
>                 URL: https://issues.apache.org/jira/browse/IGNITE-18341
>             Project: Ignite
>          Issue Type: Improvement
>            Reporter: Aleksey Plekhanov
>            Assignee: Aleksey Plekhanov
>            Priority: Major
>              Labels: calcite, calcite2-required, calcite3-required
>          Time Spent: 1h
>  Remaining Estimate: 0h
>
> To propagate hash/affinity distribution in relation node all distribution keys should be contained in the node. It's impossible to pass hash distribution through the node if node knows nothing about hash distribution keys. For example, hash distribution can't bypass aggregate if one or more of distribution keys is not contained in grouped columns.
> Suppose, for example, we have two tables T1 and T2 colocated on fields T1.A and T2.A. The following query:
> {code:java}
> SELECT (SELECT sum(b) FROM t2 WHERE t2.a = t1.a) FROM t1 {code}
> Hash distribution can't be used on the right side of the correlated nested loop join, since aggregate doesn't have required columns, and plan for such a query looks very ineffective:
> {noformat}
> IgniteProject(EXPR$0=[$3]), id = 219
>   IgniteCorrelatedNestedLoopJoin(condition=[true], joinType=[left], variablesSet=[[$cor0]], correlationVariables=[[$cor0]]), id = 218
>     IgniteExchange(distribution=[single]), id = 213
>       IgniteTableScan(table=[[PUBLIC, T1]]), id = 84
>     IgniteColocatedHashAggregate(group=[{}], SUM(B)=[SUM($0)]), id = 217
>       IgniteProject(B=[$1]), id = 216
>         IgniteHashIndexSpool(readType=[LAZY], writeType=[EAGER], searchRow=[[$cor0.A, null]], condition=[=($0, $cor0.A)], allowNulls=[false]), id = 215
>           IgniteExchange(distribution=[single]), id = 214
>             IgniteTableScan(table=[[PUBLIC, T2]], requiredColumns=[{0, 1}]), id = 112{noformat}
> If we look closer to the query we can find that filter {{t2.a = t1.a}} makes this query colocated. If we run such a query on H2 engine with {{colocatedJoin=false}} flag it will return the correct result. 
> To workaround such a problem I propose to introduce some kind of artificial "correlated distribution". This distribution will be produced on the right side of correlated nested loop join, if left side of the join has hash distribution (will contain reference to correlate and distribution of this correlate), than passed through set of nodes without modification and finally on filter node will be restored as hash distribution remaped to input operator fields (if filter contains equality conditions input operator fields and correlated variable fields).
> After such a change plan should be looks like:
> {noformat}
> IgniteExchange(distribution=[single]), id = 283
>   IgniteProject(EXPR$0=[$3]), id = 282
>     IgniteCorrelatedNestedLoopJoin(condition=[true], joinType=[left], variablesSet=[[$cor0]], correlationVariables=[[$cor0]]), id = 281
>       IgniteTableScan(table=[[PUBLIC, T1]]), id = 84
>       IgniteColocatedHashAggregate(group=[{}], SUM(B)=[SUM($0)]), id = 280
>         IgniteProject(B=[$1]), id = 279
>           IgniteFilter(condition=[=($0, $cor0.A)]), id = 278
>             IgniteTableScan(table=[[PUBLIC, T2]], requiredColumns=[{0, 1}]), id = 112
> {noformat}
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)