You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@calcite.apache.org by "jin xing (Jira)" <ji...@apache.org> on 2019/09/03 02:35:00 UTC
[jira] [Commented] (CALCITE-2592) EnumerableMergeJoin is never taken

    [ https://issues.apache.org/jira/browse/CALCITE-2592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16921118#comment-16921118 ] 

jin xing commented on CALCITE-2592:
-----------------------------------

[~vladimirsitnikov] 
I made a fix in [https://github.com/apache/calcite/pull/1434]

This PR proposes to construct the operator of {{EnumerableSort}} when creating {{EnumerableMergeJoin}}, thus no {{AbstractConvert}} is created and save extra matching effort when optimization.

I enabled VolcanoPlannerTest#testMergeJoin and test can pass now.

It's great if you can give some comments when you have time ~

 

Thanks~

> EnumerableMergeJoin is never taken
> ----------------------------------
>
>                 Key: CALCITE-2592
>                 URL: https://issues.apache.org/jira/browse/CALCITE-2592
>             Project: Calcite
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 1.17.0
>            Reporter: Vladimir Sitnikov
>            Priority: Major
>         Attachments: calcite2592.png, calcite2592.svg
>
>
> I have added a test case {{org.apache.calcite.plan.volcano.VolcanoPlannerTest#testMergeJoin}}
> The test fails as follows:
> {noformat}org.apache.calcite.plan.RelOptPlanner$CannotPlanException: There are not enough rules to produce a node with desired properties: convention=ENUMERABLE, sort=[].
> Missing conversion is LogicalValues[convention: NONE -> ENUMERABLE, sort: [1] -> [0]]
> There is 1 empty subset: rel#9:Subset#0.ENUMERABLE.[0], the relevant part of the original plan is as follows
> 0:LogicalValues(tuples=[[{ '2', 'a' }, { '1', 'b' }]])
> Root: rel#7:Subset#2.ENUMERABLE.[]
> Original rel:
> LogicalJoin(subset=[rel#7:Subset#2.ENUMERABLE.[]], condition=[=($0, $2)], joinType=[inner]): rowcount = 1.0, cumulative cost = {1.0 rows, 0.0 cpu, 0.0 io}, id = 5
>   LogicalValues(subset=[rel#3:Subset#0.NONE.[1]], tuples=[[{ '2', 'a' }, { '1', 'b' }]]): rowcount = 2.0, cumulative cost = {2.0 rows, 1.0 cpu, 0.0 io}, id = 0
>   LogicalValues(subset=[rel#4:Subset#1.NONE.[]], tuples=[[{ '1', 'x' }, { '2', 'y' }]]): rowcount = 2.0, cumulative cost = {2.0 rows, 1.0 cpu, 0.0 io}, id = 1
> Sets:
> Set#0, type: RecordType(CHAR(1) id, CHAR(1) name)
> 	rel#3:Subset#0.NONE.[1], best=null, importance=0.81
> 		rel#0:LogicalValues.NONE.[1](type=RecordType(CHAR(1) id, CHAR(1) name),tuples=[{ '2', 'a' }, { '1', 'b' }]), rowcount=2.0, cumulative cost={inf}
> 	rel#9:Subset#0.ENUMERABLE.[0], best=null, importance=0.9
> 	rel#24:Subset#0.ENUMERABLE.[1], best=rel#23, importance=0.45
> 		rel#23:EnumerableValues.ENUMERABLE.[1](type=RecordType(CHAR(1) id, CHAR(1) name),tuples=[{ '2', 'a' }, { '1', 'b' }]), rowcount=2.0, cumulative cost={2.0 rows, 1.0 cpu, 0.0 io}
> Set#1, type: RecordType(CHAR(1) id, CHAR(1) name)
> 	rel#4:Subset#1.NONE.[], best=null, importance=0.81
> 		rel#1:LogicalValues.NONE.[[0, 1], [1]](type=RecordType(CHAR(1) id, CHAR(1) name),tuples=[{ '1', 'x' }, { '2', 'y' }]), rowcount=2.0, cumulative cost={inf}
> 	rel#10:Subset#1.ENUMERABLE.[0], best=rel#21, importance=0.9
> 		rel#21:EnumerableValues.ENUMERABLE.[[0, 1], [1]](type=RecordType(CHAR(1) id, CHAR(1) name),tuples=[{ '1', 'x' }, { '2', 'y' }]), rowcount=2.0, cumulative cost={2.0 rows, 1.0 cpu, 0.0 io}
> 	rel#22:Subset#1.ENUMERABLE.[], best=rel#21, importance=0.45
> 		rel#21:EnumerableValues.ENUMERABLE.[[0, 1], [1]](type=RecordType(CHAR(1) id, CHAR(1) name),tuples=[{ '1', 'x' }, { '2', 'y' }]), rowcount=2.0, cumulative cost={2.0 rows, 1.0 cpu, 0.0 io}
> Set#2, type: RecordType(CHAR(1) id, CHAR(1) name, CHAR(1) id0, CHAR(1) name0)
> 	rel#6:Subset#2.NONE.[], best=null, importance=0.9
> 		rel#5:LogicalJoin.NONE.[](left=RelSubset#3,right=RelSubset#4,condition==($0, $2),joinType=inner), rowcount=1.0, cumulative cost={inf}
> 		rel#15:LogicalProject.NONE.[](input=RelSubset#14,id=$2,name=$3,id0=$0,name0=$1), rowcount=1.0, cumulative cost={inf}
> 	rel#7:Subset#2.ENUMERABLE.[], best=null, importance=1.0
> 		rel#8:AbstractConverter.ENUMERABLE.[](input=RelSubset#6,convention=ENUMERABLE,sort=[]), rowcount=1.0, cumulative cost={inf}
> 		rel#11:EnumerableMergeJoin.ENUMERABLE.[[0], [2]](left=RelSubset#9,right=RelSubset#10,condition==($0, $2),joinType=inner), rowcount=1.0, cumulative cost={inf}
> Set#3, type: RecordType(CHAR(1) id, CHAR(1) name, CHAR(1) id0, CHAR(1) name0)
> 	rel#14:Subset#3.NONE.[], best=null, importance=0.81
> 		rel#12:LogicalJoin.NONE.[](left=RelSubset#4,right=RelSubset#3,condition==($2, $0),joinType=inner), rowcount=1.0, cumulative cost={inf}
> 		rel#18:LogicalProject.NONE.[](input=RelSubset#6,id=$2,name=$3,id0=$0,name0=$1), rowcount=1.0, cumulative cost={inf}
> 	rel#20:Subset#3.ENUMERABLE.[], best=null, importance=0.405
> 		rel#19:EnumerableMergeJoin.ENUMERABLE.[[0], [2]](left=RelSubset#10,right=RelSubset#9,condition==($0, $2),joinType=inner), rowcount=1.0, cumulative cost={inf}
> Graphviz:
> digraph G {
> 	root [style=filled,label="Root"];
> 	subgraph cluster0{
> 		label="Set 0 RecordType(CHAR(1) id, CHAR(1) name)";
> 		rel0 [label="rel#0:LogicalValues(type=RecordType(CHAR(1) id, CHAR(1) name),tuples=[{ '2', 'a' }, { '1', 'b' }])\nrows=2.0, cost={inf}",shape=box]
> 		rel23 [label="rel#23:EnumerableValues(type=RecordType(CHAR(1) id, CHAR(1) name),tuples=[{ '2', 'a' }, { '1', 'b' }])\nrows=2.0, cost={2.0 rows, 1.0 cpu, 0.0 io}",color=blue,shape=box]
> 		subset3 [label="rel#3:Subset#0.NONE.[1]"]
> 		subset9 [label="rel#9:Subset#0.ENUMERABLE.[0]",color=red]
> 		subset24 [label="rel#24:Subset#0.ENUMERABLE.[1]"]
> 	}
> 	subgraph cluster1{
> 		label="Set 1 RecordType(CHAR(1) id, CHAR(1) name)";
> 		rel1 [label="rel#1:LogicalValues(type=RecordType(CHAR(1) id, CHAR(1) name),tuples=[{ '1', 'x' }, { '2', 'y' }])\nrows=2.0, cost={inf}",shape=box]
> 		rel21 [label="rel#21:EnumerableValues(type=RecordType(CHAR(1) id, CHAR(1) name),tuples=[{ '1', 'x' }, { '2', 'y' }])\nrows=2.0, cost={2.0 rows, 1.0 cpu, 0.0 io}",color=blue,shape=box]
> 		subset4 [label="rel#4:Subset#1.NONE.[]"]
> 		subset10 [label="rel#10:Subset#1.ENUMERABLE.[0]"]
> 		subset22 [label="rel#22:Subset#1.ENUMERABLE.[]"]
> 		subset22 -> subset10;	}
> 	subgraph cluster2{
> 		label="Set 2 RecordType(CHAR(1) id, CHAR(1) name, CHAR(1) id0, CHAR(1) name0)";
> 		rel5 [label="rel#5:LogicalJoin(left=RelSubset#3,right=RelSubset#4,condition==($0, $2),joinType=inner)\nrows=1.0, cost={inf}",shape=box]
> 		rel8 [label="rel#8:AbstractConverter(input=RelSubset#6,convention=ENUMERABLE,sort=[])\nrows=1.0, cost={inf}",shape=box]
> 		rel11 [label="rel#11:EnumerableMergeJoin(left=RelSubset#9,right=RelSubset#10,condition==($0, $2),joinType=inner)\nrows=1.0, cost={inf}",shape=box]
> 		rel15 [label="rel#15:LogicalProject(input=RelSubset#14,id=$2,name=$3,id0=$0,name0=$1)\nrows=1.0, cost={inf}",shape=box]
> 		subset6 [label="rel#6:Subset#2.NONE.[]"]
> 		subset7 [label="rel#7:Subset#2.ENUMERABLE.[]"]
> 	}
> 	subgraph cluster3{
> 		label="Set 3 RecordType(CHAR(1) id, CHAR(1) name, CHAR(1) id0, CHAR(1) name0)";
> 		rel12 [label="rel#12:LogicalJoin(left=RelSubset#4,right=RelSubset#3,condition==($2, $0),joinType=inner)\nrows=1.0, cost={inf}",shape=box]
> 		rel18 [label="rel#18:LogicalProject(input=RelSubset#6,id=$2,name=$3,id0=$0,name0=$1)\nrows=1.0, cost={inf}",shape=box]
> 		rel19 [label="rel#19:EnumerableMergeJoin(left=RelSubset#10,right=RelSubset#9,condition==($0, $2),joinType=inner)\nrows=1.0, cost={inf}",shape=box]
> 		subset14 [label="rel#14:Subset#3.NONE.[]"]
> 		subset20 [label="rel#20:Subset#3.ENUMERABLE.[]"]
> 	}
> 	root -> subset7;
> 	subset3 -> rel0;
> 	subset24 -> rel23[color=blue];
> 	subset4 -> rel1;
> 	subset22 -> rel21[color=blue];
> 	subset6 -> rel5; rel5 -> subset3[label="0"]; rel5 -> subset4[label="1"];
> 	subset7 -> rel8; rel8 -> subset6;
> 	subset7 -> rel11; rel11 -> subset9[label="0"]; rel11 -> subset10[label="1"];
> 	subset6 -> rel15; rel15 -> subset14;
> 	subset14 -> rel12; rel12 -> subset4[label="0"]; rel12 -> subset3[label="1"];
> 	subset14 -> rel18; rel18 -> subset6;
> 	subset20 -> rel19; rel19 -> subset10[label="0"]; rel19 -> subset9[label="1"];
> }
> 	at org.apache.calcite.plan.volcano.RelSubset$CheapestPlanReplacer.visit(RelSubset.java:587)
> 	at org.apache.calcite.plan.volcano.RelSubset.buildCheapestPlan(RelSubset.java:304)
> 	at org.apache.calcite.plan.volcano.VolcanoPlanner.findBestExp(VolcanoPlanner.java:655){noformat}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)