You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@calcite.apache.org by "Sharma, Ishan" <is...@brown.edu> on 2021/03/12 20:53:32 UTC
Issue with generating a basic physical plan
Hi Folks,
We are working on a dataflow system
<http://cs.brown.edu/people/malte/research/pbc/> and are using calcite to
parse the query and generate a plan. The generated plan is then later used
to build a dataflow graph in C++. We are facing issues when it comes to
generating a basic physical plan via the rules that calcite provides.
Our code is pretty much based on Milinda's tutorial
<https://github.com/milinda/calcite-tutorial>. I have it working in an
independent java project with the version of calcite that the tutorial uses
(1.6.0), but am unable to get it to work with 1.26.0.
*Following is the relevant subset of code:*
public class QueryPlanner {
private final DataFlowGraphLibrary.DataFlowGraphGenerator generator;
private final SchemaPlus schema;
private final Planner planner;
private final RuleSet rules;
final ArrayList<RelTraitDef> traitDefs;
// Construct a QueryPlanner given a C++ DataFlowGraphLibrary interface
exposing
// Pelton state.
private QueryPlanner(DataFlowGraphLibrary.DataFlowGraphGenerator
generator) {
// Store the c++ interface.
this.generator = generator;
// Rules used for optimizing the logical plan
this.rules = RuleSets.ofList(
// CoreRules.FILTER_INTO_JOIN,
Bindables.RULES);
// Initialize traitdefs
this.traitDefs = new ArrayList<RelTraitDef>();
this.traitDefs.add(ConventionTraitDef.INSTANCE);
this.traitDefs.add(RelCollationTraitDef.INSTANCE);
// Create a calcite-understandable representation of the schema of the
logical
// un-sharded database, as exposed from c++ via generator.
PeltonSchemaFactory schemaFactory = new PeltonSchemaFactory(generator);
this.schema = schemaFactory.createSchema();
FrameworkConfig calciteFrameworkConfig =
Frameworks.newConfigBuilder()
// MySQL dialect.
.parserConfig(SqlParser.configBuilder().setLex(Lex.MYSQL).build())
// Sets the schema to use by the planner
.defaultSchema(this.schema)
.traitDefs(this.traitDefs)
.context(Contexts.EMPTY_CONTEXT)
.ruleSets(this.rules)
.costFactory(null)
.build();
this.planner = Frameworks.getPlanner(calciteFrameworkConfig);
}
// Parse query and plan it into a physical plan.
private RelNode getPhysicalPlan(String query) throws ValidationException,
RelConversionException {
SqlNode sqlNode;
try {
sqlNode = this.planner.parse(query);
} catch (SqlParseException e) {
throw new RuntimeException("Query parsing error.", e);
}
SqlNode validatedSqlNode = this.planner.validate(sqlNode);
RelNode logicalPlan = this.planner.rel(validatedSqlNode).project();
System.out.println(RelOptUtil.toString(logicalPlan));
RelNode physicalPlan = this.planner.transform(0,
logicalPlan.getTraitSet().plus(BindableConvention.INSTANCE), logicalPlan);
// System.out.println(RelOptUtil.toString(physicalPlan));
return logicalPlan;
}
*Following is the corresponding error*, where the query being supplied is
SELECT * FROM test_table WHERE Col2 = 'hello!':
Exception in thread "main"
org.apache.calcite.plan.RelOptPlanner$CannotPlanException: There are not
enough rules to produce a node with desired properties:
convention=BINDABLE, sort=[]. All the inputs have relevant nodes, however,
the cost is still infinite.
Root: rel#12:RelSubset#2.BINDABLE.[]
Original rel:
LogicalProject(subset=[rel#12:RelSubset#2.BINDABLE.[]], Col1=[$0],
Col2=[$1], Col3=[$2]): rowcount = 15.0, cumulative cost = {15.0 rows, 45.0
cpu, 0.0 io}, id = 10
LogicalFilter(subset=[rel#9:RelSubset#1.NONE.[]], condition=[=($1,
'hello!')]): rowcount = 15.0, cumulative cost = {15.0 rows, 100.0 cpu, 0.0
io}, id = 8
LogicalTableScan(subset=[rel#7:RelSubset#0.NONE.[]],
table=[[test_table]]): rowcount = 100.0, cumulative cost = {100.0 rows,
101.0 cpu, 0.0 io}, id = 4
Sets:
Set#0, type: RecordType(INTEGER Col1, VARCHAR Col2, INTEGER Col3)
rel#7:RelSubset#0.NONE.[], best=null
rel#4:LogicalTableScan.NONE.[](table=[test_table]),
rowcount=100.0, cumulative cost={inf}
rel#16:RelSubset#0.BINDABLE.[], best=null
rel#15:InterpretableConverter.BINDABLE.[](input=RelSubset#7),
rowcount=100.0, cumulative cost={inf}
Set#1, type: RecordType(INTEGER Col1, VARCHAR Col2, INTEGER Col3)
rel#9:RelSubset#1.NONE.[], best=null
rel#8:LogicalFilter.NONE.[](input=RelSubset#7,condition==($1, 'hello!')),
rowcount=15.0, cumulative cost={inf}
rel#19:RelSubset#1.BINDABLE.[], best=null
rel#18:InterpretableConverter.BINDABLE.[](input=RelSubset#9),
rowcount=15.0, cumulative cost={inf}
rel#20:BindableFilter.BINDABLE.[](input=RelSubset#16,condition==($1,
'hello!')), rowcount=15.0, cumulative cost={inf}
Set#2, type: RecordType(INTEGER Col1, VARCHAR Col2, INTEGER Col3)
rel#11:RelSubset#2.NONE.[], best=null
rel#10:LogicalProject.NONE.[](input=RelSubset#9,inputs=0..2),
rowcount=15.0, cumulative cost={inf}
rel#12:RelSubset#2.BINDABLE.[], best=null
rel#13:AbstractConverter.BINDABLE.[](input=RelSubset#11,convention=BINDABLE,sort=[]),
rowcount=15.0, cumulative cost={inf}
rel#22:InterpretableConverter.BINDABLE.[](input=RelSubset#11),
rowcount=15.0, cumulative cost={inf}
rel#23:BindableProject.BINDABLE.[](input=RelSubset#19,inputs=0..2),
rowcount=15.0, cumulative cost={inf}
*Graphviz:*
digraph G {
root [style=filled,label="Root"];
subgraph cluster0{
label="Set 0 RecordType(INTEGER Col1, VARCHAR Col2, INTEGER
Col3)";
rel4 [label="re
I
l#4:LogicalTableScan\ntable=[test_table]\nrows=100.0, cost={inf}",shape=box]
rel15
[label="rel#15:InterpretableConverter\ninput=RelSubset#7\nrows=100.0,
cost={inf}",shape=box]
subset7 [label="rel#7:RelSubset#0.NONE.[]"]
subset16 [label="rel#16:RelSubset#0.BINDABLE.[]"]
}
subgraph cluster1{
label="Set 1 RecordType(INTEGER Col1, VARCHAR Col2, INTEGER
Col3)";
rel8
[label="rel#8:LogicalFilter\ninput=RelSubset#7,condition==($1,
'hello!')\nrows=15.0, cost={inf}",shape=box]
rel18
[label="rel#18:InterpretableConverter\ninput=RelSubset#9\nrows=15.0,
cost={inf}",shape=box]
rel20
[label="rel#20:BindableFilter\ninput=RelSubset#16,condition==($1,
'hello!')\nrows=15.0, cost={inf}",shape=box]
subset9 [label="rel#9:RelSubset#1.NONE.[]"]
subset19 [label="rel#19:RelSubset#1.BINDABLE.[]"]
}
subgraph cluster2{
label="Set 2 RecordType(INTEGER Col1, VARCHAR Col2, INTEGER
Col3)";
rel10
[label="rel#10:LogicalProject\ninput=RelSubset#9,inputs=0..2\nrows=15.0,
cost={inf}",shape=box]
rel13
[label="rel#13:AbstractConverter\ninput=RelSubset#11,convention=BINDABLE,sort=[]\nrows=15.0,
cost={inf}",shape=box]
rel22
[label="rel#22:InterpretableConverter\ninput=RelSubset#11\nrows=15.0,
cost={inf}",shape=box]
rel23
[label="rel#23:BindableProject\ninput=RelSubset#19,inputs=0..2\nrows=15.0,
cost={inf}",shape=box]
subset11 [label="rel#11:RelSubset#2.NONE.[]"]
subset12 [label="rel#12:RelSubset#2.BINDABLE.[]"]
}
root -> subset12;
subset7 -> rel4;
subset16 -> rel15; rel15 -> subset7;
subset9 -> rel8; rel8 -> subset7;
subset19 -> rel18; rel18 -> subset9;
subset19 -> rel20; rel20 -> subset16;
subset11 -> rel10; rel10 -> subset9;
subset12 -> rel13; rel13 -> subset11;
subset12 -> rel22; rel22 -> subset11;
subset12 -> rel23; rel23 -> subset19;
}
at
org.apache.calcite.plan.volcano.RelSubset$CheapestPlanReplacer.visit(RelSubset.java:742)
at
org.apache.calcite.plan.volcano.RelSubset.buildCheapestPlan(RelSubset.java:365)
at
org.apache.calcite.plan.volcano.VolcanoPlanner.findBestExp(VolcanoPlanner.java:520)
at
org.apache.calcite.tools.Programs$RuleSetProgram.run(Programs.java:312)
at
org.apache.calcite.prepare.PlannerImpl.transform(PlannerImpl.java:356)
at
com.brownsys.pelton.QueryPlanner.getPhysicalPlan(QueryPlanner.java:87)
at com.brownsys.pelton.QueryPlanner.plan(QueryPlanner.java:118)
Based on the discussion here
<https://www.mail-archive.com/dev@calcite.apache.org/msg12131.html> and the
corresponding graphviz the issue seems to be related to the bindable
convention and/or the use of it. I am super new to working with calcite and
would appreciate your advice.
Thanks,
Ishan
Re: Issue with generating a basic physical plan
Posted by Chunwei Lei <ch...@gmail.com>.
Hi, Ishan.
From what I can see, that is because the table scan is not converted to
BindableTableScan.
Best,
Chunwei
On Sat, Mar 13, 2021 at 2:41 PM Sharma, Ishan <is...@brown.edu>
wrote:
> Hi Folks,
> We are working on a dataflow system
> <http://cs.brown.edu/people/malte/research/pbc/> and are using calcite to
> parse the query and generate a plan. The generated plan is then later used
> to build a dataflow graph in C++. We are facing issues when it comes to
> generating a basic physical plan via the rules that calcite provides.
>
> Our code is pretty much based on Milinda's tutorial
> <https://github.com/milinda/calcite-tutorial>. I have it working in an
> independent java project with the version of calcite that the tutorial uses
> (1.6.0), but am unable to get it to work with 1.26.0.
>
> *Following is the relevant subset of code:*
> public class QueryPlanner {
> private final DataFlowGraphLibrary.DataFlowGraphGenerator generator;
> private final SchemaPlus schema;
> private final Planner planner;
> private final RuleSet rules;
> final ArrayList<RelTraitDef> traitDefs;
>
> // Construct a QueryPlanner given a C++ DataFlowGraphLibrary interface
> exposing
> // Pelton state.
> private QueryPlanner(DataFlowGraphLibrary.DataFlowGraphGenerator
> generator) {
> // Store the c++ interface.
> this.generator = generator;
> // Rules used for optimizing the logical plan
> this.rules = RuleSets.ofList(
> // CoreRules.FILTER_INTO_JOIN,
> Bindables.RULES);
> // Initialize traitdefs
> this.traitDefs = new ArrayList<RelTraitDef>();
> this.traitDefs.add(ConventionTraitDef.INSTANCE);
> this.traitDefs.add(RelCollationTraitDef.INSTANCE);
>
> // Create a calcite-understandable representation of the schema of the
> logical
> // un-sharded database, as exposed from c++ via generator.
> PeltonSchemaFactory schemaFactory = new PeltonSchemaFactory(generator);
> this.schema = schemaFactory.createSchema();
>
> FrameworkConfig calciteFrameworkConfig =
> Frameworks.newConfigBuilder()
> // MySQL dialect.
>
> .parserConfig(SqlParser.configBuilder().setLex(Lex.MYSQL).build())
> // Sets the schema to use by the planner
> .defaultSchema(this.schema)
> .traitDefs(this.traitDefs)
> .context(Contexts.EMPTY_CONTEXT)
> .ruleSets(this.rules)
> .costFactory(null)
> .build();
>
> this.planner = Frameworks.getPlanner(calciteFrameworkConfig);
> }
>
> // Parse query and plan it into a physical plan.
> private RelNode getPhysicalPlan(String query) throws ValidationException,
> RelConversionException {
> SqlNode sqlNode;
>
> try {
> sqlNode = this.planner.parse(query);
> } catch (SqlParseException e) {
> throw new RuntimeException("Query parsing error.", e);
> }
>
> SqlNode validatedSqlNode = this.planner.validate(sqlNode);
> RelNode logicalPlan = this.planner.rel(validatedSqlNode).project();
> System.out.println(RelOptUtil.toString(logicalPlan));
> RelNode physicalPlan = this.planner.transform(0,
> logicalPlan.getTraitSet().plus(BindableConvention.INSTANCE), logicalPlan);
> // System.out.println(RelOptUtil.toString(physicalPlan));
> return logicalPlan;
> }
>
>
> *Following is the corresponding error*, where the query being supplied is
> SELECT * FROM test_table WHERE Col2 = 'hello!':
>
> Exception in thread "main"
> org.apache.calcite.plan.RelOptPlanner$CannotPlanException: There are not
> enough rules to produce a node with desired properties:
> convention=BINDABLE, sort=[]. All the inputs have relevant nodes, however,
> the cost is still infinite.
> Root: rel#12:RelSubset#2.BINDABLE.[]
> Original rel:
> LogicalProject(subset=[rel#12:RelSubset#2.BINDABLE.[]], Col1=[$0],
> Col2=[$1], Col3=[$2]): rowcount = 15.0, cumulative cost = {15.0 rows, 45.0
> cpu, 0.0 io}, id = 10
> LogicalFilter(subset=[rel#9:RelSubset#1.NONE.[]], condition=[=($1,
> 'hello!')]): rowcount = 15.0, cumulative cost = {15.0 rows, 100.0 cpu, 0.0
> io}, id = 8
> LogicalTableScan(subset=[rel#7:RelSubset#0.NONE.[]],
> table=[[test_table]]): rowcount = 100.0, cumulative cost = {100.0 rows,
> 101.0 cpu, 0.0 io}, id = 4
>
> Sets:
> Set#0, type: RecordType(INTEGER Col1, VARCHAR Col2, INTEGER Col3)
> rel#7:RelSubset#0.NONE.[], best=null
> rel#4:LogicalTableScan.NONE.[](table=[test_table]),
> rowcount=100.0, cumulative cost={inf}
> rel#16:RelSubset#0.BINDABLE.[], best=null
>
> rel#15:InterpretableConverter.BINDABLE.[](input=RelSubset#7),
> rowcount=100.0, cumulative cost={inf}
> Set#1, type: RecordType(INTEGER Col1, VARCHAR Col2, INTEGER Col3)
> rel#9:RelSubset#1.NONE.[], best=null
>
> rel#8:LogicalFilter.NONE.[](input=RelSubset#7,condition==($1, 'hello!')),
> rowcount=15.0, cumulative cost={inf}
> rel#19:RelSubset#1.BINDABLE.[], best=null
>
> rel#18:InterpretableConverter.BINDABLE.[](input=RelSubset#9),
> rowcount=15.0, cumulative cost={inf}
>
> rel#20:BindableFilter.BINDABLE.[](input=RelSubset#16,condition==($1,
> 'hello!')), rowcount=15.0, cumulative cost={inf}
> Set#2, type: RecordType(INTEGER Col1, VARCHAR Col2, INTEGER Col3)
> rel#11:RelSubset#2.NONE.[], best=null
>
> rel#10:LogicalProject.NONE.[](input=RelSubset#9,inputs=0..2),
> rowcount=15.0, cumulative cost={inf}
> rel#12:RelSubset#2.BINDABLE.[], best=null
>
>
> rel#13:AbstractConverter.BINDABLE.[](input=RelSubset#11,convention=BINDABLE,sort=[]),
> rowcount=15.0, cumulative cost={inf}
>
> rel#22:InterpretableConverter.BINDABLE.[](input=RelSubset#11),
> rowcount=15.0, cumulative cost={inf}
>
> rel#23:BindableProject.BINDABLE.[](input=RelSubset#19,inputs=0..2),
> rowcount=15.0, cumulative cost={inf}
>
> *Graphviz:*
> digraph G {
> root [style=filled,label="Root"];
> subgraph cluster0{
> label="Set 0 RecordType(INTEGER Col1, VARCHAR Col2, INTEGER
> Col3)";
> rel4 [label="re
>
> I
> l#4:LogicalTableScan\ntable=[test_table]\nrows=100.0,
> cost={inf}",shape=box]
> rel15
> [label="rel#15:InterpretableConverter\ninput=RelSubset#7\nrows=100.0,
> cost={inf}",shape=box]
> subset7 [label="rel#7:RelSubset#0.NONE.[]"]
> subset16 [label="rel#16:RelSubset#0.BINDABLE.[]"]
> }
> subgraph cluster1{
> label="Set 1 RecordType(INTEGER Col1, VARCHAR Col2, INTEGER
> Col3)";
> rel8
> [label="rel#8:LogicalFilter\ninput=RelSubset#7,condition==($1,
> 'hello!')\nrows=15.0, cost={inf}",shape=box]
> rel18
> [label="rel#18:InterpretableConverter\ninput=RelSubset#9\nrows=15.0,
> cost={inf}",shape=box]
> rel20
> [label="rel#20:BindableFilter\ninput=RelSubset#16,condition==($1,
> 'hello!')\nrows=15.0, cost={inf}",shape=box]
> subset9 [label="rel#9:RelSubset#1.NONE.[]"]
> subset19 [label="rel#19:RelSubset#1.BINDABLE.[]"]
> }
> subgraph cluster2{
> label="Set 2 RecordType(INTEGER Col1, VARCHAR Col2, INTEGER
> Col3)";
> rel10
> [label="rel#10:LogicalProject\ninput=RelSubset#9,inputs=0..2\nrows=15.0,
> cost={inf}",shape=box]
> rel13
>
> [label="rel#13:AbstractConverter\ninput=RelSubset#11,convention=BINDABLE,sort=[]\nrows=15.0,
> cost={inf}",shape=box]
> rel22
> [label="rel#22:InterpretableConverter\ninput=RelSubset#11\nrows=15.0,
> cost={inf}",shape=box]
> rel23
> [label="rel#23:BindableProject\ninput=RelSubset#19,inputs=0..2\nrows=15.0,
> cost={inf}",shape=box]
> subset11 [label="rel#11:RelSubset#2.NONE.[]"]
> subset12 [label="rel#12:RelSubset#2.BINDABLE.[]"]
> }
> root -> subset12;
> subset7 -> rel4;
> subset16 -> rel15; rel15 -> subset7;
> subset9 -> rel8; rel8 -> subset7;
> subset19 -> rel18; rel18 -> subset9;
> subset19 -> rel20; rel20 -> subset16;
> subset11 -> rel10; rel10 -> subset9;
> subset12 -> rel13; rel13 -> subset11;
> subset12 -> rel22; rel22 -> subset11;
> subset12 -> rel23; rel23 -> subset19;
> }
> at
>
> org.apache.calcite.plan.volcano.RelSubset$CheapestPlanReplacer.visit(RelSubset.java:742)
> at
>
> org.apache.calcite.plan.volcano.RelSubset.buildCheapestPlan(RelSubset.java:365)
> at
>
> org.apache.calcite.plan.volcano.VolcanoPlanner.findBestExp(VolcanoPlanner.java:520)
> at
> org.apache.calcite.tools.Programs$RuleSetProgram.run(Programs.java:312)
> at
> org.apache.calcite.prepare.PlannerImpl.transform(PlannerImpl.java:356)
> at
> com.brownsys.pelton.QueryPlanner.getPhysicalPlan(QueryPlanner.java:87)
> at com.brownsys.pelton.QueryPlanner.plan(QueryPlanner.java:118)
>
> Based on the discussion here
> <https://www.mail-archive.com/dev@calcite.apache.org/msg12131.html> and
> the
> corresponding graphviz the issue seems to be related to the bindable
> convention and/or the use of it. I am super new to working with calcite and
> would appreciate your advice.
>
> Thanks,
> Ishan
>