You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@calcite.apache.org by Γιώργος Θεοδωράκης <gi...@gmail.com> on 2016/10/03 15:51:36 UTC

How to use VolcanoPlanner

Hi,

I want to parse an Sql query and transform it to an optimized relational
plan (not convert it to physical !!) using calcite rules based on my
database schema and metadata. Right now, the only helpful example I have
found for my purpose is taken from
https://github.com/milinda/samza-sql/blob/master/samza-sql-planner/src/main/java/org/apache/samza/sql/planner/QueryPlanner.java
,
in which a simple Planner is used for parsing and validating Sql and a
HepPlanner is used for searching for an optimized plan based on imported
rules.

Is there any way to use in my case the VolcanoPlanner? The only examples I
have seen so far from the test classes suggest that it should be used for
converting relational expressions to physical ones. How can I make the
Volcano Planner "see" my SchemaPlus schema ,when I can only define
RelOptSchema? Can someone provide me with a complete example of using
Volcano Planner and adding rules, such
as FilterProjectTransposeRule.INSTANCE?

Thanks in advance,
George

Re: How to use VolcanoPlanner

Posted by Colm McHugh <co...@gmail.com>.
What errors did you get? It should be possible to use both Volcano and hep
when query planning (Drill does this, and possibly others).
It superficially sounds like applying a heuristics pass that includes the
project pushdown (and any other rules you may want to apply)
after the volcano pass should work.

On Tue, Nov 1, 2016 at 4:05 AM, Γιώργος Θεοδωράκης <gi...@gmail.com>
wrote:

> I am wondering if is it possible to push down projections in Volcano
> generally. With the cost model of Volcano, a projection adds rows and cpu
> cost and it can't be chosen. For example  for the next query:
>
> "select s.products.productid "
> + "from s.products,s.orders "
> + "where s.orders.productid = s.products.productid  "
> + "and units>5 "
>
> , the optimized plan I get from Volcano is this
>
> EnumerableProject(productid=[$0]): rowcount = 5.3999999999999995,
> cumulative cost = {51.555956815368326 rows, 37.4 cpu, 0.0 io}, id = 86
>   EnumerableJoin(condition=[=($0, $3)], joinType=[inner]): rowcount =
> 5.3999999999999995, cumulative cost = {46.15595681536833 rows, 32.0 cpu,
> 0.0 io}, id = 85
>     EnumerableTableScan(table=[[s, products]]): rowcount = 6.0, cumulative
> cost = {6.0 rows, 7.0 cpu, 0.0 io}, id = 40
>     EnumerableFilter(condition=[>($2, 5)]): rowcount = 6.0, cumulative
> cost
> = {18.0 rows, 25.0 cpu, 0.0 io}, id = 84
>       EnumerableTableScan(table=[[s, orders]]): rowcount = 12.0, cumulative
> cost = {12.0 rows, 13.0 cpu, 0.0 io}, id = 59
>
> However the plan I would like to get is the one with projections pushed
> down, which has bigger cumulative cost as we would have two more
> projections at the source of inputs (with additional rowCount and cpu cost)
> with Volcano's cost model, but smaller in most cases as we reduce the
> memory needed(which is not taken in consideration here).
>
> Are there any suggestions? At first I used hepPlanner, but I didn't get
> optimized plans for join order with the built-in
> rules(using JoinPushThroughJoinRule.RIGHT and JoinPushThroughJoinRule.LEFT
> never stops, and using MultiJoinRules doesn't do anything). Then, I tried
> to use VolcanoPlanner and I got optimized plans for join order with no
> pushed-down projections. I also tried to use a combination of both planners
> but I got errors. What should I do?
>
> 2016-10-27 16:00 GMT+03:00 Γιώργος Θεοδωράκης <gi...@gmail.com>:
>
> > I fixed the second error by changing my Statistics to:
> >
> > public Statistic getStatistic() {
> > int rowCount = rows.size();
> > return Statistics.of(rowCount, ImmutableList.<ImmutableBitSet>of());
> > }
> >
> > Any suggestions of a better solution are welcome. The project push-down
> > problem still exists...
> >
> > 2016-10-27 15:42 GMT+03:00 Γιώργος Θεοδωράκης <gi...@gmail.com>:
> >
> >> Hi,
> >> I was missing the implementations of operators, and I added the built in
> >> EnumerableRules until I create my own, in order to fix it. However, the
> >> plan I get from Volcano Optimizer is different from the one I get from
> >> HepPlanner, although I use the same rules. My problem is about
> Projection
> >> push-down. The hepPlanner pushes Projections to the bottom of the
> RelNode
> >> tree, and VolcanoPlanner keeps them always at the top (doesn't push them
> >> through joins). I use these rules in both :
> >>
> >>        ProjectRemoveRule.INSTANCE,
> >>        ProjectJoinTransposeRule.INSTANCE,
> >>         ProjectFilterTransposeRule.INSTANCE, /*it is better to use
> >> filter first and then project*/
> >>        ProjectTableScanRule.INSTANCE,
> >>        ProjectWindowTransposeRule.INSTANCE,
> >>        ProjectMergeRule.INSTANCE
> >>
> >> and
> >>                             EnumerableRules.ENUMERABLE_TABLE_SCAN_RULE,
> >>         EnumerableRules.ENUMERABLE_PROJECT_RULE,
> >>                             ...
> >>
> >> Finally , when trying to use aggregate I get this error:
> >>
> >> Exception in thread "main" java.lang.AssertionError: Internal error:
> >> Error while applying rule EnumerableTableScanRule, args
> >> [rel#4:LogicalTableScan.NONE.[](table=[s, orders])]
> >> at org.apache.calcite.util.Util.newInternal(Util.java:792)
> >> at org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch(
> >> VolcanoRuleCall.java:236)
> >> at org.apache.calcite.plan.volcano.VolcanoPlanner.findBestExp(
> >> VolcanoPlanner.java:819)
> >> at org.apache.calcite.tools.Programs$RuleSetProgram.run(Program
> >> s.java:334)
> >> at org.apache.calcite.prepare.PlannerImpl.transform(
> PlannerImpl.java:308)
> >> at calcite.VolcanoTester.main(VolcanoTester.java:106)
> >> Caused by: java.lang.AssertionError: Internal error: Error occurred
> while
> >> applying rule EnumerableTableScanRule
> >> at org.apache.calcite.util.Util.newInternal(Util.java:792)
> >> at org.apache.calcite.plan.volcano.VolcanoRuleCall.transformTo(
> >> VolcanoRuleCall.java:148)
> >> at org.apache.calcite.plan.RelOptRuleCall.transformTo(RelOptRul
> >> eCall.java:225)
> >> at org.apache.calcite.rel.convert.ConverterRule.onMatch(Convert
> >> erRule.java:117)
> >> at org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch(
> >> VolcanoRuleCall.java:213)
> >> ... 4 more
> >> Caused by: java.lang.NullPointerException
> >> at org.apache.calcite.schema.Statistics$2.isKey(Statistics.java:70)
> >> at org.apache.calcite.prepare.RelOptTableImpl.isKey(RelOptTable
> >> Impl.java:288)
> >> at org.apache.calcite.rel.metadata.RelMdColumnUniqueness.areCol
> >> umnsUnique(RelMdColumnUniqueness.java:76)
> >> at GeneratedMetadataHandler_ColumnUniqueness.areColumnsUnique_$(Unknown
> >> Source)
> >> at GeneratedMetadataHandler_ColumnUniqueness.areColumnsUnique(Unknown
> >> Source)
> >> at org.apache.calcite.rel.metadata.RelMetadataQuery.areColumnsU
> >> nique(RelMetadataQuery.java:461)
> >> at org.apache.calcite.rel.metadata.RelMdUtil.areColumnsDefinite
> >> lyUnique(RelMdUtil.java:216)
> >> at org.apache.calcite.rel.metadata.RelMdDistinctRowCount.getDis
> >> tinctRowCount(RelMdDistinctRowCount.java:75)
> >> at GeneratedMetadataHandler_DistinctRowCount.
> getDistinctRowCount_$(Unknown
> >> Source)
> >> at GeneratedMetadataHandler_DistinctRowCount.
> getDistinctRowCount(Unknown
> >> Source)
> >> at org.apache.calcite.rel.metadata.RelMetadataQuery.getDistinct
> >> RowCount(RelMetadataQuery.java:700)
> >> at org.apache.calcite.rel.metadata.RelMdDistinctRowCount.getDis
> >> tinctRowCount(RelMdDistinctRowCount.java:292)
> >> at GeneratedMetadataHandler_DistinctRowCount.
> getDistinctRowCount_$(Unknown
> >> Source)
> >> at GeneratedMetadataHandler_DistinctRowCount.
> getDistinctRowCount(Unknown
> >> Source)
> >> at org.apache.calcite.rel.metadata.RelMetadataQuery.getDistinct
> >> RowCount(RelMetadataQuery.java:700)
> >> at org.apache.calcite.rel.metadata.RelMdDistinctRowCount.getDis
> >> tinctRowCount(RelMdDistinctRowCount.java:138)
> >> at GeneratedMetadataHandler_DistinctRowCount.
> getDistinctRowCount_$(Unknown
> >> Source)
> >> at GeneratedMetadataHandler_DistinctRowCount.
> getDistinctRowCount(Unknown
> >> Source)
> >> at org.apache.calcite.rel.metadata.RelMetadataQuery.getDistinct
> >> RowCount(RelMetadataQuery.java:700)
> >> at org.apache.calcite.rel.metadata.RelMdDistinctRowCount.getDis
> >> tinctRowCount(RelMdDistinctRowCount.java:292)
> >> at GeneratedMetadataHandler_DistinctRowCount.
> getDistinctRowCount_$(Unknown
> >> Source)
> >> at GeneratedMetadataHandler_DistinctRowCount.
> getDistinctRowCount(Unknown
> >> Source)
> >> at org.apache.calcite.rel.metadata.RelMetadataQuery.getDistinct
> >> RowCount(RelMetadataQuery.java:700)
> >> at org.apache.calcite.rel.metadata.RelMdRowCount.getRowCount(
> >> RelMdRowCount.java:194)
> >> at GeneratedMetadataHandler_RowCount.getRowCount_$(Unknown Source)
> >> at GeneratedMetadataHandler_RowCount.getRowCount(Unknown Source)
> >> at org.apache.calcite.rel.metadata.RelMetadataQuery.getRowCount
> >> (RelMetadataQuery.java:201)
> >> at org.apache.calcite.rel.core.Aggregate.computeSelfCost(Aggreg
> >> ate.java:304)
> >> at org.apache.calcite.rel.metadata.RelMdPercentageOriginalRows.
> >> getNonCumulativeCost(RelMdPercentageOriginalRows.java:162)
> >> at GeneratedMetadataHandler_NonCumulativeCost.
> getNonCumulativeCost_$(Unknown
> >> Source)
> >> at GeneratedMetadataHandler_NonCumulativeCost.
> getNonCumulativeCost(Unknown
> >> Source)
> >> at org.apache.calcite.rel.metadata.RelMetadataQuery.getNonCumul
> >> ativeCost(RelMetadataQuery.java:258)
> >> at org.apache.calcite.plan.volcano.VolcanoPlanner.getCost(
> >> VolcanoPlanner.java:1128)
> >> at org.apache.calcite.plan.volcano.RelSubset.propagateCostImpro
> >> vements0(RelSubset.java:336)
> >> at org.apache.calcite.plan.volcano.RelSubset.propagateCostImpro
> >> vements(RelSubset.java:319)
> >> at org.apache.calcite.plan.volcano.RelSubset.propagateCostImpro
> >> vements0(RelSubset.java:348)
> >> at org.apache.calcite.plan.volcano.RelSubset.propagateCostImpro
> >> vements(RelSubset.java:319)
> >> at org.apache.calcite.plan.volcano.RelSubset.propagateCostImpro
> >> vements0(RelSubset.java:348)
> >> at org.apache.calcite.plan.volcano.RelSubset.propagateCostImpro
> >> vements(RelSubset.java:319)
> >> at org.apache.calcite.plan.volcano.VolcanoPlanner.addRelToSet(
> >> VolcanoPlanner.java:1830)
> >> at org.apache.calcite.plan.volcano.VolcanoPlanner.registerImpl(
> >> VolcanoPlanner.java:1766)
> >> at org.apache.calcite.plan.volcano.VolcanoPlanner.register(
> >> VolcanoPlanner.java:1032)
> >> at org.apache.calcite.plan.volcano.VolcanoPlanner.ensureRegiste
> >> red(VolcanoPlanner.java:1052)
> >> at org.apache.calcite.plan.volcano.VolcanoPlanner.ensureRegiste
> >> red(VolcanoPlanner.java:1942)
> >> at org.apache.calcite.plan.volcano.VolcanoRuleCall.transformTo(
> >> VolcanoRuleCall.java:136)
> >> ... 7 more
> >>
> >> I define the Statistics in the tables I use like this:
> >> public Statistic getStatistic() {
> >> int rowCount = rows.size();
> >> return Statistics.of(rowCount, null); //add List<ImmutableBitSet>
> >> }
> >>
> >>
> >> Thanks,
> >> George
> >>
> >> 2016-10-16 7:28 GMT+03:00 Jungtaek Lim <ka...@gmail.com>:
> >>
> >>> Hi George,
> >>>
> >>> This patch is ported version (with small fixes) of Milinda's samza-sql
> >>> implementation for Storm SQL.
> >>> https://github.com/apache/storm/pull/1736
> >>>
> >>> In this patch I removed adding HepPlanner and just rely on Volcano
> >>> Planner
> >>> (so the patch may be the closer thing what you want).
> >>> For now I also remove code regarding metadata since I'm not clear on
> how
> >>> it
> >>> works and what it helps, but I'll re-address once I can find its usage
> >>> and
> >>> benefits.
> >>>
> >>> Hope this helps.
> >>>
> >>> Thanks,
> >>> Jungtaek Lim (HeartSaVioR)
> >>>
> >>> 2016년 10월 4일 (화) 오후 7:08, Γιώργος Θεοδωράκης <giwrgosrtheod@gmail.com
> >님이
> >>> 작성:
> >>>
> >>> > I think I did as you said:
> >>> >
> >>> > https://github.com/giwrgostheod/Calcite-Saber/blob/master/sr
> >>> c/main/java/calcite/VolcanoTester.java
> >>> >
> >>> > and I get for every query I use:
> >>> > Exception in thread "main"
> >>> > org.apache.calcite.plan.RelOptPlanner$CannotPlanException: Node
> >>> > [rel#10:Subset#2.NONE.[]] could not be implemented; planner state:
> >>> > Root: rel#10:Subset#2.NONE.[]
> >>> > Original rel:
> >>> > ....
> >>> > at
> >>> >
> >>> > org.apache.calcite.plan.volcano.RelSubset$CheapestPlanReplac
> >>> er.visit(RelSubset.java:443)
> >>> > at
> >>> >
> >>> > org.apache.calcite.plan.volcano.RelSubset.buildCheapestPlan(
> >>> RelSubset.java:293)
> >>> > at
> >>> >
> >>> > org.apache.calcite.plan.volcano.VolcanoPlanner.findBestExp(V
> >>> olcanoPlanner.java:835)
> >>> > at org.apache.calcite.tools.Programs$RuleSetProgram.run(Program
> >>> s.java:334)
> >>> > at org.apache.calcite.prepare.PlannerImpl.transform(PlannerImpl
> >>> .java:308)
> >>> > at calcite.VolcanoTester.main(VolcanoTester.java:77)
> >>> >
> >>> > My table's is defined here :
> >>> >
> >>> > https://github.com/giwrgostheod/Calcite-Saber/blob/master/sr
> >>> c/main/java/calcite/utils/OrdersTableFactory.java
> >>> >
> >>> >
> >>> > Thank you for your time,
> >>> > George
> >>> >
> >>> >
> >>> > 2016-10-04 2:38 GMT+03:00 Jordan Halterman <
> jordan.halterman@gmail.com
> >>> >:
> >>> >
> >>> > > The link you provided is a pretty good example. Build a
> >>> FrameworkConfig
> >>> > > with your schema, parser config, and other information, and use it
> to
> >>> > > create a Planner. That Planner uses a VolcanoPlanner internally.
> >>> What's
> >>> > > missing from that particular example is just the addition of
> >>> programs.
> >>> > > Programs are effectively sets of rules you will use to optimize
> your
> >>> > query.
> >>> > > So, to add your FilterProjectTransposeRule to the planner, call
> this
> >>> when
> >>> > > building your FrameworkConfig:
> >>> > >
> >>> > > .programs(Programs.ofRules(FilterProjectTransposeRule.INSTANCE))
> >>> > >
> >>> > > That adds your program(s) to the set of programs in the planner,
> and
> >>> > those
> >>> > > programs can be accessed to optimize the query. Use the planner to
> >>> > parse()
> >>> > > your query, validate() your query, and then convert() your query
> >>> into a
> >>> > > logical plan. Then call...
> >>> > >
> >>> > > RelTraitSet traitSet = planner.emptyTraitSet().replac
> >>> e(Convention.NONE);
> >>> > > planner.transform(0. traitSet, logicalPlan);
> >>> > >
> >>> > > to apply the rules you added to the configuration. That should use
> >>> the
> >>> > > VolcanoPlanner to apply the rules you added in your Program. The
> >>> trait
> >>> > set
> >>> > > that's passed to that method is the required output trait set. So,
> >>> if you
> >>> > > wanted to convert the logical plan into some physical convention,
> >>> you'd
> >>> > > pass your physical convention instead of Convention.NONE. I can
> >>> respond
> >>> > > with a full example if you need it in a bit. I just don't have the
> >>> > capacity
> >>> > > to write it ATM.
> >>> > >
> >>> > > On Mon, Oct 3, 2016 at 8:51 AM, Γιώργος Θεοδωράκης <
> >>> > > giwrgosrtheod@gmail.com>
> >>> > > wrote:
> >>> > >
> >>> > > > Hi,
> >>> > > >
> >>> > > > I want to parse an Sql query and transform it to an optimized
> >>> > relational
> >>> > > > plan (not convert it to physical !!) using calcite rules based on
> >>> my
> >>> > > > database schema and metadata. Right now, the only helpful
> example I
> >>> > have
> >>> > > > found for my purpose is taken from
> >>> > > > https://github.com/milinda/samza-sql/blob/master/samza-
> >>> > > >
> >>> > sql-planner/src/main/java/org/apache/samza/sql/planner/Query
> >>> Planner.java
> >>> > > > ,
> >>> > > > in which a simple Planner is used for parsing and validating Sql
> >>> and a
> >>> > > > HepPlanner is used for searching for an optimized plan based on
> >>> > imported
> >>> > > > rules.
> >>> > > >
> >>> > > > Is there any way to use in my case the VolcanoPlanner? The only
> >>> > examples
> >>> > > I
> >>> > > > have seen so far from the test classes suggest that it should be
> >>> used
> >>> > for
> >>> > > > converting relational expressions to physical ones. How can I
> make
> >>> the
> >>> > > > Volcano Planner "see" my SchemaPlus schema ,when I can only
> define
> >>> > > > RelOptSchema? Can someone provide me with a complete example of
> >>> using
> >>> > > > Volcano Planner and adding rules, such
> >>> > > > as FilterProjectTransposeRule.INSTANCE?
> >>> > > >
> >>> > > > Thanks in advance,
> >>> > > > George
> >>> > > >
> >>> > >
> >>> >
> >>>
> >>
> >>
> >
>

Re: How to use VolcanoPlanner

Posted by Julian Hyde <jh...@apache.org>.
There is a simple form of projection push-down, namely column pruning aka field trimming. If you compute x + y from a table that has columns (x, y, z), then it will prune away z, and project only (x, y). Column pruning doesn’t fit into the Volcano model (or in fact into the general transformation rule model) because it returns RelNodes that have a different row-type.  So, use RelFieldTrimmer for that. Most production optimizers will trim early in the planning process and probably late in the planning process also.

Your point about cost-based projection push-down is well taken. The solution, I think, is to use more realistic cost functions. A project that just doesn’t compute expressions, just projects columns, should have virtually zero cost. (On top of a table scan, in a typical implementation, it actually has a negative cost: it makes the scan cheaper by reading fewer disk blocks, and returning less data.)

Other operators should also have a significantly lower cost if project has been pushed down. For example, the cost of a Sort implemented as an external merge sort should depend on log of the number of BYTES processed, not just log of the number of ROWS processed. It is the number of bytes that determines how many times the data needs to be partitioned before we get a sort run small enough to fit into available memory.

Feel free to suggest improvements to the built-in (default) cost functions. But also don’t be afraid to use “no brainer” transformations such as RelFieldTrimmer at key points in the planning process.

Julian


> On Nov 1, 2016, at 11:09 AM, Colm McHugh <co...@gmail.com> wrote:
> 
> What errors did you get? It should be possible to use both Volcano and hep
> when query planning (Drill does this, and possibly others).
> It superficially sounds like applying a heuristics pass that includes the
> project pushdown (and any other rules you may want to apply)
> after the volcano pass should work.
> 
> On Tue, Nov 1, 2016 at 4:05 AM, Γιώργος Θεοδωράκης <gi...@gmail.com>
> wrote:
> 
>> I am wondering if is it possible to push down projections in Volcano
>> generally. With the cost model of Volcano, a projection adds rows and cpu
>> cost and it can't be chosen. For example  for the next query:
>> 
>> "select s.products.productid "
>> + "from s.products,s.orders "
>> + "where s.orders.productid = s.products.productid  "
>> + "and units>5 "
>> 
>> , the optimized plan I get from Volcano is this
>> 
>> EnumerableProject(productid=[$0]): rowcount = 5.3999999999999995,
>> cumulative cost = {51.555956815368326 rows, 37.4 cpu, 0.0 io}, id = 86
>>  EnumerableJoin(condition=[=($0, $3)], joinType=[inner]): rowcount =
>> 5.3999999999999995, cumulative cost = {46.15595681536833 rows, 32.0 cpu,
>> 0.0 io}, id = 85
>>    EnumerableTableScan(table=[[s, products]]): rowcount = 6.0, cumulative
>> cost = {6.0 rows, 7.0 cpu, 0.0 io}, id = 40
>>    EnumerableFilter(condition=[>($2, 5)]): rowcount = 6.0, cumulative
>> cost
>> = {18.0 rows, 25.0 cpu, 0.0 io}, id = 84
>>      EnumerableTableScan(table=[[s, orders]]): rowcount = 12.0, cumulative
>> cost = {12.0 rows, 13.0 cpu, 0.0 io}, id = 59
>> 
>> However the plan I would like to get is the one with projections pushed
>> down, which has bigger cumulative cost as we would have two more
>> projections at the source of inputs (with additional rowCount and cpu cost)
>> with Volcano's cost model, but smaller in most cases as we reduce the
>> memory needed(which is not taken in consideration here).
>> 
>> Are there any suggestions? At first I used hepPlanner, but I didn't get
>> optimized plans for join order with the built-in
>> rules(using JoinPushThroughJoinRule.RIGHT and JoinPushThroughJoinRule.LEFT
>> never stops, and using MultiJoinRules doesn't do anything). Then, I tried
>> to use VolcanoPlanner and I got optimized plans for join order with no
>> pushed-down projections. I also tried to use a combination of both planners
>> but I got errors. What should I do?
>> 
>> 2016-10-27 16:00 GMT+03:00 Γιώργος Θεοδωράκης <gi...@gmail.com>:
>> 
>>> I fixed the second error by changing my Statistics to:
>>> 
>>> public Statistic getStatistic() {
>>> int rowCount = rows.size();
>>> return Statistics.of(rowCount, ImmutableList.<ImmutableBitSet>of());
>>> }
>>> 
>>> Any suggestions of a better solution are welcome. The project push-down
>>> problem still exists...
>>> 
>>> 2016-10-27 15:42 GMT+03:00 Γιώργος Θεοδωράκης <gi...@gmail.com>:
>>> 
>>>> Hi,
>>>> I was missing the implementations of operators, and I added the built in
>>>> EnumerableRules until I create my own, in order to fix it. However, the
>>>> plan I get from Volcano Optimizer is different from the one I get from
>>>> HepPlanner, although I use the same rules. My problem is about
>> Projection
>>>> push-down. The hepPlanner pushes Projections to the bottom of the
>> RelNode
>>>> tree, and VolcanoPlanner keeps them always at the top (doesn't push them
>>>> through joins). I use these rules in both :
>>>> 
>>>>       ProjectRemoveRule.INSTANCE,
>>>>       ProjectJoinTransposeRule.INSTANCE,
>>>>        ProjectFilterTransposeRule.INSTANCE, /*it is better to use
>>>> filter first and then project*/
>>>>       ProjectTableScanRule.INSTANCE,
>>>>       ProjectWindowTransposeRule.INSTANCE,
>>>>       ProjectMergeRule.INSTANCE
>>>> 
>>>> and
>>>>                            EnumerableRules.ENUMERABLE_TABLE_SCAN_RULE,
>>>>        EnumerableRules.ENUMERABLE_PROJECT_RULE,
>>>>                            ...
>>>> 
>>>> Finally , when trying to use aggregate I get this error:
>>>> 
>>>> Exception in thread "main" java.lang.AssertionError: Internal error:
>>>> Error while applying rule EnumerableTableScanRule, args
>>>> [rel#4:LogicalTableScan.NONE.[](table=[s, orders])]
>>>> at org.apache.calcite.util.Util.newInternal(Util.java:792)
>>>> at org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch(
>>>> VolcanoRuleCall.java:236)
>>>> at org.apache.calcite.plan.volcano.VolcanoPlanner.findBestExp(
>>>> VolcanoPlanner.java:819)
>>>> at org.apache.calcite.tools.Programs$RuleSetProgram.run(Program
>>>> s.java:334)
>>>> at org.apache.calcite.prepare.PlannerImpl.transform(
>> PlannerImpl.java:308)
>>>> at calcite.VolcanoTester.main(VolcanoTester.java:106)
>>>> Caused by: java.lang.AssertionError: Internal error: Error occurred
>> while
>>>> applying rule EnumerableTableScanRule
>>>> at org.apache.calcite.util.Util.newInternal(Util.java:792)
>>>> at org.apache.calcite.plan.volcano.VolcanoRuleCall.transformTo(
>>>> VolcanoRuleCall.java:148)
>>>> at org.apache.calcite.plan.RelOptRuleCall.transformTo(RelOptRul
>>>> eCall.java:225)
>>>> at org.apache.calcite.rel.convert.ConverterRule.onMatch(Convert
>>>> erRule.java:117)
>>>> at org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch(
>>>> VolcanoRuleCall.java:213)
>>>> ... 4 more
>>>> Caused by: java.lang.NullPointerException
>>>> at org.apache.calcite.schema.Statistics$2.isKey(Statistics.java:70)
>>>> at org.apache.calcite.prepare.RelOptTableImpl.isKey(RelOptTable
>>>> Impl.java:288)
>>>> at org.apache.calcite.rel.metadata.RelMdColumnUniqueness.areCol
>>>> umnsUnique(RelMdColumnUniqueness.java:76)
>>>> at GeneratedMetadataHandler_ColumnUniqueness.areColumnsUnique_$(Unknown
>>>> Source)
>>>> at GeneratedMetadataHandler_ColumnUniqueness.areColumnsUnique(Unknown
>>>> Source)
>>>> at org.apache.calcite.rel.metadata.RelMetadataQuery.areColumnsU
>>>> nique(RelMetadataQuery.java:461)
>>>> at org.apache.calcite.rel.metadata.RelMdUtil.areColumnsDefinite
>>>> lyUnique(RelMdUtil.java:216)
>>>> at org.apache.calcite.rel.metadata.RelMdDistinctRowCount.getDis
>>>> tinctRowCount(RelMdDistinctRowCount.java:75)
>>>> at GeneratedMetadataHandler_DistinctRowCount.
>> getDistinctRowCount_$(Unknown
>>>> Source)
>>>> at GeneratedMetadataHandler_DistinctRowCount.
>> getDistinctRowCount(Unknown
>>>> Source)
>>>> at org.apache.calcite.rel.metadata.RelMetadataQuery.getDistinct
>>>> RowCount(RelMetadataQuery.java:700)
>>>> at org.apache.calcite.rel.metadata.RelMdDistinctRowCount.getDis
>>>> tinctRowCount(RelMdDistinctRowCount.java:292)
>>>> at GeneratedMetadataHandler_DistinctRowCount.
>> getDistinctRowCount_$(Unknown
>>>> Source)
>>>> at GeneratedMetadataHandler_DistinctRowCount.
>> getDistinctRowCount(Unknown
>>>> Source)
>>>> at org.apache.calcite.rel.metadata.RelMetadataQuery.getDistinct
>>>> RowCount(RelMetadataQuery.java:700)
>>>> at org.apache.calcite.rel.metadata.RelMdDistinctRowCount.getDis
>>>> tinctRowCount(RelMdDistinctRowCount.java:138)
>>>> at GeneratedMetadataHandler_DistinctRowCount.
>> getDistinctRowCount_$(Unknown
>>>> Source)
>>>> at GeneratedMetadataHandler_DistinctRowCount.
>> getDistinctRowCount(Unknown
>>>> Source)
>>>> at org.apache.calcite.rel.metadata.RelMetadataQuery.getDistinct
>>>> RowCount(RelMetadataQuery.java:700)
>>>> at org.apache.calcite.rel.metadata.RelMdDistinctRowCount.getDis
>>>> tinctRowCount(RelMdDistinctRowCount.java:292)
>>>> at GeneratedMetadataHandler_DistinctRowCount.
>> getDistinctRowCount_$(Unknown
>>>> Source)
>>>> at GeneratedMetadataHandler_DistinctRowCount.
>> getDistinctRowCount(Unknown
>>>> Source)
>>>> at org.apache.calcite.rel.metadata.RelMetadataQuery.getDistinct
>>>> RowCount(RelMetadataQuery.java:700)
>>>> at org.apache.calcite.rel.metadata.RelMdRowCount.getRowCount(
>>>> RelMdRowCount.java:194)
>>>> at GeneratedMetadataHandler_RowCount.getRowCount_$(Unknown Source)
>>>> at GeneratedMetadataHandler_RowCount.getRowCount(Unknown Source)
>>>> at org.apache.calcite.rel.metadata.RelMetadataQuery.getRowCount
>>>> (RelMetadataQuery.java:201)
>>>> at org.apache.calcite.rel.core.Aggregate.computeSelfCost(Aggreg
>>>> ate.java:304)
>>>> at org.apache.calcite.rel.metadata.RelMdPercentageOriginalRows.
>>>> getNonCumulativeCost(RelMdPercentageOriginalRows.java:162)
>>>> at GeneratedMetadataHandler_NonCumulativeCost.
>> getNonCumulativeCost_$(Unknown
>>>> Source)
>>>> at GeneratedMetadataHandler_NonCumulativeCost.
>> getNonCumulativeCost(Unknown
>>>> Source)
>>>> at org.apache.calcite.rel.metadata.RelMetadataQuery.getNonCumul
>>>> ativeCost(RelMetadataQuery.java:258)
>>>> at org.apache.calcite.plan.volcano.VolcanoPlanner.getCost(
>>>> VolcanoPlanner.java:1128)
>>>> at org.apache.calcite.plan.volcano.RelSubset.propagateCostImpro
>>>> vements0(RelSubset.java:336)
>>>> at org.apache.calcite.plan.volcano.RelSubset.propagateCostImpro
>>>> vements(RelSubset.java:319)
>>>> at org.apache.calcite.plan.volcano.RelSubset.propagateCostImpro
>>>> vements0(RelSubset.java:348)
>>>> at org.apache.calcite.plan.volcano.RelSubset.propagateCostImpro
>>>> vements(RelSubset.java:319)
>>>> at org.apache.calcite.plan.volcano.RelSubset.propagateCostImpro
>>>> vements0(RelSubset.java:348)
>>>> at org.apache.calcite.plan.volcano.RelSubset.propagateCostImpro
>>>> vements(RelSubset.java:319)
>>>> at org.apache.calcite.plan.volcano.VolcanoPlanner.addRelToSet(
>>>> VolcanoPlanner.java:1830)
>>>> at org.apache.calcite.plan.volcano.VolcanoPlanner.registerImpl(
>>>> VolcanoPlanner.java:1766)
>>>> at org.apache.calcite.plan.volcano.VolcanoPlanner.register(
>>>> VolcanoPlanner.java:1032)
>>>> at org.apache.calcite.plan.volcano.VolcanoPlanner.ensureRegiste
>>>> red(VolcanoPlanner.java:1052)
>>>> at org.apache.calcite.plan.volcano.VolcanoPlanner.ensureRegiste
>>>> red(VolcanoPlanner.java:1942)
>>>> at org.apache.calcite.plan.volcano.VolcanoRuleCall.transformTo(
>>>> VolcanoRuleCall.java:136)
>>>> ... 7 more
>>>> 
>>>> I define the Statistics in the tables I use like this:
>>>> public Statistic getStatistic() {
>>>> int rowCount = rows.size();
>>>> return Statistics.of(rowCount, null); //add List<ImmutableBitSet>
>>>> }
>>>> 
>>>> 
>>>> Thanks,
>>>> George
>>>> 
>>>> 2016-10-16 7:28 GMT+03:00 Jungtaek Lim <ka...@gmail.com>:
>>>> 
>>>>> Hi George,
>>>>> 
>>>>> This patch is ported version (with small fixes) of Milinda's samza-sql
>>>>> implementation for Storm SQL.
>>>>> https://github.com/apache/storm/pull/1736
>>>>> 
>>>>> In this patch I removed adding HepPlanner and just rely on Volcano
>>>>> Planner
>>>>> (so the patch may be the closer thing what you want).
>>>>> For now I also remove code regarding metadata since I'm not clear on
>> how
>>>>> it
>>>>> works and what it helps, but I'll re-address once I can find its usage
>>>>> and
>>>>> benefits.
>>>>> 
>>>>> Hope this helps.
>>>>> 
>>>>> Thanks,
>>>>> Jungtaek Lim (HeartSaVioR)
>>>>> 
>>>>> 2016년 10월 4일 (화) 오후 7:08, Γιώργος Θεοδωράκης <giwrgosrtheod@gmail.com
>>> 님이
>>>>> 작성:
>>>>> 
>>>>>> I think I did as you said:
>>>>>> 
>>>>>> https://github.com/giwrgostheod/Calcite-Saber/blob/master/sr
>>>>> c/main/java/calcite/VolcanoTester.java
>>>>>> 
>>>>>> and I get for every query I use:
>>>>>> Exception in thread "main"
>>>>>> org.apache.calcite.plan.RelOptPlanner$CannotPlanException: Node
>>>>>> [rel#10:Subset#2.NONE.[]] could not be implemented; planner state:
>>>>>> Root: rel#10:Subset#2.NONE.[]
>>>>>> Original rel:
>>>>>> ....
>>>>>> at
>>>>>> 
>>>>>> org.apache.calcite.plan.volcano.RelSubset$CheapestPlanReplac
>>>>> er.visit(RelSubset.java:443)
>>>>>> at
>>>>>> 
>>>>>> org.apache.calcite.plan.volcano.RelSubset.buildCheapestPlan(
>>>>> RelSubset.java:293)
>>>>>> at
>>>>>> 
>>>>>> org.apache.calcite.plan.volcano.VolcanoPlanner.findBestExp(V
>>>>> olcanoPlanner.java:835)
>>>>>> at org.apache.calcite.tools.Programs$RuleSetProgram.run(Program
>>>>> s.java:334)
>>>>>> at org.apache.calcite.prepare.PlannerImpl.transform(PlannerImpl
>>>>> .java:308)
>>>>>> at calcite.VolcanoTester.main(VolcanoTester.java:77)
>>>>>> 
>>>>>> My table's is defined here :
>>>>>> 
>>>>>> https://github.com/giwrgostheod/Calcite-Saber/blob/master/sr
>>>>> c/main/java/calcite/utils/OrdersTableFactory.java
>>>>>> 
>>>>>> 
>>>>>> Thank you for your time,
>>>>>> George
>>>>>> 
>>>>>> 
>>>>>> 2016-10-04 2:38 GMT+03:00 Jordan Halterman <
>> jordan.halterman@gmail.com
>>>>>> :
>>>>>> 
>>>>>>> The link you provided is a pretty good example. Build a
>>>>> FrameworkConfig
>>>>>>> with your schema, parser config, and other information, and use it
>> to
>>>>>>> create a Planner. That Planner uses a VolcanoPlanner internally.
>>>>> What's
>>>>>>> missing from that particular example is just the addition of
>>>>> programs.
>>>>>>> Programs are effectively sets of rules you will use to optimize
>> your
>>>>>> query.
>>>>>>> So, to add your FilterProjectTransposeRule to the planner, call
>> this
>>>>> when
>>>>>>> building your FrameworkConfig:
>>>>>>> 
>>>>>>> .programs(Programs.ofRules(FilterProjectTransposeRule.INSTANCE))
>>>>>>> 
>>>>>>> That adds your program(s) to the set of programs in the planner,
>> and
>>>>>> those
>>>>>>> programs can be accessed to optimize the query. Use the planner to
>>>>>> parse()
>>>>>>> your query, validate() your query, and then convert() your query
>>>>> into a
>>>>>>> logical plan. Then call...
>>>>>>> 
>>>>>>> RelTraitSet traitSet = planner.emptyTraitSet().replac
>>>>> e(Convention.NONE);
>>>>>>> planner.transform(0. traitSet, logicalPlan);
>>>>>>> 
>>>>>>> to apply the rules you added to the configuration. That should use
>>>>> the
>>>>>>> VolcanoPlanner to apply the rules you added in your Program. The
>>>>> trait
>>>>>> set
>>>>>>> that's passed to that method is the required output trait set. So,
>>>>> if you
>>>>>>> wanted to convert the logical plan into some physical convention,
>>>>> you'd
>>>>>>> pass your physical convention instead of Convention.NONE. I can
>>>>> respond
>>>>>>> with a full example if you need it in a bit. I just don't have the
>>>>>> capacity
>>>>>>> to write it ATM.
>>>>>>> 
>>>>>>> On Mon, Oct 3, 2016 at 8:51 AM, Γιώργος Θεοδωράκης <
>>>>>>> giwrgosrtheod@gmail.com>
>>>>>>> wrote:
>>>>>>> 
>>>>>>>> Hi,
>>>>>>>> 
>>>>>>>> I want to parse an Sql query and transform it to an optimized
>>>>>> relational
>>>>>>>> plan (not convert it to physical !!) using calcite rules based on
>>>>> my
>>>>>>>> database schema and metadata. Right now, the only helpful
>> example I
>>>>>> have
>>>>>>>> found for my purpose is taken from
>>>>>>>> https://github.com/milinda/samza-sql/blob/master/samza-
>>>>>>>> 
>>>>>> sql-planner/src/main/java/org/apache/samza/sql/planner/Query
>>>>> Planner.java
>>>>>>>> ,
>>>>>>>> in which a simple Planner is used for parsing and validating Sql
>>>>> and a
>>>>>>>> HepPlanner is used for searching for an optimized plan based on
>>>>>> imported
>>>>>>>> rules.
>>>>>>>> 
>>>>>>>> Is there any way to use in my case the VolcanoPlanner? The only
>>>>>> examples
>>>>>>> I
>>>>>>>> have seen so far from the test classes suggest that it should be
>>>>> used
>>>>>> for
>>>>>>>> converting relational expressions to physical ones. How can I
>> make
>>>>> the
>>>>>>>> Volcano Planner "see" my SchemaPlus schema ,when I can only
>> define
>>>>>>>> RelOptSchema? Can someone provide me with a complete example of
>>>>> using
>>>>>>>> Volcano Planner and adding rules, such
>>>>>>>> as FilterProjectTransposeRule.INSTANCE?
>>>>>>>> 
>>>>>>>> Thanks in advance,
>>>>>>>> George
>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>> 
>>>> 
>>>> 
>>> 
>> 


Re: How to use VolcanoPlanner

Posted by Γιώργος Θεοδωράκης <gi...@gmail.com>.
I fixed the errors (they occurred  because of the way I added
EnumerableRules in Volcano).

I have implemented something like what you suggested:
1)I use VolcanoPlanner (with both simple and EnumerableRules), and get a
plan with EnumerableRules. As I have found by the hard way, I cannot get a
plan with LogicalOperators (this is what I would prefer) from
VolcanoPlanner, and I have to use some Convention Rules and Traits.
2)I use hepPlanner with two custom Rules I created, which pushdown
EnumerableProjections through EnumerableJoins (this is what I wanted). The
problem is that, my final plan has every rule as EnumerableOperator, except
from LogicalProjections because of my implementation of the custom rules.
The result is fine for now, but I would try to fix it.

 If I am missing something, I look forward to better suggestions.

Thanks,
George

2016-11-01 20:09 GMT+02:00 Colm McHugh <co...@gmail.com>:

> What errors did you get? It should be possible to use both Volcano and hep
> when query planning (Drill does this, and possibly others).
> It superficially sounds like applying a heuristics pass that includes the
> project pushdown (and any other rules you may want to apply)
> after the volcano pass should work.
>
> On Tue, Nov 1, 2016 at 4:05 AM, Γιώργος Θεοδωράκης <
> giwrgosrtheod@gmail.com>
> wrote:
>
> > I am wondering if is it possible to push down projections in Volcano
> > generally. With the cost model of Volcano, a projection adds rows and cpu
> > cost and it can't be chosen. For example  for the next query:
> >
> > "select s.products.productid "
> > + "from s.products,s.orders "
> > + "where s.orders.productid = s.products.productid  "
> > + "and units>5 "
> >
> > , the optimized plan I get from Volcano is this
> >
> > EnumerableProject(productid=[$0]): rowcount = 5.3999999999999995,
> > cumulative cost = {51.555956815368326 rows, 37.4 cpu, 0.0 io}, id = 86
> >   EnumerableJoin(condition=[=($0, $3)], joinType=[inner]): rowcount =
> > 5.3999999999999995, cumulative cost = {46.15595681536833 rows, 32.0 cpu,
> > 0.0 io}, id = 85
> >     EnumerableTableScan(table=[[s, products]]): rowcount = 6.0,
> cumulative
> > cost = {6.0 rows, 7.0 cpu, 0.0 io}, id = 40
> >     EnumerableFilter(condition=[>($2, 5)]): rowcount = 6.0, cumulative
> > cost
> > = {18.0 rows, 25.0 cpu, 0.0 io}, id = 84
> >       EnumerableTableScan(table=[[s, orders]]): rowcount = 12.0,
> cumulative
> > cost = {12.0 rows, 13.0 cpu, 0.0 io}, id = 59
> >
> > However the plan I would like to get is the one with projections pushed
> > down, which has bigger cumulative cost as we would have two more
> > projections at the source of inputs (with additional rowCount and cpu
> cost)
> > with Volcano's cost model, but smaller in most cases as we reduce the
> > memory needed(which is not taken in consideration here).
> >
> > Are there any suggestions? At first I used hepPlanner, but I didn't get
> > optimized plans for join order with the built-in
> > rules(using JoinPushThroughJoinRule.RIGHT and
> JoinPushThroughJoinRule.LEFT
> > never stops, and using MultiJoinRules doesn't do anything). Then, I tried
> > to use VolcanoPlanner and I got optimized plans for join order with no
> > pushed-down projections. I also tried to use a combination of both
> planners
> > but I got errors. What should I do?
> >
> > 2016-10-27 16:00 GMT+03:00 Γιώργος Θεοδωράκης <gi...@gmail.com>:
> >
> > > I fixed the second error by changing my Statistics to:
> > >
> > > public Statistic getStatistic() {
> > > int rowCount = rows.size();
> > > return Statistics.of(rowCount, ImmutableList.<ImmutableBitSet>of());
> > > }
> > >
> > > Any suggestions of a better solution are welcome. The project push-down
> > > problem still exists...
> > >
> > > 2016-10-27 15:42 GMT+03:00 Γιώργος Θεοδωράκης <giwrgosrtheod@gmail.com
> >:
> > >
> > >> Hi,
> > >> I was missing the implementations of operators, and I added the built
> in
> > >> EnumerableRules until I create my own, in order to fix it. However,
> the
> > >> plan I get from Volcano Optimizer is different from the one I get from
> > >> HepPlanner, although I use the same rules. My problem is about
> > Projection
> > >> push-down. The hepPlanner pushes Projections to the bottom of the
> > RelNode
> > >> tree, and VolcanoPlanner keeps them always at the top (doesn't push
> them
> > >> through joins). I use these rules in both :
> > >>
> > >>        ProjectRemoveRule.INSTANCE,
> > >>        ProjectJoinTransposeRule.INSTANCE,
> > >>         ProjectFilterTransposeRule.INSTANCE, /*it is better to use
> > >> filter first and then project*/
> > >>        ProjectTableScanRule.INSTANCE,
> > >>        ProjectWindowTransposeRule.INSTANCE,
> > >>        ProjectMergeRule.INSTANCE
> > >>
> > >> and
> > >>                             EnumerableRules.ENUMERABLE_
> TABLE_SCAN_RULE,
> > >>         EnumerableRules.ENUMERABLE_PROJECT_RULE,
> > >>                             ...
> > >>
> > >> Finally , when trying to use aggregate I get this error:
> > >>
> > >> Exception in thread "main" java.lang.AssertionError: Internal error:
> > >> Error while applying rule EnumerableTableScanRule, args
> > >> [rel#4:LogicalTableScan.NONE.[](table=[s, orders])]
> > >> at org.apache.calcite.util.Util.newInternal(Util.java:792)
> > >> at org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch(
> > >> VolcanoRuleCall.java:236)
> > >> at org.apache.calcite.plan.volcano.VolcanoPlanner.findBestExp(
> > >> VolcanoPlanner.java:819)
> > >> at org.apache.calcite.tools.Programs$RuleSetProgram.run(Program
> > >> s.java:334)
> > >> at org.apache.calcite.prepare.PlannerImpl.transform(
> > PlannerImpl.java:308)
> > >> at calcite.VolcanoTester.main(VolcanoTester.java:106)
> > >> Caused by: java.lang.AssertionError: Internal error: Error occurred
> > while
> > >> applying rule EnumerableTableScanRule
> > >> at org.apache.calcite.util.Util.newInternal(Util.java:792)
> > >> at org.apache.calcite.plan.volcano.VolcanoRuleCall.transformTo(
> > >> VolcanoRuleCall.java:148)
> > >> at org.apache.calcite.plan.RelOptRuleCall.transformTo(RelOptRul
> > >> eCall.java:225)
> > >> at org.apache.calcite.rel.convert.ConverterRule.onMatch(Convert
> > >> erRule.java:117)
> > >> at org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch(
> > >> VolcanoRuleCall.java:213)
> > >> ... 4 more
> > >> Caused by: java.lang.NullPointerException
> > >> at org.apache.calcite.schema.Statistics$2.isKey(Statistics.java:70)
> > >> at org.apache.calcite.prepare.RelOptTableImpl.isKey(RelOptTable
> > >> Impl.java:288)
> > >> at org.apache.calcite.rel.metadata.RelMdColumnUniqueness.areCol
> > >> umnsUnique(RelMdColumnUniqueness.java:76)
> > >> at GeneratedMetadataHandler_ColumnUniqueness.
> areColumnsUnique_$(Unknown
> > >> Source)
> > >> at GeneratedMetadataHandler_ColumnUniqueness.areColumnsUnique(Unknown
> > >> Source)
> > >> at org.apache.calcite.rel.metadata.RelMetadataQuery.areColumnsU
> > >> nique(RelMetadataQuery.java:461)
> > >> at org.apache.calcite.rel.metadata.RelMdUtil.areColumnsDefinite
> > >> lyUnique(RelMdUtil.java:216)
> > >> at org.apache.calcite.rel.metadata.RelMdDistinctRowCount.getDis
> > >> tinctRowCount(RelMdDistinctRowCount.java:75)
> > >> at GeneratedMetadataHandler_DistinctRowCount.
> > getDistinctRowCount_$(Unknown
> > >> Source)
> > >> at GeneratedMetadataHandler_DistinctRowCount.
> > getDistinctRowCount(Unknown
> > >> Source)
> > >> at org.apache.calcite.rel.metadata.RelMetadataQuery.getDistinct
> > >> RowCount(RelMetadataQuery.java:700)
> > >> at org.apache.calcite.rel.metadata.RelMdDistinctRowCount.getDis
> > >> tinctRowCount(RelMdDistinctRowCount.java:292)
> > >> at GeneratedMetadataHandler_DistinctRowCount.
> > getDistinctRowCount_$(Unknown
> > >> Source)
> > >> at GeneratedMetadataHandler_DistinctRowCount.
> > getDistinctRowCount(Unknown
> > >> Source)
> > >> at org.apache.calcite.rel.metadata.RelMetadataQuery.getDistinct
> > >> RowCount(RelMetadataQuery.java:700)
> > >> at org.apache.calcite.rel.metadata.RelMdDistinctRowCount.getDis
> > >> tinctRowCount(RelMdDistinctRowCount.java:138)
> > >> at GeneratedMetadataHandler_DistinctRowCount.
> > getDistinctRowCount_$(Unknown
> > >> Source)
> > >> at GeneratedMetadataHandler_DistinctRowCount.
> > getDistinctRowCount(Unknown
> > >> Source)
> > >> at org.apache.calcite.rel.metadata.RelMetadataQuery.getDistinct
> > >> RowCount(RelMetadataQuery.java:700)
> > >> at org.apache.calcite.rel.metadata.RelMdDistinctRowCount.getDis
> > >> tinctRowCount(RelMdDistinctRowCount.java:292)
> > >> at GeneratedMetadataHandler_DistinctRowCount.
> > getDistinctRowCount_$(Unknown
> > >> Source)
> > >> at GeneratedMetadataHandler_DistinctRowCount.
> > getDistinctRowCount(Unknown
> > >> Source)
> > >> at org.apache.calcite.rel.metadata.RelMetadataQuery.getDistinct
> > >> RowCount(RelMetadataQuery.java:700)
> > >> at org.apache.calcite.rel.metadata.RelMdRowCount.getRowCount(
> > >> RelMdRowCount.java:194)
> > >> at GeneratedMetadataHandler_RowCount.getRowCount_$(Unknown Source)
> > >> at GeneratedMetadataHandler_RowCount.getRowCount(Unknown Source)
> > >> at org.apache.calcite.rel.metadata.RelMetadataQuery.getRowCount
> > >> (RelMetadataQuery.java:201)
> > >> at org.apache.calcite.rel.core.Aggregate.computeSelfCost(Aggreg
> > >> ate.java:304)
> > >> at org.apache.calcite.rel.metadata.RelMdPercentageOriginalRows.
> > >> getNonCumulativeCost(RelMdPercentageOriginalRows.java:162)
> > >> at GeneratedMetadataHandler_NonCumulativeCost.
> > getNonCumulativeCost_$(Unknown
> > >> Source)
> > >> at GeneratedMetadataHandler_NonCumulativeCost.
> > getNonCumulativeCost(Unknown
> > >> Source)
> > >> at org.apache.calcite.rel.metadata.RelMetadataQuery.getNonCumul
> > >> ativeCost(RelMetadataQuery.java:258)
> > >> at org.apache.calcite.plan.volcano.VolcanoPlanner.getCost(
> > >> VolcanoPlanner.java:1128)
> > >> at org.apache.calcite.plan.volcano.RelSubset.propagateCostImpro
> > >> vements0(RelSubset.java:336)
> > >> at org.apache.calcite.plan.volcano.RelSubset.propagateCostImpro
> > >> vements(RelSubset.java:319)
> > >> at org.apache.calcite.plan.volcano.RelSubset.propagateCostImpro
> > >> vements0(RelSubset.java:348)
> > >> at org.apache.calcite.plan.volcano.RelSubset.propagateCostImpro
> > >> vements(RelSubset.java:319)
> > >> at org.apache.calcite.plan.volcano.RelSubset.propagateCostImpro
> > >> vements0(RelSubset.java:348)
> > >> at org.apache.calcite.plan.volcano.RelSubset.propagateCostImpro
> > >> vements(RelSubset.java:319)
> > >> at org.apache.calcite.plan.volcano.VolcanoPlanner.addRelToSet(
> > >> VolcanoPlanner.java:1830)
> > >> at org.apache.calcite.plan.volcano.VolcanoPlanner.registerImpl(
> > >> VolcanoPlanner.java:1766)
> > >> at org.apache.calcite.plan.volcano.VolcanoPlanner.register(
> > >> VolcanoPlanner.java:1032)
> > >> at org.apache.calcite.plan.volcano.VolcanoPlanner.ensureRegiste
> > >> red(VolcanoPlanner.java:1052)
> > >> at org.apache.calcite.plan.volcano.VolcanoPlanner.ensureRegiste
> > >> red(VolcanoPlanner.java:1942)
> > >> at org.apache.calcite.plan.volcano.VolcanoRuleCall.transformTo(
> > >> VolcanoRuleCall.java:136)
> > >> ... 7 more
> > >>
> > >> I define the Statistics in the tables I use like this:
> > >> public Statistic getStatistic() {
> > >> int rowCount = rows.size();
> > >> return Statistics.of(rowCount, null); //add List<ImmutableBitSet>
> > >> }
> > >>
> > >>
> > >> Thanks,
> > >> George
> > >>
> > >> 2016-10-16 7:28 GMT+03:00 Jungtaek Lim <ka...@gmail.com>:
> > >>
> > >>> Hi George,
> > >>>
> > >>> This patch is ported version (with small fixes) of Milinda's
> samza-sql
> > >>> implementation for Storm SQL.
> > >>> https://github.com/apache/storm/pull/1736
> > >>>
> > >>> In this patch I removed adding HepPlanner and just rely on Volcano
> > >>> Planner
> > >>> (so the patch may be the closer thing what you want).
> > >>> For now I also remove code regarding metadata since I'm not clear on
> > how
> > >>> it
> > >>> works and what it helps, but I'll re-address once I can find its
> usage
> > >>> and
> > >>> benefits.
> > >>>
> > >>> Hope this helps.
> > >>>
> > >>> Thanks,
> > >>> Jungtaek Lim (HeartSaVioR)
> > >>>
> > >>> 2016년 10월 4일 (화) 오후 7:08, Γιώργος Θεοδωράκης <
> giwrgosrtheod@gmail.com
> > >님이
> > >>> 작성:
> > >>>
> > >>> > I think I did as you said:
> > >>> >
> > >>> > https://github.com/giwrgostheod/Calcite-Saber/blob/master/sr
> > >>> c/main/java/calcite/VolcanoTester.java
> > >>> >
> > >>> > and I get for every query I use:
> > >>> > Exception in thread "main"
> > >>> > org.apache.calcite.plan.RelOptPlanner$CannotPlanException: Node
> > >>> > [rel#10:Subset#2.NONE.[]] could not be implemented; planner state:
> > >>> > Root: rel#10:Subset#2.NONE.[]
> > >>> > Original rel:
> > >>> > ....
> > >>> > at
> > >>> >
> > >>> > org.apache.calcite.plan.volcano.RelSubset$CheapestPlanReplac
> > >>> er.visit(RelSubset.java:443)
> > >>> > at
> > >>> >
> > >>> > org.apache.calcite.plan.volcano.RelSubset.buildCheapestPlan(
> > >>> RelSubset.java:293)
> > >>> > at
> > >>> >
> > >>> > org.apache.calcite.plan.volcano.VolcanoPlanner.findBestExp(V
> > >>> olcanoPlanner.java:835)
> > >>> > at org.apache.calcite.tools.Programs$RuleSetProgram.run(Program
> > >>> s.java:334)
> > >>> > at org.apache.calcite.prepare.PlannerImpl.transform(PlannerImpl
> > >>> .java:308)
> > >>> > at calcite.VolcanoTester.main(VolcanoTester.java:77)
> > >>> >
> > >>> > My table's is defined here :
> > >>> >
> > >>> > https://github.com/giwrgostheod/Calcite-Saber/blob/master/sr
> > >>> c/main/java/calcite/utils/OrdersTableFactory.java
> > >>> >
> > >>> >
> > >>> > Thank you for your time,
> > >>> > George
> > >>> >
> > >>> >
> > >>> > 2016-10-04 2:38 GMT+03:00 Jordan Halterman <
> > jordan.halterman@gmail.com
> > >>> >:
> > >>> >
> > >>> > > The link you provided is a pretty good example. Build a
> > >>> FrameworkConfig
> > >>> > > with your schema, parser config, and other information, and use
> it
> > to
> > >>> > > create a Planner. That Planner uses a VolcanoPlanner internally.
> > >>> What's
> > >>> > > missing from that particular example is just the addition of
> > >>> programs.
> > >>> > > Programs are effectively sets of rules you will use to optimize
> > your
> > >>> > query.
> > >>> > > So, to add your FilterProjectTransposeRule to the planner, call
> > this
> > >>> when
> > >>> > > building your FrameworkConfig:
> > >>> > >
> > >>> > > .programs(Programs.ofRules(FilterProjectTransposeRule.INSTANCE))
> > >>> > >
> > >>> > > That adds your program(s) to the set of programs in the planner,
> > and
> > >>> > those
> > >>> > > programs can be accessed to optimize the query. Use the planner
> to
> > >>> > parse()
> > >>> > > your query, validate() your query, and then convert() your query
> > >>> into a
> > >>> > > logical plan. Then call...
> > >>> > >
> > >>> > > RelTraitSet traitSet = planner.emptyTraitSet().replac
> > >>> e(Convention.NONE);
> > >>> > > planner.transform(0. traitSet, logicalPlan);
> > >>> > >
> > >>> > > to apply the rules you added to the configuration. That should
> use
> > >>> the
> > >>> > > VolcanoPlanner to apply the rules you added in your Program. The
> > >>> trait
> > >>> > set
> > >>> > > that's passed to that method is the required output trait set.
> So,
> > >>> if you
> > >>> > > wanted to convert the logical plan into some physical convention,
> > >>> you'd
> > >>> > > pass your physical convention instead of Convention.NONE. I can
> > >>> respond
> > >>> > > with a full example if you need it in a bit. I just don't have
> the
> > >>> > capacity
> > >>> > > to write it ATM.
> > >>> > >
> > >>> > > On Mon, Oct 3, 2016 at 8:51 AM, Γιώργος Θεοδωράκης <
> > >>> > > giwrgosrtheod@gmail.com>
> > >>> > > wrote:
> > >>> > >
> > >>> > > > Hi,
> > >>> > > >
> > >>> > > > I want to parse an Sql query and transform it to an optimized
> > >>> > relational
> > >>> > > > plan (not convert it to physical !!) using calcite rules based
> on
> > >>> my
> > >>> > > > database schema and metadata. Right now, the only helpful
> > example I
> > >>> > have
> > >>> > > > found for my purpose is taken from
> > >>> > > > https://github.com/milinda/samza-sql/blob/master/samza-
> > >>> > > >
> > >>> > sql-planner/src/main/java/org/apache/samza/sql/planner/Query
> > >>> Planner.java
> > >>> > > > ,
> > >>> > > > in which a simple Planner is used for parsing and validating
> Sql
> > >>> and a
> > >>> > > > HepPlanner is used for searching for an optimized plan based on
> > >>> > imported
> > >>> > > > rules.
> > >>> > > >
> > >>> > > > Is there any way to use in my case the VolcanoPlanner? The only
> > >>> > examples
> > >>> > > I
> > >>> > > > have seen so far from the test classes suggest that it should
> be
> > >>> used
> > >>> > for
> > >>> > > > converting relational expressions to physical ones. How can I
> > make
> > >>> the
> > >>> > > > Volcano Planner "see" my SchemaPlus schema ,when I can only
> > define
> > >>> > > > RelOptSchema? Can someone provide me with a complete example of
> > >>> using
> > >>> > > > Volcano Planner and adding rules, such
> > >>> > > > as FilterProjectTransposeRule.INSTANCE?
> > >>> > > >
> > >>> > > > Thanks in advance,
> > >>> > > > George
> > >>> > > >
> > >>> > >
> > >>> >
> > >>>
> > >>
> > >>
> > >
> >
>

Re: How to use VolcanoPlanner

Posted by Colm McHugh <co...@gmail.com>.
What errors did you get? It should be possible to use both Volcano and hep
when query planning (Drill does this, and possibly others).
It superficially sounds like applying a heuristics pass that includes the
project pushdown (and any other rules you may want to apply)
after the volcano pass should work.

On Tue, Nov 1, 2016 at 4:05 AM, Γιώργος Θεοδωράκης <gi...@gmail.com>
wrote:

> I am wondering if is it possible to push down projections in Volcano
> generally. With the cost model of Volcano, a projection adds rows and cpu
> cost and it can't be chosen. For example  for the next query:
>
> "select s.products.productid "
> + "from s.products,s.orders "
> + "where s.orders.productid = s.products.productid  "
> + "and units>5 "
>
> , the optimized plan I get from Volcano is this
>
> EnumerableProject(productid=[$0]): rowcount = 5.3999999999999995,
> cumulative cost = {51.555956815368326 rows, 37.4 cpu, 0.0 io}, id = 86
>   EnumerableJoin(condition=[=($0, $3)], joinType=[inner]): rowcount =
> 5.3999999999999995, cumulative cost = {46.15595681536833 rows, 32.0 cpu,
> 0.0 io}, id = 85
>     EnumerableTableScan(table=[[s, products]]): rowcount = 6.0, cumulative
> cost = {6.0 rows, 7.0 cpu, 0.0 io}, id = 40
>     EnumerableFilter(condition=[>($2, 5)]): rowcount = 6.0, cumulative
> cost
> = {18.0 rows, 25.0 cpu, 0.0 io}, id = 84
>       EnumerableTableScan(table=[[s, orders]]): rowcount = 12.0, cumulative
> cost = {12.0 rows, 13.0 cpu, 0.0 io}, id = 59
>
> However the plan I would like to get is the one with projections pushed
> down, which has bigger cumulative cost as we would have two more
> projections at the source of inputs (with additional rowCount and cpu cost)
> with Volcano's cost model, but smaller in most cases as we reduce the
> memory needed(which is not taken in consideration here).
>
> Are there any suggestions? At first I used hepPlanner, but I didn't get
> optimized plans for join order with the built-in
> rules(using JoinPushThroughJoinRule.RIGHT and JoinPushThroughJoinRule.LEFT
> never stops, and using MultiJoinRules doesn't do anything). Then, I tried
> to use VolcanoPlanner and I got optimized plans for join order with no
> pushed-down projections. I also tried to use a combination of both planners
> but I got errors. What should I do?
>
> 2016-10-27 16:00 GMT+03:00 Γιώργος Θεοδωράκης <gi...@gmail.com>:
>
> > I fixed the second error by changing my Statistics to:
> >
> > public Statistic getStatistic() {
> > int rowCount = rows.size();
> > return Statistics.of(rowCount, ImmutableList.<ImmutableBitSet>of());
> > }
> >
> > Any suggestions of a better solution are welcome. The project push-down
> > problem still exists...
> >
> > 2016-10-27 15:42 GMT+03:00 Γιώργος Θεοδωράκης <gi...@gmail.com>:
> >
> >> Hi,
> >> I was missing the implementations of operators, and I added the built in
> >> EnumerableRules until I create my own, in order to fix it. However, the
> >> plan I get from Volcano Optimizer is different from the one I get from
> >> HepPlanner, although I use the same rules. My problem is about
> Projection
> >> push-down. The hepPlanner pushes Projections to the bottom of the
> RelNode
> >> tree, and VolcanoPlanner keeps them always at the top (doesn't push them
> >> through joins). I use these rules in both :
> >>
> >>        ProjectRemoveRule.INSTANCE,
> >>        ProjectJoinTransposeRule.INSTANCE,
> >>         ProjectFilterTransposeRule.INSTANCE, /*it is better to use
> >> filter first and then project*/
> >>        ProjectTableScanRule.INSTANCE,
> >>        ProjectWindowTransposeRule.INSTANCE,
> >>        ProjectMergeRule.INSTANCE
> >>
> >> and
> >>                             EnumerableRules.ENUMERABLE_TABLE_SCAN_RULE,
> >>         EnumerableRules.ENUMERABLE_PROJECT_RULE,
> >>                             ...
> >>
> >> Finally , when trying to use aggregate I get this error:
> >>
> >> Exception in thread "main" java.lang.AssertionError: Internal error:
> >> Error while applying rule EnumerableTableScanRule, args
> >> [rel#4:LogicalTableScan.NONE.[](table=[s, orders])]
> >> at org.apache.calcite.util.Util.newInternal(Util.java:792)
> >> at org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch(
> >> VolcanoRuleCall.java:236)
> >> at org.apache.calcite.plan.volcano.VolcanoPlanner.findBestExp(
> >> VolcanoPlanner.java:819)
> >> at org.apache.calcite.tools.Programs$RuleSetProgram.run(Program
> >> s.java:334)
> >> at org.apache.calcite.prepare.PlannerImpl.transform(
> PlannerImpl.java:308)
> >> at calcite.VolcanoTester.main(VolcanoTester.java:106)
> >> Caused by: java.lang.AssertionError: Internal error: Error occurred
> while
> >> applying rule EnumerableTableScanRule
> >> at org.apache.calcite.util.Util.newInternal(Util.java:792)
> >> at org.apache.calcite.plan.volcano.VolcanoRuleCall.transformTo(
> >> VolcanoRuleCall.java:148)
> >> at org.apache.calcite.plan.RelOptRuleCall.transformTo(RelOptRul
> >> eCall.java:225)
> >> at org.apache.calcite.rel.convert.ConverterRule.onMatch(Convert
> >> erRule.java:117)
> >> at org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch(
> >> VolcanoRuleCall.java:213)
> >> ... 4 more
> >> Caused by: java.lang.NullPointerException
> >> at org.apache.calcite.schema.Statistics$2.isKey(Statistics.java:70)
> >> at org.apache.calcite.prepare.RelOptTableImpl.isKey(RelOptTable
> >> Impl.java:288)
> >> at org.apache.calcite.rel.metadata.RelMdColumnUniqueness.areCol
> >> umnsUnique(RelMdColumnUniqueness.java:76)
> >> at GeneratedMetadataHandler_ColumnUniqueness.areColumnsUnique_$(Unknown
> >> Source)
> >> at GeneratedMetadataHandler_ColumnUniqueness.areColumnsUnique(Unknown
> >> Source)
> >> at org.apache.calcite.rel.metadata.RelMetadataQuery.areColumnsU
> >> nique(RelMetadataQuery.java:461)
> >> at org.apache.calcite.rel.metadata.RelMdUtil.areColumnsDefinite
> >> lyUnique(RelMdUtil.java:216)
> >> at org.apache.calcite.rel.metadata.RelMdDistinctRowCount.getDis
> >> tinctRowCount(RelMdDistinctRowCount.java:75)
> >> at GeneratedMetadataHandler_DistinctRowCount.
> getDistinctRowCount_$(Unknown
> >> Source)
> >> at GeneratedMetadataHandler_DistinctRowCount.
> getDistinctRowCount(Unknown
> >> Source)
> >> at org.apache.calcite.rel.metadata.RelMetadataQuery.getDistinct
> >> RowCount(RelMetadataQuery.java:700)
> >> at org.apache.calcite.rel.metadata.RelMdDistinctRowCount.getDis
> >> tinctRowCount(RelMdDistinctRowCount.java:292)
> >> at GeneratedMetadataHandler_DistinctRowCount.
> getDistinctRowCount_$(Unknown
> >> Source)
> >> at GeneratedMetadataHandler_DistinctRowCount.
> getDistinctRowCount(Unknown
> >> Source)
> >> at org.apache.calcite.rel.metadata.RelMetadataQuery.getDistinct
> >> RowCount(RelMetadataQuery.java:700)
> >> at org.apache.calcite.rel.metadata.RelMdDistinctRowCount.getDis
> >> tinctRowCount(RelMdDistinctRowCount.java:138)
> >> at GeneratedMetadataHandler_DistinctRowCount.
> getDistinctRowCount_$(Unknown
> >> Source)
> >> at GeneratedMetadataHandler_DistinctRowCount.
> getDistinctRowCount(Unknown
> >> Source)
> >> at org.apache.calcite.rel.metadata.RelMetadataQuery.getDistinct
> >> RowCount(RelMetadataQuery.java:700)
> >> at org.apache.calcite.rel.metadata.RelMdDistinctRowCount.getDis
> >> tinctRowCount(RelMdDistinctRowCount.java:292)
> >> at GeneratedMetadataHandler_DistinctRowCount.
> getDistinctRowCount_$(Unknown
> >> Source)
> >> at GeneratedMetadataHandler_DistinctRowCount.
> getDistinctRowCount(Unknown
> >> Source)
> >> at org.apache.calcite.rel.metadata.RelMetadataQuery.getDistinct
> >> RowCount(RelMetadataQuery.java:700)
> >> at org.apache.calcite.rel.metadata.RelMdRowCount.getRowCount(
> >> RelMdRowCount.java:194)
> >> at GeneratedMetadataHandler_RowCount.getRowCount_$(Unknown Source)
> >> at GeneratedMetadataHandler_RowCount.getRowCount(Unknown Source)
> >> at org.apache.calcite.rel.metadata.RelMetadataQuery.getRowCount
> >> (RelMetadataQuery.java:201)
> >> at org.apache.calcite.rel.core.Aggregate.computeSelfCost(Aggreg
> >> ate.java:304)
> >> at org.apache.calcite.rel.metadata.RelMdPercentageOriginalRows.
> >> getNonCumulativeCost(RelMdPercentageOriginalRows.java:162)
> >> at GeneratedMetadataHandler_NonCumulativeCost.
> getNonCumulativeCost_$(Unknown
> >> Source)
> >> at GeneratedMetadataHandler_NonCumulativeCost.
> getNonCumulativeCost(Unknown
> >> Source)
> >> at org.apache.calcite.rel.metadata.RelMetadataQuery.getNonCumul
> >> ativeCost(RelMetadataQuery.java:258)
> >> at org.apache.calcite.plan.volcano.VolcanoPlanner.getCost(
> >> VolcanoPlanner.java:1128)
> >> at org.apache.calcite.plan.volcano.RelSubset.propagateCostImpro
> >> vements0(RelSubset.java:336)
> >> at org.apache.calcite.plan.volcano.RelSubset.propagateCostImpro
> >> vements(RelSubset.java:319)
> >> at org.apache.calcite.plan.volcano.RelSubset.propagateCostImpro
> >> vements0(RelSubset.java:348)
> >> at org.apache.calcite.plan.volcano.RelSubset.propagateCostImpro
> >> vements(RelSubset.java:319)
> >> at org.apache.calcite.plan.volcano.RelSubset.propagateCostImpro
> >> vements0(RelSubset.java:348)
> >> at org.apache.calcite.plan.volcano.RelSubset.propagateCostImpro
> >> vements(RelSubset.java:319)
> >> at org.apache.calcite.plan.volcano.VolcanoPlanner.addRelToSet(
> >> VolcanoPlanner.java:1830)
> >> at org.apache.calcite.plan.volcano.VolcanoPlanner.registerImpl(
> >> VolcanoPlanner.java:1766)
> >> at org.apache.calcite.plan.volcano.VolcanoPlanner.register(
> >> VolcanoPlanner.java:1032)
> >> at org.apache.calcite.plan.volcano.VolcanoPlanner.ensureRegiste
> >> red(VolcanoPlanner.java:1052)
> >> at org.apache.calcite.plan.volcano.VolcanoPlanner.ensureRegiste
> >> red(VolcanoPlanner.java:1942)
> >> at org.apache.calcite.plan.volcano.VolcanoRuleCall.transformTo(
> >> VolcanoRuleCall.java:136)
> >> ... 7 more
> >>
> >> I define the Statistics in the tables I use like this:
> >> public Statistic getStatistic() {
> >> int rowCount = rows.size();
> >> return Statistics.of(rowCount, null); //add List<ImmutableBitSet>
> >> }
> >>
> >>
> >> Thanks,
> >> George
> >>
> >> 2016-10-16 7:28 GMT+03:00 Jungtaek Lim <ka...@gmail.com>:
> >>
> >>> Hi George,
> >>>
> >>> This patch is ported version (with small fixes) of Milinda's samza-sql
> >>> implementation for Storm SQL.
> >>> https://github.com/apache/storm/pull/1736
> >>>
> >>> In this patch I removed adding HepPlanner and just rely on Volcano
> >>> Planner
> >>> (so the patch may be the closer thing what you want).
> >>> For now I also remove code regarding metadata since I'm not clear on
> how
> >>> it
> >>> works and what it helps, but I'll re-address once I can find its usage
> >>> and
> >>> benefits.
> >>>
> >>> Hope this helps.
> >>>
> >>> Thanks,
> >>> Jungtaek Lim (HeartSaVioR)
> >>>
> >>> 2016년 10월 4일 (화) 오후 7:08, Γιώργος Θεοδωράκης <giwrgosrtheod@gmail.com
> >님이
> >>> 작성:
> >>>
> >>> > I think I did as you said:
> >>> >
> >>> > https://github.com/giwrgostheod/Calcite-Saber/blob/master/sr
> >>> c/main/java/calcite/VolcanoTester.java
> >>> >
> >>> > and I get for every query I use:
> >>> > Exception in thread "main"
> >>> > org.apache.calcite.plan.RelOptPlanner$CannotPlanException: Node
> >>> > [rel#10:Subset#2.NONE.[]] could not be implemented; planner state:
> >>> > Root: rel#10:Subset#2.NONE.[]
> >>> > Original rel:
> >>> > ....
> >>> > at
> >>> >
> >>> > org.apache.calcite.plan.volcano.RelSubset$CheapestPlanReplac
> >>> er.visit(RelSubset.java:443)
> >>> > at
> >>> >
> >>> > org.apache.calcite.plan.volcano.RelSubset.buildCheapestPlan(
> >>> RelSubset.java:293)
> >>> > at
> >>> >
> >>> > org.apache.calcite.plan.volcano.VolcanoPlanner.findBestExp(V
> >>> olcanoPlanner.java:835)
> >>> > at org.apache.calcite.tools.Programs$RuleSetProgram.run(Program
> >>> s.java:334)
> >>> > at org.apache.calcite.prepare.PlannerImpl.transform(PlannerImpl
> >>> .java:308)
> >>> > at calcite.VolcanoTester.main(VolcanoTester.java:77)
> >>> >
> >>> > My table's is defined here :
> >>> >
> >>> > https://github.com/giwrgostheod/Calcite-Saber/blob/master/sr
> >>> c/main/java/calcite/utils/OrdersTableFactory.java
> >>> >
> >>> >
> >>> > Thank you for your time,
> >>> > George
> >>> >
> >>> >
> >>> > 2016-10-04 2:38 GMT+03:00 Jordan Halterman <
> jordan.halterman@gmail.com
> >>> >:
> >>> >
> >>> > > The link you provided is a pretty good example. Build a
> >>> FrameworkConfig
> >>> > > with your schema, parser config, and other information, and use it
> to
> >>> > > create a Planner. That Planner uses a VolcanoPlanner internally.
> >>> What's
> >>> > > missing from that particular example is just the addition of
> >>> programs.
> >>> > > Programs are effectively sets of rules you will use to optimize
> your
> >>> > query.
> >>> > > So, to add your FilterProjectTransposeRule to the planner, call
> this
> >>> when
> >>> > > building your FrameworkConfig:
> >>> > >
> >>> > > .programs(Programs.ofRules(FilterProjectTransposeRule.INSTANCE))
> >>> > >
> >>> > > That adds your program(s) to the set of programs in the planner,
> and
> >>> > those
> >>> > > programs can be accessed to optimize the query. Use the planner to
> >>> > parse()
> >>> > > your query, validate() your query, and then convert() your query
> >>> into a
> >>> > > logical plan. Then call...
> >>> > >
> >>> > > RelTraitSet traitSet = planner.emptyTraitSet().replac
> >>> e(Convention.NONE);
> >>> > > planner.transform(0. traitSet, logicalPlan);
> >>> > >
> >>> > > to apply the rules you added to the configuration. That should use
> >>> the
> >>> > > VolcanoPlanner to apply the rules you added in your Program. The
> >>> trait
> >>> > set
> >>> > > that's passed to that method is the required output trait set. So,
> >>> if you
> >>> > > wanted to convert the logical plan into some physical convention,
> >>> you'd
> >>> > > pass your physical convention instead of Convention.NONE. I can
> >>> respond
> >>> > > with a full example if you need it in a bit. I just don't have the
> >>> > capacity
> >>> > > to write it ATM.
> >>> > >
> >>> > > On Mon, Oct 3, 2016 at 8:51 AM, Γιώργος Θεοδωράκης <
> >>> > > giwrgosrtheod@gmail.com>
> >>> > > wrote:
> >>> > >
> >>> > > > Hi,
> >>> > > >
> >>> > > > I want to parse an Sql query and transform it to an optimized
> >>> > relational
> >>> > > > plan (not convert it to physical !!) using calcite rules based on
> >>> my
> >>> > > > database schema and metadata. Right now, the only helpful
> example I
> >>> > have
> >>> > > > found for my purpose is taken from
> >>> > > > https://github.com/milinda/samza-sql/blob/master/samza-
> >>> > > >
> >>> > sql-planner/src/main/java/org/apache/samza/sql/planner/Query
> >>> Planner.java
> >>> > > > ,
> >>> > > > in which a simple Planner is used for parsing and validating Sql
> >>> and a
> >>> > > > HepPlanner is used for searching for an optimized plan based on
> >>> > imported
> >>> > > > rules.
> >>> > > >
> >>> > > > Is there any way to use in my case the VolcanoPlanner? The only
> >>> > examples
> >>> > > I
> >>> > > > have seen so far from the test classes suggest that it should be
> >>> used
> >>> > for
> >>> > > > converting relational expressions to physical ones. How can I
> make
> >>> the
> >>> > > > Volcano Planner "see" my SchemaPlus schema ,when I can only
> define
> >>> > > > RelOptSchema? Can someone provide me with a complete example of
> >>> using
> >>> > > > Volcano Planner and adding rules, such
> >>> > > > as FilterProjectTransposeRule.INSTANCE?
> >>> > > >
> >>> > > > Thanks in advance,
> >>> > > > George
> >>> > > >
> >>> > >
> >>> >
> >>>
> >>
> >>
> >
>

Re: How to use VolcanoPlanner

Posted by Γιώργος Θεοδωράκης <gi...@gmail.com>.
I am wondering if is it possible to push down projections in Volcano
generally. With the cost model of Volcano, a projection adds rows and cpu
cost and it can't be chosen. For example  for the next query:

"select s.products.productid "
+ "from s.products,s.orders "
+ "where s.orders.productid = s.products.productid  "
+ "and units>5 "

, the optimized plan I get from Volcano is this

EnumerableProject(productid=[$0]): rowcount = 5.3999999999999995,
cumulative cost = {51.555956815368326 rows, 37.4 cpu, 0.0 io}, id = 86
  EnumerableJoin(condition=[=($0, $3)], joinType=[inner]): rowcount =
5.3999999999999995, cumulative cost = {46.15595681536833 rows, 32.0 cpu,
0.0 io}, id = 85
    EnumerableTableScan(table=[[s, products]]): rowcount = 6.0, cumulative
cost = {6.0 rows, 7.0 cpu, 0.0 io}, id = 40
    EnumerableFilter(condition=[>($2, 5)]): rowcount = 6.0, cumulative cost
= {18.0 rows, 25.0 cpu, 0.0 io}, id = 84
      EnumerableTableScan(table=[[s, orders]]): rowcount = 12.0, cumulative
cost = {12.0 rows, 13.0 cpu, 0.0 io}, id = 59

However the plan I would like to get is the one with projections pushed
down, which has bigger cumulative cost as we would have two more
projections at the source of inputs (with additional rowCount and cpu cost)
with Volcano's cost model, but smaller in most cases as we reduce the
memory needed(which is not taken in consideration here).

Are there any suggestions? At first I used hepPlanner, but I didn't get
optimized plans for join order with the built-in
rules(using JoinPushThroughJoinRule.RIGHT and JoinPushThroughJoinRule.LEFT
never stops, and using MultiJoinRules doesn't do anything). Then, I tried
to use VolcanoPlanner and I got optimized plans for join order with no
pushed-down projections. I also tried to use a combination of both planners
but I got errors. What should I do?

2016-10-27 16:00 GMT+03:00 Γιώργος Θεοδωράκης <gi...@gmail.com>:

> I fixed the second error by changing my Statistics to:
>
> public Statistic getStatistic() {
> int rowCount = rows.size();
> return Statistics.of(rowCount, ImmutableList.<ImmutableBitSet>of());
> }
>
> Any suggestions of a better solution are welcome. The project push-down
> problem still exists...
>
> 2016-10-27 15:42 GMT+03:00 Γιώργος Θεοδωράκης <gi...@gmail.com>:
>
>> Hi,
>> I was missing the implementations of operators, and I added the built in
>> EnumerableRules until I create my own, in order to fix it. However, the
>> plan I get from Volcano Optimizer is different from the one I get from
>> HepPlanner, although I use the same rules. My problem is about Projection
>> push-down. The hepPlanner pushes Projections to the bottom of the RelNode
>> tree, and VolcanoPlanner keeps them always at the top (doesn't push them
>> through joins). I use these rules in both :
>>
>>        ProjectRemoveRule.INSTANCE,
>>        ProjectJoinTransposeRule.INSTANCE,
>>         ProjectFilterTransposeRule.INSTANCE, /*it is better to use
>> filter first and then project*/
>>        ProjectTableScanRule.INSTANCE,
>>        ProjectWindowTransposeRule.INSTANCE,
>>        ProjectMergeRule.INSTANCE
>>
>> and
>>                             EnumerableRules.ENUMERABLE_TABLE_SCAN_RULE,
>>         EnumerableRules.ENUMERABLE_PROJECT_RULE,
>>                             ...
>>
>> Finally , when trying to use aggregate I get this error:
>>
>> Exception in thread "main" java.lang.AssertionError: Internal error:
>> Error while applying rule EnumerableTableScanRule, args
>> [rel#4:LogicalTableScan.NONE.[](table=[s, orders])]
>> at org.apache.calcite.util.Util.newInternal(Util.java:792)
>> at org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch(
>> VolcanoRuleCall.java:236)
>> at org.apache.calcite.plan.volcano.VolcanoPlanner.findBestExp(
>> VolcanoPlanner.java:819)
>> at org.apache.calcite.tools.Programs$RuleSetProgram.run(Program
>> s.java:334)
>> at org.apache.calcite.prepare.PlannerImpl.transform(PlannerImpl.java:308)
>> at calcite.VolcanoTester.main(VolcanoTester.java:106)
>> Caused by: java.lang.AssertionError: Internal error: Error occurred while
>> applying rule EnumerableTableScanRule
>> at org.apache.calcite.util.Util.newInternal(Util.java:792)
>> at org.apache.calcite.plan.volcano.VolcanoRuleCall.transformTo(
>> VolcanoRuleCall.java:148)
>> at org.apache.calcite.plan.RelOptRuleCall.transformTo(RelOptRul
>> eCall.java:225)
>> at org.apache.calcite.rel.convert.ConverterRule.onMatch(Convert
>> erRule.java:117)
>> at org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch(
>> VolcanoRuleCall.java:213)
>> ... 4 more
>> Caused by: java.lang.NullPointerException
>> at org.apache.calcite.schema.Statistics$2.isKey(Statistics.java:70)
>> at org.apache.calcite.prepare.RelOptTableImpl.isKey(RelOptTable
>> Impl.java:288)
>> at org.apache.calcite.rel.metadata.RelMdColumnUniqueness.areCol
>> umnsUnique(RelMdColumnUniqueness.java:76)
>> at GeneratedMetadataHandler_ColumnUniqueness.areColumnsUnique_$(Unknown
>> Source)
>> at GeneratedMetadataHandler_ColumnUniqueness.areColumnsUnique(Unknown
>> Source)
>> at org.apache.calcite.rel.metadata.RelMetadataQuery.areColumnsU
>> nique(RelMetadataQuery.java:461)
>> at org.apache.calcite.rel.metadata.RelMdUtil.areColumnsDefinite
>> lyUnique(RelMdUtil.java:216)
>> at org.apache.calcite.rel.metadata.RelMdDistinctRowCount.getDis
>> tinctRowCount(RelMdDistinctRowCount.java:75)
>> at GeneratedMetadataHandler_DistinctRowCount.getDistinctRowCount_$(Unknown
>> Source)
>> at GeneratedMetadataHandler_DistinctRowCount.getDistinctRowCount(Unknown
>> Source)
>> at org.apache.calcite.rel.metadata.RelMetadataQuery.getDistinct
>> RowCount(RelMetadataQuery.java:700)
>> at org.apache.calcite.rel.metadata.RelMdDistinctRowCount.getDis
>> tinctRowCount(RelMdDistinctRowCount.java:292)
>> at GeneratedMetadataHandler_DistinctRowCount.getDistinctRowCount_$(Unknown
>> Source)
>> at GeneratedMetadataHandler_DistinctRowCount.getDistinctRowCount(Unknown
>> Source)
>> at org.apache.calcite.rel.metadata.RelMetadataQuery.getDistinct
>> RowCount(RelMetadataQuery.java:700)
>> at org.apache.calcite.rel.metadata.RelMdDistinctRowCount.getDis
>> tinctRowCount(RelMdDistinctRowCount.java:138)
>> at GeneratedMetadataHandler_DistinctRowCount.getDistinctRowCount_$(Unknown
>> Source)
>> at GeneratedMetadataHandler_DistinctRowCount.getDistinctRowCount(Unknown
>> Source)
>> at org.apache.calcite.rel.metadata.RelMetadataQuery.getDistinct
>> RowCount(RelMetadataQuery.java:700)
>> at org.apache.calcite.rel.metadata.RelMdDistinctRowCount.getDis
>> tinctRowCount(RelMdDistinctRowCount.java:292)
>> at GeneratedMetadataHandler_DistinctRowCount.getDistinctRowCount_$(Unknown
>> Source)
>> at GeneratedMetadataHandler_DistinctRowCount.getDistinctRowCount(Unknown
>> Source)
>> at org.apache.calcite.rel.metadata.RelMetadataQuery.getDistinct
>> RowCount(RelMetadataQuery.java:700)
>> at org.apache.calcite.rel.metadata.RelMdRowCount.getRowCount(
>> RelMdRowCount.java:194)
>> at GeneratedMetadataHandler_RowCount.getRowCount_$(Unknown Source)
>> at GeneratedMetadataHandler_RowCount.getRowCount(Unknown Source)
>> at org.apache.calcite.rel.metadata.RelMetadataQuery.getRowCount
>> (RelMetadataQuery.java:201)
>> at org.apache.calcite.rel.core.Aggregate.computeSelfCost(Aggreg
>> ate.java:304)
>> at org.apache.calcite.rel.metadata.RelMdPercentageOriginalRows.
>> getNonCumulativeCost(RelMdPercentageOriginalRows.java:162)
>> at GeneratedMetadataHandler_NonCumulativeCost.getNonCumulativeCost_$(Unknown
>> Source)
>> at GeneratedMetadataHandler_NonCumulativeCost.getNonCumulativeCost(Unknown
>> Source)
>> at org.apache.calcite.rel.metadata.RelMetadataQuery.getNonCumul
>> ativeCost(RelMetadataQuery.java:258)
>> at org.apache.calcite.plan.volcano.VolcanoPlanner.getCost(
>> VolcanoPlanner.java:1128)
>> at org.apache.calcite.plan.volcano.RelSubset.propagateCostImpro
>> vements0(RelSubset.java:336)
>> at org.apache.calcite.plan.volcano.RelSubset.propagateCostImpro
>> vements(RelSubset.java:319)
>> at org.apache.calcite.plan.volcano.RelSubset.propagateCostImpro
>> vements0(RelSubset.java:348)
>> at org.apache.calcite.plan.volcano.RelSubset.propagateCostImpro
>> vements(RelSubset.java:319)
>> at org.apache.calcite.plan.volcano.RelSubset.propagateCostImpro
>> vements0(RelSubset.java:348)
>> at org.apache.calcite.plan.volcano.RelSubset.propagateCostImpro
>> vements(RelSubset.java:319)
>> at org.apache.calcite.plan.volcano.VolcanoPlanner.addRelToSet(
>> VolcanoPlanner.java:1830)
>> at org.apache.calcite.plan.volcano.VolcanoPlanner.registerImpl(
>> VolcanoPlanner.java:1766)
>> at org.apache.calcite.plan.volcano.VolcanoPlanner.register(
>> VolcanoPlanner.java:1032)
>> at org.apache.calcite.plan.volcano.VolcanoPlanner.ensureRegiste
>> red(VolcanoPlanner.java:1052)
>> at org.apache.calcite.plan.volcano.VolcanoPlanner.ensureRegiste
>> red(VolcanoPlanner.java:1942)
>> at org.apache.calcite.plan.volcano.VolcanoRuleCall.transformTo(
>> VolcanoRuleCall.java:136)
>> ... 7 more
>>
>> I define the Statistics in the tables I use like this:
>> public Statistic getStatistic() {
>> int rowCount = rows.size();
>> return Statistics.of(rowCount, null); //add List<ImmutableBitSet>
>> }
>>
>>
>> Thanks,
>> George
>>
>> 2016-10-16 7:28 GMT+03:00 Jungtaek Lim <ka...@gmail.com>:
>>
>>> Hi George,
>>>
>>> This patch is ported version (with small fixes) of Milinda's samza-sql
>>> implementation for Storm SQL.
>>> https://github.com/apache/storm/pull/1736
>>>
>>> In this patch I removed adding HepPlanner and just rely on Volcano
>>> Planner
>>> (so the patch may be the closer thing what you want).
>>> For now I also remove code regarding metadata since I'm not clear on how
>>> it
>>> works and what it helps, but I'll re-address once I can find its usage
>>> and
>>> benefits.
>>>
>>> Hope this helps.
>>>
>>> Thanks,
>>> Jungtaek Lim (HeartSaVioR)
>>>
>>> 2016년 10월 4일 (화) 오후 7:08, Γιώργος Θεοδωράκης <gi...@gmail.com>님이
>>> 작성:
>>>
>>> > I think I did as you said:
>>> >
>>> > https://github.com/giwrgostheod/Calcite-Saber/blob/master/sr
>>> c/main/java/calcite/VolcanoTester.java
>>> >
>>> > and I get for every query I use:
>>> > Exception in thread "main"
>>> > org.apache.calcite.plan.RelOptPlanner$CannotPlanException: Node
>>> > [rel#10:Subset#2.NONE.[]] could not be implemented; planner state:
>>> > Root: rel#10:Subset#2.NONE.[]
>>> > Original rel:
>>> > ....
>>> > at
>>> >
>>> > org.apache.calcite.plan.volcano.RelSubset$CheapestPlanReplac
>>> er.visit(RelSubset.java:443)
>>> > at
>>> >
>>> > org.apache.calcite.plan.volcano.RelSubset.buildCheapestPlan(
>>> RelSubset.java:293)
>>> > at
>>> >
>>> > org.apache.calcite.plan.volcano.VolcanoPlanner.findBestExp(V
>>> olcanoPlanner.java:835)
>>> > at org.apache.calcite.tools.Programs$RuleSetProgram.run(Program
>>> s.java:334)
>>> > at org.apache.calcite.prepare.PlannerImpl.transform(PlannerImpl
>>> .java:308)
>>> > at calcite.VolcanoTester.main(VolcanoTester.java:77)
>>> >
>>> > My table's is defined here :
>>> >
>>> > https://github.com/giwrgostheod/Calcite-Saber/blob/master/sr
>>> c/main/java/calcite/utils/OrdersTableFactory.java
>>> >
>>> >
>>> > Thank you for your time,
>>> > George
>>> >
>>> >
>>> > 2016-10-04 2:38 GMT+03:00 Jordan Halterman <jordan.halterman@gmail.com
>>> >:
>>> >
>>> > > The link you provided is a pretty good example. Build a
>>> FrameworkConfig
>>> > > with your schema, parser config, and other information, and use it to
>>> > > create a Planner. That Planner uses a VolcanoPlanner internally.
>>> What's
>>> > > missing from that particular example is just the addition of
>>> programs.
>>> > > Programs are effectively sets of rules you will use to optimize your
>>> > query.
>>> > > So, to add your FilterProjectTransposeRule to the planner, call this
>>> when
>>> > > building your FrameworkConfig:
>>> > >
>>> > > .programs(Programs.ofRules(FilterProjectTransposeRule.INSTANCE))
>>> > >
>>> > > That adds your program(s) to the set of programs in the planner, and
>>> > those
>>> > > programs can be accessed to optimize the query. Use the planner to
>>> > parse()
>>> > > your query, validate() your query, and then convert() your query
>>> into a
>>> > > logical plan. Then call...
>>> > >
>>> > > RelTraitSet traitSet = planner.emptyTraitSet().replac
>>> e(Convention.NONE);
>>> > > planner.transform(0. traitSet, logicalPlan);
>>> > >
>>> > > to apply the rules you added to the configuration. That should use
>>> the
>>> > > VolcanoPlanner to apply the rules you added in your Program. The
>>> trait
>>> > set
>>> > > that's passed to that method is the required output trait set. So,
>>> if you
>>> > > wanted to convert the logical plan into some physical convention,
>>> you'd
>>> > > pass your physical convention instead of Convention.NONE. I can
>>> respond
>>> > > with a full example if you need it in a bit. I just don't have the
>>> > capacity
>>> > > to write it ATM.
>>> > >
>>> > > On Mon, Oct 3, 2016 at 8:51 AM, Γιώργος Θεοδωράκης <
>>> > > giwrgosrtheod@gmail.com>
>>> > > wrote:
>>> > >
>>> > > > Hi,
>>> > > >
>>> > > > I want to parse an Sql query and transform it to an optimized
>>> > relational
>>> > > > plan (not convert it to physical !!) using calcite rules based on
>>> my
>>> > > > database schema and metadata. Right now, the only helpful example I
>>> > have
>>> > > > found for my purpose is taken from
>>> > > > https://github.com/milinda/samza-sql/blob/master/samza-
>>> > > >
>>> > sql-planner/src/main/java/org/apache/samza/sql/planner/Query
>>> Planner.java
>>> > > > ,
>>> > > > in which a simple Planner is used for parsing and validating Sql
>>> and a
>>> > > > HepPlanner is used for searching for an optimized plan based on
>>> > imported
>>> > > > rules.
>>> > > >
>>> > > > Is there any way to use in my case the VolcanoPlanner? The only
>>> > examples
>>> > > I
>>> > > > have seen so far from the test classes suggest that it should be
>>> used
>>> > for
>>> > > > converting relational expressions to physical ones. How can I make
>>> the
>>> > > > Volcano Planner "see" my SchemaPlus schema ,when I can only define
>>> > > > RelOptSchema? Can someone provide me with a complete example of
>>> using
>>> > > > Volcano Planner and adding rules, such
>>> > > > as FilterProjectTransposeRule.INSTANCE?
>>> > > >
>>> > > > Thanks in advance,
>>> > > > George
>>> > > >
>>> > >
>>> >
>>>
>>
>>
>

Re: How to use VolcanoPlanner

Posted by Γιώργος Θεοδωράκης <gi...@gmail.com>.
I fixed the second error by changing my Statistics to:

public Statistic getStatistic() {
int rowCount = rows.size();
return Statistics.of(rowCount, ImmutableList.<ImmutableBitSet>of());
}

Any suggestions of a better solution are welcome. The project push-down
problem still exists...

2016-10-27 15:42 GMT+03:00 Γιώργος Θεοδωράκης <gi...@gmail.com>:

> Hi,
> I was missing the implementations of operators, and I added the built in
> EnumerableRules until I create my own, in order to fix it. However, the
> plan I get from Volcano Optimizer is different from the one I get from
> HepPlanner, although I use the same rules. My problem is about Projection
> push-down. The hepPlanner pushes Projections to the bottom of the RelNode
> tree, and VolcanoPlanner keeps them always at the top (doesn't push them
> through joins). I use these rules in both :
>
>        ProjectRemoveRule.INSTANCE,
>        ProjectJoinTransposeRule.INSTANCE,
>         ProjectFilterTransposeRule.INSTANCE, /*it is better to use filter
> first and then project*/
>        ProjectTableScanRule.INSTANCE,
>        ProjectWindowTransposeRule.INSTANCE,
>        ProjectMergeRule.INSTANCE
>
> and
>                             EnumerableRules.ENUMERABLE_TABLE_SCAN_RULE,
>         EnumerableRules.ENUMERABLE_PROJECT_RULE,
>                             ...
>
> Finally , when trying to use aggregate I get this error:
>
> Exception in thread "main" java.lang.AssertionError: Internal error: Error
> while applying rule EnumerableTableScanRule, args
> [rel#4:LogicalTableScan.NONE.[](table=[s, orders])]
> at org.apache.calcite.util.Util.newInternal(Util.java:792)
> at org.apache.calcite.plan.volcano.VolcanoRuleCall.
> onMatch(VolcanoRuleCall.java:236)
> at org.apache.calcite.plan.volcano.VolcanoPlanner.
> findBestExp(VolcanoPlanner.java:819)
> at org.apache.calcite.tools.Programs$RuleSetProgram.run(Programs.java:334)
> at org.apache.calcite.prepare.PlannerImpl.transform(PlannerImpl.java:308)
> at calcite.VolcanoTester.main(VolcanoTester.java:106)
> Caused by: java.lang.AssertionError: Internal error: Error occurred while
> applying rule EnumerableTableScanRule
> at org.apache.calcite.util.Util.newInternal(Util.java:792)
> at org.apache.calcite.plan.volcano.VolcanoRuleCall.
> transformTo(VolcanoRuleCall.java:148)
> at org.apache.calcite.plan.RelOptRuleCall.transformTo(
> RelOptRuleCall.java:225)
> at org.apache.calcite.rel.convert.ConverterRule.onMatch(
> ConverterRule.java:117)
> at org.apache.calcite.plan.volcano.VolcanoRuleCall.
> onMatch(VolcanoRuleCall.java:213)
> ... 4 more
> Caused by: java.lang.NullPointerException
> at org.apache.calcite.schema.Statistics$2.isKey(Statistics.java:70)
> at org.apache.calcite.prepare.RelOptTableImpl.isKey(
> RelOptTableImpl.java:288)
> at org.apache.calcite.rel.metadata.RelMdColumnUniqueness.areColumnsUnique(
> RelMdColumnUniqueness.java:76)
> at GeneratedMetadataHandler_ColumnUniqueness.areColumnsUnique_$(Unknown
> Source)
> at GeneratedMetadataHandler_ColumnUniqueness.areColumnsUnique(Unknown
> Source)
> at org.apache.calcite.rel.metadata.RelMetadataQuery.areColumnsUnique(
> RelMetadataQuery.java:461)
> at org.apache.calcite.rel.metadata.RelMdUtil.areColumnsDefinitelyUnique(
> RelMdUtil.java:216)
> at org.apache.calcite.rel.metadata.RelMdDistinctRowCount.
> getDistinctRowCount(RelMdDistinctRowCount.java:75)
> at GeneratedMetadataHandler_DistinctRowCount.getDistinctRowCount_$(Unknown
> Source)
> at GeneratedMetadataHandler_DistinctRowCount.getDistinctRowCount(Unknown
> Source)
> at org.apache.calcite.rel.metadata.RelMetadataQuery.getDistinctRowCount(
> RelMetadataQuery.java:700)
> at org.apache.calcite.rel.metadata.RelMdDistinctRowCount.
> getDistinctRowCount(RelMdDistinctRowCount.java:292)
> at GeneratedMetadataHandler_DistinctRowCount.getDistinctRowCount_$(Unknown
> Source)
> at GeneratedMetadataHandler_DistinctRowCount.getDistinctRowCount(Unknown
> Source)
> at org.apache.calcite.rel.metadata.RelMetadataQuery.getDistinctRowCount(
> RelMetadataQuery.java:700)
> at org.apache.calcite.rel.metadata.RelMdDistinctRowCount.
> getDistinctRowCount(RelMdDistinctRowCount.java:138)
> at GeneratedMetadataHandler_DistinctRowCount.getDistinctRowCount_$(Unknown
> Source)
> at GeneratedMetadataHandler_DistinctRowCount.getDistinctRowCount(Unknown
> Source)
> at org.apache.calcite.rel.metadata.RelMetadataQuery.getDistinctRowCount(
> RelMetadataQuery.java:700)
> at org.apache.calcite.rel.metadata.RelMdDistinctRowCount.
> getDistinctRowCount(RelMdDistinctRowCount.java:292)
> at GeneratedMetadataHandler_DistinctRowCount.getDistinctRowCount_$(Unknown
> Source)
> at GeneratedMetadataHandler_DistinctRowCount.getDistinctRowCount(Unknown
> Source)
> at org.apache.calcite.rel.metadata.RelMetadataQuery.getDistinctRowCount(
> RelMetadataQuery.java:700)
> at org.apache.calcite.rel.metadata.RelMdRowCount.
> getRowCount(RelMdRowCount.java:194)
> at GeneratedMetadataHandler_RowCount.getRowCount_$(Unknown Source)
> at GeneratedMetadataHandler_RowCount.getRowCount(Unknown Source)
> at org.apache.calcite.rel.metadata.RelMetadataQuery.
> getRowCount(RelMetadataQuery.java:201)
> at org.apache.calcite.rel.core.Aggregate.computeSelfCost(
> Aggregate.java:304)
> at org.apache.calcite.rel.metadata.RelMdPercentageOriginalRows.
> getNonCumulativeCost(RelMdPercentageOriginalRows.java:162)
> at GeneratedMetadataHandler_NonCumulativeCost.getNonCumulativeCost_$(Unknown
> Source)
> at GeneratedMetadataHandler_NonCumulativeCost.getNonCumulativeCost(Unknown
> Source)
> at org.apache.calcite.rel.metadata.RelMetadataQuery.getNonCumulativeCost(
> RelMetadataQuery.java:258)
> at org.apache.calcite.plan.volcano.VolcanoPlanner.
> getCost(VolcanoPlanner.java:1128)
> at org.apache.calcite.plan.volcano.RelSubset.propagateCostImprovements0(
> RelSubset.java:336)
> at org.apache.calcite.plan.volcano.RelSubset.propagateCostImprovements(
> RelSubset.java:319)
> at org.apache.calcite.plan.volcano.RelSubset.propagateCostImprovements0(
> RelSubset.java:348)
> at org.apache.calcite.plan.volcano.RelSubset.propagateCostImprovements(
> RelSubset.java:319)
> at org.apache.calcite.plan.volcano.RelSubset.propagateCostImprovements0(
> RelSubset.java:348)
> at org.apache.calcite.plan.volcano.RelSubset.propagateCostImprovements(
> RelSubset.java:319)
> at org.apache.calcite.plan.volcano.VolcanoPlanner.
> addRelToSet(VolcanoPlanner.java:1830)
> at org.apache.calcite.plan.volcano.VolcanoPlanner.
> registerImpl(VolcanoPlanner.java:1766)
> at org.apache.calcite.plan.volcano.VolcanoPlanner.
> register(VolcanoPlanner.java:1032)
> at org.apache.calcite.plan.volcano.VolcanoPlanner.ensureRegistered(
> VolcanoPlanner.java:1052)
> at org.apache.calcite.plan.volcano.VolcanoPlanner.ensureRegistered(
> VolcanoPlanner.java:1942)
> at org.apache.calcite.plan.volcano.VolcanoRuleCall.
> transformTo(VolcanoRuleCall.java:136)
> ... 7 more
>
> I define the Statistics in the tables I use like this:
> public Statistic getStatistic() {
> int rowCount = rows.size();
> return Statistics.of(rowCount, null); //add List<ImmutableBitSet>
> }
>
>
> Thanks,
> George
>
> 2016-10-16 7:28 GMT+03:00 Jungtaek Lim <ka...@gmail.com>:
>
>> Hi George,
>>
>> This patch is ported version (with small fixes) of Milinda's samza-sql
>> implementation for Storm SQL.
>> https://github.com/apache/storm/pull/1736
>>
>> In this patch I removed adding HepPlanner and just rely on Volcano Planner
>> (so the patch may be the closer thing what you want).
>> For now I also remove code regarding metadata since I'm not clear on how
>> it
>> works and what it helps, but I'll re-address once I can find its usage and
>> benefits.
>>
>> Hope this helps.
>>
>> Thanks,
>> Jungtaek Lim (HeartSaVioR)
>>
>> 2016년 10월 4일 (화) 오후 7:08, Γιώργος Θεοδωράκης <gi...@gmail.com>님이
>> 작성:
>>
>> > I think I did as you said:
>> >
>> > https://github.com/giwrgostheod/Calcite-Saber/blob/master/
>> src/main/java/calcite/VolcanoTester.java
>> >
>> > and I get for every query I use:
>> > Exception in thread "main"
>> > org.apache.calcite.plan.RelOptPlanner$CannotPlanException: Node
>> > [rel#10:Subset#2.NONE.[]] could not be implemented; planner state:
>> > Root: rel#10:Subset#2.NONE.[]
>> > Original rel:
>> > ....
>> > at
>> >
>> > org.apache.calcite.plan.volcano.RelSubset$CheapestPlanReplac
>> er.visit(RelSubset.java:443)
>> > at
>> >
>> > org.apache.calcite.plan.volcano.RelSubset.buildCheapestPlan(
>> RelSubset.java:293)
>> > at
>> >
>> > org.apache.calcite.plan.volcano.VolcanoPlanner.findBestExp(
>> VolcanoPlanner.java:835)
>> > at org.apache.calcite.tools.Programs$RuleSetProgram.run(Program
>> s.java:334)
>> > at org.apache.calcite.prepare.PlannerImpl.transform(PlannerImpl
>> .java:308)
>> > at calcite.VolcanoTester.main(VolcanoTester.java:77)
>> >
>> > My table's is defined here :
>> >
>> > https://github.com/giwrgostheod/Calcite-Saber/blob/master/
>> src/main/java/calcite/utils/OrdersTableFactory.java
>> >
>> >
>> > Thank you for your time,
>> > George
>> >
>> >
>> > 2016-10-04 2:38 GMT+03:00 Jordan Halterman <jordan.halterman@gmail.com
>> >:
>> >
>> > > The link you provided is a pretty good example. Build a
>> FrameworkConfig
>> > > with your schema, parser config, and other information, and use it to
>> > > create a Planner. That Planner uses a VolcanoPlanner internally.
>> What's
>> > > missing from that particular example is just the addition of programs.
>> > > Programs are effectively sets of rules you will use to optimize your
>> > query.
>> > > So, to add your FilterProjectTransposeRule to the planner, call this
>> when
>> > > building your FrameworkConfig:
>> > >
>> > > .programs(Programs.ofRules(FilterProjectTransposeRule.INSTANCE))
>> > >
>> > > That adds your program(s) to the set of programs in the planner, and
>> > those
>> > > programs can be accessed to optimize the query. Use the planner to
>> > parse()
>> > > your query, validate() your query, and then convert() your query into
>> a
>> > > logical plan. Then call...
>> > >
>> > > RelTraitSet traitSet = planner.emptyTraitSet().replac
>> e(Convention.NONE);
>> > > planner.transform(0. traitSet, logicalPlan);
>> > >
>> > > to apply the rules you added to the configuration. That should use the
>> > > VolcanoPlanner to apply the rules you added in your Program. The trait
>> > set
>> > > that's passed to that method is the required output trait set. So, if
>> you
>> > > wanted to convert the logical plan into some physical convention,
>> you'd
>> > > pass your physical convention instead of Convention.NONE. I can
>> respond
>> > > with a full example if you need it in a bit. I just don't have the
>> > capacity
>> > > to write it ATM.
>> > >
>> > > On Mon, Oct 3, 2016 at 8:51 AM, Γιώργος Θεοδωράκης <
>> > > giwrgosrtheod@gmail.com>
>> > > wrote:
>> > >
>> > > > Hi,
>> > > >
>> > > > I want to parse an Sql query and transform it to an optimized
>> > relational
>> > > > plan (not convert it to physical !!) using calcite rules based on my
>> > > > database schema and metadata. Right now, the only helpful example I
>> > have
>> > > > found for my purpose is taken from
>> > > > https://github.com/milinda/samza-sql/blob/master/samza-
>> > > >
>> > sql-planner/src/main/java/org/apache/samza/sql/planner/Query
>> Planner.java
>> > > > ,
>> > > > in which a simple Planner is used for parsing and validating Sql
>> and a
>> > > > HepPlanner is used for searching for an optimized plan based on
>> > imported
>> > > > rules.
>> > > >
>> > > > Is there any way to use in my case the VolcanoPlanner? The only
>> > examples
>> > > I
>> > > > have seen so far from the test classes suggest that it should be
>> used
>> > for
>> > > > converting relational expressions to physical ones. How can I make
>> the
>> > > > Volcano Planner "see" my SchemaPlus schema ,when I can only define
>> > > > RelOptSchema? Can someone provide me with a complete example of
>> using
>> > > > Volcano Planner and adding rules, such
>> > > > as FilterProjectTransposeRule.INSTANCE?
>> > > >
>> > > > Thanks in advance,
>> > > > George
>> > > >
>> > >
>> >
>>
>
>

Re: How to use VolcanoPlanner

Posted by Γιώργος Θεοδωράκης <gi...@gmail.com>.
Hi,
I was missing the implementations of operators, and I added the built in
EnumerableRules until I create my own, in order to fix it. However, the
plan I get from Volcano Optimizer is different from the one I get from
HepPlanner, although I use the same rules. My problem is about Projection
push-down. The hepPlanner pushes Projections to the bottom of the RelNode
tree, and VolcanoPlanner keeps them always at the top (doesn't push them
through joins). I use these rules in both :

       ProjectRemoveRule.INSTANCE,
       ProjectJoinTransposeRule.INSTANCE,
        ProjectFilterTransposeRule.INSTANCE, /*it is better to use filter
first and then project*/
       ProjectTableScanRule.INSTANCE,
       ProjectWindowTransposeRule.INSTANCE,
       ProjectMergeRule.INSTANCE

and
                            EnumerableRules.ENUMERABLE_TABLE_SCAN_RULE,
        EnumerableRules.ENUMERABLE_PROJECT_RULE,
                            ...

Finally , when trying to use aggregate I get this error:

Exception in thread "main" java.lang.AssertionError: Internal error: Error
while applying rule EnumerableTableScanRule, args
[rel#4:LogicalTableScan.NONE.[](table=[s, orders])]
at org.apache.calcite.util.Util.newInternal(Util.java:792)
at
org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch(VolcanoRuleCall.java:236)
at
org.apache.calcite.plan.volcano.VolcanoPlanner.findBestExp(VolcanoPlanner.java:819)
at org.apache.calcite.tools.Programs$RuleSetProgram.run(Programs.java:334)
at org.apache.calcite.prepare.PlannerImpl.transform(PlannerImpl.java:308)
at calcite.VolcanoTester.main(VolcanoTester.java:106)
Caused by: java.lang.AssertionError: Internal error: Error occurred while
applying rule EnumerableTableScanRule
at org.apache.calcite.util.Util.newInternal(Util.java:792)
at
org.apache.calcite.plan.volcano.VolcanoRuleCall.transformTo(VolcanoRuleCall.java:148)
at
org.apache.calcite.plan.RelOptRuleCall.transformTo(RelOptRuleCall.java:225)
at
org.apache.calcite.rel.convert.ConverterRule.onMatch(ConverterRule.java:117)
at
org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch(VolcanoRuleCall.java:213)
... 4 more
Caused by: java.lang.NullPointerException
at org.apache.calcite.schema.Statistics$2.isKey(Statistics.java:70)
at
org.apache.calcite.prepare.RelOptTableImpl.isKey(RelOptTableImpl.java:288)
at
org.apache.calcite.rel.metadata.RelMdColumnUniqueness.areColumnsUnique(RelMdColumnUniqueness.java:76)
at GeneratedMetadataHandler_ColumnUniqueness.areColumnsUnique_$(Unknown
Source)
at GeneratedMetadataHandler_ColumnUniqueness.areColumnsUnique(Unknown
Source)
at
org.apache.calcite.rel.metadata.RelMetadataQuery.areColumnsUnique(RelMetadataQuery.java:461)
at
org.apache.calcite.rel.metadata.RelMdUtil.areColumnsDefinitelyUnique(RelMdUtil.java:216)
at
org.apache.calcite.rel.metadata.RelMdDistinctRowCount.getDistinctRowCount(RelMdDistinctRowCount.java:75)
at GeneratedMetadataHandler_DistinctRowCount.getDistinctRowCount_$(Unknown
Source)
at GeneratedMetadataHandler_DistinctRowCount.getDistinctRowCount(Unknown
Source)
at
org.apache.calcite.rel.metadata.RelMetadataQuery.getDistinctRowCount(RelMetadataQuery.java:700)
at
org.apache.calcite.rel.metadata.RelMdDistinctRowCount.getDistinctRowCount(RelMdDistinctRowCount.java:292)
at GeneratedMetadataHandler_DistinctRowCount.getDistinctRowCount_$(Unknown
Source)
at GeneratedMetadataHandler_DistinctRowCount.getDistinctRowCount(Unknown
Source)
at
org.apache.calcite.rel.metadata.RelMetadataQuery.getDistinctRowCount(RelMetadataQuery.java:700)
at
org.apache.calcite.rel.metadata.RelMdDistinctRowCount.getDistinctRowCount(RelMdDistinctRowCount.java:138)
at GeneratedMetadataHandler_DistinctRowCount.getDistinctRowCount_$(Unknown
Source)
at GeneratedMetadataHandler_DistinctRowCount.getDistinctRowCount(Unknown
Source)
at
org.apache.calcite.rel.metadata.RelMetadataQuery.getDistinctRowCount(RelMetadataQuery.java:700)
at
org.apache.calcite.rel.metadata.RelMdDistinctRowCount.getDistinctRowCount(RelMdDistinctRowCount.java:292)
at GeneratedMetadataHandler_DistinctRowCount.getDistinctRowCount_$(Unknown
Source)
at GeneratedMetadataHandler_DistinctRowCount.getDistinctRowCount(Unknown
Source)
at
org.apache.calcite.rel.metadata.RelMetadataQuery.getDistinctRowCount(RelMetadataQuery.java:700)
at
org.apache.calcite.rel.metadata.RelMdRowCount.getRowCount(RelMdRowCount.java:194)
at GeneratedMetadataHandler_RowCount.getRowCount_$(Unknown Source)
at GeneratedMetadataHandler_RowCount.getRowCount(Unknown Source)
at
org.apache.calcite.rel.metadata.RelMetadataQuery.getRowCount(RelMetadataQuery.java:201)
at org.apache.calcite.rel.core.Aggregate.computeSelfCost(Aggregate.java:304)
at
org.apache.calcite.rel.metadata.RelMdPercentageOriginalRows.getNonCumulativeCost(RelMdPercentageOriginalRows.java:162)
at
GeneratedMetadataHandler_NonCumulativeCost.getNonCumulativeCost_$(Unknown
Source)
at GeneratedMetadataHandler_NonCumulativeCost.getNonCumulativeCost(Unknown
Source)
at
org.apache.calcite.rel.metadata.RelMetadataQuery.getNonCumulativeCost(RelMetadataQuery.java:258)
at
org.apache.calcite.plan.volcano.VolcanoPlanner.getCost(VolcanoPlanner.java:1128)
at
org.apache.calcite.plan.volcano.RelSubset.propagateCostImprovements0(RelSubset.java:336)
at
org.apache.calcite.plan.volcano.RelSubset.propagateCostImprovements(RelSubset.java:319)
at
org.apache.calcite.plan.volcano.RelSubset.propagateCostImprovements0(RelSubset.java:348)
at
org.apache.calcite.plan.volcano.RelSubset.propagateCostImprovements(RelSubset.java:319)
at
org.apache.calcite.plan.volcano.RelSubset.propagateCostImprovements0(RelSubset.java:348)
at
org.apache.calcite.plan.volcano.RelSubset.propagateCostImprovements(RelSubset.java:319)
at
org.apache.calcite.plan.volcano.VolcanoPlanner.addRelToSet(VolcanoPlanner.java:1830)
at
org.apache.calcite.plan.volcano.VolcanoPlanner.registerImpl(VolcanoPlanner.java:1766)
at
org.apache.calcite.plan.volcano.VolcanoPlanner.register(VolcanoPlanner.java:1032)
at
org.apache.calcite.plan.volcano.VolcanoPlanner.ensureRegistered(VolcanoPlanner.java:1052)
at
org.apache.calcite.plan.volcano.VolcanoPlanner.ensureRegistered(VolcanoPlanner.java:1942)
at
org.apache.calcite.plan.volcano.VolcanoRuleCall.transformTo(VolcanoRuleCall.java:136)
... 7 more

I define the Statistics in the tables I use like this:
public Statistic getStatistic() {
int rowCount = rows.size();
return Statistics.of(rowCount, null); //add List<ImmutableBitSet>
}


Thanks,
George

2016-10-16 7:28 GMT+03:00 Jungtaek Lim <ka...@gmail.com>:

> Hi George,
>
> This patch is ported version (with small fixes) of Milinda's samza-sql
> implementation for Storm SQL.
> https://github.com/apache/storm/pull/1736
>
> In this patch I removed adding HepPlanner and just rely on Volcano Planner
> (so the patch may be the closer thing what you want).
> For now I also remove code regarding metadata since I'm not clear on how it
> works and what it helps, but I'll re-address once I can find its usage and
> benefits.
>
> Hope this helps.
>
> Thanks,
> Jungtaek Lim (HeartSaVioR)
>
> 2016년 10월 4일 (화) 오후 7:08, Γιώργος Θεοδωράκης <gi...@gmail.com>님이
> 작성:
>
> > I think I did as you said:
> >
> > https://github.com/giwrgostheod/Calcite-Saber/blob/master/src/main/java/
> calcite/VolcanoTester.java
> >
> > and I get for every query I use:
> > Exception in thread "main"
> > org.apache.calcite.plan.RelOptPlanner$CannotPlanException: Node
> > [rel#10:Subset#2.NONE.[]] could not be implemented; planner state:
> > Root: rel#10:Subset#2.NONE.[]
> > Original rel:
> > ....
> > at
> >
> > org.apache.calcite.plan.volcano.RelSubset$CheapestPlanReplacer.visit(
> RelSubset.java:443)
> > at
> >
> > org.apache.calcite.plan.volcano.RelSubset.buildCheapestPlan(RelSubset.
> java:293)
> > at
> >
> > org.apache.calcite.plan.volcano.VolcanoPlanner.
> findBestExp(VolcanoPlanner.java:835)
> > at org.apache.calcite.tools.Programs$RuleSetProgram.run(
> Programs.java:334)
> > at org.apache.calcite.prepare.PlannerImpl.transform(
> PlannerImpl.java:308)
> > at calcite.VolcanoTester.main(VolcanoTester.java:77)
> >
> > My table's is defined here :
> >
> > https://github.com/giwrgostheod/Calcite-Saber/blob/master/src/main/java/
> calcite/utils/OrdersTableFactory.java
> >
> >
> > Thank you for your time,
> > George
> >
> >
> > 2016-10-04 2:38 GMT+03:00 Jordan Halterman <jo...@gmail.com>:
> >
> > > The link you provided is a pretty good example. Build a FrameworkConfig
> > > with your schema, parser config, and other information, and use it to
> > > create a Planner. That Planner uses a VolcanoPlanner internally. What's
> > > missing from that particular example is just the addition of programs.
> > > Programs are effectively sets of rules you will use to optimize your
> > query.
> > > So, to add your FilterProjectTransposeRule to the planner, call this
> when
> > > building your FrameworkConfig:
> > >
> > > .programs(Programs.ofRules(FilterProjectTransposeRule.INSTANCE))
> > >
> > > That adds your program(s) to the set of programs in the planner, and
> > those
> > > programs can be accessed to optimize the query. Use the planner to
> > parse()
> > > your query, validate() your query, and then convert() your query into a
> > > logical plan. Then call...
> > >
> > > RelTraitSet traitSet = planner.emptyTraitSet().
> replace(Convention.NONE);
> > > planner.transform(0. traitSet, logicalPlan);
> > >
> > > to apply the rules you added to the configuration. That should use the
> > > VolcanoPlanner to apply the rules you added in your Program. The trait
> > set
> > > that's passed to that method is the required output trait set. So, if
> you
> > > wanted to convert the logical plan into some physical convention, you'd
> > > pass your physical convention instead of Convention.NONE. I can respond
> > > with a full example if you need it in a bit. I just don't have the
> > capacity
> > > to write it ATM.
> > >
> > > On Mon, Oct 3, 2016 at 8:51 AM, Γιώργος Θεοδωράκης <
> > > giwrgosrtheod@gmail.com>
> > > wrote:
> > >
> > > > Hi,
> > > >
> > > > I want to parse an Sql query and transform it to an optimized
> > relational
> > > > plan (not convert it to physical !!) using calcite rules based on my
> > > > database schema and metadata. Right now, the only helpful example I
> > have
> > > > found for my purpose is taken from
> > > > https://github.com/milinda/samza-sql/blob/master/samza-
> > > >
> > sql-planner/src/main/java/org/apache/samza/sql/planner/QueryPlanner.java
> > > > ,
> > > > in which a simple Planner is used for parsing and validating Sql and
> a
> > > > HepPlanner is used for searching for an optimized plan based on
> > imported
> > > > rules.
> > > >
> > > > Is there any way to use in my case the VolcanoPlanner? The only
> > examples
> > > I
> > > > have seen so far from the test classes suggest that it should be used
> > for
> > > > converting relational expressions to physical ones. How can I make
> the
> > > > Volcano Planner "see" my SchemaPlus schema ,when I can only define
> > > > RelOptSchema? Can someone provide me with a complete example of using
> > > > Volcano Planner and adding rules, such
> > > > as FilterProjectTransposeRule.INSTANCE?
> > > >
> > > > Thanks in advance,
> > > > George
> > > >
> > >
> >
>

Re: How to use VolcanoPlanner

Posted by Jungtaek Lim <ka...@gmail.com>.
Hi George,

This patch is ported version (with small fixes) of Milinda's samza-sql
implementation for Storm SQL.
https://github.com/apache/storm/pull/1736

In this patch I removed adding HepPlanner and just rely on Volcano Planner
(so the patch may be the closer thing what you want).
For now I also remove code regarding metadata since I'm not clear on how it
works and what it helps, but I'll re-address once I can find its usage and
benefits.

Hope this helps.

Thanks,
Jungtaek Lim (HeartSaVioR)

2016년 10월 4일 (화) 오후 7:08, Γιώργος Θεοδωράκης <gi...@gmail.com>님이 작성:

> I think I did as you said:
>
> https://github.com/giwrgostheod/Calcite-Saber/blob/master/src/main/java/calcite/VolcanoTester.java
>
> and I get for every query I use:
> Exception in thread "main"
> org.apache.calcite.plan.RelOptPlanner$CannotPlanException: Node
> [rel#10:Subset#2.NONE.[]] could not be implemented; planner state:
> Root: rel#10:Subset#2.NONE.[]
> Original rel:
> ....
> at
>
> org.apache.calcite.plan.volcano.RelSubset$CheapestPlanReplacer.visit(RelSubset.java:443)
> at
>
> org.apache.calcite.plan.volcano.RelSubset.buildCheapestPlan(RelSubset.java:293)
> at
>
> org.apache.calcite.plan.volcano.VolcanoPlanner.findBestExp(VolcanoPlanner.java:835)
> at org.apache.calcite.tools.Programs$RuleSetProgram.run(Programs.java:334)
> at org.apache.calcite.prepare.PlannerImpl.transform(PlannerImpl.java:308)
> at calcite.VolcanoTester.main(VolcanoTester.java:77)
>
> My table's is defined here :
>
> https://github.com/giwrgostheod/Calcite-Saber/blob/master/src/main/java/calcite/utils/OrdersTableFactory.java
>
>
> Thank you for your time,
> George
>
>
> 2016-10-04 2:38 GMT+03:00 Jordan Halterman <jo...@gmail.com>:
>
> > The link you provided is a pretty good example. Build a FrameworkConfig
> > with your schema, parser config, and other information, and use it to
> > create a Planner. That Planner uses a VolcanoPlanner internally. What's
> > missing from that particular example is just the addition of programs.
> > Programs are effectively sets of rules you will use to optimize your
> query.
> > So, to add your FilterProjectTransposeRule to the planner, call this when
> > building your FrameworkConfig:
> >
> > .programs(Programs.ofRules(FilterProjectTransposeRule.INSTANCE))
> >
> > That adds your program(s) to the set of programs in the planner, and
> those
> > programs can be accessed to optimize the query. Use the planner to
> parse()
> > your query, validate() your query, and then convert() your query into a
> > logical plan. Then call...
> >
> > RelTraitSet traitSet = planner.emptyTraitSet().replace(Convention.NONE);
> > planner.transform(0. traitSet, logicalPlan);
> >
> > to apply the rules you added to the configuration. That should use the
> > VolcanoPlanner to apply the rules you added in your Program. The trait
> set
> > that's passed to that method is the required output trait set. So, if you
> > wanted to convert the logical plan into some physical convention, you'd
> > pass your physical convention instead of Convention.NONE. I can respond
> > with a full example if you need it in a bit. I just don't have the
> capacity
> > to write it ATM.
> >
> > On Mon, Oct 3, 2016 at 8:51 AM, Γιώργος Θεοδωράκης <
> > giwrgosrtheod@gmail.com>
> > wrote:
> >
> > > Hi,
> > >
> > > I want to parse an Sql query and transform it to an optimized
> relational
> > > plan (not convert it to physical !!) using calcite rules based on my
> > > database schema and metadata. Right now, the only helpful example I
> have
> > > found for my purpose is taken from
> > > https://github.com/milinda/samza-sql/blob/master/samza-
> > >
> sql-planner/src/main/java/org/apache/samza/sql/planner/QueryPlanner.java
> > > ,
> > > in which a simple Planner is used for parsing and validating Sql and a
> > > HepPlanner is used for searching for an optimized plan based on
> imported
> > > rules.
> > >
> > > Is there any way to use in my case the VolcanoPlanner? The only
> examples
> > I
> > > have seen so far from the test classes suggest that it should be used
> for
> > > converting relational expressions to physical ones. How can I make the
> > > Volcano Planner "see" my SchemaPlus schema ,when I can only define
> > > RelOptSchema? Can someone provide me with a complete example of using
> > > Volcano Planner and adding rules, such
> > > as FilterProjectTransposeRule.INSTANCE?
> > >
> > > Thanks in advance,
> > > George
> > >
> >
>

Re: How to use VolcanoPlanner

Posted by Γιώργος Θεοδωράκης <gi...@gmail.com>.
I think I did as you said:
https://github.com/giwrgostheod/Calcite-Saber/blob/master/src/main/java/calcite/VolcanoTester.java

and I get for every query I use:
Exception in thread "main"
org.apache.calcite.plan.RelOptPlanner$CannotPlanException: Node
[rel#10:Subset#2.NONE.[]] could not be implemented; planner state:
Root: rel#10:Subset#2.NONE.[]
Original rel:
....
at
org.apache.calcite.plan.volcano.RelSubset$CheapestPlanReplacer.visit(RelSubset.java:443)
at
org.apache.calcite.plan.volcano.RelSubset.buildCheapestPlan(RelSubset.java:293)
at
org.apache.calcite.plan.volcano.VolcanoPlanner.findBestExp(VolcanoPlanner.java:835)
at org.apache.calcite.tools.Programs$RuleSetProgram.run(Programs.java:334)
at org.apache.calcite.prepare.PlannerImpl.transform(PlannerImpl.java:308)
at calcite.VolcanoTester.main(VolcanoTester.java:77)

My table's is defined here :
https://github.com/giwrgostheod/Calcite-Saber/blob/master/src/main/java/calcite/utils/OrdersTableFactory.java


Thank you for your time,
George


2016-10-04 2:38 GMT+03:00 Jordan Halterman <jo...@gmail.com>:

> The link you provided is a pretty good example. Build a FrameworkConfig
> with your schema, parser config, and other information, and use it to
> create a Planner. That Planner uses a VolcanoPlanner internally. What's
> missing from that particular example is just the addition of programs.
> Programs are effectively sets of rules you will use to optimize your query.
> So, to add your FilterProjectTransposeRule to the planner, call this when
> building your FrameworkConfig:
>
> .programs(Programs.ofRules(FilterProjectTransposeRule.INSTANCE))
>
> That adds your program(s) to the set of programs in the planner, and those
> programs can be accessed to optimize the query. Use the planner to parse()
> your query, validate() your query, and then convert() your query into a
> logical plan. Then call...
>
> RelTraitSet traitSet = planner.emptyTraitSet().replace(Convention.NONE);
> planner.transform(0. traitSet, logicalPlan);
>
> to apply the rules you added to the configuration. That should use the
> VolcanoPlanner to apply the rules you added in your Program. The trait set
> that's passed to that method is the required output trait set. So, if you
> wanted to convert the logical plan into some physical convention, you'd
> pass your physical convention instead of Convention.NONE. I can respond
> with a full example if you need it in a bit. I just don't have the capacity
> to write it ATM.
>
> On Mon, Oct 3, 2016 at 8:51 AM, Γιώργος Θεοδωράκης <
> giwrgosrtheod@gmail.com>
> wrote:
>
> > Hi,
> >
> > I want to parse an Sql query and transform it to an optimized relational
> > plan (not convert it to physical !!) using calcite rules based on my
> > database schema and metadata. Right now, the only helpful example I have
> > found for my purpose is taken from
> > https://github.com/milinda/samza-sql/blob/master/samza-
> > sql-planner/src/main/java/org/apache/samza/sql/planner/QueryPlanner.java
> > ,
> > in which a simple Planner is used for parsing and validating Sql and a
> > HepPlanner is used for searching for an optimized plan based on imported
> > rules.
> >
> > Is there any way to use in my case the VolcanoPlanner? The only examples
> I
> > have seen so far from the test classes suggest that it should be used for
> > converting relational expressions to physical ones. How can I make the
> > Volcano Planner "see" my SchemaPlus schema ,when I can only define
> > RelOptSchema? Can someone provide me with a complete example of using
> > Volcano Planner and adding rules, such
> > as FilterProjectTransposeRule.INSTANCE?
> >
> > Thanks in advance,
> > George
> >
>

Re: How to use VolcanoPlanner

Posted by Jordan Halterman <jo...@gmail.com>.
The link you provided is a pretty good example. Build a FrameworkConfig
with your schema, parser config, and other information, and use it to
create a Planner. That Planner uses a VolcanoPlanner internally. What's
missing from that particular example is just the addition of programs.
Programs are effectively sets of rules you will use to optimize your query.
So, to add your FilterProjectTransposeRule to the planner, call this when
building your FrameworkConfig:

.programs(Programs.ofRules(FilterProjectTransposeRule.INSTANCE))

That adds your program(s) to the set of programs in the planner, and those
programs can be accessed to optimize the query. Use the planner to parse()
your query, validate() your query, and then convert() your query into a
logical plan. Then call...

RelTraitSet traitSet = planner.emptyTraitSet().replace(Convention.NONE);
planner.transform(0. traitSet, logicalPlan);

to apply the rules you added to the configuration. That should use the
VolcanoPlanner to apply the rules you added in your Program. The trait set
that's passed to that method is the required output trait set. So, if you
wanted to convert the logical plan into some physical convention, you'd
pass your physical convention instead of Convention.NONE. I can respond
with a full example if you need it in a bit. I just don't have the capacity
to write it ATM.

On Mon, Oct 3, 2016 at 8:51 AM, Γιώργος Θεοδωράκης <gi...@gmail.com>
wrote:

> Hi,
>
> I want to parse an Sql query and transform it to an optimized relational
> plan (not convert it to physical !!) using calcite rules based on my
> database schema and metadata. Right now, the only helpful example I have
> found for my purpose is taken from
> https://github.com/milinda/samza-sql/blob/master/samza-
> sql-planner/src/main/java/org/apache/samza/sql/planner/QueryPlanner.java
> ,
> in which a simple Planner is used for parsing and validating Sql and a
> HepPlanner is used for searching for an optimized plan based on imported
> rules.
>
> Is there any way to use in my case the VolcanoPlanner? The only examples I
> have seen so far from the test classes suggest that it should be used for
> converting relational expressions to physical ones. How can I make the
> Volcano Planner "see" my SchemaPlus schema ,when I can only define
> RelOptSchema? Can someone provide me with a complete example of using
> Volcano Planner and adding rules, such
> as FilterProjectTransposeRule.INSTANCE?
>
> Thanks in advance,
> George
>