You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@calcite.apache.org by "Danny Chen (Jira)" <ji...@apache.org> on 2020/01/11 02:27:00 UTC

[jira] [Comment Edited] (CALCITE-2356) Allow user defined policy to dynamically define all or some specific rules' execution order or even skip some rules

    [ https://issues.apache.org/jira/browse/CALCITE-2356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17013337#comment-17013337 ] 

Danny Chen edited comment on CALCITE-2356 at 1/11/20 2:26 AM:
--------------------------------------------------------------

Here is what Roman has tried in our thread:  [DISCUSS] Proposal to add API to force rules matching specific rels

As Vladimir Ozerov mentioned, the general problem here is that
VolcanoPlanner applies both logical and physical rules in a top-down
manner. It's more like the original volcano paper approach [1]:

bq. In the Volcano search strategy, a first phase applied all
transformation rules to create all possible logical expressions for a query and all its subtrees. The second phase,
which performed the actual optimization, navigated within that network of equivalence classes and expressions,
applied implementation rules to obtain plans, and determined the best plan.

The Cascades paper [2] says

bq. In the Cascades optimizer, this separation into two phases is abolished, because it is not useful to derive all
bq. logically equivalent forms of all expressions, e.g., of a predicate. A group is explored using transformation rules
bq. only on demand

Unfortunately, it is not clear from this paper what the actual "on
demand" phrase means. But there is a great explanation of the Cascades
optimization process can be found in [3]:

bq. It begins with the input query and, ..., 
bq. proceeds top-down, using recursion on the inputs
bq. of the current multiexpression MExpr. However, plan
bq. costing actually proceeds bottom-up, based on the order
bq. of the returns from the top-down recursive calls.

and this is what VolcanoPlanner lacks for: when a logical rule is
applied in a top-down way and a new LogicalRelNode is created, we need
to understand whether the overall plan cost was improved by this move or
not. In order to understand it we need to know the actual cost of the
new plan. As [3] says:

bq. A plan is an expression made up entirely of physical operators.

Hence, to evaluate the plan cost we need to have fully implemented
physical tree of operators for our new LogicalRelNode which was created
during applying the logical rule above.

This process can be described with the pseudo-code:

for (LogicalRuleMatch rule : LogicalRuleMatchesSortedTopDown) {
  LogicalNode newNode = rule.apply();
  implementRecursivelyBottomUp(newNode);
}

we first apply logical rules in a top-down fashion and then after each
logical rule invocation we need to implement newly created nodes with
the physical rules recursively in a bottom-up way.

The outcome here is when the physical converter rule is applied to a
RelNode, all children of this node are already implemented, so their
traits can easily be derived and the precise cost of them can be calculated.

I tried to simulate this behavior with minor changes in
RuleQueue.RuleMatchImportanceComparator (see PR [4]) and it worked well
for me. Tests started perform better. I counted planner ticks (see
VolcanoPlanner#findBestExp) to evaluate performance improvements. For
example, the number of ticks in JdbcTest#testJoinManyWay before
comparator change:

ticks=11
ticks=1041
ticks=31362

and after:

ticks=11
ticks=678
ticks=26231

In some Apache Ignite test scenarios the number of ticks was reduced by
half.

The essence of this change is the minor adjustment of
RuleQueue.RuleMatchImportanceComparator:

- converter rules (physical rules) always have priority over the logical
ones (if there are physical rules in the queue, they will be applied first)
- Physical rules are ordered in the bottom-up fashion
- Logical rules are ordered in the top-down fashion

For example, if we have the initial rel:

LogicalProject
  LogicalFilter

the rule queue for this RelNode would be ordered like this:

1. PhysicalFilterRule <-children physical rules go first
2. PhysicalProjectRule <- then go parent physical rules
3. FilterProjectTransposeRule <- logical rules go last

This minor improvement might help in some cases:
- slight reducing the planning time.
- simplification of the bottom-up trait propagation because when it's
time to physically implement a parent node, all it's children are
already physically implemented.

But in general this approach is not the actual Cascades implementation
because it still missing one of the most useful Cascades feature: groups
prunning. Details can be found in [3]. Long story short: we can skip
optimization of some RelNodes if it is known that it's children have the
bigger cost than the best RelNode in the same Subset. The more
fundamental changes are required to fix it.

@Haisheng, are you doing something like that?

[1] https://paperhub.s3.amazonaws.com/dace52a42c07f7f8348b08dc2b186061.pdf
[2]
https://www.cse.iitb.ac.in/infolab/Data/Courses/CS632/Papers/Cascades-graefe.pdf
[3]
https://15721.courses.cs.cmu.edu/spring2019/papers/22-optimizer1/shapiro-ideas2001.pdf
[4] https://github.com/apache/calcite/pull/1732

-- 
Kind Regards
Roman Kondakov


was (Author: danny0405):
Here is what Roman has tried in our thread:  [DISCUSS] Proposal to add API to force rules matching specific rels

As Vladimir Ozerov mentioned, the general problem here is that
VolcanoPlanner applies both logical and physical rules in a top-down
manner. It's more like the original volcano paper approach [1]:

bq. In the Volcano search strategy, a first phase applied all
bq. transformation rules to create all possible logical expressions for a query and all its subtrees. The second phase,
bq. which performed the actual optimization, navigated within that network of equivalence classes and expressions,
bq. applied implementation rules to obtain plans, and determined the best plan.

The Cascades paper [2] says

bq. In the Cascades optimizer, this separation into two phases is abolished, because it is not useful to derive all
bq. logically equivalent forms of all expressions, e.g., of a predicate. A group is explored using transformation rules
bq. only on demand

Unfortunately, it is not clear from this paper what the actual "on
demand" phrase means. But there is a great explanation of the Cascades
optimization process can be found in [3]:

bq. It begins with the input query and, ..., 
bq. proceeds top-down, using recursion on the inputs
bq. of the current multiexpression MExpr. However, plan
bq. costing actually proceeds bottom-up, based on the order
bq. of the returns from the top-down recursive calls.

and this is what VolcanoPlanner lacks for: when a logical rule is
applied in a top-down way and a new LogicalRelNode is created, we need
to understand whether the overall plan cost was improved by this move or
not. In order to understand it we need to know the actual cost of the
new plan. As [3] says:

bq. A plan is an expression made up entirely of physical operators.

Hence, to evaluate the plan cost we need to have fully implemented
physical tree of operators for our new LogicalRelNode which was created
during applying the logical rule above.

This process can be described with the pseudo-code:

for (LogicalRuleMatch rule : LogicalRuleMatchesSortedTopDown) {
  LogicalNode newNode = rule.apply();
  implementRecursivelyBottomUp(newNode);
}

we first apply logical rules in a top-down fashion and then after each
logical rule invocation we need to implement newly created nodes with
the physical rules recursively in a bottom-up way.

The outcome here is when the physical converter rule is applied to a
RelNode, all children of this node are already implemented, so their
traits can easily be derived and the precise cost of them can be calculated.

I tried to simulate this behavior with minor changes in
RuleQueue.RuleMatchImportanceComparator (see PR [4]) and it worked well
for me. Tests started perform better. I counted planner ticks (see
VolcanoPlanner#findBestExp) to evaluate performance improvements. For
example, the number of ticks in JdbcTest#testJoinManyWay before
comparator change:

ticks=11
ticks=1041
ticks=31362

and after:

ticks=11
ticks=678
ticks=26231

In some Apache Ignite test scenarios the number of ticks was reduced by
half.

The essence of this change is the minor adjustment of
RuleQueue.RuleMatchImportanceComparator:

- converter rules (physical rules) always have priority over the logical
ones (if there are physical rules in the queue, they will be applied first)
- Physical rules are ordered in the bottom-up fashion
- Logical rules are ordered in the top-down fashion

For example, if we have the initial rel:

LogicalProject
  LogicalFilter

the rule queue for this RelNode would be ordered like this:

1. PhysicalFilterRule <-children physical rules go first
2. PhysicalProjectRule <- then go parent physical rules
3. FilterProjectTransposeRule <- logical rules go last

This minor improvement might help in some cases:
- slight reducing the planning time.
- simplification of the bottom-up trait propagation because when it's
time to physically implement a parent node, all it's children are
already physically implemented.

But in general this approach is not the actual Cascades implementation
because it still missing one of the most useful Cascades feature: groups
prunning. Details can be found in [3]. Long story short: we can skip
optimization of some RelNodes if it is known that it's children have the
bigger cost than the best RelNode in the same Subset. The more
fundamental changes are required to fix it.

@Haisheng, are you doing something like that?

[1] https://paperhub.s3.amazonaws.com/dace52a42c07f7f8348b08dc2b186061.pdf
[2]
https://www.cse.iitb.ac.in/infolab/Data/Courses/CS632/Papers/Cascades-graefe.pdf
[3]
https://15721.courses.cs.cmu.edu/spring2019/papers/22-optimizer1/shapiro-ideas2001.pdf
[4] https://github.com/apache/calcite/pull/1732

-- 
Kind Regards
Roman Kondakov

> Allow user defined policy to dynamically define all or some specific rules' execution order or even skip some rules
> -------------------------------------------------------------------------------------------------------------------
>
>                 Key: CALCITE-2356
>                 URL: https://issues.apache.org/jira/browse/CALCITE-2356
>             Project: Calcite
>          Issue Type: Bug
>            Reporter: Chunhui Shi
>            Priority: Major
>
> We have seen the order of the rule execution did impact VolcanoPlanner's behavior, for example, in CALCITE-2223, if we reverse the order decided by name in RuleMatchImportanceComparator, we could get different result for CALCITE-2223 case.
> And in some of our practices, we have seen rules on overlapped patterns could also trigger unnecessary chaos and much bigger exploration space which caused the planning time became much longer.
> So the proposal of this Jira is, while each rule focuses on the local pattern, Calcite allow a pluggable coordinator of rule execution to utilize the knowledge we have about the rules and current state. The output of this coordinator is the sequence of rules to execute on matching patterns. The input is still the matching rules and pattern discovered by Calcite. When new nodes added and new rules need to be triggered, they should be added through the coordinator to decide whether/how they will be executed.
> This proposed feature should not impact any current Calcite users since they don't define their own policies.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)