You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@calcite.apache.org by "Vladimir Sitnikov (JIRA)" <ji...@apache.org> on 2018/09/22 20:02:00 UTC
[jira] [Commented] (CALCITE-2223) ProjectMergeRule is infinitely
matched when is applied after ProjectReduceExpressionsRule
[ https://issues.apache.org/jira/browse/CALCITE-2223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16624811#comment-16624811 ]
Vladimir Sitnikov commented on CALCITE-2223:
--------------------------------------------
It looks like the case was somewhat "expected" in [org.apache.calcite.plan.volcano.RuleQueue#checkDuplicateSubsets|https://github.com/apache/calcite/blob/295ab13e8338bdd0e0c29e051907371c9b2929aa/core/src/main/java/org/apache/calcite/plan/volcano/RuleQueue.java#L508]:
{code:java}
// If the same subset appears more than once along any path from root
// operand to a leaf operand, we have matched a cycle. A relational
// expression that consumes its own output can never be implemented, and
// furthermore, if we fire rules on it we may generate lots of garbage.
// For example, if
// Project(A, X = X + 0)
// is in the same subset as A, then we would generate
// Project(A, X = X + 0 + 0)
// Project(A, X = X + 0 + 0 + 0)
// also in the same subset. They are valid but useless.
final Deque<RelSubset> subsets = new ArrayDeque<>();
try {
checkDuplicateSubsets(subsets, match.rule.getOperand(), match.rels);
} catch (Util.FoundOne e) {
return true;
}
{code}
{code:java} /** Recursively checks whether there are any duplicate subsets along any path
* from root of the operand tree to one of the leaves.
*
* <p>It is OK for a match to have duplicate subsets if they are not on the
* same path. For example,
*
* <blockquote><pre>
* Join
* / \
* X X
* </pre></blockquote>
*
* <p>is a valid match.
*
* @throws org.apache.calcite.util.Util.FoundOne on match
*/
private void checkDuplicateSubsets(Deque<RelSubset> subsets,
RelOptRuleOperand operand, RelNode[] rels) {
final RelSubset subset = planner.getSubset(rels[operand.ordinalInRule]);
if (subsets.contains(subset)) {
throw Util.FoundOne.NULL;
}
if (!operand.getChildOperands().isEmpty()) {
subsets.push(subset);
for (RelOptRuleOperand childOperand : operand.getChildOperands()) {
checkDuplicateSubsets(subsets, childOperand, rels);
}
final RelSubset x = subsets.pop();
assert x == subset;
}
}
{code}
However, {{checkDuplicateSubsets}} does not verify all inputs of operands.
In other words, it computes subsets of {{RelOptRuleOperand}}s only.
In case of ProjectMergeRule, operands are Project(Project(...)) (two project nodes).
However the actual configuration is rel#7:Project( ... rel#9:Project( ..., rel#7:Project ) )
{noformat}rel#7:LogicalProject.NONE.[](input=RelSubset#6,f0=CAST($0):INTEGER NOT NULL,f=$0)
rel#9:LogicalProject.NONE.[](input=RelSubset#8,f=$0)
{noformat}
rel#7 is in rel#8:Subset#2
rel#9 is in rel#6:Subset#1
{{checkDuplicateSubsets}} starts with adding {{rel#8}} to {{Deque<RelSubset> subsets}}, then it adds {{rel#6}} and it does not see that rel#9 consumes {{rel#8}} which forms a cycle.
It looks like {{checkDuplicateSubsets}} needs more robust cycle calculation. However, it is not clear if there could be more complicated cycles.
> ProjectMergeRule is infinitely matched when is applied after ProjectReduceExpressionsRule
> -----------------------------------------------------------------------------------------
>
> Key: CALCITE-2223
> URL: https://issues.apache.org/jira/browse/CALCITE-2223
> Project: Calcite
> Issue Type: Bug
> Reporter: Volodymyr Vysotskyi
> Assignee: Julian Hyde
> Priority: Critical
>
> For queries like this:
> {code:sql}
> select t1.f from (select cast(f as int) f, f from (select cast(f as int) f from (values('1')) t(f))) as t1
> {code}
> OOM is thrown when {{ProjectMergeRule}} is applied before applying {{ProjectReduceExpressionsRule}} in VolcanoPlanner.
> A simple test to reproduce this issue (in {{RelOptRulesTest}}):
> {code:java}
> @Test public void testOomProjectMergeRule() {
> RelBuilder relBuilder = RelBuilder.create(RelBuilderTest.config().build());
> RelNode relNode = relBuilder
> .values(new String[]{"f"}, "1")
> .project(
> relBuilder.alias(
> relBuilder.cast(relBuilder.field(0), SqlTypeName.INTEGER),
> "f"))
> .project(
> relBuilder.alias(
> relBuilder.cast(relBuilder.field(0), SqlTypeName.INTEGER),
> "f0"),
> relBuilder.alias(relBuilder.field(0), "f"))
> .project(
> relBuilder.alias(relBuilder.field(0), "f"))
> .build();
> RelOptPlanner planner = relNode.getCluster().getPlanner();
> RuleSet ruleSet =
> RuleSets.ofList(
> ReduceExpressionsRule.PROJECT_INSTANCE,
> new ProjectMergeRuleWithLongerName(),
> EnumerableRules.ENUMERABLE_PROJECT_RULE,
> EnumerableRules.ENUMERABLE_VALUES_RULE);
> Program program = Programs.of(ruleSet);
> RelTraitSet toTraits =
> relNode.getCluster().traitSet()
> .replace(0, EnumerableConvention.INSTANCE);
> RelNode output = program.run(planner, relNode, toTraits,
> ImmutableList.<RelOptMaterialization>of(), ImmutableList.<RelOptLattice>of());
> // check for output
> }
> /**
> * ProjectMergeRule inheritor which has
> * class name greater than ProjectReduceExpressionsRule class name (String.compareTo()).
> *
> * It is needed for RuleQueue.popMatch() method
> * to apply this rule before ProjectReduceExpressionsRule.
> */
> private static class ProjectMergeRuleWithLongerName extends ProjectMergeRule {
> public ProjectMergeRuleWithLongerName() {
> super(true, RelFactories.LOGICAL_BUILDER);
> }
> }
> {code}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)