You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@drill.apache.org by vdiravka <gi...@git.apache.org> on 2018/04/19 19:45:51 UTC
[GitHub] drill pull request #1226: DRILL-3855: Enable FilterSetOpTransposeRule, Drill...
GitHub user vdiravka opened a pull request:
https://github.com/apache/drill/pull/1226
DRILL-3855: Enable FilterSetOpTransposeRule, DrillProjectSetOpTranspo…
…seRule
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/vdiravka/drill DRILL-3855
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/drill/pull/1226.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #1226
----
commit 797cf596549531e1e0c613e956ecbaea17b44db8
Author: Vitalii Diravka <vi...@...>
Date: 2018-04-19T11:31:17Z
DRILL-3855: Enable FilterSetOpTransposeRule, DrillProjectSetOpTransposeRule
----
---
[GitHub] drill pull request #1226: DRILL-3855: Enable FilterSetOpTransposeRule, Drill...
Posted by vdiravka <gi...@git.apache.org>.
Github user vdiravka commented on a diff in the pull request:
https://github.com/apache/drill/pull/1226#discussion_r183768380
--- Diff: exec/java-exec/src/test/java/org/apache/drill/TestUnionAll.java ---
@@ -634,10 +634,14 @@ public void testFilterPushDownOverUnionAll() throws Exception {
+ "order by n_regionkey";
// Validate the plan
- final String[] expectedPlan = {".*Filter.*\n" +
- ".*UnionAll.*\n" +
- ".*Scan.*columns=\\[`n_regionkey`\\].*\n" +
- ".*Scan.*columns=\\[`r_regionkey`\\].*"};
+ final String[] expectedPlan = {"Sort.*\n" +
--- End diff --
Looks like there are no issues related to it. I have added additional test case to verify it:
`select n_nationkey from (select n_nationkey, n_name, n_comment from cp."tpch/nation.parquet" union all select r_regionkey, r_name, r_comment from cp."tpch/region.parquet") where n_nationkey > 4;`
When filter is pushed down the result from right side of UNION will be empty.
Generally speaking using UNION operator with empty data was resolved in [DRILL-4185](https://issues.apache.org/jira/browse/DRILL-4185) and we have test cases with empty data batches.
---
[GitHub] drill pull request #1226: DRILL-3855: Enable FilterSetOpTransposeRule, Drill...
Posted by amansinha100 <gi...@git.apache.org>.
Github user amansinha100 commented on a diff in the pull request:
https://github.com/apache/drill/pull/1226#discussion_r183087404
--- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/handlers/DefaultSqlHandler.java ---
@@ -238,26 +238,27 @@ protected DrillRel convertToRawDrel(final RelNode relNode) throws SqlUnsupported
// HEP Directory pruning .
final RelNode pruned = transform(PlannerType.HEP_BOTTOM_UP, PlannerPhase.DIRECTORY_PRUNING, relNode);
final RelTraitSet logicalTraits = pruned.getTraitSet().plus(DrillRel.DRILL_LOGICAL);
+ final RelNode intermediateNode = transform(PlannerType.HEP, PlannerPhase.PRE_LOGICAL_PLANNING, pruned);
--- End diff --
Did you consider applying this even before the DIRECTORY_PRUNING ? If a filter is on dirN column and gets pushed on both sides of the union-all, then directory pruning can be done.
---
[GitHub] drill pull request #1226: DRILL-3855: Enable FilterSetOpTransposeRule, Drill...
Posted by vdiravka <gi...@git.apache.org>.
Github user vdiravka commented on a diff in the pull request:
https://github.com/apache/drill/pull/1226#discussion_r183763991
--- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/planner/PlannerPhase.java ---
@@ -178,6 +178,12 @@ public RuleSet getRules(OptimizerRulesContext context, Collection<StoragePlugin>
PlannerPhase.getPhysicalRules(context),
getStorageRules(context, plugins, this));
}
+ },
+
+ PRE_LOGICAL_PLANNING("Planning with Hep planner only for rules, which are failed for Volcano planner") {
--- End diff --
You are right, it is necessary to add these rules to Drill's `staticRuleSet`, so then these setOpTranspose rules can cover more cases.
When it will be added to Volcano Planner we can think, whether to remove `PlannerPhase.PRE_LOGICAL_PLANNING` or to leave it for more complete `directory_pruning` or think to move `PlannerPhase.DIRECTORY_PRUNING` to `staticRuleSet` for Volcano Planner too.
---
[GitHub] drill pull request #1226: DRILL-3855: Enable FilterSetOpTransposeRule, Drill...
Posted by amansinha100 <gi...@git.apache.org>.
Github user amansinha100 commented on a diff in the pull request:
https://github.com/apache/drill/pull/1226#discussion_r183091956
--- Diff: exec/java-exec/src/test/java/org/apache/drill/TestUnionAll.java ---
@@ -634,10 +634,14 @@ public void testFilterPushDownOverUnionAll() throws Exception {
+ "order by n_regionkey";
// Validate the plan
- final String[] expectedPlan = {".*Filter.*\n" +
- ".*UnionAll.*\n" +
- ".*Scan.*columns=\\[`n_regionkey`\\].*\n" +
- ".*Scan.*columns=\\[`r_regionkey`\\].*"};
+ final String[] expectedPlan = {"Sort.*\n" +
--- End diff --
Apart from these tests, do we have any unit tests that produce empty results once the filter is pushed on either side of the union-all ? I expect that there could be some issues uncovered due to empty batch handling. However, those issues could be treated as separate JIRA.
---
[GitHub] drill pull request #1226: DRILL-3855: Enable FilterSetOpTransposeRule, Drill...
Posted by vdiravka <gi...@git.apache.org>.
Github user vdiravka commented on a diff in the pull request:
https://github.com/apache/drill/pull/1226#discussion_r183764377
--- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/handlers/DefaultSqlHandler.java ---
@@ -238,26 +238,27 @@ protected DrillRel convertToRawDrel(final RelNode relNode) throws SqlUnsupported
// HEP Directory pruning .
final RelNode pruned = transform(PlannerType.HEP_BOTTOM_UP, PlannerPhase.DIRECTORY_PRUNING, relNode);
final RelTraitSet logicalTraits = pruned.getTraitSet().plus(DrillRel.DRILL_LOGICAL);
+ final RelNode intermediateNode = transform(PlannerType.HEP, PlannerPhase.PRE_LOGICAL_PLANNING, pruned);
--- End diff --
Nice suggestion. Done.
---
[GitHub] drill issue #1226: DRILL-3855: Enable FilterSetOpTransposeRule, DrillProject...
Posted by arina-ielchiieva <gi...@git.apache.org>.
Github user arina-ielchiieva commented on the issue:
https://github.com/apache/drill/pull/1226
@amansinha100 / @vvysotskyi could you please review?
---
[GitHub] drill issue #1226: DRILL-3855: Enable FilterSetOpTransposeRule, DrillProject...
Posted by amansinha100 <gi...@git.apache.org>.
Github user amansinha100 commented on the issue:
https://github.com/apache/drill/pull/1226
Thanks for making the changes. +1
---
[GitHub] drill pull request #1226: DRILL-3855: Enable FilterSetOpTransposeRule, Drill...
Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:
https://github.com/apache/drill/pull/1226
---
[GitHub] drill pull request #1226: DRILL-3855: Enable FilterSetOpTransposeRule, Drill...
Posted by amansinha100 <gi...@git.apache.org>.
Github user amansinha100 commented on a diff in the pull request:
https://github.com/apache/drill/pull/1226#discussion_r183091126
--- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/planner/PlannerPhase.java ---
@@ -178,6 +178,12 @@ public RuleSet getRules(OptimizerRulesContext context, Collection<StoragePlugin>
PlannerPhase.getPhysicalRules(context),
getStorageRules(context, plugins, this));
}
+ },
+
+ PRE_LOGICAL_PLANNING("Planning with Hep planner only for rules, which are failed for Volcano planner") {
--- End diff --
I thought about whether a separate group of rules is needed instead of adding it to existing group of HEP rules. But I suppose it is ok to create the separate group with the expectation that once Calcite-1271 is fixed for Volcano planner, we have the option to remove this group. Thinking more about this..filter pushdown past a union-all is cost-safe since union-all output rows is equal or greater than input rows. (it is not cost-safe for pushdown past union-distinct).
---