You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@drill.apache.org by vdiravka <gi...@git.apache.org> on 2018/04/19 19:45:51 UTC

[GitHub] drill pull request #1226: DRILL-3855: Enable FilterSetOpTransposeRule, Drill...

GitHub user vdiravka opened a pull request:

    https://github.com/apache/drill/pull/1226

    DRILL-3855: Enable FilterSetOpTransposeRule, DrillProjectSetOpTranspo…

    …seRule

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/vdiravka/drill DRILL-3855

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/drill/pull/1226.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #1226
    
----
commit 797cf596549531e1e0c613e956ecbaea17b44db8
Author: Vitalii Diravka <vi...@...>
Date:   2018-04-19T11:31:17Z

    DRILL-3855: Enable FilterSetOpTransposeRule, DrillProjectSetOpTransposeRule

----


---

[GitHub] drill pull request #1226: DRILL-3855: Enable FilterSetOpTransposeRule, Drill...

Posted by vdiravka <gi...@git.apache.org>.
Github user vdiravka commented on a diff in the pull request:

    https://github.com/apache/drill/pull/1226#discussion_r183768380
  
    --- Diff: exec/java-exec/src/test/java/org/apache/drill/TestUnionAll.java ---
    @@ -634,10 +634,14 @@ public void testFilterPushDownOverUnionAll() throws Exception {
             + "order by n_regionkey";
     
         // Validate the plan
    -    final String[] expectedPlan = {".*Filter.*\n" +
    -            ".*UnionAll.*\n" +
    -            ".*Scan.*columns=\\[`n_regionkey`\\].*\n" +
    -            ".*Scan.*columns=\\[`r_regionkey`\\].*"};
    +    final String[] expectedPlan = {"Sort.*\n" +
    --- End diff --
    
    Looks like there are no issues related to it. I have added additional test case to verify it:
    `select n_nationkey from (select n_nationkey, n_name, n_comment from cp."tpch/nation.parquet" union all select r_regionkey, r_name, r_comment  from cp."tpch/region.parquet") where n_nationkey > 4;`
    When filter is pushed down the result from right side of UNION will be empty.
    
    Generally speaking using UNION operator with empty data was resolved in [DRILL-4185](https://issues.apache.org/jira/browse/DRILL-4185) and we have test cases with empty data batches.


---

[GitHub] drill pull request #1226: DRILL-3855: Enable FilterSetOpTransposeRule, Drill...

Posted by amansinha100 <gi...@git.apache.org>.
Github user amansinha100 commented on a diff in the pull request:

    https://github.com/apache/drill/pull/1226#discussion_r183087404
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/handlers/DefaultSqlHandler.java ---
    @@ -238,26 +238,27 @@ protected DrillRel convertToRawDrel(final RelNode relNode) throws SqlUnsupported
           // HEP Directory pruning .
           final RelNode pruned = transform(PlannerType.HEP_BOTTOM_UP, PlannerPhase.DIRECTORY_PRUNING, relNode);
           final RelTraitSet logicalTraits = pruned.getTraitSet().plus(DrillRel.DRILL_LOGICAL);
    +      final RelNode intermediateNode = transform(PlannerType.HEP, PlannerPhase.PRE_LOGICAL_PLANNING, pruned);
    --- End diff --
    
    Did you consider applying this even before the DIRECTORY_PRUNING ?  If a filter is on dirN column and gets pushed on both sides of the union-all, then directory pruning can be done.  


---

[GitHub] drill pull request #1226: DRILL-3855: Enable FilterSetOpTransposeRule, Drill...

Posted by vdiravka <gi...@git.apache.org>.
Github user vdiravka commented on a diff in the pull request:

    https://github.com/apache/drill/pull/1226#discussion_r183763991
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/planner/PlannerPhase.java ---
    @@ -178,6 +178,12 @@ public RuleSet getRules(OptimizerRulesContext context, Collection<StoragePlugin>
               PlannerPhase.getPhysicalRules(context),
               getStorageRules(context, plugins, this));
         }
    +  },
    +
    +  PRE_LOGICAL_PLANNING("Planning with Hep planner only for rules, which are failed for Volcano planner") {
    --- End diff --
    
    You are right, it is necessary to add these rules to Drill's `staticRuleSet`, so then these setOpTranspose rules can cover more cases.
    
    When it will be added to Volcano Planner we can think, whether to remove `PlannerPhase.PRE_LOGICAL_PLANNING` or to leave it for more complete `directory_pruning` or think to move `PlannerPhase.DIRECTORY_PRUNING` to `staticRuleSet` for Volcano Planner too.


---

[GitHub] drill pull request #1226: DRILL-3855: Enable FilterSetOpTransposeRule, Drill...

Posted by amansinha100 <gi...@git.apache.org>.
Github user amansinha100 commented on a diff in the pull request:

    https://github.com/apache/drill/pull/1226#discussion_r183091956
  
    --- Diff: exec/java-exec/src/test/java/org/apache/drill/TestUnionAll.java ---
    @@ -634,10 +634,14 @@ public void testFilterPushDownOverUnionAll() throws Exception {
             + "order by n_regionkey";
     
         // Validate the plan
    -    final String[] expectedPlan = {".*Filter.*\n" +
    -            ".*UnionAll.*\n" +
    -            ".*Scan.*columns=\\[`n_regionkey`\\].*\n" +
    -            ".*Scan.*columns=\\[`r_regionkey`\\].*"};
    +    final String[] expectedPlan = {"Sort.*\n" +
    --- End diff --
    
    Apart from these tests, do we have any unit tests that produce empty results once the filter is pushed on either side of the union-all ?  I expect that there could be some issues uncovered due to empty batch handling.  However, those issues could be treated as separate JIRA.  


---

[GitHub] drill pull request #1226: DRILL-3855: Enable FilterSetOpTransposeRule, Drill...

Posted by vdiravka <gi...@git.apache.org>.
Github user vdiravka commented on a diff in the pull request:

    https://github.com/apache/drill/pull/1226#discussion_r183764377
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/handlers/DefaultSqlHandler.java ---
    @@ -238,26 +238,27 @@ protected DrillRel convertToRawDrel(final RelNode relNode) throws SqlUnsupported
           // HEP Directory pruning .
           final RelNode pruned = transform(PlannerType.HEP_BOTTOM_UP, PlannerPhase.DIRECTORY_PRUNING, relNode);
           final RelTraitSet logicalTraits = pruned.getTraitSet().plus(DrillRel.DRILL_LOGICAL);
    +      final RelNode intermediateNode = transform(PlannerType.HEP, PlannerPhase.PRE_LOGICAL_PLANNING, pruned);
    --- End diff --
    
    Nice suggestion. Done.


---

[GitHub] drill issue #1226: DRILL-3855: Enable FilterSetOpTransposeRule, DrillProject...

Posted by arina-ielchiieva <gi...@git.apache.org>.
Github user arina-ielchiieva commented on the issue:

    https://github.com/apache/drill/pull/1226
  
    @amansinha100 / @vvysotskyi could you please review?


---

[GitHub] drill issue #1226: DRILL-3855: Enable FilterSetOpTransposeRule, DrillProject...

Posted by amansinha100 <gi...@git.apache.org>.
Github user amansinha100 commented on the issue:

    https://github.com/apache/drill/pull/1226
  
    Thanks for making the changes.  +1


---

[GitHub] drill pull request #1226: DRILL-3855: Enable FilterSetOpTransposeRule, Drill...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/drill/pull/1226


---

[GitHub] drill pull request #1226: DRILL-3855: Enable FilterSetOpTransposeRule, Drill...

Posted by amansinha100 <gi...@git.apache.org>.
Github user amansinha100 commented on a diff in the pull request:

    https://github.com/apache/drill/pull/1226#discussion_r183091126
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/planner/PlannerPhase.java ---
    @@ -178,6 +178,12 @@ public RuleSet getRules(OptimizerRulesContext context, Collection<StoragePlugin>
               PlannerPhase.getPhysicalRules(context),
               getStorageRules(context, plugins, this));
         }
    +  },
    +
    +  PRE_LOGICAL_PLANNING("Planning with Hep planner only for rules, which are failed for Volcano planner") {
    --- End diff --
    
    I thought about whether a separate group of rules is needed instead of adding it to existing group of HEP rules.  But I suppose it is ok to create the separate group with the expectation that once Calcite-1271 is fixed for Volcano planner, we have the option to remove this group.   Thinking more about this..filter pushdown past a union-all is cost-safe since union-all output rows is equal or greater than input rows.  (it is not cost-safe for pushdown past union-distinct).  


---