You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@drill.apache.org by Aman Sinha <as...@maprtech.com> on 2015/04/24 19:37:46 UTC

Review Request 33523: DRILL-1957: Support nested loop join planning (for NOT-IN, uncorrelated EXISTS, Inequality)

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33523/
-----------------------------------------------------------

Review request for drill and Jinfeng Ni.


Bugs: DRILL-1957
    https://issues.apache.org/jira/browse/DRILL-1957


Repository: drill-git


Description
-------

After Jinfeng's rebasing of Drill on Calcite master branch, the new Calcite logical plan for NOT-IN consists of a cartesian join with a scalar subquery.  In order to support this, we need nested loop join.  This patch adds support for such plans and works in complement with execution side changes already done earlier. 

The check for scalar subquery is restricted to scalar aggregates.  Note that a Filter may appear after a scalar aggregate, hence we also check for that. 

NL Join plan will do a broadcast of the right (scalar) child.  The join condition for NL join is always TRUE; however if there is a filter condition present, the planner will create a Filter-NLJ plan where the filter is done right after NLJ.  In this way, it is possible to test the NLJ for equality joins also; in order to support that, an option 'planner.enable_nljoin_for_scalar_only'  has been provided.  The default is TRUE. 

Besides NOT-IN, other SQL constructs that will be enabled are uncorrelated EXISTS, cartesian joins (involving scalars unless the above option is enabled) and inequality joins (also involving scalars).


Diffs
-----

  exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/join/JoinUtils.java 41bf786 
  exec/java-exec/src/main/java/org/apache/drill/exec/planner/common/DrillJoinRelBase.java 8dc5cf1 
  exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillFilterRel.java a914f47 
  exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillJoinRel.java 1f602c7 
  exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillJoinRule.java f832dfe 
  exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillRuleSets.java 532fd43 
  exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/HashJoinPrel.java aca55a0 
  exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/HashJoinPrule.java 24df0b1 
  exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/JoinPrel.java 59b9f41 
  exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/JoinPruleBase.java d6f1672 
  exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/MergeJoinPrel.java 3c0022f 
  exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/MergeJoinPrule.java cbcc920 
  exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/NestedLoopJoinPrel.java PRE-CREATION 
  exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/NestedLoopJoinPrule.java PRE-CREATION 
  exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/PlannerSettings.java ac86c4a 
  exec/java-exec/src/main/java/org/apache/drill/exec/server/options/SystemOptionManager.java a394efe 
  exec/java-exec/src/test/java/org/apache/drill/TestDisabledFunctionality.java e049943 
  exec/java-exec/src/test/java/org/apache/drill/TestTpchDistributed.java 2b41912 
  exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/join/TestNestedLoopJoin.java PRE-CREATION 
  exec/java-exec/src/test/resources/queries/tpch/11_1.sql PRE-CREATION 

Diff: https://reviews.apache.org/r/33523/diff/


Testing
-------

Added tests for NOT-IN, EXISTS, Inequality.  Enabled TPCH-16, TPCH-15 and a slight variant of TPCH-11.
Ran all unit tests. One test: TestDisabledFunctionality.testSubqueryWithoutCorrelatedJoinCondition encounters an error in NLJ execution which will be fixed shortly. This test is marked ignored for now. 

Full functional/TPCH-100 tests are in progress.


Thanks,

Aman Sinha


Re: Review Request 33523: DRILL-1957: Support nested loop join planning (for NOT-IN, uncorrelated EXISTS, Inequality)

Posted by Aman Sinha <as...@maprtech.com>.

> On April 27, 2015, 3:46 p.m., Jinfeng Ni wrote:
> > exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/join/JoinUtils.java, line 42
> > <https://reviews.apache.org/r/33523/diff/1/?file=941039#file941039line42>
> >
> >     Equality join includes both EQUALS and IS_NOT_DISTINCT_FROM comparator, right?

yes, currently I am not distinguishing between those for the join categorization...perhaps I should but will do this as part of refactoring to use JoinInfo (based on Julian's suggestion below).


> On April 27, 2015, 3:46 p.m., Jinfeng Ni wrote:
> > exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/join/JoinUtils.java, line 43
> > <https://reviews.apache.org/r/33523/diff/1/?file=941039#file941039line43>
> >
> >     If a join has both equality and inequality condition, which category will this join fall in? 
> >     
> >     e.g: 
> >       T1.c1 = T2.C2 and T1.C3 > T1.C4
> 
> Julian Hyde wrote:
>     Rather than JoinCategory, consider modeling joins the same way as Calcite's JoinInfo. A join is decomposed into 0 or more equi-join keys and a remaining condition. So, you can easily tell whether a join is equi, theta, cartesian, or hybrid.
>     
>     You often want to handle hybrid expressions like "T1.c1 = T2.C2 and T1.C3 > T1.C4" as an equi-join plus a remaining condition (e.g. you want to implement using hash-join followed by filter).
>     
>     Ideally this code would end up in Calcite at some point. Adopting the same core concepts will help.
>     
>     Regarding the above remark: Calcite does not recognize IS_NOT_DISTINCT_FROM as an equi condition, but it ought to.

Agree.. makes sense to use JoinInfo.  Given the time constaints for 0.9 release, I would like to get this in as-is and make the changes as part of refactoring for the 1.0 release.  I will file a JIRA to keep track of it.


> On April 27, 2015, 3:46 p.m., Jinfeng Ni wrote:
> > exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/join/JoinUtils.java, line 121
> > <https://reviews.apache.org/r/33523/diff/1/?file=941039#file941039line121>
> >
> >     Why do we treat DrillFilterRel differently from other Rel node?

Currently, I am limiting the scalar check for either subquery with scalar aggregate or a filter on top of the scalar subquery.  It could be generalized to non-filter Rel in the future.


> On April 27, 2015, 3:46 p.m., Jinfeng Ni wrote:
> > exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/join/JoinUtils.java, line 145
> > <https://reviews.apache.org/r/33523/diff/1/?file=941039#file941039line145>
> >
> >     Is it good enough to have just the fist condition? If the remaining is not always true, that means it's inequality join?

I preserved whatever we were doing earlier in DrillJoinRelBase.computeSelfCost().  This code will get obsolete once I do the refactoring to use JoinInfo.


> On April 27, 2015, 3:46 p.m., Jinfeng Ni wrote:
> > exec/java-exec/src/main/java/org/apache/drill/exec/planner/common/DrillJoinRelBase.java, line 87
> > <https://reviews.apache.org/r/33523/diff/1/?file=941040#file941040line87>
> >
> >     For alwaysTrue condition, is it the case that the join cardinality will be simply the product of both inputs?

For pure cartesian it is the right estimate.  For inequality conditions, it is an approximation but a reasonable estimate.


> On April 27, 2015, 3:46 p.m., Jinfeng Ni wrote:
> > exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillRuleSets.java, line 239
> > <https://reviews.apache.org/r/33523/diff/1/?file=941044#file941044line239>
> >
> >     Please add comment here, explian why NLJ will be enabled, only when broadcast Join is eanbled as well.

Added comment.


- Aman


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33523/#review81663
-----------------------------------------------------------


On April 24, 2015, 5:37 p.m., Aman Sinha wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/33523/
> -----------------------------------------------------------
> 
> (Updated April 24, 2015, 5:37 p.m.)
> 
> 
> Review request for drill and Jinfeng Ni.
> 
> 
> Bugs: DRILL-1957
>     https://issues.apache.org/jira/browse/DRILL-1957
> 
> 
> Repository: drill-git
> 
> 
> Description
> -------
> 
> After Jinfeng's rebasing of Drill on Calcite master branch, the new Calcite logical plan for NOT-IN consists of a cartesian join with a scalar subquery.  In order to support this, we need nested loop join.  This patch adds support for such plans and works in complement with execution side changes already done earlier. 
> 
> The check for scalar subquery is restricted to scalar aggregates.  Note that a Filter may appear after a scalar aggregate, hence we also check for that. 
> 
> NL Join plan will do a broadcast of the right (scalar) child.  The join condition for NL join is always TRUE; however if there is a filter condition present, the planner will create a Filter-NLJ plan where the filter is done right after NLJ.  In this way, it is possible to test the NLJ for equality joins also; in order to support that, an option 'planner.enable_nljoin_for_scalar_only'  has been provided.  The default is TRUE. 
> 
> Besides NOT-IN, other SQL constructs that will be enabled are uncorrelated EXISTS, cartesian joins (involving scalars unless the above option is enabled) and inequality joins (also involving scalars).
> 
> 
> Diffs
> -----
> 
>   exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/join/JoinUtils.java 41bf786 
>   exec/java-exec/src/main/java/org/apache/drill/exec/planner/common/DrillJoinRelBase.java 8dc5cf1 
>   exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillFilterRel.java a914f47 
>   exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillJoinRel.java 1f602c7 
>   exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillJoinRule.java f832dfe 
>   exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillRuleSets.java 532fd43 
>   exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/HashJoinPrel.java aca55a0 
>   exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/HashJoinPrule.java 24df0b1 
>   exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/JoinPrel.java 59b9f41 
>   exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/JoinPruleBase.java d6f1672 
>   exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/MergeJoinPrel.java 3c0022f 
>   exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/MergeJoinPrule.java cbcc920 
>   exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/NestedLoopJoinPrel.java PRE-CREATION 
>   exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/NestedLoopJoinPrule.java PRE-CREATION 
>   exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/PlannerSettings.java ac86c4a 
>   exec/java-exec/src/main/java/org/apache/drill/exec/server/options/SystemOptionManager.java a394efe 
>   exec/java-exec/src/test/java/org/apache/drill/TestDisabledFunctionality.java e049943 
>   exec/java-exec/src/test/java/org/apache/drill/TestTpchDistributed.java 2b41912 
>   exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/join/TestNestedLoopJoin.java PRE-CREATION 
>   exec/java-exec/src/test/resources/queries/tpch/11_1.sql PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/33523/diff/
> 
> 
> Testing
> -------
> 
> Added tests for NOT-IN, EXISTS, Inequality.  Enabled TPCH-16, TPCH-15 and a slight variant of TPCH-11.
> Ran all unit tests. One test: TestDisabledFunctionality.testSubqueryWithoutCorrelatedJoinCondition encounters an error in NLJ execution which will be fixed shortly. This test is marked ignored for now. 
> 
> Full functional/TPCH-100 tests are in progress.
> 
> 
> Thanks,
> 
> Aman Sinha
> 
>


Re: Review Request 33523: DRILL-1957: Support nested loop join planning (for NOT-IN, uncorrelated EXISTS, Inequality)

Posted by Julian Hyde <ju...@hydromatic.net>.

> On April 27, 2015, 3:46 p.m., Jinfeng Ni wrote:
> > exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/join/JoinUtils.java, line 43
> > <https://reviews.apache.org/r/33523/diff/1/?file=941039#file941039line43>
> >
> >     If a join has both equality and inequality condition, which category will this join fall in? 
> >     
> >     e.g: 
> >       T1.c1 = T2.C2 and T1.C3 > T1.C4

Rather than JoinCategory, consider modeling joins the same way as Calcite's JoinInfo. A join is decomposed into 0 or more equi-join keys and a remaining condition. So, you can easily tell whether a join is equi, theta, cartesian, or hybrid.

You often want to handle hybrid expressions like "T1.c1 = T2.C2 and T1.C3 > T1.C4" as an equi-join plus a remaining condition (e.g. you want to implement using hash-join followed by filter).

Ideally this code would end up in Calcite at some point. Adopting the same core concepts will help.

Regarding the above remark: Calcite does not recognize IS_NOT_DISTINCT_FROM as an equi condition, but it ought to.


- Julian


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33523/#review81663
-----------------------------------------------------------


On April 24, 2015, 5:37 p.m., Aman Sinha wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/33523/
> -----------------------------------------------------------
> 
> (Updated April 24, 2015, 5:37 p.m.)
> 
> 
> Review request for drill and Jinfeng Ni.
> 
> 
> Bugs: DRILL-1957
>     https://issues.apache.org/jira/browse/DRILL-1957
> 
> 
> Repository: drill-git
> 
> 
> Description
> -------
> 
> After Jinfeng's rebasing of Drill on Calcite master branch, the new Calcite logical plan for NOT-IN consists of a cartesian join with a scalar subquery.  In order to support this, we need nested loop join.  This patch adds support for such plans and works in complement with execution side changes already done earlier. 
> 
> The check for scalar subquery is restricted to scalar aggregates.  Note that a Filter may appear after a scalar aggregate, hence we also check for that. 
> 
> NL Join plan will do a broadcast of the right (scalar) child.  The join condition for NL join is always TRUE; however if there is a filter condition present, the planner will create a Filter-NLJ plan where the filter is done right after NLJ.  In this way, it is possible to test the NLJ for equality joins also; in order to support that, an option 'planner.enable_nljoin_for_scalar_only'  has been provided.  The default is TRUE. 
> 
> Besides NOT-IN, other SQL constructs that will be enabled are uncorrelated EXISTS, cartesian joins (involving scalars unless the above option is enabled) and inequality joins (also involving scalars).
> 
> 
> Diffs
> -----
> 
>   exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/join/JoinUtils.java 41bf786 
>   exec/java-exec/src/main/java/org/apache/drill/exec/planner/common/DrillJoinRelBase.java 8dc5cf1 
>   exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillFilterRel.java a914f47 
>   exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillJoinRel.java 1f602c7 
>   exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillJoinRule.java f832dfe 
>   exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillRuleSets.java 532fd43 
>   exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/HashJoinPrel.java aca55a0 
>   exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/HashJoinPrule.java 24df0b1 
>   exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/JoinPrel.java 59b9f41 
>   exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/JoinPruleBase.java d6f1672 
>   exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/MergeJoinPrel.java 3c0022f 
>   exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/MergeJoinPrule.java cbcc920 
>   exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/NestedLoopJoinPrel.java PRE-CREATION 
>   exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/NestedLoopJoinPrule.java PRE-CREATION 
>   exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/PlannerSettings.java ac86c4a 
>   exec/java-exec/src/main/java/org/apache/drill/exec/server/options/SystemOptionManager.java a394efe 
>   exec/java-exec/src/test/java/org/apache/drill/TestDisabledFunctionality.java e049943 
>   exec/java-exec/src/test/java/org/apache/drill/TestTpchDistributed.java 2b41912 
>   exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/join/TestNestedLoopJoin.java PRE-CREATION 
>   exec/java-exec/src/test/resources/queries/tpch/11_1.sql PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/33523/diff/
> 
> 
> Testing
> -------
> 
> Added tests for NOT-IN, EXISTS, Inequality.  Enabled TPCH-16, TPCH-15 and a slight variant of TPCH-11.
> Ran all unit tests. One test: TestDisabledFunctionality.testSubqueryWithoutCorrelatedJoinCondition encounters an error in NLJ execution which will be fixed shortly. This test is marked ignored for now. 
> 
> Full functional/TPCH-100 tests are in progress.
> 
> 
> Thanks,
> 
> Aman Sinha
> 
>


Re: Review Request 33523: DRILL-1957: Support nested loop join planning (for NOT-IN, uncorrelated EXISTS, Inequality)

Posted by Jinfeng Ni <jn...@maprtech.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33523/#review81663
-----------------------------------------------------------



exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/join/JoinUtils.java
<https://reviews.apache.org/r/33523/#comment132092>

    Equality join includes both EQUALS and IS_NOT_DISTINCT_FROM comparator, right?



exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/join/JoinUtils.java
<https://reviews.apache.org/r/33523/#comment132091>

    If a join has both equality and inequality condition, which category will this join fall in? 
    
    e.g: 
      T1.c1 = T2.C2 and T1.C3 > T1.C4



exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/join/JoinUtils.java
<https://reviews.apache.org/r/33523/#comment132110>

    Why do we treat DrillFilterRel differently from other Rel node?



exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/join/JoinUtils.java
<https://reviews.apache.org/r/33523/#comment132111>

    Is it good enough to have just the fist condition? If the remaining is not always true, that means it's inequality join?



exec/java-exec/src/main/java/org/apache/drill/exec/planner/common/DrillJoinRelBase.java
<https://reviews.apache.org/r/33523/#comment132108>

    For alwaysTrue condition, is it the case that the join cardinality will be simply the product of both inputs?



exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillRuleSets.java
<https://reviews.apache.org/r/33523/#comment132109>

    Please add comment here, explian why NLJ will be enabled, only when broadcast Join is eanbled as well.


- Jinfeng Ni


On April 24, 2015, 10:37 a.m., Aman Sinha wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/33523/
> -----------------------------------------------------------
> 
> (Updated April 24, 2015, 10:37 a.m.)
> 
> 
> Review request for drill and Jinfeng Ni.
> 
> 
> Bugs: DRILL-1957
>     https://issues.apache.org/jira/browse/DRILL-1957
> 
> 
> Repository: drill-git
> 
> 
> Description
> -------
> 
> After Jinfeng's rebasing of Drill on Calcite master branch, the new Calcite logical plan for NOT-IN consists of a cartesian join with a scalar subquery.  In order to support this, we need nested loop join.  This patch adds support for such plans and works in complement with execution side changes already done earlier. 
> 
> The check for scalar subquery is restricted to scalar aggregates.  Note that a Filter may appear after a scalar aggregate, hence we also check for that. 
> 
> NL Join plan will do a broadcast of the right (scalar) child.  The join condition for NL join is always TRUE; however if there is a filter condition present, the planner will create a Filter-NLJ plan where the filter is done right after NLJ.  In this way, it is possible to test the NLJ for equality joins also; in order to support that, an option 'planner.enable_nljoin_for_scalar_only'  has been provided.  The default is TRUE. 
> 
> Besides NOT-IN, other SQL constructs that will be enabled are uncorrelated EXISTS, cartesian joins (involving scalars unless the above option is enabled) and inequality joins (also involving scalars).
> 
> 
> Diffs
> -----
> 
>   exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/join/JoinUtils.java 41bf786 
>   exec/java-exec/src/main/java/org/apache/drill/exec/planner/common/DrillJoinRelBase.java 8dc5cf1 
>   exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillFilterRel.java a914f47 
>   exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillJoinRel.java 1f602c7 
>   exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillJoinRule.java f832dfe 
>   exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillRuleSets.java 532fd43 
>   exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/HashJoinPrel.java aca55a0 
>   exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/HashJoinPrule.java 24df0b1 
>   exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/JoinPrel.java 59b9f41 
>   exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/JoinPruleBase.java d6f1672 
>   exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/MergeJoinPrel.java 3c0022f 
>   exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/MergeJoinPrule.java cbcc920 
>   exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/NestedLoopJoinPrel.java PRE-CREATION 
>   exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/NestedLoopJoinPrule.java PRE-CREATION 
>   exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/PlannerSettings.java ac86c4a 
>   exec/java-exec/src/main/java/org/apache/drill/exec/server/options/SystemOptionManager.java a394efe 
>   exec/java-exec/src/test/java/org/apache/drill/TestDisabledFunctionality.java e049943 
>   exec/java-exec/src/test/java/org/apache/drill/TestTpchDistributed.java 2b41912 
>   exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/join/TestNestedLoopJoin.java PRE-CREATION 
>   exec/java-exec/src/test/resources/queries/tpch/11_1.sql PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/33523/diff/
> 
> 
> Testing
> -------
> 
> Added tests for NOT-IN, EXISTS, Inequality.  Enabled TPCH-16, TPCH-15 and a slight variant of TPCH-11.
> Ran all unit tests. One test: TestDisabledFunctionality.testSubqueryWithoutCorrelatedJoinCondition encounters an error in NLJ execution which will be fixed shortly. This test is marked ignored for now. 
> 
> Full functional/TPCH-100 tests are in progress.
> 
> 
> Thanks,
> 
> Aman Sinha
> 
>


Re: Review Request 33523: DRILL-1957: Support nested loop join planning (for NOT-IN, uncorrelated EXISTS, Inequality)

Posted by Aman Sinha <as...@maprtech.com>.

> On April 27, 2015, 6:44 p.m., Jinfeng Ni wrote:
> > exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/HashJoinPrel.java, line 50
> > <https://reviews.apache.org/r/33523/diff/1/?file=941045#file941045line50>
> >
> >     this line of code seems to be duplicated, since this(...) will call line 56.

Good catch.  Fixed.


- Aman


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33523/#review81696
-----------------------------------------------------------


On April 24, 2015, 5:37 p.m., Aman Sinha wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/33523/
> -----------------------------------------------------------
> 
> (Updated April 24, 2015, 5:37 p.m.)
> 
> 
> Review request for drill and Jinfeng Ni.
> 
> 
> Bugs: DRILL-1957
>     https://issues.apache.org/jira/browse/DRILL-1957
> 
> 
> Repository: drill-git
> 
> 
> Description
> -------
> 
> After Jinfeng's rebasing of Drill on Calcite master branch, the new Calcite logical plan for NOT-IN consists of a cartesian join with a scalar subquery.  In order to support this, we need nested loop join.  This patch adds support for such plans and works in complement with execution side changes already done earlier. 
> 
> The check for scalar subquery is restricted to scalar aggregates.  Note that a Filter may appear after a scalar aggregate, hence we also check for that. 
> 
> NL Join plan will do a broadcast of the right (scalar) child.  The join condition for NL join is always TRUE; however if there is a filter condition present, the planner will create a Filter-NLJ plan where the filter is done right after NLJ.  In this way, it is possible to test the NLJ for equality joins also; in order to support that, an option 'planner.enable_nljoin_for_scalar_only'  has been provided.  The default is TRUE. 
> 
> Besides NOT-IN, other SQL constructs that will be enabled are uncorrelated EXISTS, cartesian joins (involving scalars unless the above option is enabled) and inequality joins (also involving scalars).
> 
> 
> Diffs
> -----
> 
>   exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/join/JoinUtils.java 41bf786 
>   exec/java-exec/src/main/java/org/apache/drill/exec/planner/common/DrillJoinRelBase.java 8dc5cf1 
>   exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillFilterRel.java a914f47 
>   exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillJoinRel.java 1f602c7 
>   exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillJoinRule.java f832dfe 
>   exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillRuleSets.java 532fd43 
>   exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/HashJoinPrel.java aca55a0 
>   exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/HashJoinPrule.java 24df0b1 
>   exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/JoinPrel.java 59b9f41 
>   exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/JoinPruleBase.java d6f1672 
>   exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/MergeJoinPrel.java 3c0022f 
>   exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/MergeJoinPrule.java cbcc920 
>   exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/NestedLoopJoinPrel.java PRE-CREATION 
>   exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/NestedLoopJoinPrule.java PRE-CREATION 
>   exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/PlannerSettings.java ac86c4a 
>   exec/java-exec/src/main/java/org/apache/drill/exec/server/options/SystemOptionManager.java a394efe 
>   exec/java-exec/src/test/java/org/apache/drill/TestDisabledFunctionality.java e049943 
>   exec/java-exec/src/test/java/org/apache/drill/TestTpchDistributed.java 2b41912 
>   exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/join/TestNestedLoopJoin.java PRE-CREATION 
>   exec/java-exec/src/test/resources/queries/tpch/11_1.sql PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/33523/diff/
> 
> 
> Testing
> -------
> 
> Added tests for NOT-IN, EXISTS, Inequality.  Enabled TPCH-16, TPCH-15 and a slight variant of TPCH-11.
> Ran all unit tests. One test: TestDisabledFunctionality.testSubqueryWithoutCorrelatedJoinCondition encounters an error in NLJ execution which will be fixed shortly. This test is marked ignored for now. 
> 
> Full functional/TPCH-100 tests are in progress.
> 
> 
> Thanks,
> 
> Aman Sinha
> 
>


Re: Review Request 33523: DRILL-1957: Support nested loop join planning (for NOT-IN, uncorrelated EXISTS, Inequality)

Posted by Jinfeng Ni <jn...@maprtech.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33523/#review81696
-----------------------------------------------------------



exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/HashJoinPrel.java
<https://reviews.apache.org/r/33523/#comment132125>

    this line of code seems to be duplicated, since this(...) will call line 56.



exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/JoinPruleBase.java
<https://reviews.apache.org/r/33523/#comment132138>

    I assume that the reason that Planner has to add Filter on top of NLJ is because NLJ execution is only handle "true" condition.
    
    Adding Filter and move the join condition to Filter is fine when right side is scalar subquery. But in general, if right side is a regular subquery or table, move the join condition (including the equality comparison part) completely to filter seems to have performance impact.
    
    For now, it is probably fine, since we mainly use NLJ for scalar subquery.


- Jinfeng Ni


On April 24, 2015, 10:37 a.m., Aman Sinha wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/33523/
> -----------------------------------------------------------
> 
> (Updated April 24, 2015, 10:37 a.m.)
> 
> 
> Review request for drill and Jinfeng Ni.
> 
> 
> Bugs: DRILL-1957
>     https://issues.apache.org/jira/browse/DRILL-1957
> 
> 
> Repository: drill-git
> 
> 
> Description
> -------
> 
> After Jinfeng's rebasing of Drill on Calcite master branch, the new Calcite logical plan for NOT-IN consists of a cartesian join with a scalar subquery.  In order to support this, we need nested loop join.  This patch adds support for such plans and works in complement with execution side changes already done earlier. 
> 
> The check for scalar subquery is restricted to scalar aggregates.  Note that a Filter may appear after a scalar aggregate, hence we also check for that. 
> 
> NL Join plan will do a broadcast of the right (scalar) child.  The join condition for NL join is always TRUE; however if there is a filter condition present, the planner will create a Filter-NLJ plan where the filter is done right after NLJ.  In this way, it is possible to test the NLJ for equality joins also; in order to support that, an option 'planner.enable_nljoin_for_scalar_only'  has been provided.  The default is TRUE. 
> 
> Besides NOT-IN, other SQL constructs that will be enabled are uncorrelated EXISTS, cartesian joins (involving scalars unless the above option is enabled) and inequality joins (also involving scalars).
> 
> 
> Diffs
> -----
> 
>   exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/join/JoinUtils.java 41bf786 
>   exec/java-exec/src/main/java/org/apache/drill/exec/planner/common/DrillJoinRelBase.java 8dc5cf1 
>   exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillFilterRel.java a914f47 
>   exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillJoinRel.java 1f602c7 
>   exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillJoinRule.java f832dfe 
>   exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillRuleSets.java 532fd43 
>   exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/HashJoinPrel.java aca55a0 
>   exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/HashJoinPrule.java 24df0b1 
>   exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/JoinPrel.java 59b9f41 
>   exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/JoinPruleBase.java d6f1672 
>   exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/MergeJoinPrel.java 3c0022f 
>   exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/MergeJoinPrule.java cbcc920 
>   exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/NestedLoopJoinPrel.java PRE-CREATION 
>   exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/NestedLoopJoinPrule.java PRE-CREATION 
>   exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/PlannerSettings.java ac86c4a 
>   exec/java-exec/src/main/java/org/apache/drill/exec/server/options/SystemOptionManager.java a394efe 
>   exec/java-exec/src/test/java/org/apache/drill/TestDisabledFunctionality.java e049943 
>   exec/java-exec/src/test/java/org/apache/drill/TestTpchDistributed.java 2b41912 
>   exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/join/TestNestedLoopJoin.java PRE-CREATION 
>   exec/java-exec/src/test/resources/queries/tpch/11_1.sql PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/33523/diff/
> 
> 
> Testing
> -------
> 
> Added tests for NOT-IN, EXISTS, Inequality.  Enabled TPCH-16, TPCH-15 and a slight variant of TPCH-11.
> Ran all unit tests. One test: TestDisabledFunctionality.testSubqueryWithoutCorrelatedJoinCondition encounters an error in NLJ execution which will be fixed shortly. This test is marked ignored for now. 
> 
> Full functional/TPCH-100 tests are in progress.
> 
> 
> Thanks,
> 
> Aman Sinha
> 
>


Re: Review Request 33523: DRILL-1957: Support nested loop join planning (for NOT-IN, uncorrelated EXISTS, Inequality)

Posted by Jinfeng Ni <jn...@maprtech.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33523/#review81923
-----------------------------------------------------------

Ship it!


Ship It!

- Jinfeng Ni


On April 24, 2015, 10:37 a.m., Aman Sinha wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/33523/
> -----------------------------------------------------------
> 
> (Updated April 24, 2015, 10:37 a.m.)
> 
> 
> Review request for drill and Jinfeng Ni.
> 
> 
> Bugs: DRILL-1957
>     https://issues.apache.org/jira/browse/DRILL-1957
> 
> 
> Repository: drill-git
> 
> 
> Description
> -------
> 
> After Jinfeng's rebasing of Drill on Calcite master branch, the new Calcite logical plan for NOT-IN consists of a cartesian join with a scalar subquery.  In order to support this, we need nested loop join.  This patch adds support for such plans and works in complement with execution side changes already done earlier. 
> 
> The check for scalar subquery is restricted to scalar aggregates.  Note that a Filter may appear after a scalar aggregate, hence we also check for that. 
> 
> NL Join plan will do a broadcast of the right (scalar) child.  The join condition for NL join is always TRUE; however if there is a filter condition present, the planner will create a Filter-NLJ plan where the filter is done right after NLJ.  In this way, it is possible to test the NLJ for equality joins also; in order to support that, an option 'planner.enable_nljoin_for_scalar_only'  has been provided.  The default is TRUE. 
> 
> Besides NOT-IN, other SQL constructs that will be enabled are uncorrelated EXISTS, cartesian joins (involving scalars unless the above option is enabled) and inequality joins (also involving scalars).
> 
> 
> Diffs
> -----
> 
>   exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/join/JoinUtils.java 41bf786 
>   exec/java-exec/src/main/java/org/apache/drill/exec/planner/common/DrillJoinRelBase.java 8dc5cf1 
>   exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillFilterRel.java a914f47 
>   exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillJoinRel.java 1f602c7 
>   exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillJoinRule.java f832dfe 
>   exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillRuleSets.java 532fd43 
>   exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/HashJoinPrel.java aca55a0 
>   exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/HashJoinPrule.java 24df0b1 
>   exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/JoinPrel.java 59b9f41 
>   exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/JoinPruleBase.java d6f1672 
>   exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/MergeJoinPrel.java 3c0022f 
>   exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/MergeJoinPrule.java cbcc920 
>   exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/NestedLoopJoinPrel.java PRE-CREATION 
>   exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/NestedLoopJoinPrule.java PRE-CREATION 
>   exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/PlannerSettings.java ac86c4a 
>   exec/java-exec/src/main/java/org/apache/drill/exec/server/options/SystemOptionManager.java a394efe 
>   exec/java-exec/src/test/java/org/apache/drill/TestDisabledFunctionality.java e049943 
>   exec/java-exec/src/test/java/org/apache/drill/TestTpchDistributed.java 2b41912 
>   exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/join/TestNestedLoopJoin.java PRE-CREATION 
>   exec/java-exec/src/test/resources/queries/tpch/11_1.sql PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/33523/diff/
> 
> 
> Testing
> -------
> 
> Added tests for NOT-IN, EXISTS, Inequality.  Enabled TPCH-16, TPCH-15 and a slight variant of TPCH-11.
> Ran all unit tests. One test: TestDisabledFunctionality.testSubqueryWithoutCorrelatedJoinCondition encounters an error in NLJ execution which will be fixed shortly. This test is marked ignored for now. 
> 
> Full functional/TPCH-100 tests are in progress.
> 
> 
> Thanks,
> 
> Aman Sinha
> 
>