You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@drill.apache.org by Jinfeng Ni <jn...@maprtech.com> on 2015/04/18 22:34:21 UTC
Re: Review Request 33136: DRILL-1384: Rebase Drill's forked Optiq
library onto Calcite release 1.0.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33136/
-----------------------------------------------------------
(Updated April 18, 2015, 1:34 p.m.)
Review request for drill and Aman Sinha.
Changes
-------
Exclude most of changes coming from Calcite pacakge rename/restructure, and keep the other major changes, to make it easier for review purpose.
Summary (updated)
-----------------
DRILL-1384: Rebase Drill's forked Optiq library onto Calcite release 1.0.
Repository: drill-git
Description
-------
Drill currently uses a forked Optiq (Renamed to Calcite) version, dated back in July 2014. The forked version has 10-20 Drill specific patches. However, we did not rebase the forked version onto the on-going Calcite release. As such, Drill misses some bug fixes/new feature development on Calcite side.
This patch is trying to rebase the Drill's forked version onto Calcite release 1.0. (More precisely, on commit of CALCITE-603, when this rebasing work started).
Most of changes happen due to Calcite's package structure / renaming (See CALCITE-296].
Some Drill specific changes:
1. Provide a Drill specifc RelDataTypeSystem, to support decimal with precision/scale up to 38.
2. Modify Drill's parser, to allow * in Compound Identifier.
3. Provide Drill specific FilterJoinRule, to enforce Drill only support equal-join in JOIN operator.
4. Modify Drill costing comparison, such that the costing oder is a total order when compare different plans.
5. Modify costing estimation for Drill Project operator.
6. Use a ProjectRemove rule, such that it will honor parent's output field name.
7. Modify Calcite's Frameworks/planner interface, such that Drill will use validatedRowType to construct a top-level project, to ensure the final output field is what the query specified. (Calcite could inject "$F0", or "$EXPR0 into converted RelNode tree, in Sql2RelConverter)
8. Fix couple of Drill unit test cases, since the expected result by query semantics are not fixed.
9. Some type-related to Drill Sql operators.
Some impact of such rebasing.
1. TPCH Q16, or query with NOT IN predicate involving NULLABLE column could hit CanNotPlanException.
The previous plan for Q16, although return the correct result, is not valid, when the column is nullable. See DRILL-1957 OR CALCITE-373.
2. Plan changed in some TPCH queries, which may cause timeout for Q5 in TPCH scale factor 100 run.
We probably need continue refine the costing estimation formula, especially for Join operator.
Diffs (updated)
-----
common/pom.xml 525b533
contrib/storage-hive/core/pom.xml 9bd6293
exec/java-exec/pom.xml f5313ca
exec/java-exec/src/main/codegen/includes/compoundIdentifier.ftl 50d8c20
exec/java-exec/src/main/java/org/apache/drill/exec/planner/common/DrillJoinRelBase.java 3b3aa1a
exec/java-exec/src/main/java/org/apache/drill/exec/planner/common/DrillProjectRelBase.java 7cf98cd
exec/java-exec/src/main/java/org/apache/drill/exec/planner/common/DrillRelOptUtil.java bbe7cf3
exec/java-exec/src/main/java/org/apache/drill/exec/planner/cost/DrillCostBase.java 87a1ea3
exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillFilterJoinRules.java PRE-CREATION
exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillProjectRel.java 14ea894
exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillPushProjIntoScan.java fcfced2
exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillReduceExpressionsRule.java 29175e5
exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillRuleSets.java 84a0b51
exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/HashJoinPrel.java f63057f
exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/DrillSqlOperator.java 7bd48c8
exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/handlers/CreateTableHandler.java a17a604
exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/handlers/DefaultSqlHandler.java 5ee502d
exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/handlers/ExplainHandler.java 1cce6a5
exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/handlers/SqlHandlerUtil.java c347bef
exec/jdbc-all/pom.xml b369aed
exec/jdbc/pom.xml b4ec758
pom.xml f6bcd91
Diff: https://reviews.apache.org/r/33136/diff/
Testing
-------
Unit testcase clean run. (except for disabled testcases. See DRILL-2630, DRILL-2761, etc)
Thanks,
Jinfeng Ni
Re: Review Request 33136: DRILL-1384: Rebase Drill's forked Optiq
library onto Calcite release 1.0.
Posted by Jinfeng Ni <jn...@maprtech.com>.
> On April 18, 2015, 4:10 p.m., Aman Sinha wrote:
> > exec/java-exec/src/main/java/org/apache/drill/exec/planner/cost/DrillCostBase.java, line 61
> > <https://reviews.apache.org/r/33136/diff/3/?file=934734#file934734line61>
> >
> > Is this factor actually helping with any of the plans ? If not, we should skip it since having more factors makes it difficult to estimate the cost.
This is used when in DrillCostBase we compare the total cost of cpu/io/network/memory of two different plans. Previously, memory cost is excluded, while in HashAgg/HashJoin/StreamAgg, the memory cost is one important factor when choose a plan. For io/network, seems we calculated in terms of # of instruction cycles, yet memory cost is calcuted propotionly to bytes. That's why we need this ratio, when combine them together. I understand this ratio for now is just a magic number, requiring further tuning in the near future.
> On April 18, 2015, 4:10 p.m., Aman Sinha wrote:
> > exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillRuleSets.java, line 123
> > <https://reviews.apache.org/r/33136/diff/3/?file=934739#file934739line123>
> >
> > Remove all the commented out old rule names that are not valid in the master branch of Calcite since it adds confusion.
I'll remove. Originally, I kept them, so that we know what is the prior rule name, before the rebasing work, since we're more faimilar with the old rule name.
> On April 18, 2015, 4:10 p.m., Aman Sinha wrote:
> > exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/HashJoinPrel.java, line 78
> > <https://reviews.apache.org/r/33136/diff/3/?file=934740#file934740line78>
> >
> > Since this logic has moved to DrillJoinRelBase, you should just remove it from here.
remove.
- Jinfeng
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33136/#review80617
-----------------------------------------------------------
On April 18, 2015, 1:34 p.m., Jinfeng Ni wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/33136/
> -----------------------------------------------------------
>
> (Updated April 18, 2015, 1:34 p.m.)
>
>
> Review request for drill and Aman Sinha.
>
>
> Repository: drill-git
>
>
> Description
> -------
>
> Drill currently uses a forked Optiq (Renamed to Calcite) version, dated back in July 2014. The forked version has 10-20 Drill specific patches. However, we did not rebase the forked version onto the on-going Calcite release. As such, Drill misses some bug fixes/new feature development on Calcite side.
>
> This patch is trying to rebase the Drill's forked version onto Calcite release 1.0. (More precisely, on commit of CALCITE-603, when this rebasing work started).
>
> Most of changes happen due to Calcite's package structure / renaming (See CALCITE-296].
>
> Some Drill specific changes:
>
> 1. Provide a Drill specifc RelDataTypeSystem, to support decimal with precision/scale up to 38.
> 2. Modify Drill's parser, to allow * in Compound Identifier.
> 3. Provide Drill specific FilterJoinRule, to enforce Drill only support equal-join in JOIN operator.
> 4. Modify Drill costing comparison, such that the costing oder is a total order when compare different plans.
> 5. Modify costing estimation for Drill Project operator.
> 6. Use a ProjectRemove rule, such that it will honor parent's output field name.
> 7. Modify Calcite's Frameworks/planner interface, such that Drill will use validatedRowType to construct a top-level project, to ensure the final output field is what the query specified. (Calcite could inject "$F0", or "$EXPR0 into converted RelNode tree, in Sql2RelConverter)
> 8. Fix couple of Drill unit test cases, since the expected result by query semantics are not fixed.
> 9. Some type-related to Drill Sql operators.
>
> Some impact of such rebasing.
> 1. TPCH Q16, or query with NOT IN predicate involving NULLABLE column could hit CanNotPlanException.
>
> The previous plan for Q16, although return the correct result, is not valid, when the column is nullable. See DRILL-1957 OR CALCITE-373.
>
> 2. Plan changed in some TPCH queries, which may cause timeout for Q5 in TPCH scale factor 100 run.
> We probably need continue refine the costing estimation formula, especially for Join operator.
>
>
> Diffs
> -----
>
> common/pom.xml 525b533
> contrib/storage-hive/core/pom.xml 9bd6293
> exec/java-exec/pom.xml f5313ca
> exec/java-exec/src/main/codegen/includes/compoundIdentifier.ftl 50d8c20
> exec/java-exec/src/main/java/org/apache/drill/exec/planner/common/DrillJoinRelBase.java 3b3aa1a
> exec/java-exec/src/main/java/org/apache/drill/exec/planner/common/DrillProjectRelBase.java 7cf98cd
> exec/java-exec/src/main/java/org/apache/drill/exec/planner/common/DrillRelOptUtil.java bbe7cf3
> exec/java-exec/src/main/java/org/apache/drill/exec/planner/cost/DrillCostBase.java 87a1ea3
> exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillFilterJoinRules.java PRE-CREATION
> exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillProjectRel.java 14ea894
> exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillPushProjIntoScan.java fcfced2
> exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillReduceExpressionsRule.java 29175e5
> exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillRuleSets.java 84a0b51
> exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/HashJoinPrel.java f63057f
> exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/DrillSqlOperator.java 7bd48c8
> exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/handlers/CreateTableHandler.java a17a604
> exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/handlers/DefaultSqlHandler.java 5ee502d
> exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/handlers/ExplainHandler.java 1cce6a5
> exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/handlers/SqlHandlerUtil.java c347bef
> exec/jdbc-all/pom.xml b369aed
> exec/jdbc/pom.xml b4ec758
> pom.xml f6bcd91
>
> Diff: https://reviews.apache.org/r/33136/diff/
>
>
> Testing
> -------
>
> Unit testcase clean run. (except for disabled testcases. See DRILL-2630, DRILL-2761, etc)
>
>
> Thanks,
>
> Jinfeng Ni
>
>
Re: Review Request 33136: DRILL-1384: Rebase Drill's forked Optiq
library onto Calcite release 1.0.
Posted by Aman Sinha <as...@maprtech.com>.
> On April 18, 2015, 11:10 p.m., Aman Sinha wrote:
> > exec/java-exec/src/main/java/org/apache/drill/exec/planner/cost/DrillCostBase.java, line 61
> > <https://reviews.apache.org/r/33136/diff/3/?file=934734#file934734line61>
> >
> > Is this factor actually helping with any of the plans ? If not, we should skip it since having more factors makes it difficult to estimate the cost.
>
> Jinfeng Ni wrote:
> This is used when in DrillCostBase we compare the total cost of cpu/io/network/memory of two different plans. Previously, memory cost is excluded, while in HashAgg/HashJoin/StreamAgg, the memory cost is one important factor when choose a plan. For io/network, seems we calculated in terms of # of instruction cycles, yet memory cost is calcuted propotionly to bytes. That's why we need this ratio, when combine them together. I understand this ratio for now is just a magic number, requiring further tuning in the near future.
Ok to keep it in for now...
- Aman
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33136/#review80617
-----------------------------------------------------------
On April 18, 2015, 8:34 p.m., Jinfeng Ni wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/33136/
> -----------------------------------------------------------
>
> (Updated April 18, 2015, 8:34 p.m.)
>
>
> Review request for drill and Aman Sinha.
>
>
> Repository: drill-git
>
>
> Description
> -------
>
> Drill currently uses a forked Optiq (Renamed to Calcite) version, dated back in July 2014. The forked version has 10-20 Drill specific patches. However, we did not rebase the forked version onto the on-going Calcite release. As such, Drill misses some bug fixes/new feature development on Calcite side.
>
> This patch is trying to rebase the Drill's forked version onto Calcite release 1.0. (More precisely, on commit of CALCITE-603, when this rebasing work started).
>
> Most of changes happen due to Calcite's package structure / renaming (See CALCITE-296].
>
> Some Drill specific changes:
>
> 1. Provide a Drill specifc RelDataTypeSystem, to support decimal with precision/scale up to 38.
> 2. Modify Drill's parser, to allow * in Compound Identifier.
> 3. Provide Drill specific FilterJoinRule, to enforce Drill only support equal-join in JOIN operator.
> 4. Modify Drill costing comparison, such that the costing oder is a total order when compare different plans.
> 5. Modify costing estimation for Drill Project operator.
> 6. Use a ProjectRemove rule, such that it will honor parent's output field name.
> 7. Modify Calcite's Frameworks/planner interface, such that Drill will use validatedRowType to construct a top-level project, to ensure the final output field is what the query specified. (Calcite could inject "$F0", or "$EXPR0 into converted RelNode tree, in Sql2RelConverter)
> 8. Fix couple of Drill unit test cases, since the expected result by query semantics are not fixed.
> 9. Some type-related to Drill Sql operators.
>
> Some impact of such rebasing.
> 1. TPCH Q16, or query with NOT IN predicate involving NULLABLE column could hit CanNotPlanException.
>
> The previous plan for Q16, although return the correct result, is not valid, when the column is nullable. See DRILL-1957 OR CALCITE-373.
>
> 2. Plan changed in some TPCH queries, which may cause timeout for Q5 in TPCH scale factor 100 run.
> We probably need continue refine the costing estimation formula, especially for Join operator.
>
>
> Diffs
> -----
>
> common/pom.xml 525b533
> contrib/storage-hive/core/pom.xml 9bd6293
> exec/java-exec/pom.xml f5313ca
> exec/java-exec/src/main/codegen/includes/compoundIdentifier.ftl 50d8c20
> exec/java-exec/src/main/java/org/apache/drill/exec/planner/common/DrillJoinRelBase.java 3b3aa1a
> exec/java-exec/src/main/java/org/apache/drill/exec/planner/common/DrillProjectRelBase.java 7cf98cd
> exec/java-exec/src/main/java/org/apache/drill/exec/planner/common/DrillRelOptUtil.java bbe7cf3
> exec/java-exec/src/main/java/org/apache/drill/exec/planner/cost/DrillCostBase.java 87a1ea3
> exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillFilterJoinRules.java PRE-CREATION
> exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillProjectRel.java 14ea894
> exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillPushProjIntoScan.java fcfced2
> exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillReduceExpressionsRule.java 29175e5
> exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillRuleSets.java 84a0b51
> exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/HashJoinPrel.java f63057f
> exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/DrillSqlOperator.java 7bd48c8
> exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/handlers/CreateTableHandler.java a17a604
> exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/handlers/DefaultSqlHandler.java 5ee502d
> exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/handlers/ExplainHandler.java 1cce6a5
> exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/handlers/SqlHandlerUtil.java c347bef
> exec/jdbc-all/pom.xml b369aed
> exec/jdbc/pom.xml b4ec758
> pom.xml f6bcd91
>
> Diff: https://reviews.apache.org/r/33136/diff/
>
>
> Testing
> -------
>
> Unit testcase clean run. (except for disabled testcases. See DRILL-2630, DRILL-2761, etc)
>
>
> Thanks,
>
> Jinfeng Ni
>
>
Re: Review Request 33136: DRILL-1384: Rebase Drill's forked Optiq
library onto Calcite release 1.0.
Posted by Aman Sinha <as...@maprtech.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33136/#review80617
-----------------------------------------------------------
exec/java-exec/src/main/java/org/apache/drill/exec/planner/common/DrillProjectRelBase.java
<https://reviews.apache.org/r/33136/#comment130713>
Remove this..
exec/java-exec/src/main/java/org/apache/drill/exec/planner/cost/DrillCostBase.java
<https://reviews.apache.org/r/33136/#comment130718>
Is this factor actually helping with any of the plans ? If not, we should skip it since having more factors makes it difficult to estimate the cost.
exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillFilterJoinRules.java
<https://reviews.apache.org/r/33136/#comment130712>
Keeping a chunk of commented out code creates confusion..better to remove it.
exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillRuleSets.java
<https://reviews.apache.org/r/33136/#comment130714>
Remove all the commented out old rule names that are not valid in the master branch of Calcite since it adds confusion.
exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/HashJoinPrel.java
<https://reviews.apache.org/r/33136/#comment130715>
Since this logic has moved to DrillJoinRelBase, you should just remove it from here.
- Aman Sinha
On April 18, 2015, 8:34 p.m., Jinfeng Ni wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/33136/
> -----------------------------------------------------------
>
> (Updated April 18, 2015, 8:34 p.m.)
>
>
> Review request for drill and Aman Sinha.
>
>
> Repository: drill-git
>
>
> Description
> -------
>
> Drill currently uses a forked Optiq (Renamed to Calcite) version, dated back in July 2014. The forked version has 10-20 Drill specific patches. However, we did not rebase the forked version onto the on-going Calcite release. As such, Drill misses some bug fixes/new feature development on Calcite side.
>
> This patch is trying to rebase the Drill's forked version onto Calcite release 1.0. (More precisely, on commit of CALCITE-603, when this rebasing work started).
>
> Most of changes happen due to Calcite's package structure / renaming (See CALCITE-296].
>
> Some Drill specific changes:
>
> 1. Provide a Drill specifc RelDataTypeSystem, to support decimal with precision/scale up to 38.
> 2. Modify Drill's parser, to allow * in Compound Identifier.
> 3. Provide Drill specific FilterJoinRule, to enforce Drill only support equal-join in JOIN operator.
> 4. Modify Drill costing comparison, such that the costing oder is a total order when compare different plans.
> 5. Modify costing estimation for Drill Project operator.
> 6. Use a ProjectRemove rule, such that it will honor parent's output field name.
> 7. Modify Calcite's Frameworks/planner interface, such that Drill will use validatedRowType to construct a top-level project, to ensure the final output field is what the query specified. (Calcite could inject "$F0", or "$EXPR0 into converted RelNode tree, in Sql2RelConverter)
> 8. Fix couple of Drill unit test cases, since the expected result by query semantics are not fixed.
> 9. Some type-related to Drill Sql operators.
>
> Some impact of such rebasing.
> 1. TPCH Q16, or query with NOT IN predicate involving NULLABLE column could hit CanNotPlanException.
>
> The previous plan for Q16, although return the correct result, is not valid, when the column is nullable. See DRILL-1957 OR CALCITE-373.
>
> 2. Plan changed in some TPCH queries, which may cause timeout for Q5 in TPCH scale factor 100 run.
> We probably need continue refine the costing estimation formula, especially for Join operator.
>
>
> Diffs
> -----
>
> common/pom.xml 525b533
> contrib/storage-hive/core/pom.xml 9bd6293
> exec/java-exec/pom.xml f5313ca
> exec/java-exec/src/main/codegen/includes/compoundIdentifier.ftl 50d8c20
> exec/java-exec/src/main/java/org/apache/drill/exec/planner/common/DrillJoinRelBase.java 3b3aa1a
> exec/java-exec/src/main/java/org/apache/drill/exec/planner/common/DrillProjectRelBase.java 7cf98cd
> exec/java-exec/src/main/java/org/apache/drill/exec/planner/common/DrillRelOptUtil.java bbe7cf3
> exec/java-exec/src/main/java/org/apache/drill/exec/planner/cost/DrillCostBase.java 87a1ea3
> exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillFilterJoinRules.java PRE-CREATION
> exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillProjectRel.java 14ea894
> exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillPushProjIntoScan.java fcfced2
> exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillReduceExpressionsRule.java 29175e5
> exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillRuleSets.java 84a0b51
> exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/HashJoinPrel.java f63057f
> exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/DrillSqlOperator.java 7bd48c8
> exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/handlers/CreateTableHandler.java a17a604
> exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/handlers/DefaultSqlHandler.java 5ee502d
> exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/handlers/ExplainHandler.java 1cce6a5
> exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/handlers/SqlHandlerUtil.java c347bef
> exec/jdbc-all/pom.xml b369aed
> exec/jdbc/pom.xml b4ec758
> pom.xml f6bcd91
>
> Diff: https://reviews.apache.org/r/33136/diff/
>
>
> Testing
> -------
>
> Unit testcase clean run. (except for disabled testcases. See DRILL-2630, DRILL-2761, etc)
>
>
> Thanks,
>
> Jinfeng Ni
>
>
Re: Review Request 33136: DRILL-1384: Rebase Drill's forked Optiq
library onto Calcite release 1.0.
Posted by Jinfeng Ni <jn...@maprtech.com>.
> On April 20, 2015, 9:27 p.m., Julian Hyde wrote:
> > Difficult to tell the rationale behind each change, but all of the changes seem to make sense.
> >
> > I noticed that you had to move to the version of ProjectRemoveRule.isTrivial with a boolean parameter. That was temporary, and was removed shortly afterwards. See CALCITE-575 and CALCITE-577. Unavoidable, I suppose.
> >
> > At a later point, consider renaming Drill's RelNodes and Rules to match Calcite's new naming scheme. Strictly optional of course.
That's right. ProjectRemoveRule.isTrivial used in Drill is a deprecated version. The reason we use that version is because Drill's execution uses name-based logic, different from Calcite's ordinal-based executiion. A project with identical expressions, yet different names should not be removed in Drill's plan; otherwise, the query may not return correct result. We probably have to re-visit this deprecated code issue, when we keep going on with Calcite's new code.
Agree that we should renaming Drill's Rules to match Calcite's new naming scheme. I'll do that in a follow-up patch. This patch is mainly focus on not causing any regression to the existing test workloads.
- Jinfeng
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33136/#review80902
-----------------------------------------------------------
On April 18, 2015, 1:34 p.m., Jinfeng Ni wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/33136/
> -----------------------------------------------------------
>
> (Updated April 18, 2015, 1:34 p.m.)
>
>
> Review request for drill and Aman Sinha.
>
>
> Repository: drill-git
>
>
> Description
> -------
>
> Drill currently uses a forked Optiq (Renamed to Calcite) version, dated back in July 2014. The forked version has 10-20 Drill specific patches. However, we did not rebase the forked version onto the on-going Calcite release. As such, Drill misses some bug fixes/new feature development on Calcite side.
>
> This patch is trying to rebase the Drill's forked version onto Calcite release 1.0. (More precisely, on commit of CALCITE-603, when this rebasing work started).
>
> Most of changes happen due to Calcite's package structure / renaming (See CALCITE-296].
>
> Some Drill specific changes:
>
> 1. Provide a Drill specifc RelDataTypeSystem, to support decimal with precision/scale up to 38.
> 2. Modify Drill's parser, to allow * in Compound Identifier.
> 3. Provide Drill specific FilterJoinRule, to enforce Drill only support equal-join in JOIN operator.
> 4. Modify Drill costing comparison, such that the costing oder is a total order when compare different plans.
> 5. Modify costing estimation for Drill Project operator.
> 6. Use a ProjectRemove rule, such that it will honor parent's output field name.
> 7. Modify Calcite's Frameworks/planner interface, such that Drill will use validatedRowType to construct a top-level project, to ensure the final output field is what the query specified. (Calcite could inject "$F0", or "$EXPR0 into converted RelNode tree, in Sql2RelConverter)
> 8. Fix couple of Drill unit test cases, since the expected result by query semantics are not fixed.
> 9. Some type-related to Drill Sql operators.
>
> Some impact of such rebasing.
> 1. TPCH Q16, or query with NOT IN predicate involving NULLABLE column could hit CanNotPlanException.
>
> The previous plan for Q16, although return the correct result, is not valid, when the column is nullable. See DRILL-1957 OR CALCITE-373.
>
> 2. Plan changed in some TPCH queries, which may cause timeout for Q5 in TPCH scale factor 100 run.
> We probably need continue refine the costing estimation formula, especially for Join operator.
>
>
> Diffs
> -----
>
> common/pom.xml 525b533
> contrib/storage-hive/core/pom.xml 9bd6293
> exec/java-exec/pom.xml f5313ca
> exec/java-exec/src/main/codegen/includes/compoundIdentifier.ftl 50d8c20
> exec/java-exec/src/main/java/org/apache/drill/exec/planner/common/DrillJoinRelBase.java 3b3aa1a
> exec/java-exec/src/main/java/org/apache/drill/exec/planner/common/DrillProjectRelBase.java 7cf98cd
> exec/java-exec/src/main/java/org/apache/drill/exec/planner/common/DrillRelOptUtil.java bbe7cf3
> exec/java-exec/src/main/java/org/apache/drill/exec/planner/cost/DrillCostBase.java 87a1ea3
> exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillFilterJoinRules.java PRE-CREATION
> exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillProjectRel.java 14ea894
> exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillPushProjIntoScan.java fcfced2
> exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillReduceExpressionsRule.java 29175e5
> exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillRuleSets.java 84a0b51
> exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/HashJoinPrel.java f63057f
> exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/DrillSqlOperator.java 7bd48c8
> exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/handlers/CreateTableHandler.java a17a604
> exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/handlers/DefaultSqlHandler.java 5ee502d
> exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/handlers/ExplainHandler.java 1cce6a5
> exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/handlers/SqlHandlerUtil.java c347bef
> exec/jdbc-all/pom.xml b369aed
> exec/jdbc/pom.xml b4ec758
> pom.xml f6bcd91
>
> Diff: https://reviews.apache.org/r/33136/diff/
>
>
> Testing
> -------
>
> Unit testcase clean run. (except for disabled testcases. See DRILL-2630, DRILL-2761, etc)
>
>
> Thanks,
>
> Jinfeng Ni
>
>
Re: Review Request 33136: DRILL-1384: Rebase Drill's forked Optiq
library onto Calcite release 1.0.
Posted by Julian Hyde <ju...@hydromatic.net>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33136/#review80902
-----------------------------------------------------------
Ship it!
Difficult to tell the rationale behind each change, but all of the changes seem to make sense.
I noticed that you had to move to the version of ProjectRemoveRule.isTrivial with a boolean parameter. That was temporary, and was removed shortly afterwards. See CALCITE-575 and CALCITE-577. Unavoidable, I suppose.
At a later point, consider renaming Drill's RelNodes and Rules to match Calcite's new naming scheme. Strictly optional of course.
- Julian Hyde
On April 18, 2015, 8:34 p.m., Jinfeng Ni wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/33136/
> -----------------------------------------------------------
>
> (Updated April 18, 2015, 8:34 p.m.)
>
>
> Review request for drill and Aman Sinha.
>
>
> Repository: drill-git
>
>
> Description
> -------
>
> Drill currently uses a forked Optiq (Renamed to Calcite) version, dated back in July 2014. The forked version has 10-20 Drill specific patches. However, we did not rebase the forked version onto the on-going Calcite release. As such, Drill misses some bug fixes/new feature development on Calcite side.
>
> This patch is trying to rebase the Drill's forked version onto Calcite release 1.0. (More precisely, on commit of CALCITE-603, when this rebasing work started).
>
> Most of changes happen due to Calcite's package structure / renaming (See CALCITE-296].
>
> Some Drill specific changes:
>
> 1. Provide a Drill specifc RelDataTypeSystem, to support decimal with precision/scale up to 38.
> 2. Modify Drill's parser, to allow * in Compound Identifier.
> 3. Provide Drill specific FilterJoinRule, to enforce Drill only support equal-join in JOIN operator.
> 4. Modify Drill costing comparison, such that the costing oder is a total order when compare different plans.
> 5. Modify costing estimation for Drill Project operator.
> 6. Use a ProjectRemove rule, such that it will honor parent's output field name.
> 7. Modify Calcite's Frameworks/planner interface, such that Drill will use validatedRowType to construct a top-level project, to ensure the final output field is what the query specified. (Calcite could inject "$F0", or "$EXPR0 into converted RelNode tree, in Sql2RelConverter)
> 8. Fix couple of Drill unit test cases, since the expected result by query semantics are not fixed.
> 9. Some type-related to Drill Sql operators.
>
> Some impact of such rebasing.
> 1. TPCH Q16, or query with NOT IN predicate involving NULLABLE column could hit CanNotPlanException.
>
> The previous plan for Q16, although return the correct result, is not valid, when the column is nullable. See DRILL-1957 OR CALCITE-373.
>
> 2. Plan changed in some TPCH queries, which may cause timeout for Q5 in TPCH scale factor 100 run.
> We probably need continue refine the costing estimation formula, especially for Join operator.
>
>
> Diffs
> -----
>
> common/pom.xml 525b533
> contrib/storage-hive/core/pom.xml 9bd6293
> exec/java-exec/pom.xml f5313ca
> exec/java-exec/src/main/codegen/includes/compoundIdentifier.ftl 50d8c20
> exec/java-exec/src/main/java/org/apache/drill/exec/planner/common/DrillJoinRelBase.java 3b3aa1a
> exec/java-exec/src/main/java/org/apache/drill/exec/planner/common/DrillProjectRelBase.java 7cf98cd
> exec/java-exec/src/main/java/org/apache/drill/exec/planner/common/DrillRelOptUtil.java bbe7cf3
> exec/java-exec/src/main/java/org/apache/drill/exec/planner/cost/DrillCostBase.java 87a1ea3
> exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillFilterJoinRules.java PRE-CREATION
> exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillProjectRel.java 14ea894
> exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillPushProjIntoScan.java fcfced2
> exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillReduceExpressionsRule.java 29175e5
> exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillRuleSets.java 84a0b51
> exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/HashJoinPrel.java f63057f
> exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/DrillSqlOperator.java 7bd48c8
> exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/handlers/CreateTableHandler.java a17a604
> exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/handlers/DefaultSqlHandler.java 5ee502d
> exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/handlers/ExplainHandler.java 1cce6a5
> exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/handlers/SqlHandlerUtil.java c347bef
> exec/jdbc-all/pom.xml b369aed
> exec/jdbc/pom.xml b4ec758
> pom.xml f6bcd91
>
> Diff: https://reviews.apache.org/r/33136/diff/
>
>
> Testing
> -------
>
> Unit testcase clean run. (except for disabled testcases. See DRILL-2630, DRILL-2761, etc)
>
>
> Thanks,
>
> Jinfeng Ni
>
>
Re: Review Request 33136: DRILL-1384: Rebase Drill's forked Optiq
library onto Calcite release 1.0.
Posted by Aman Sinha <as...@maprtech.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33136/#review80631
-----------------------------------------------------------
Ship it!
Ship It!
- Aman Sinha
On April 18, 2015, 8:34 p.m., Jinfeng Ni wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/33136/
> -----------------------------------------------------------
>
> (Updated April 18, 2015, 8:34 p.m.)
>
>
> Review request for drill and Aman Sinha.
>
>
> Repository: drill-git
>
>
> Description
> -------
>
> Drill currently uses a forked Optiq (Renamed to Calcite) version, dated back in July 2014. The forked version has 10-20 Drill specific patches. However, we did not rebase the forked version onto the on-going Calcite release. As such, Drill misses some bug fixes/new feature development on Calcite side.
>
> This patch is trying to rebase the Drill's forked version onto Calcite release 1.0. (More precisely, on commit of CALCITE-603, when this rebasing work started).
>
> Most of changes happen due to Calcite's package structure / renaming (See CALCITE-296].
>
> Some Drill specific changes:
>
> 1. Provide a Drill specifc RelDataTypeSystem, to support decimal with precision/scale up to 38.
> 2. Modify Drill's parser, to allow * in Compound Identifier.
> 3. Provide Drill specific FilterJoinRule, to enforce Drill only support equal-join in JOIN operator.
> 4. Modify Drill costing comparison, such that the costing oder is a total order when compare different plans.
> 5. Modify costing estimation for Drill Project operator.
> 6. Use a ProjectRemove rule, such that it will honor parent's output field name.
> 7. Modify Calcite's Frameworks/planner interface, such that Drill will use validatedRowType to construct a top-level project, to ensure the final output field is what the query specified. (Calcite could inject "$F0", or "$EXPR0 into converted RelNode tree, in Sql2RelConverter)
> 8. Fix couple of Drill unit test cases, since the expected result by query semantics are not fixed.
> 9. Some type-related to Drill Sql operators.
>
> Some impact of such rebasing.
> 1. TPCH Q16, or query with NOT IN predicate involving NULLABLE column could hit CanNotPlanException.
>
> The previous plan for Q16, although return the correct result, is not valid, when the column is nullable. See DRILL-1957 OR CALCITE-373.
>
> 2. Plan changed in some TPCH queries, which may cause timeout for Q5 in TPCH scale factor 100 run.
> We probably need continue refine the costing estimation formula, especially for Join operator.
>
>
> Diffs
> -----
>
> common/pom.xml 525b533
> contrib/storage-hive/core/pom.xml 9bd6293
> exec/java-exec/pom.xml f5313ca
> exec/java-exec/src/main/codegen/includes/compoundIdentifier.ftl 50d8c20
> exec/java-exec/src/main/java/org/apache/drill/exec/planner/common/DrillJoinRelBase.java 3b3aa1a
> exec/java-exec/src/main/java/org/apache/drill/exec/planner/common/DrillProjectRelBase.java 7cf98cd
> exec/java-exec/src/main/java/org/apache/drill/exec/planner/common/DrillRelOptUtil.java bbe7cf3
> exec/java-exec/src/main/java/org/apache/drill/exec/planner/cost/DrillCostBase.java 87a1ea3
> exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillFilterJoinRules.java PRE-CREATION
> exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillProjectRel.java 14ea894
> exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillPushProjIntoScan.java fcfced2
> exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillReduceExpressionsRule.java 29175e5
> exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillRuleSets.java 84a0b51
> exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/HashJoinPrel.java f63057f
> exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/DrillSqlOperator.java 7bd48c8
> exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/handlers/CreateTableHandler.java a17a604
> exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/handlers/DefaultSqlHandler.java 5ee502d
> exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/handlers/ExplainHandler.java 1cce6a5
> exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/handlers/SqlHandlerUtil.java c347bef
> exec/jdbc-all/pom.xml b369aed
> exec/jdbc/pom.xml b4ec758
> pom.xml f6bcd91
>
> Diff: https://reviews.apache.org/r/33136/diff/
>
>
> Testing
> -------
>
> Unit testcase clean run. (except for disabled testcases. See DRILL-2630, DRILL-2761, etc)
>
>
> Thanks,
>
> Jinfeng Ni
>
>