You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "Vitalii Diravka (JIRA)" <ji...@apache.org> on 2018/05/21 14:20:00 UTC
[jira] [Updated] (DRILL-6371) Use FilterSetOpTransposeRule,
DrillProjectSetOpTransposeRule in main logical stage
[ https://issues.apache.org/jira/browse/DRILL-6371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Vitalii Diravka updated DRILL-6371:
-----------------------------------
Description:
FilterSetOpTransposeRule, ProjectSetOpTransposeRule are leveraged in DRILL-3855.
They are used in HepPlanner, but if they additionally will be enabled in main logical planning stage for Volcano planner, more cases will be covered with these rules.
For example:
{code:java}
WITH year_total_1
AS (SELECT c.r_regionkey customer_id,
1 year_total
FROM cp.`tpch/region.parquet` c
UNION ALL
SELECT c.n_nationkey customer_id,
1 year_total
FROM cp.`tpch/nation.parquet` c),
year_total_2
AS (SELECT c.r_regionkey customer_id,
1 year_total
FROM cp.`tpch/region.parquet` c
UNION ALL
SELECT c.n_nationkey customer_id,
1 year_total
FROM cp.`tpch/nation.parquet` c)
SELECT count(t_w_firstyear.customer_id) as ct
FROM year_total_1 t_w_firstyear,
year_total_2 t_w_secyear
WHERE t_w_firstyear.year_total = t_w_secyear.year_total
AND t_w_firstyear.year_total > 0 and t_w_secyear.year_total > 0
{code}
Current plan after performing rules:
{code:java}
LogicalAggregate(group=[{}], ct=[COUNT($0)])
LogicalProject(customer_id=[$0])
LogicalFilter(condition=[AND(=($1, $3), >($1, 0), >($3, 0))])
LogicalJoin(condition=[true], joinType=[inner])
LogicalUnion(all=[true])
LogicalProject(customer_id=[$1], year_total=[1])
EnumerableTableScan(table=[[cp, tpch/region.parquet]])
LogicalProject(customer_id=[$1], year_total=[1])
EnumerableTableScan(table=[[cp, tpch/nation.parquet]])
LogicalUnion(all=[true])
LogicalProject(customer_id=[$1], year_total=[1])
EnumerableTableScan(table=[[cp, tpch/region.parquet]])
LogicalProject(customer_id=[$1], year_total=[1])
EnumerableTableScan(table=[[cp, tpch/nation.parquet]])
{code}
Since LogicalFilter isn't under LogicalUnion the FilterSetOpTransposeRule is not performed.
FilterJoinRule from main Drill logical stage pushes LogicalFilter below, but the stage with FilterSetOpTransposeRule is already finished.
That's why FilterSetOpTransposeRule and ProjectSetOpTransposeRule should be used in Drill main logical stage with Volcano planner.
Currently using them in Volcano Planner can cause infinite loops - CALCITE-1271 (can be resolved after solving CALCITE-2223)
was:
FilterSetOpTransposeRule, ProjectSetOpTransposeRule are leveraged in DRILL-3855.
They are used in HepPlanner, but if they additionally will be enabled in main logical planning stage for Volcano planner, more cases will be covered with these rules.
For example:
{code}
WITH year_total_1
AS (SELECT c.r_regionkey customer_id,
1 year_total
FROM cp.`tpch/region.parquet` c
UNION ALL
SELECT c.n_nationkey customer_id,
1 year_total
FROM cp.`tpch/nation.parquet` c),
year_total_2
AS (SELECT c.r_regionkey customer_id,
1 year_total
FROM cp.`tpch/region.parquet` c
UNION ALL
SELECT c.n_nationkey customer_id,
1 year_total
FROM cp.`tpch/nation.parquet` c)
SELECT count(t_w_firstyear.customer_id) as ct
FROM year_total_1 t_w_firstyear,
year_total_2 t_w_secyear
WHERE t_w_firstyear.year_total = t_w_secyear.year_total
AND t_w_firstyear.year_total > 0 and t_w_secyear.year_total > 0
{code}
Current plan after performing rules:
{code}
LogicalAggregate(group=[{}], ct=[COUNT($0)])
LogicalProject(customer_id=[$0])
LogicalFilter(condition=[AND(=($1, $3), >($1, 0), >($3, 0))])
LogicalJoin(condition=[true], joinType=[inner])
LogicalUnion(all=[true])
LogicalProject(customer_id=[$1], year_total=[1])
EnumerableTableScan(table=[[cp, tpch/region.parquet]])
LogicalProject(customer_id=[$1], year_total=[1])
EnumerableTableScan(table=[[cp, tpch/nation.parquet]])
LogicalUnion(all=[true])
LogicalProject(customer_id=[$1], year_total=[1])
EnumerableTableScan(table=[[cp, tpch/region.parquet]])
LogicalProject(customer_id=[$1], year_total=[1])
EnumerableTableScan(table=[[cp, tpch/nation.parquet]])
{code}
Since LogicalFilter isn't under LogicalUnion the FilterSetOpTransposeRule is not performed.
FilterJoinRule from main Drill logical stage pushes LogicalFilter below, but the stage with FilterSetOpTransposeRule is already finished.
That's why FilterSetOpTransposeRule and ProjectSetOpTransposeRule should be used in Drill main logical stage with Volcano planner.
Currently using them in Volcano Planner can cause infinite loops - CALCITE-1271
> Use FilterSetOpTransposeRule, DrillProjectSetOpTransposeRule in main logical stage
> ----------------------------------------------------------------------------------
>
> Key: DRILL-6371
> URL: https://issues.apache.org/jira/browse/DRILL-6371
> Project: Apache Drill
> Issue Type: Improvement
> Components: Query Planning & Optimization
> Affects Versions: 1.13.0
> Reporter: Vitalii Diravka
> Priority: Minor
> Fix For: Future
>
>
> FilterSetOpTransposeRule, ProjectSetOpTransposeRule are leveraged in DRILL-3855.
> They are used in HepPlanner, but if they additionally will be enabled in main logical planning stage for Volcano planner, more cases will be covered with these rules.
> For example:
> {code:java}
> WITH year_total_1
> AS (SELECT c.r_regionkey customer_id,
> 1 year_total
> FROM cp.`tpch/region.parquet` c
> UNION ALL
> SELECT c.n_nationkey customer_id,
> 1 year_total
> FROM cp.`tpch/nation.parquet` c),
> year_total_2
> AS (SELECT c.r_regionkey customer_id,
> 1 year_total
> FROM cp.`tpch/region.parquet` c
> UNION ALL
> SELECT c.n_nationkey customer_id,
> 1 year_total
> FROM cp.`tpch/nation.parquet` c)
> SELECT count(t_w_firstyear.customer_id) as ct
> FROM year_total_1 t_w_firstyear,
> year_total_2 t_w_secyear
> WHERE t_w_firstyear.year_total = t_w_secyear.year_total
> AND t_w_firstyear.year_total > 0 and t_w_secyear.year_total > 0
> {code}
> Current plan after performing rules:
> {code:java}
> LogicalAggregate(group=[{}], ct=[COUNT($0)])
> LogicalProject(customer_id=[$0])
> LogicalFilter(condition=[AND(=($1, $3), >($1, 0), >($3, 0))])
> LogicalJoin(condition=[true], joinType=[inner])
> LogicalUnion(all=[true])
> LogicalProject(customer_id=[$1], year_total=[1])
> EnumerableTableScan(table=[[cp, tpch/region.parquet]])
> LogicalProject(customer_id=[$1], year_total=[1])
> EnumerableTableScan(table=[[cp, tpch/nation.parquet]])
> LogicalUnion(all=[true])
> LogicalProject(customer_id=[$1], year_total=[1])
> EnumerableTableScan(table=[[cp, tpch/region.parquet]])
> LogicalProject(customer_id=[$1], year_total=[1])
> EnumerableTableScan(table=[[cp, tpch/nation.parquet]])
> {code}
> Since LogicalFilter isn't under LogicalUnion the FilterSetOpTransposeRule is not performed.
> FilterJoinRule from main Drill logical stage pushes LogicalFilter below, but the stage with FilterSetOpTransposeRule is already finished.
> That's why FilterSetOpTransposeRule and ProjectSetOpTransposeRule should be used in Drill main logical stage with Volcano planner.
> Currently using them in Volcano Planner can cause infinite loops - CALCITE-1271 (can be resolved after solving CALCITE-2223)
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)