You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@drill.apache.org by "Aman Sinha (JIRA)" <ji...@apache.org> on 2015/09/18 19:58:04 UTC
[jira] [Created] (DRILL-3803) Support inequality filter evaluation
as part of join operators
Aman Sinha created DRILL-3803:
---------------------------------
Summary: Support inequality filter evaluation as part of join operators
Key: DRILL-3803
URL: https://issues.apache.org/jira/browse/DRILL-3803
Project: Apache Drill
Issue Type: Improvement
Components: Execution - Relational Operators
Reporter: Aman Sinha
Assignee: Aman Sinha
Currently Drill evaluates an inequality filter after the join filter. See below:
{code}
0: jdbc:drill:zk=local> explain plan for select n1.n_name from cp.`tpch/nation.parquet` n1 inner join cp.`tpch/region.parquet` n2 on n1.n_nationkey = n2.n_nationkey and n1.n_regionkey < n2.n_regionkey;
+------+------+
| text | json |
+------+------+
| 00-00 Screen
00-01 Project(n_name=[$2])
00-02 SelectionVectorRemover
00-03 Filter(condition=[<($1, $4)])
00-04 HashJoin(condition=[=($0, $3)], joinType=[inner])
00-06 Project(n_nationkey=[$2], n_regionkey=[$0], n_name=[$1])
00-08 Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath [path=classpath:/tpch/nation.parquet]], selectionRoot=classpath:/tpch/nation.parquet, numFiles=1, columns=[`n_nationkey`, `n_regionkey`, `n_name`]]])
00-05 Project(n_nationkey0=[$0], n_regionkey0=[$1])
00-07 Project(n_nationkey=[$1], n_regionkey=[$0])
00-09 Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath [path=classpath:/tpch/region.parquet]], selectionRoot=classpath:/tpch/region.parquet, numFiles=1, columns=[`n_nationkey`, `n_regionkey`]]])
{code}
Suppose the inequality filter is highly selective but the join's output cardinality is large. It would be substantially better to push this filter into the join and evaluate both equality and inequality as part of the join.
This is an enhancement. We may decide at a later time to split this into 2 JIRAs : one for HashJoin and one for MergeJoin.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)