You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "Aman Sinha (JIRA)" <ji...@apache.org> on 2015/04/10 23:53:12 UTC
[jira] [Updated] (DRILL-2454) Support for local join filter in
outer joins
[ https://issues.apache.org/jira/browse/DRILL-2454?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Aman Sinha updated DRILL-2454:
------------------------------
Fix Version/s: (was: 0.9.0)
1.0.0
> Support for local join filter in outer joins
> --------------------------------------------
>
> Key: DRILL-2454
> URL: https://issues.apache.org/jira/browse/DRILL-2454
> Project: Apache Drill
> Issue Type: Improvement
> Components: Query Planning & Optimization
> Reporter: Victoria Markman
> Assignee: Jinfeng Ni
> Fix For: 1.0.0
>
>
> Currently in the case of outer join, we can't have local join filter on the preserving side of the outer join. "Can not plan" exception will be thrown.
> Below you will find examples of queries that are going to fail:
> -- Join filter on the left side of the left outer join
> {code:sql}
> select
> count(*)
> from
> cp.`tpch/orders.parquet` o
> left outer join
> cp.`tpch/lineitem.parquet` l
> on (o.o_orderkey = l.l_orderkey and o.o_orderkey = 1);
> {code}
> -- Join filter on the right side of the right outer join
> {code:sql}
> select
> count(*)
> from
> cp.`tpch/lineitem.parquet` l
> right outer join
> cp.`tpch/orders.parquet` o
> on (o.o_orderkey = l.l_orderkey and o.o_orderkey >= 1)
> where
> o.o_orderkey = 1;
> {code}
> Full outer join with the join filter on the left side:
> {code:sql}
> select
> count(*)
> from
> cp.`tpch/orders.parquet` o
> full outer join
> cp.`tpch/lineitem.parquet` l
> on (o.o_orderkey = l.l_orderkey and o.o_orderkey >= 1)
> ;
> {code}
> Full outer join with the join filter on the right side:
> {code:sql}
> select
> count(*)
> from
> cp.`tpch/orders.parquet` o
> full outer join
> cp.`tpch/lineitem.parquet` l
> on (o.o_orderkey = l.l_orderkey and l.l_orderkey >= 1)
> ;
> {code}
> The only case that will currently work is an inner join:
> {code:sql}
> select
> count(*)
> from
> cp.`tpch/orders.parquet` o
> inner join
> cp.`tpch/lineitem.parquet` l
> on (o.o_orderkey = l.l_orderkey and o.o_orderkey >= 1)
> ;
> {code}
> Or outer join with the local join filter on "non-preserving side" of the outer join:
> {code:sql}
> select
> count(*)
> from
> cp.`tpch/orders.parquet` o
> left outer join
> cp.`tpch/lineitem.parquet` l
> on (o.o_orderkey = l.l_orderkey and l.l_orderkey >= 1)
> ;
> {code}
> As a reminder, queries with filter in the ON clause of the join condition are semantically different from those with the same condition in the WHERE clause.
> This query will return an order and its lines, if any, for order number =100:
> {code:sql}
> select *
> from
> cp.`tpch/orders.parquet` o
> left outer join
> cp.`tpch/lineitem.parquet` l
> on (o.o_orderkey = l.l_orderkey)
> where o.o_orderkey = 100
> ;
> {code}
> This query will return ALL orders, but only order 100 will have any lines associated with it:
> {code:sql}
> select *
> from
> cp.`tpch/orders.parquet` o
> left outer join
> cp.`tpch/lineitem.parquet` l
> on (o.o_orderkey = l.l_orderkey and o.o_orderkey = 100)
> ;
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)