You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "Aman Sinha (JIRA)" <ji...@apache.org> on 2015/04/10 23:53:12 UTC

[jira] [Updated] (DRILL-2454) Support for local join filter in outer joins

     [ https://issues.apache.org/jira/browse/DRILL-2454?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Aman Sinha updated DRILL-2454:
------------------------------
    Fix Version/s:     (was: 0.9.0)
                   1.0.0

> Support for local join filter in outer joins
> --------------------------------------------
>
>                 Key: DRILL-2454
>                 URL: https://issues.apache.org/jira/browse/DRILL-2454
>             Project: Apache Drill
>          Issue Type: Improvement
>          Components: Query Planning & Optimization
>            Reporter: Victoria Markman
>            Assignee: Jinfeng Ni
>             Fix For: 1.0.0
>
>
> Currently in the case of outer join, we can't have local join filter on the preserving side of the outer join. "Can not plan" exception will be thrown.
> Below you will find examples of queries that are going to fail:
> -- Join filter on the left side of the left outer join 
> {code:sql}
> select
>         count(*)
> from
>         cp.`tpch/orders.parquet` o
>                 left outer join
>         cp.`tpch/lineitem.parquet` l
> on      (o.o_orderkey = l.l_orderkey and o.o_orderkey = 1);
> {code}
> -- Join filter on the right side of the right outer join
> {code:sql}
> select
>         count(*)
> from
>         cp.`tpch/lineitem.parquet` l
>                 right outer join
>         cp.`tpch/orders.parquet` o
> on      (o.o_orderkey = l.l_orderkey and o.o_orderkey >= 1)
> where
>         o.o_orderkey = 1;
> {code}
> Full outer join with the join filter on the left side:
> {code:sql}
> select
>         count(*)
> from
>         cp.`tpch/orders.parquet` o
>                 full outer join
>         cp.`tpch/lineitem.parquet` l
> on      (o.o_orderkey = l.l_orderkey and o.o_orderkey >= 1)
> ;
> {code}
> Full outer join with the join filter on the right side:
> {code:sql}
> select
>         count(*)
> from
>         cp.`tpch/orders.parquet` o
>                 full outer join
>         cp.`tpch/lineitem.parquet` l
> on      (o.o_orderkey = l.l_orderkey and l.l_orderkey >= 1)
> ;
> {code}
> The only case that will currently work is an inner join:
> {code:sql}
> select
>         count(*)
> from
>         cp.`tpch/orders.parquet` o
>                 inner join
>         cp.`tpch/lineitem.parquet` l
> on      (o.o_orderkey = l.l_orderkey and o.o_orderkey >= 1)
> ;
> {code}
> Or outer join with the local join filter on "non-preserving side" of the outer join:
> {code:sql}
> select
>         count(*)
> from
>         cp.`tpch/orders.parquet` o
>                 left outer join
>         cp.`tpch/lineitem.parquet` l
> on      (o.o_orderkey = l.l_orderkey and l.l_orderkey >= 1)
> ;
> {code}
> As a reminder, queries with filter in the ON clause of the join condition are semantically different from those with the same condition in the WHERE clause.
> This query will return an order and its lines, if any, for order number =100:
> {code:sql}
> select  *
> from
>         cp.`tpch/orders.parquet` o
>                 left outer join
>         cp.`tpch/lineitem.parquet` l
> on      (o.o_orderkey = l.l_orderkey)
> where   o.o_orderkey = 100
> ;
> {code}
> This query will return ALL orders, but only order 100 will have any lines associated with it:
> {code:sql}
> select  *
> from
>         cp.`tpch/orders.parquet` o
>                 left outer join
>         cp.`tpch/lineitem.parquet` l
> on      (o.o_orderkey = l.l_orderkey and o.o_orderkey = 100)
> ;
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)