You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "Parth Chandra (JIRA)" <ji...@apache.org> on 2015/04/21 22:05:59 UTC

[jira] [Updated] (DRILL-280) Multiple tables join may get join sequence where two tables joined do not have any join condition.

     [ https://issues.apache.org/jira/browse/DRILL-280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Parth Chandra updated DRILL-280:
--------------------------------
    Component/s: Execution - Relational Operators

> Multiple tables join may get join sequence where two tables joined do not have any join condition.  
> ----------------------------------------------------------------------------------------------------
>
>                 Key: DRILL-280
>                 URL: https://issues.apache.org/jira/browse/DRILL-280
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Execution - Relational Operators
>            Reporter: Jinfeng Ni
>            Assignee: Jinfeng Ni
>            Priority: Critical
>             Fix For: 0.4.0
>
>
> When we have > 2 tables joined together, if we put the join predicates in the WHERE clause, the query may get a join sequence where two tables joined together do not have any join condition.  This will lead to error in Merge Join OP, which assume join always have a join condition.
> For example, the following query:
> SELECT S.S_ACCTBAL, S.S_NAME
> FROM
>  ( SELECT _MAP['P_PARTKEY'] as P_PARTKEY,
>           _MAP['P_MFGR'] as P_MFGR
>    FROM "/Users/jni//work/tpc-h-parquet/part") P,
>  ( SELECT _MAP['S_SUPPKEY'] AS S_SUPPKEY,
>           _MAP['S_NATIONKEY'] AS S_NATIONKEY,
>           _MAP['S_ACCTBAL'] AS S_ACCTBAL,
>           _MAP['S_NAME']  AS S_NAME,
>           _MAP['S_ADDRESS'] AS S_ADDRESS,
>           _MAP['S_PHONE'] AS S_PHONE,
>           _MAP['S_COMMENT'] AS S_COMMENT
>    FROM "/Users/jni//work/tpc-h-parquet/supplier") S,
>  (SELECT _MAP['PS_PARTKEY'] AS PS_PARTKEY,
>          _MAP['PS_SUPPKEY'] AS PS_SUPPKEY
>   FROM "/Users/jni//work/tpc-h-parquet/partsupp") PS
>  WHERE P.P_PARTKEY  = PS.PS_PARTKEY and
>       S.S_SUPPKEY = PS.PS_SUPPKEY
>  LIMIT 100;
> The join sequence in logical and physical plan is : P -> S -> PS. However, since there is no direct predicate between P and S, the Merge Join between P and S will have no join conditions. This lead to the following error in MergeJoinBatch.java:
> "Failure while setting up Foreman. < ArrayIndexOutOfBoundsException:[ 0 ]
> Since almost all TPC-H queries have multiple tables joins, it's important that we get this issue resolved, in order to run TPC-H queries.
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)