You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Xuefu Zhang (JIRA)" <ji...@apache.org> on 2015/06/19 20:35:01 UTC

[jira] [Resolved] (HIVE-11034) Joining multiple tables producing different results with different order of join

     [ https://issues.apache.org/jira/browse/HIVE-11034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Xuefu Zhang resolved HIVE-11034.
--------------------------------
    Resolution: Duplicate

I close this as a dupe. However, please feel free to reopen if the problem persists.

> Joining multiple tables producing different results with different order of join
> --------------------------------------------------------------------------------
>
>                 Key: HIVE-11034
>                 URL: https://issues.apache.org/jira/browse/HIVE-11034
>             Project: Hive
>          Issue Type: Bug
>          Components: Query Processor
>    Affects Versions: 0.13.0
>         Environment: Linux 2.6.32-279.19.1.el6.x86_64
>            Reporter: Srini Pindi
>            Priority: Critical
>
> {panel}
> Join between tables with different join columns from main table yielding wrong results in hive. 
> Changing the order of the joins between main table and other tables is producing different results.
> {panel}
> Please see below for the steps to reproduce the issue:
> 1. Create tables as follows:
>     create table p(ck string, email string);
>     create table a1(ck string, flag string);
>     create table a2(email string, flag string);
>     create table a3(ck string, flag string);
> 2. Load data into the tables as follows:
>     P
>     ||ck||email||
>     |10|e10|
>     |20|e20|
>     |30|e30|
>     |40|e40|
>  
>     A1
>     ||ck||flag||
>     |10|N|
>     |20|Y|
>     |30|Y|
>     |40|Y|
>     A2
>     ||email||flag||
>     |e10|Y|
>     |e20|N|
>     |e30|Y|
>     |e40|Y|
>  
>     A3
>     ||ck||flag||
>     |10|Y|
>     |20|Y|
>     |30|N|
>     |40|Y|
>   
>  3. Good query:
> {panel}
> select p.ck 
> from p 
> left outer join a1 on p.ck = a1.ck 
> left outer join a3 on p.ck = a3.ck 
> left outer join a2 on p.email = a2.email 
> where a1.flag = 'Y'
>   and a3.flag = 'Y'
>   and a2.flag = 'Y'
> ;
> {panel}
> and results are
>   40
> 4. Bad query
> {panel}
> select p.ck 
> from p 
> left outer join a1 on p.ck = a1.ck 
> left outer join a2 on p.email = a2.email 
> left outer join a3 on p.ck = a3.ck 
> where a1.flag = 'Y'
>   and a2.flag = 'Y'
>   and a3.flag = 'Y'
> ;
> {panel}
>  Producing results as:
>  30
>  40



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)