You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org> on 2010/08/10 09:22:17 UTC

[jira] Commented: (HIVE-741) NULL is not handled correctly in join

    [ https://issues.apache.org/jira/browse/HIVE-741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12896789#action_12896789 ] 

Amareshwari Sriramadasu commented on HIVE-741:
----------------------------------------------

bq. When 'null' is replaced by '' it works.
I see the same result even if 'null' is replaced with ''.

To reproduce the above, I created a table input1 with
{code}
hive> create table input1 (key int, value int);
{code}

Loaded the following input using Load data command
{noformat}
in.txt
^A35
12^A
10^A1000
10^A100
100^A100
{noformat}

I see the following output for join queries executed.
{noformat}

hive> select * from input1 a join input1 b on a.key=b.value;

Output:
NULL    35      12      NULL
100     100     10      100
100     100     100     100

Expected Output:
100     100     10      100
100     100     100     100

hive> select * from input1 a left outer join input1 b on a.key=b.value;

OUTPUT:
NULL    35      12      NULL
10      1000    NULL    NULL
10      100     NULL    NULL
12      NULL    NULL    NULL
100     100     10      100
100     100     100     100

Expected Output:
NULL    35      NULL    NULL
10      1000    NULL    NULL
10      100     NULL    NULL
12      NULL    NULL    NULL
100     100     10      100
100     100     100     100

hive> select * from input1 a right outer join input1 b on a.key=b.value;

OUTPUT:
NULL    35      12      NULL
NULL    NULL    NULL    35
100     100     10      100
100     100     100     100
NULL    NULL    10      1000

ExpectedOutput:
NULL    NULL    NULL    35
NULL    NULL    12      NULL
100     100     10      100
100     100     100     100
NULL    NULL    10      1000

{noformat}

Expected output is obtained from mysql db for a similar query.

Ning, if you are not working on the fix for this, I would like to contribute. Would need your help understanding join code also, as I'm a new to hive. 


> NULL is not handled correctly in join
> -------------------------------------
>
>                 Key: HIVE-741
>                 URL: https://issues.apache.org/jira/browse/HIVE-741
>             Project: Hadoop Hive
>          Issue Type: Bug
>            Reporter: Ning Zhang
>            Assignee: Ning Zhang
>
> With the following data in table input4_cb:
> Key        Value
> ------       --------
> NULL     325
> 18          NULL
> The following query:
> {code}
> select * from input4_cb a join input4_cb b on a.key = b.value;
> {code}
> returns the following result:
> NULL    325    18   NULL
> The correct result should be empty set.
> When 'null' is replaced by '' it works.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.