You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Vaibhav Gumashta (JIRA)" <ji...@apache.org> on 2016/06/21 16:40:58 UTC

[jira] [Commented] (HIVE-14027) NULL values produced by left outer join do not behave as NULL

    [ https://issues.apache.org/jira/browse/HIVE-14027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15342133#comment-15342133 ] 

Vaibhav Gumashta commented on HIVE-14027:
-----------------------------------------

Explain output:
{code}
hive> explain select a.n, a.t, isnull(b.n), isnull(b.t) from (select * from biq998_bigint where n = 1) a left outer join  (select * from biq998_bigint where 1 = 2) b on a.n = b.n;
OK
STAGE DEPENDENCIES:
  Stage-4 is a root stage
  Stage-3 depends on stages: Stage-4
  Stage-0 depends on stages: Stage-3

STAGE PLANS:
  Stage: Stage-4
    Map Reduce Local Work
      Alias -> Map Local Tables:
        $hdt$_1:biq998_bigint 
          Fetch Operator
            limit: -1
      Alias -> Map Local Operator Tree:
        $hdt$_1:biq998_bigint 
          TableScan
            alias: biq998_bigint
            Statistics: Num rows: 2 Data size: 10 Basic stats: COMPLETE Column stats: NONE
            Filter Operator
              predicate: false (type: boolean)
              Statistics: Num rows: 1 Data size: 5 Basic stats: COMPLETE Column stats: NONE
              Select Operator
                expressions: n (type: bigint), t (type: string)
                outputColumnNames: _col0, _col1
                Statistics: Num rows: 1 Data size: 5 Basic stats: COMPLETE Column stats: NONE
                HashTable Sink Operator
                  keys:
                    0 1 (type: bigint)
                    1 _col0 (type: bigint)

  Stage: Stage-3
    Map Reduce
      Map Operator Tree:
          TableScan
            alias: biq998_bigint
            Statistics: Num rows: 2 Data size: 10 Basic stats: COMPLETE Column stats: NONE
            Filter Operator
              predicate: (n = 1) (type: boolean)
              Statistics: Num rows: 1 Data size: 5 Basic stats: COMPLETE Column stats: NONE
              Select Operator
                expressions: t (type: string)
                outputColumnNames: _col1
                Statistics: Num rows: 1 Data size: 5 Basic stats: COMPLETE Column stats: NONE
                Map Join Operator
                  condition map:
                       Left Outer Join0 to 1
                  keys:
                    0 1 (type: bigint)
                    1 _col0 (type: bigint)
                  outputColumnNames: _col1, _col2, _col3
                  Statistics: Num rows: 1 Data size: 5 Basic stats: COMPLETE Column stats: NONE
                  Select Operator
                    expressions: 1 (type: bigint), _col1 (type: string), _col2 is null (type: boolean), _col3 is null (type: boolean)
                    outputColumnNames: _col0, _col1, _col2, _col3
                    Statistics: Num rows: 1 Data size: 5 Basic stats: COMPLETE Column stats: NONE
                    File Output Operator
                      compressed: false
                      Statistics: Num rows: 1 Data size: 5 Basic stats: COMPLETE Column stats: NONE
                      table:
                          input format: org.apache.hadoop.mapred.SequenceFileInputFormat
                          output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
                          serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
      Local Work:
        Map Reduce Local Work

  Stage: Stage-0
    Fetch Operator
      limit: -1
      Processor Tree:
        ListSink

Time taken: 0.637 seconds, Fetched: 70 row(s)
{code}

Looks like stage-4 after the filter operator should output 0 rows; stats shown are different.

> NULL values produced by left outer join do not behave as NULL
> -------------------------------------------------------------
>
>                 Key: HIVE-14027
>                 URL: https://issues.apache.org/jira/browse/HIVE-14027
>             Project: Hive
>          Issue Type: Bug
>          Components: Query Processor
>    Affects Versions: 1.2.1, 2.0.1
>            Reporter: Vaibhav Gumashta
>            Assignee: Vaibhav Gumashta
>
> Consider the following setup:
> {code}
> create table tbl (n bigint, t string); 
> insert into tbl values (1, 'one'); 
> insert into tbl values(2, 'two');
> select a.n, a.t, isnull(b.n), isnull(b.t) from (select * from tbl where n = 1) a  left outer join  (select * from tbl where 1 = 2) b on a.n = b.n;
> 1    one    false    true
> {code}
> The query should return true for isnull(b.n).
> I've tested by inserting a row with null value for the bigint column into tbl, and isnull returns true in that case. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)