You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Aihua Xu (JIRA)" <ji...@apache.org> on 2015/05/15 17:04:00 UTC

[jira] [Created] (HIVE-10720) Pig using HCatLoader to access RCFile and perform join but get incorrect result.

Aihua Xu created HIVE-10720:
-------------------------------

             Summary: Pig using HCatLoader to access RCFile and perform join but get incorrect result.
                 Key: HIVE-10720
                 URL: https://issues.apache.org/jira/browse/HIVE-10720
             Project: Hive
          Issue Type: Bug
          Components: HCatalog
    Affects Versions: 1.3.0
            Reporter: Aihua Xu


{noformat}
Create table tbl1 (key string, value string) stored as rcfile;
Create table tbl2 (key string, value string);
insert into tbl1 values('1', 'value1');
insert into tbl2 values('1', 'value2');
{noformat}
Pig script:
{noformat}
tbl1 = LOAD 'tbl1' USING org.apache.hive.hcatalog.pig.HCatLoader();
tbl2 = LOAD 'tbl2' USING org.apache.hive.hcatalog.pig.HCatLoader();

src_tbl1 = FILTER tbl1 BY (key == '1');
prj_tbl1 = FOREACH src_tbl1 GENERATE
           key as tbl1_key,
           value as tbl1_value,
           '333' as tbl1_v1;
           
src_tbl2 = FILTER tbl2 BY (key == '1');
prj_tbl2 = FOREACH src_tbl2 GENERATE
           key as tbl2_key,
           value as tbl2_value;
           
result = JOIN prj_tbl1 BY (tbl1_key), prj_tbl2 BY (tbl2_key);
prj_result = FOREACH result 
      GENERATE  prj_tbl1::tbl1_key AS key1,
                prj_tbl1::tbl1_value AS value1,
                prj_tbl1::tbl1_v1 AS v1,
                prj_tbl2::tbl2_key AS key2,
                prj_tbl2::tbl2_value AS value2;
               
dump prj_result;
{noformat}

Based on the pig script, we could see different invalid results or even no result which should return.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)