You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@hive.apache.org by "Pengcheng Xiong (JIRA)" <ji...@apache.org> on 2015/03/23 22:04:53 UTC

[jira] [Created] (HIVE-10062) HiveOnTez: Union followed by Multi-GB followed by Multi-insert loses data

Pengcheng Xiong created HIVE-10062:
--------------------------------------

             Summary: HiveOnTez: Union followed by Multi-GB followed by Multi-insert loses data
                 Key: HIVE-10062
                 URL: https://issues.apache.org/jira/browse/HIVE-10062
             Project: Hive
          Issue Type: Bug
            Reporter: Pengcheng Xiong
            Priority: Critical


In q.test environment with src table, execute the following query: 
{code}
CREATE TABLE DEST1(key STRING, value STRING) STORED AS TEXTFILE;

CREATE TABLE DEST2(key STRING, val1 STRING, val2 STRING) STORED AS TEXTFILE;

FROM (select 'tst1' as key, cast(count(1) as string) as value from src s1
                         UNION all 
      select s2.key as key, s2.value as value from src s2) unionsrc
INSERT OVERWRITE TABLE DEST1 SELECT unionsrc.key, COUNT(DISTINCT SUBSTR(unionsrc.value,5)) GROUP BY unionsrc.key
INSERT OVERWRITE TABLE DEST2 SELECT unionsrc.key, unionsrc.value, COUNT(DISTINCT SUBSTR(unionsrc.value,5)) 
GROUP BY unionsrc.key, unionsrc.value;

select * from DEST1;
select * from DEST2;
{code}

DEST1 and DEST2 should both have 310 rows. However, DEST2 only has 1 row "tst1    500     1"




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)