You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "GuangMing Lu (Jira)" <ji...@apache.org> on 2022/03/09 13:16:00 UTC
[jira] [Updated] (HIVE-26018) The result of UNIQUEJOIN on Hive on Tez is inconsistent with that of MR
[ https://issues.apache.org/jira/browse/HIVE-26018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
GuangMing Lu updated HIVE-26018:
--------------------------------
Description:
The result of UNIQUEJOIN on Hive on Tez is inconsistent with that of MR, and the result Is not correct, for example:
CREATE TABLE T1_n1x(key STRING, val STRING) STORED AS orc;
CREATE TABLE T2_n1x(key STRING, val STRING) STORED AS orc;
insert into T1_n1x values('aaa', '111'),('bbb', '222'),('ccc', '333');
insert into T2_n1x values('aaa', '111'),('ddd', '444'),('ccc', '333');
SELECT a.key, b.key FROM UNIQUEJOIN PRESERVE T1_n1x a (a.key), PRESERVE T2_n1x b (b.key);
Hive on Tez result: wrong
{+}-------{-}{-}{+}-------+
|a.key |b.key |
|aaa |aaa |
|bbb |NULL |
|ccc |ccc |
|NULL |ddd |
{+}-------{-}{-}{+}-------+
Hive on MR result: right
{+}-------{-}{-}{+}-------+
|a.key |b.key |
|aaa |aaa |
|bbb |NULL |
|ccc |ccc |
{+}-------{-}{-}{+}-------+
SELECT a.key, b.key FROM UNIQUEJOIN T1_n1x a (a.key), T2_n1x b (b.key);
Hive on Tez result: wrong
{+}-------{-}{-}{+}-------+
|a.key |b.key |
{+}-------{-}{-}{+}-------+
|aaa |aaa |
|bbb |NULL |
|ccc |ccc |
|NULL |ddd |
{+}-------{-}{-}{+}-------+
Hive on MR result: right
{+}-------{-}{-}{+}-------+
|a.key |b.key |
{+}-------{-}{-}{+}-------+
|aaa |aaa |
|ccc |ccc |
{+}-------{-}{-}{+}-------+
was:
The result of UNIQUEJOIN on Hive on Tez is inconsistent with that of MR, and the result Is not correct, for example:
CREATE TABLE T1_n1x(key STRING, val STRING) STORED AS orc;
CREATE TABLE T2_n1x(key STRING, val STRING) STORED AS orc;
insert into T1_n1x values('aaa', '111'),('bbb', '222'),('ccc', '333');
insert into T2_n1x values('aaa', '111'),('ddd', '444'),('ccc', '333');
SELECT a.key, b.key FROM UNIQUEJOIN PRESERVE T1_n1x a (a.key), PRESERVE T2_n1x b (b.key);
Hive on Tez result: wrong
+--------+--------+
| a.key | b.key |
+--------+--------+
| aaa | aaa |
| bbb | NULL |
| ccc | ccc |
| NULL | ddd |
+--------+--------+
Hive on MR result: right
+--------+--------+
| a.key | b.key |
+--------+--------+
| aaa | aaa |
| bbb | NULL |
| ccc | ccc |
+--------+--------+
SELECT a.key, b.key FROM UNIQUEJOIN T1_n1x a (a.key), T2_n1x b (b.key);
Hive on Tez result: wrong
+--------+--------+
| a.key | b.key |
+--------+--------+
| aaa | aaa |
| bbb | NULL |
| ccc | ccc |
| NULL | ddd |
+--------+--------+
Hive on MR result: right
+--------+--------+
| a.key | b.key |
+--------+--------+
| aaa | aaa |
| ccc | ccc |
+--------+--------+
> The result of UNIQUEJOIN on Hive on Tez is inconsistent with that of MR
> -----------------------------------------------------------------------
>
> Key: HIVE-26018
> URL: https://issues.apache.org/jira/browse/HIVE-26018
> Project: Hive
> Issue Type: Bug
> Components: Tez
> Affects Versions: 3.1.0, 4.0.0
> Reporter: GuangMing Lu
> Priority: Major
> Attachments: image-2022-03-09-21-08-17-835.png
>
>
> The result of UNIQUEJOIN on Hive on Tez is inconsistent with that of MR, and the result Is not correct, for example:
> CREATE TABLE T1_n1x(key STRING, val STRING) STORED AS orc;
> CREATE TABLE T2_n1x(key STRING, val STRING) STORED AS orc;
> insert into T1_n1x values('aaa', '111'),('bbb', '222'),('ccc', '333');
> insert into T2_n1x values('aaa', '111'),('ddd', '444'),('ccc', '333');
> SELECT a.key, b.key FROM UNIQUEJOIN PRESERVE T1_n1x a (a.key), PRESERVE T2_n1x b (b.key);
> Hive on Tez result: wrong
> {+}-------{-}{-}{+}-------+
> |a.key |b.key |
> |aaa |aaa |
> |bbb |NULL |
> |ccc |ccc |
> |NULL |ddd |
> {+}-------{-}{-}{+}-------+
> Hive on MR result: right
> {+}-------{-}{-}{+}-------+
> |a.key |b.key |
>
> |aaa |aaa |
> |bbb |NULL |
> |ccc |ccc |
> {+}-------{-}{-}{+}-------+
> SELECT a.key, b.key FROM UNIQUEJOIN T1_n1x a (a.key), T2_n1x b (b.key);
> Hive on Tez result: wrong
> {+}-------{-}{-}{+}-------+
> |a.key |b.key |
> {+}-------{-}{-}{+}-------+
> |aaa |aaa |
> |bbb |NULL |
> |ccc |ccc |
> |NULL |ddd |
> {+}-------{-}{-}{+}-------+
> Hive on MR result: right
> {+}-------{-}{-}{+}-------+
> |a.key |b.key |
> {+}-------{-}{-}{+}-------+
> |aaa |aaa |
> |ccc |ccc |
> {+}-------{-}{-}{+}-------+
>
--
This message was sent by Atlassian Jira
(v8.20.1#820001)