You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Furcy Pin (JIRA)" <ji...@apache.org> on 2015/06/12 14:44:00 UTC
[jira] [Updated] (HIVE-10931) Wrong columns selected on multiple
joins
[ https://issues.apache.org/jira/browse/HIVE-10931?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Furcy Pin updated HIVE-10931:
-----------------------------
Description:
The following set of queries :
{code:sql}
DROP TABLE IF EXISTS test1 ;
DROP TABLE IF EXISTS test2 ;
DROP TABLE IF EXISTS test3 ;
CREATE TABLE test1 (col1 INT, col2 STRING, col3 STRING, coL4 STRING, coL5 STRING, col6 STRING) ;
INSERT INTO TABLE test1 VALUES (1,NULL,NULL,NULL,NULL,"A") ;
CREATE TABLE test2 (col1 INT, col2 STRING, col3 STRING, coL4 STRING, coL5 STRING, col6 STRING) ;
INSERT INTO TABLE test2 VALUES (1,NULL,NULL,NULL,NULL,"X") ;
CREATE TABLE test3 (coL1 STRING) ;
INSERT INTO TABLE test3 VALUES ("A") ;
SELECT
T2.val
FROM test1 T1
LEFT JOIN (SELECT col1, col2, col3, col4, col5, COALESCE(col6,"") as val FROM test2) T2
ON T2.col1 = T1.col1
LEFT JOIN test3 T3
ON T3.col1 = T1.col6
;
{code}
will return this :
{noformat}
+----------+--+
| t2.val |
+----------+--+
| A |
+----------+--+
{noformat}
Obviously, this result is wrong as table `test2` contains a "X" and no "A".
This is the most minimal example we found of this issue, in particular
having less than 6 columns in the tables will work, for instance :
{code:sql}
SELECT
T2.val
FROM test1 T1
LEFT JOIN (SELECT col1, col2, col3, col4, COALESCE(col6,"") as val FROM test2) T2
ON T2.col1 = T1.col1
LEFT JOIN test3 T3
ON T3.col1 = T1.col6
;
{code:sql}
(same query as before, but `col5` was removed from the select)
will return :
{noformat}
+----------+--+
| t2.val |
+----------+--+
| X |
+----------+--+
{noformat}
Removing the `COALESCE` also removes the bug...
was:
The following set of queries :
```
DROP TABLE IF EXISTS test1 ;
DROP TABLE IF EXISTS test2 ;
DROP TABLE IF EXISTS test3 ;
CREATE TABLE test1 (col1 INT, col2 STRING, col3 STRING, coL4 STRING, coL5 STRING, col6 STRING) ;
INSERT INTO TABLE test1 VALUES (1,NULL,NULL,NULL,NULL,"A") ;
CREATE TABLE test2 (col1 INT, col2 STRING, col3 STRING, coL4 STRING, coL5 STRING, col6 STRING) ;
INSERT INTO TABLE test2 VALUES (1,NULL,NULL,NULL,NULL,"X") ;
CREATE TABLE test3 (coL1 STRING) ;
INSERT INTO TABLE test3 VALUES ("A") ;
SELECT
T2.val
FROM test1 T1
LEFT JOIN (SELECT col1, col2, col3, col4, col5, COALESCE(col6,"") as val FROM test2) T2
ON T2.col1 = T1.col1
LEFT JOIN test3 T3
ON T3.col1 = T1.col6
;
```
will return this :
```
+----------+--+
| t2.val |
+----------+--+
| A |
+----------+--+
```
Obviously, this result is wrong as table `test2` contains a "X" and no "A".
This is the most minimal example we found of this issue, in particular
having less than 6 columns in the tables will work, for instance :
```
SELECT
T2.val
FROM test1 T1
LEFT JOIN (SELECT col1, col2, col3, col4, COALESCE(col6,"") as val FROM test2) T2
ON T2.col1 = T1.col1
LEFT JOIN test3 T3
ON T3.col1 = T1.col6
;
```
(same query as before, but `col5` was removed from the select)
will return :
```
+----------+--+
| t2.val |
+----------+--+
| X |
+----------+--+
```
Removing the `COALESCE` also removes the bug...
> Wrong columns selected on multiple joins
> ----------------------------------------
>
> Key: HIVE-10931
> URL: https://issues.apache.org/jira/browse/HIVE-10931
> Project: Hive
> Issue Type: Bug
> Affects Versions: 1.1.0
> Environment: Cloudera cdh5.4.2
> Reporter: Furcy Pin
>
> The following set of queries :
> {code:sql}
> DROP TABLE IF EXISTS test1 ;
> DROP TABLE IF EXISTS test2 ;
> DROP TABLE IF EXISTS test3 ;
> CREATE TABLE test1 (col1 INT, col2 STRING, col3 STRING, coL4 STRING, coL5 STRING, col6 STRING) ;
> INSERT INTO TABLE test1 VALUES (1,NULL,NULL,NULL,NULL,"A") ;
> CREATE TABLE test2 (col1 INT, col2 STRING, col3 STRING, coL4 STRING, coL5 STRING, col6 STRING) ;
> INSERT INTO TABLE test2 VALUES (1,NULL,NULL,NULL,NULL,"X") ;
> CREATE TABLE test3 (coL1 STRING) ;
> INSERT INTO TABLE test3 VALUES ("A") ;
> SELECT
> T2.val
> FROM test1 T1
> LEFT JOIN (SELECT col1, col2, col3, col4, col5, COALESCE(col6,"") as val FROM test2) T2
> ON T2.col1 = T1.col1
> LEFT JOIN test3 T3
> ON T3.col1 = T1.col6
> ;
> {code}
> will return this :
> {noformat}
> +----------+--+
> | t2.val |
> +----------+--+
> | A |
> +----------+--+
> {noformat}
> Obviously, this result is wrong as table `test2` contains a "X" and no "A".
> This is the most minimal example we found of this issue, in particular
> having less than 6 columns in the tables will work, for instance :
> {code:sql}
> SELECT
> T2.val
> FROM test1 T1
> LEFT JOIN (SELECT col1, col2, col3, col4, COALESCE(col6,"") as val FROM test2) T2
> ON T2.col1 = T1.col1
> LEFT JOIN test3 T3
> ON T3.col1 = T1.col6
> ;
> {code:sql}
> (same query as before, but `col5` was removed from the select)
> will return :
> {noformat}
> +----------+--+
> | t2.val |
> +----------+--+
> | X |
> +----------+--+
> {noformat}
> Removing the `COALESCE` also removes the bug...
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)