You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "Khurram Faraaz (JIRA)" <ji...@apache.org> on 2017/03/10 12:29:04 UTC

[jira] [Created] (DRILL-5347) Join query returns incorrect results, returns an extra column.

Khurram Faraaz created DRILL-5347:
-------------------------------------

             Summary: Join query returns incorrect results, returns an extra column.
                 Key: DRILL-5347
                 URL: https://issues.apache.org/jira/browse/DRILL-5347
             Project: Apache Drill
          Issue Type: Bug
          Components: Query Planning & Optimization
    Affects Versions: 1.10.0
            Reporter: Khurram Faraaz
            Priority: Critical



Join query returns incorrect results, returns an extra column.
Note that the columns in both the parquet files are in different order.

Postgres returns correct results.

{noformat}
postgres=# create table using_l1(c1 int, c2 varchar(25), c3 date);
CREATE TABLE
postgres=# create table using_r1(c2 varchar(25), c3 date, c1 int);
CREATE TABLE
postgres=# copy using_l1 from '/root/using_clause/using_l1.txt' DELIMITER ',' CSV;
COPY 17
postgres=# copy using_r1 from '/root/using_clause/using_r1.txt' DELIMITER ',' CSV;
COPY 18
postgres=# select * from using_l1 JOIN using_r1 USING(c1);
  c1   |    c2     |     c3     |     c2     |     c3
-------+-----------+------------+------------+------------
     1 | test      | 1982-09-11 | teststring | 2009-10-10
     1 | test      | 1982-09-11 | yes        | 2001-09-09
     2 | string    | 1989-04-12 | never      | 1989-11-11
     3 | foo       | 2000-10-12 | none       | 1967-03-05
     3 | foo       | 2000-10-12 | here       | 2011-09-09
     4 | foobar    | 2015-01-07 | champ      | 2008-06-07
     5 | bar       | 2004-05-08 | bar        | 2010-02-04
    17 | barfoobar | 1992-12-06 | hello      | 2014-07-02
    19 | hello     | 1980-10-10 | hi         | 2000-01-01
     0 | now       | 2007-07-02 | foobar     | 2011-01-01
    10 | here      | 2009-11-11 | done       | 1947-08-14
 99999 | there     | 1993-09-03 | there      | 1993-09-03
    22 | hi        | 2000-01-04 | there      | 2016-03-22
    13 | no        | 2016-10-01 | foo        | 1982-10-11
    13 | no        | 2016-10-01 | no         | 1992-12-06
(15 rows)
{noformat}

Results from Drill 1.10.0 (3dfb4972), for the same input data we see an extra column c10 in the results of join query.

{noformat}

## using_l1 => (c1 INTEGER, c2 VARCHAR(25), c3 DATE)

0: jdbc:drill:schema=dfs.tmp> CREATE TABLE using_l1 AS SELECT CAST(columns[0] AS INTEGER) c1, CAST(columns[1] AS VARCHAR(25)) c2, CAST(columns[2] AS DATE) c3 FROM `using_l1.txt`;
+-----------+----------------------------+
| Fragment  | Number of records written  |
+-----------+----------------------------+
| 0_0       | 17                         |
+-----------+----------------------------+
1 row selected (0.258 seconds)

## using_r1 => (c2 VARCHAR, c3 DATE, c1 INTEGER)

0: jdbc:drill:schema=dfs.tmp> CREATE TABLE using_r1 AS SELECT CAST(columns[0] AS VARCHAR(25)) c2, CAST(columns[1] AS DATE) c3, CAST(columns[2] AS INTEGER) c1 FROM `using_r1.txt`;
+-----------+----------------------------+
| Fragment  | Number of records written  |
+-----------+----------------------------+
| 0_0       | 18                         |
+-----------+----------------------------+
1 row selected (0.249 seconds)
0: jdbc:drill:schema=dfs.tmp>

## Last column c10 is un-expected in the results of the JOIN

0: jdbc:drill:schema=dfs.tmp> select * from using_l1 JOIN using_r1 USING(c1);
+--------+------------+-------------+-------------+-------------+--------+
|   c1   |     c2     |     c3      |     c20     |     c30     |  c10   |
+--------+------------+-------------+-------------+-------------+--------+
| 1      | test       | 1982-09-11  | yes         | 2001-09-09  | 1      |
| 1      | test       | 1982-09-11  | teststring  | 2009-10-10  | 1      |
| 2      | string     | 1989-04-12  | never       | 1989-11-11  | 2      |
| 3      | foo        | 2000-10-12  | here        | 2011-09-09  | 3      |
| 3      | foo        | 2000-10-12  | none        | 1967-03-05  | 3      |
| 4      | foobar     | 2015-01-07  | champ       | 2008-06-07  | 4      |
| 5      | bar        | 2004-05-08  | bar         | 2010-02-04  | 5      |
| 17     | barfoobar  | 1992-12-06  | hello       | 2014-07-02  | 17     |
| 19     | hello      | 1980-10-10  | hi          | 2000-01-01  | 19     |
| 0      | now        | 2007-07-02  | foobar      | 2011-01-01  | 0      |
| 10     | here       | 2009-11-11  | done        | 1947-08-14  | 10     |
| 99999  | there      | 1993-09-03  | there       | 1993-09-03  | 99999  |
| 22     | hi         | 2000-01-04  | there       | 2016-03-22  | 22     |
| 13     | no         | 2016-10-01  | no          | 1992-12-06  | 13     |
| 13     | no         | 2016-10-01  | foo         | 1982-10-11  | 13     |
+--------+------------+-------------+-------------+-------------+--------+
15 rows selected (0.358 seconds)
{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)