You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@drill.apache.org by "Khurram Faraaz (JIRA)" <ji...@apache.org> on 2017/03/10 12:29:04 UTC
[jira] [Created] (DRILL-5347) Join query returns incorrect results,
returns an extra column.
Khurram Faraaz created DRILL-5347:
-------------------------------------
Summary: Join query returns incorrect results, returns an extra column.
Key: DRILL-5347
URL: https://issues.apache.org/jira/browse/DRILL-5347
Project: Apache Drill
Issue Type: Bug
Components: Query Planning & Optimization
Affects Versions: 1.10.0
Reporter: Khurram Faraaz
Priority: Critical
Join query returns incorrect results, returns an extra column.
Note that the columns in both the parquet files are in different order.
Postgres returns correct results.
{noformat}
postgres=# create table using_l1(c1 int, c2 varchar(25), c3 date);
CREATE TABLE
postgres=# create table using_r1(c2 varchar(25), c3 date, c1 int);
CREATE TABLE
postgres=# copy using_l1 from '/root/using_clause/using_l1.txt' DELIMITER ',' CSV;
COPY 17
postgres=# copy using_r1 from '/root/using_clause/using_r1.txt' DELIMITER ',' CSV;
COPY 18
postgres=# select * from using_l1 JOIN using_r1 USING(c1);
c1 | c2 | c3 | c2 | c3
-------+-----------+------------+------------+------------
1 | test | 1982-09-11 | teststring | 2009-10-10
1 | test | 1982-09-11 | yes | 2001-09-09
2 | string | 1989-04-12 | never | 1989-11-11
3 | foo | 2000-10-12 | none | 1967-03-05
3 | foo | 2000-10-12 | here | 2011-09-09
4 | foobar | 2015-01-07 | champ | 2008-06-07
5 | bar | 2004-05-08 | bar | 2010-02-04
17 | barfoobar | 1992-12-06 | hello | 2014-07-02
19 | hello | 1980-10-10 | hi | 2000-01-01
0 | now | 2007-07-02 | foobar | 2011-01-01
10 | here | 2009-11-11 | done | 1947-08-14
99999 | there | 1993-09-03 | there | 1993-09-03
22 | hi | 2000-01-04 | there | 2016-03-22
13 | no | 2016-10-01 | foo | 1982-10-11
13 | no | 2016-10-01 | no | 1992-12-06
(15 rows)
{noformat}
Results from Drill 1.10.0 (3dfb4972), for the same input data we see an extra column c10 in the results of join query.
{noformat}
## using_l1 => (c1 INTEGER, c2 VARCHAR(25), c3 DATE)
0: jdbc:drill:schema=dfs.tmp> CREATE TABLE using_l1 AS SELECT CAST(columns[0] AS INTEGER) c1, CAST(columns[1] AS VARCHAR(25)) c2, CAST(columns[2] AS DATE) c3 FROM `using_l1.txt`;
+-----------+----------------------------+
| Fragment | Number of records written |
+-----------+----------------------------+
| 0_0 | 17 |
+-----------+----------------------------+
1 row selected (0.258 seconds)
## using_r1 => (c2 VARCHAR, c3 DATE, c1 INTEGER)
0: jdbc:drill:schema=dfs.tmp> CREATE TABLE using_r1 AS SELECT CAST(columns[0] AS VARCHAR(25)) c2, CAST(columns[1] AS DATE) c3, CAST(columns[2] AS INTEGER) c1 FROM `using_r1.txt`;
+-----------+----------------------------+
| Fragment | Number of records written |
+-----------+----------------------------+
| 0_0 | 18 |
+-----------+----------------------------+
1 row selected (0.249 seconds)
0: jdbc:drill:schema=dfs.tmp>
## Last column c10 is un-expected in the results of the JOIN
0: jdbc:drill:schema=dfs.tmp> select * from using_l1 JOIN using_r1 USING(c1);
+--------+------------+-------------+-------------+-------------+--------+
| c1 | c2 | c3 | c20 | c30 | c10 |
+--------+------------+-------------+-------------+-------------+--------+
| 1 | test | 1982-09-11 | yes | 2001-09-09 | 1 |
| 1 | test | 1982-09-11 | teststring | 2009-10-10 | 1 |
| 2 | string | 1989-04-12 | never | 1989-11-11 | 2 |
| 3 | foo | 2000-10-12 | here | 2011-09-09 | 3 |
| 3 | foo | 2000-10-12 | none | 1967-03-05 | 3 |
| 4 | foobar | 2015-01-07 | champ | 2008-06-07 | 4 |
| 5 | bar | 2004-05-08 | bar | 2010-02-04 | 5 |
| 17 | barfoobar | 1992-12-06 | hello | 2014-07-02 | 17 |
| 19 | hello | 1980-10-10 | hi | 2000-01-01 | 19 |
| 0 | now | 2007-07-02 | foobar | 2011-01-01 | 0 |
| 10 | here | 2009-11-11 | done | 1947-08-14 | 10 |
| 99999 | there | 1993-09-03 | there | 1993-09-03 | 99999 |
| 22 | hi | 2000-01-04 | there | 2016-03-22 | 22 |
| 13 | no | 2016-10-01 | no | 1992-12-06 | 13 |
| 13 | no | 2016-10-01 | foo | 1982-10-11 | 13 |
+--------+------------+-------------+-------------+-------------+--------+
15 rows selected (0.358 seconds)
{noformat}
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)