You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "Khurram Faraaz (JIRA)" <ji...@apache.org> on 2017/03/10 12:31:04 UTC
[jira] [Updated] (DRILL-5347) Join query returns incorrect results,
returns an extra column.
[ https://issues.apache.org/jira/browse/DRILL-5347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Khurram Faraaz updated DRILL-5347:
----------------------------------
Attachment: using_r1.txt
using_l1.txt
data files used in test are attached here.
> Join query returns incorrect results, returns an extra column.
> --------------------------------------------------------------
>
> Key: DRILL-5347
> URL: https://issues.apache.org/jira/browse/DRILL-5347
> Project: Apache Drill
> Issue Type: Bug
> Components: Query Planning & Optimization
> Affects Versions: 1.10.0
> Reporter: Khurram Faraaz
> Priority: Critical
> Attachments: using_l1.txt, using_r1.txt
>
>
> Join query returns incorrect results, returns an extra column.
> Note that the columns in both the parquet files are in different order.
> Postgres returns correct results.
> {noformat}
> postgres=# create table using_l1(c1 int, c2 varchar(25), c3 date);
> CREATE TABLE
> postgres=# create table using_r1(c2 varchar(25), c3 date, c1 int);
> CREATE TABLE
> postgres=# copy using_l1 from '/root/using_clause/using_l1.txt' DELIMITER ',' CSV;
> COPY 17
> postgres=# copy using_r1 from '/root/using_clause/using_r1.txt' DELIMITER ',' CSV;
> COPY 18
> postgres=# select * from using_l1 JOIN using_r1 USING(c1);
> c1 | c2 | c3 | c2 | c3
> -------+-----------+------------+------------+------------
> 1 | test | 1982-09-11 | teststring | 2009-10-10
> 1 | test | 1982-09-11 | yes | 2001-09-09
> 2 | string | 1989-04-12 | never | 1989-11-11
> 3 | foo | 2000-10-12 | none | 1967-03-05
> 3 | foo | 2000-10-12 | here | 2011-09-09
> 4 | foobar | 2015-01-07 | champ | 2008-06-07
> 5 | bar | 2004-05-08 | bar | 2010-02-04
> 17 | barfoobar | 1992-12-06 | hello | 2014-07-02
> 19 | hello | 1980-10-10 | hi | 2000-01-01
> 0 | now | 2007-07-02 | foobar | 2011-01-01
> 10 | here | 2009-11-11 | done | 1947-08-14
> 99999 | there | 1993-09-03 | there | 1993-09-03
> 22 | hi | 2000-01-04 | there | 2016-03-22
> 13 | no | 2016-10-01 | foo | 1982-10-11
> 13 | no | 2016-10-01 | no | 1992-12-06
> (15 rows)
> {noformat}
> Results from Drill 1.10.0 (3dfb4972), for the same input data we see an extra column c10 in the results of join query.
> {noformat}
> ## using_l1 => (c1 INTEGER, c2 VARCHAR(25), c3 DATE)
> 0: jdbc:drill:schema=dfs.tmp> CREATE TABLE using_l1 AS SELECT CAST(columns[0] AS INTEGER) c1, CAST(columns[1] AS VARCHAR(25)) c2, CAST(columns[2] AS DATE) c3 FROM `using_l1.txt`;
> +-----------+----------------------------+
> | Fragment | Number of records written |
> +-----------+----------------------------+
> | 0_0 | 17 |
> +-----------+----------------------------+
> 1 row selected (0.258 seconds)
> ## using_r1 => (c2 VARCHAR, c3 DATE, c1 INTEGER)
> 0: jdbc:drill:schema=dfs.tmp> CREATE TABLE using_r1 AS SELECT CAST(columns[0] AS VARCHAR(25)) c2, CAST(columns[1] AS DATE) c3, CAST(columns[2] AS INTEGER) c1 FROM `using_r1.txt`;
> +-----------+----------------------------+
> | Fragment | Number of records written |
> +-----------+----------------------------+
> | 0_0 | 18 |
> +-----------+----------------------------+
> 1 row selected (0.249 seconds)
> 0: jdbc:drill:schema=dfs.tmp>
> ## Last column c10 is un-expected in the results of the JOIN
> 0: jdbc:drill:schema=dfs.tmp> select * from using_l1 JOIN using_r1 USING(c1);
> +--------+------------+-------------+-------------+-------------+--------+
> | c1 | c2 | c3 | c20 | c30 | c10 |
> +--------+------------+-------------+-------------+-------------+--------+
> | 1 | test | 1982-09-11 | yes | 2001-09-09 | 1 |
> | 1 | test | 1982-09-11 | teststring | 2009-10-10 | 1 |
> | 2 | string | 1989-04-12 | never | 1989-11-11 | 2 |
> | 3 | foo | 2000-10-12 | here | 2011-09-09 | 3 |
> | 3 | foo | 2000-10-12 | none | 1967-03-05 | 3 |
> | 4 | foobar | 2015-01-07 | champ | 2008-06-07 | 4 |
> | 5 | bar | 2004-05-08 | bar | 2010-02-04 | 5 |
> | 17 | barfoobar | 1992-12-06 | hello | 2014-07-02 | 17 |
> | 19 | hello | 1980-10-10 | hi | 2000-01-01 | 19 |
> | 0 | now | 2007-07-02 | foobar | 2011-01-01 | 0 |
> | 10 | here | 2009-11-11 | done | 1947-08-14 | 10 |
> | 99999 | there | 1993-09-03 | there | 1993-09-03 | 99999 |
> | 22 | hi | 2000-01-04 | there | 2016-03-22 | 22 |
> | 13 | no | 2016-10-01 | no | 1992-12-06 | 13 |
> | 13 | no | 2016-10-01 | foo | 1982-10-11 | 13 |
> +--------+------------+-------------+-------------+-------------+--------+
> 15 rows selected (0.358 seconds)
> {noformat}
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)