You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@drill.apache.org by "Khurram Faraaz (JIRA)" <ji...@apache.org> on 2015/09/15 20:20:46 UTC

[jira] [Created] (DRILL-3783) Incorrect results : COUNT() over results returned by UNION ALL

Khurram Faraaz created DRILL-3783:
-------------------------------------

             Summary: Incorrect results : COUNT(<column-name>) over results returned by UNION ALL 
                 Key: DRILL-3783
                 URL: https://issues.apache.org/jira/browse/DRILL-3783
             Project: Apache Drill
          Issue Type: Bug
          Components: Execution - Relational Operators
    Affects Versions: 1.2.0
         Environment: 4 node cluster on CentOS
            Reporter: Khurram Faraaz
            Assignee: Sean Hsuan-Yi Chu
            Priority: Critical
             Fix For: 1.2.0


Count over results returned union all query, returns incorrect results. The below query returned an Exception (please se DRILL-2637) that JIRA was marked as fixed, however the query returns incorrect results. 

{code}
0: jdbc:drill:schema=dfs.tmp> select count(c1) from (select cast(columns[0] as int) c1 from `testWindow.csv`) union all (select cast(columns[0] as int) c2 from `testWindow.csv`);
+---------+
| EXPR$0  |
+---------+
| 11      |
| 100     |
| 10      |
| 2       |
| 50      |
| 55      |
| 67      |
| 113     |
| 119     |
| 89      |
| 57      |
| 61      |
+---------+
12 rows selected (0.753 seconds)
{code}

Results returned by the query on LHS and RHS of Union all operator are
{code}
0: jdbc:drill:schema=dfs.tmp> select cast(columns[0] as int) c1 from `testWindow.csv`;
+------+
|  c1  |
+------+
| 100  |
| 10   |
| 2    |
| 50   |
| 55   |
| 67   |
| 113  |
| 119  |
| 89   |
| 57   |
| 61   |
+------+
11 rows selected (0.197 seconds)
0: jdbc:drill:schema=dfs.tmp> select cast(columns[0] as int) c2 from `testWindow.csv`;
+------+
|  c2  |
+------+
| 100  |
| 10   |
| 2    |
| 50   |
| 55   |
| 67   |
| 113  |
| 119  |
| 89   |
| 57   |
| 61   |
+------+
11 rows selected (0.173 seconds)
{code}

Note that enclosing the queries within correct parentheses returns correct results. We do not want to return incorrect results to user when the parentheses are missing.
{code}
0: jdbc:drill:schema=dfs.tmp> select count(c1) from ((select cast(columns[0] as int) c1 from `testWindow.csv`) union all (select cast(columns[0] as int) c2 from `testWindow.csv`));
+---------+
| EXPR$0  |
+---------+
| 22      |
+---------+
1 row selected (0.234 seconds)
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)