You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "Sudheesh Katkam (JIRA)" <ji...@apache.org> on 2015/04/15 21:11:59 UTC

[jira] [Updated] (DRILL-2801) ORDER BY produces extra records

     [ https://issues.apache.org/jira/browse/DRILL-2801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sudheesh Katkam updated DRILL-2801:
-----------------------------------
    Attachment: data.csv

> ORDER BY produces extra records
> -------------------------------
>
>                 Key: DRILL-2801
>                 URL: https://issues.apache.org/jira/browse/DRILL-2801
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Execution - Relational Operators
>    Affects Versions: 0.8.0
>            Reporter: Sudheesh Katkam
>            Assignee: Chris Westin
>            Priority: Critical
>         Attachments: data.csv
>
>
> Running in embedded mode on my mac.
> {code}
> $ wc -w data.csv
>    50000 data.csv
> {code}
> Here's the query:
> {code}
> 0: jdbc:drill:zk=local> SELECT count(*) FROM dfs.`data.csv`;
> +------------+
> |   EXPR$0   |
> +------------+
> | 50000      |
> +------------+
> 1 row selected (0.223 seconds)
> 0: jdbc:drill:zk=local> SELECT columns[0] FROM dfs.`data.csv` ORDER BY columns[0];
> +------------+
> |   EXPR$0   |
> +------------+
> ...
> | 6          |
> +------------+
> 50,001 rows selected (0.928 seconds)
> 0: jdbc:drill:zk=local> SELECT tab.col, COUNT(tab.col) FROM (SELECT columns[0] col FROM dfs.`data.csv` ORDER BY columns[0]) tab GROUP BY tab.col;
> +------------+------------+
> |     col      |   EXPR$1   |
> +------------+------------+
> | 2          | 10000      |
> | 3          | 10000      |
> | 4          | 10000      |
> | 5          | 10001      |
> | 6          | 10000      |
> +------------+------------+
> 5 rows selected (0.704 seconds)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)