You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "Victoria Markman (JIRA)" <ji...@apache.org> on 2015/05/27 20:18:19 UTC
[jira] [Updated] (DRILL-3182) Window function with DISTINCT qualifier returns seemingly incorrect result

     [ https://issues.apache.org/jira/browse/DRILL-3182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Victoria Markman updated DRILL-3182:
------------------------------------
    Assignee: Deneche A. Hakim  (was: Daniel Barclay (Drill))

> Window function with DISTINCT qualifier returns seemingly incorrect result
> --------------------------------------------------------------------------
>
>                 Key: DRILL-3182
>                 URL: https://issues.apache.org/jira/browse/DRILL-3182
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Functions - Drill
>    Affects Versions: 1.0.0
>            Reporter: Victoria Markman
>            Assignee: Deneche A. Hakim
>
> Both count(distinct <colname>) and count(all <colname>) return the same result. It does not look correct to me and I'm not sure what the correct behavior is going to be.
> (1) Latest postgres does not support distinct with Window functions:
> postgres=# select a2, count(distinct b2) over(partition by a2) from t2;
> ERROR:  DISTINCT is not implemented for window functions
> LINE 1: select a2, count(distinct b2) over(partition by a2) from t2;
>                    ^
> postgres=# select a2, avg(distinct a2) over(partition by a2) from t2;
> ERROR:  DISTINCT is not implemented for window functions
> LINE 1: select a2, avg(distinct a2) over(partition by a2) from t2;
> (2) Calcite does not support this either:  https://github.com/apache/incubator-calcite/blob/master/doc/reference.md
> Do we support it ? If not,  I think we should throw an error ...
> {code}
> 0: jdbc:drill:schema=dfs> select * from t2;
> +-----+--------+-------------+
> | a2  |   b2   |     c2      |
> +-----+--------+-------------+
> | 0   | zzz    | 2014-12-31  |
> | 1   | aaaaa  | 2015-01-01  |
> | 2   | bbbbb  | 2015-01-02  |
> | 2   | bbbbb  | 2015-01-02  |
> | 2   | bbbbb  | 2015-01-02  |
> | 3   | ccccc  | 2015-01-03  |
> | 4   | ddddd  | 2015-01-04  |
> | 5   | eeeee  | 2015-01-05  |
> | 6   | fffff  | 2015-01-06  |
> | 7   | ggggg  | 2015-01-07  |
> | 7   | ggggg  | 2015-01-07  |
> | 8   | hhhhh  | 2015-01-08  |
> | 9   | iiiii  | 2015-01-09  |
> +-----+--------+-------------+
> 13 rows selected (0.134 seconds)
> 0: jdbc:drill:schema=dfs> select a2, count(distinct b2) over(partition by a2) from t2;
> +-----+---------+
> | a2  | EXPR$1  |
> +-----+---------+
> | 0   | 1       |
> | 1   | 1       |
> | 2   | 3       |
> | 2   | 3       |
> | 2   | 3       |
> | 3   | 1       |
> | 4   | 1       |
> | 5   | 1       |
> | 6   | 1       |
> | 7   | 2       |
> | 7   | 2       |
> | 8   | 1       |
> | 9   | 1       |
> +-----+---------+
> 13 rows selected (0.224 seconds)
> 0: jdbc:drill:schema=dfs> select a2, count(b2) over(partition by a2) from t2;
> +-----+---------+
> | a2  | EXPR$1  |
> +-----+---------+
> | 0   | 1       |
> | 1   | 1       |
> | 2   | 3       |
> | 2   | 3       |
> | 2   | 3       |
> | 3   | 1       |
> | 4   | 1       |
> | 5   | 1       |
> | 6   | 1       |
> | 7   | 2       |
> | 7   | 2       |
> | 8   | 1       |
> | 9   | 1       |
> +-----+---------+
> 13 rows selected (0.219 seconds)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)