You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "Victoria Markman (JIRA)" <ji...@apache.org> on 2015/05/26 20:53:17 UTC
[jira] [Created] (DRILL-3182) Window function with DISTINCT
qualifier returns seemingly incorrect result
Victoria Markman created DRILL-3182:
---------------------------------------
Summary: Window function with DISTINCT qualifier returns seemingly incorrect result
Key: DRILL-3182
URL: https://issues.apache.org/jira/browse/DRILL-3182
Project: Apache Drill
Issue Type: Bug
Components: Functions - Drill
Affects Versions: 1.0.0
Reporter: Victoria Markman
Assignee: Daniel Barclay (Drill)
Both count(distinct <colname>) and count(all <colname>) return the same result. It does not look correct to me and I'm not sure what the correct behavior is going to be.
(1) Latest postgres does not support distinct with Window functions:
postgres=# select a2, count(distinct b2) over(partition by a2) from t2;
ERROR: DISTINCT is not implemented for window functions
LINE 1: select a2, count(distinct b2) over(partition by a2) from t2;
^
postgres=# select a2, avg(distinct a2) over(partition by a2) from t2;
ERROR: DISTINCT is not implemented for window functions
LINE 1: select a2, avg(distinct a2) over(partition by a2) from t2;
(2) Calcite does not support this either: https://github.com/apache/incubator-calcite/blob/master/doc/reference.md
Do we support it ? If not, I think we should throw an error ...
{code}
0: jdbc:drill:schema=dfs> select * from t2;
+-----+--------+-------------+
| a2 | b2 | c2 |
+-----+--------+-------------+
| 0 | zzz | 2014-12-31 |
| 1 | aaaaa | 2015-01-01 |
| 2 | bbbbb | 2015-01-02 |
| 2 | bbbbb | 2015-01-02 |
| 2 | bbbbb | 2015-01-02 |
| 3 | ccccc | 2015-01-03 |
| 4 | ddddd | 2015-01-04 |
| 5 | eeeee | 2015-01-05 |
| 6 | fffff | 2015-01-06 |
| 7 | ggggg | 2015-01-07 |
| 7 | ggggg | 2015-01-07 |
| 8 | hhhhh | 2015-01-08 |
| 9 | iiiii | 2015-01-09 |
+-----+--------+-------------+
13 rows selected (0.134 seconds)
0: jdbc:drill:schema=dfs> select a2, count(distinct b2) over(partition by a2) from t2;
+-----+---------+
| a2 | EXPR$1 |
+-----+---------+
| 0 | 1 |
| 1 | 1 |
| 2 | 3 |
| 2 | 3 |
| 2 | 3 |
| 3 | 1 |
| 4 | 1 |
| 5 | 1 |
| 6 | 1 |
| 7 | 2 |
| 7 | 2 |
| 8 | 1 |
| 9 | 1 |
+-----+---------+
13 rows selected (0.224 seconds)
0: jdbc:drill:schema=dfs> select a2, count(b2) over(partition by a2) from t2;
+-----+---------+
| a2 | EXPR$1 |
+-----+---------+
| 0 | 1 |
| 1 | 1 |
| 2 | 3 |
| 2 | 3 |
| 2 | 3 |
| 3 | 1 |
| 4 | 1 |
| 5 | 1 |
| 6 | 1 |
| 7 | 2 |
| 7 | 2 |
| 8 | 1 |
| 9 | 1 |
+-----+---------+
13 rows selected (0.219 seconds)
{code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)