You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@drill.apache.org by "Khurram Faraaz (JIRA)" <ji...@apache.org> on 2015/06/26 23:04:04 UTC
[jira] [Created] (DRILL-3404) Filter on window function does not
appear in query plan
Khurram Faraaz created DRILL-3404:
-------------------------------------
Summary: Filter on window function does not appear in query plan
Key: DRILL-3404
URL: https://issues.apache.org/jira/browse/DRILL-3404
Project: Apache Drill
Issue Type: Bug
Components: Query Planning & Optimization
Affects Versions: 1.1.0
Environment: 4 node cluster on CentOS
Reporter: Khurram Faraaz
Assignee: Jinfeng Ni
Priority: Critical
Filter is missing in the query plan for the below query in Drill, and hence wrong results are returned.
Results from Drill
{code}
0: jdbc:drill:schema=dfs.tmp> select c1, c2, w_sum from ( select c1, c2, sum ( c1 ) over ( partition by c2 order by c1 asc nulls first ) w_sum from `tblWnulls` ) sub_query where w_sum is not null;
+-------------+-------+-------------+
| c1 | c2 | w_sum |
+-------------+-------+-------------+
| 0 | a | 0 |
| 1 | a | 1 |
| 5 | a | 6 |
| 10 | a | 16 |
| 11 | a | 27 |
| 14 | a | 41 |
| 11111 | a | 11152 |
| 2 | b | 2 |
| 9 | b | 11 |
| 13 | b | 24 |
| 17 | b | 41 |
| null | c | null |
| 4 | c | 4 |
| 6 | c | 10 |
| 8 | c | 18 |
| 12 | c | 30 |
| 13 | c | 56 |
| 13 | c | 56 |
| null | d | null |
| null | d | null |
| 10 | d | 10 |
| 11 | d | 21 |
| 2147483647 | d | 4294967315 |
| 2147483647 | d | 4294967315 |
| -1 | e | -1 |
| 15 | e | 14 |
| null | null | null |
| 19 | null | 19 |
| 65536 | null | 65555 |
| 1000000 | null | 1065555 |
+-------------+-------+-------------+
30 rows selected (0.337 seconds)
{code}
Explain plan for the above query from Drill
{code}
0: jdbc:drill:schema=dfs.tmp> explain plan for select c1, c2, w_sum from ( select c1, c2, sum ( c1 ) over ( partition by c2 order by c1 asc nulls first ) w_sum from `tblWnulls` ) sub_query where w_sum is not null;
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------+
| text | json |
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------+
| 00-00 Screen
00-01 Project(c1=[$0], c2=[$1], w_sum=[$2])
00-02 Project(c1=[$0], c2=[$1], w_sum=[CASE(>($2, 0), $3, null)])
00-03 Window(window#0=[window(partition {1} order by [0 ASC-nulls-first] range between UNBOUNDED PRECEDING and CURRENT ROW aggs [COUNT($0), $SUM0($0)])])
00-04 SelectionVectorRemover
00-05 Sort(sort0=[$1], sort1=[$0], dir0=[ASC], dir1=[ASC-nulls-first])
00-06 Project(c1=[$1], c2=[$0])
00-07 Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath [path=maprfs:///tmp/tblWnulls]], selectionRoot=/tmp/tblWnulls, numFiles=1, columns=[`c1`, `c2`]]])
{code}
Results from Postgres 9.3
{code}
postgres=# select c1, c2, w_sum from ( select c1, c2, sum ( c1 ) over ( partition by c2 order by c1 asc nulls first ) w_sum from t222 ) sub_query where w_sum is not null;
c1 | c2 | w_sum
------------+----+------------
0 | a | 0
1 | a | 1
5 | a | 6
10 | a | 16
11 | a | 27
14 | a | 41
11111 | a | 11152
2 | b | 2
9 | b | 11
13 | b | 24
17 | b | 41
4 | c | 4
6 | c | 10
8 | c | 18
12 | c | 30
13 | c | 56
13 | c | 56
10 | d | 10
11 | d | 21
2147483647 | d | 4294967315
2147483647 | d | 4294967315
-1 | e | -1
15 | e | 14
19 | | 19
65536 | | 65555
1000000 | | 1065555
(26 rows)
{code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)