You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "Khurram Faraaz (JIRA)" <ji...@apache.org> on 2015/06/26 23:54:05 UTC
[jira] [Updated] (DRILL-3404) Filter on window function does not
appear in query plan
[ https://issues.apache.org/jira/browse/DRILL-3404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Khurram Faraaz updated DRILL-3404:
----------------------------------
Attachment: 0_0_0.parquet
Attached parquet input file used in the test.
> Filter on window function does not appear in query plan
> -------------------------------------------------------
>
> Key: DRILL-3404
> URL: https://issues.apache.org/jira/browse/DRILL-3404
> Project: Apache Drill
> Issue Type: Bug
> Components: Query Planning & Optimization
> Affects Versions: 1.1.0
> Environment: 4 node cluster on CentOS
> Reporter: Khurram Faraaz
> Assignee: Jinfeng Ni
> Priority: Critical
> Attachments: 0_0_0.parquet
>
>
> Filter is missing in the query plan for the below query in Drill, and hence wrong results are returned.
> Results from Drill
> {code}
> 0: jdbc:drill:schema=dfs.tmp> select c1, c2, w_sum from ( select c1, c2, sum ( c1 ) over ( partition by c2 order by c1 asc nulls first ) w_sum from `tblWnulls` ) sub_query where w_sum is not null;
> +-------------+-------+-------------+
> | c1 | c2 | w_sum |
> +-------------+-------+-------------+
> | 0 | a | 0 |
> | 1 | a | 1 |
> | 5 | a | 6 |
> | 10 | a | 16 |
> | 11 | a | 27 |
> | 14 | a | 41 |
> | 11111 | a | 11152 |
> | 2 | b | 2 |
> | 9 | b | 11 |
> | 13 | b | 24 |
> | 17 | b | 41 |
> | null | c | null |
> | 4 | c | 4 |
> | 6 | c | 10 |
> | 8 | c | 18 |
> | 12 | c | 30 |
> | 13 | c | 56 |
> | 13 | c | 56 |
> | null | d | null |
> | null | d | null |
> | 10 | d | 10 |
> | 11 | d | 21 |
> | 2147483647 | d | 4294967315 |
> | 2147483647 | d | 4294967315 |
> | -1 | e | -1 |
> | 15 | e | 14 |
> | null | null | null |
> | 19 | null | 19 |
> | 65536 | null | 65555 |
> | 1000000 | null | 1065555 |
> +-------------+-------+-------------+
> 30 rows selected (0.337 seconds)
> {code}
> Explain plan for the above query from Drill
> {code}
> 0: jdbc:drill:schema=dfs.tmp> explain plan for select c1, c2, w_sum from ( select c1, c2, sum ( c1 ) over ( partition by c2 order by c1 asc nulls first ) w_sum from `tblWnulls` ) sub_query where w_sum is not null;
> +----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------+
> | text | json |
> +----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------+
> | 00-00 Screen
> 00-01 Project(c1=[$0], c2=[$1], w_sum=[$2])
> 00-02 Project(c1=[$0], c2=[$1], w_sum=[CASE(>($2, 0), $3, null)])
> 00-03 Window(window#0=[window(partition {1} order by [0 ASC-nulls-first] range between UNBOUNDED PRECEDING and CURRENT ROW aggs [COUNT($0), $SUM0($0)])])
> 00-04 SelectionVectorRemover
> 00-05 Sort(sort0=[$1], sort1=[$0], dir0=[ASC], dir1=[ASC-nulls-first])
> 00-06 Project(c1=[$1], c2=[$0])
> 00-07 Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath [path=maprfs:///tmp/tblWnulls]], selectionRoot=/tmp/tblWnulls, numFiles=1, columns=[`c1`, `c2`]]])
> {code}
> Results from Postgres 9.3
> {code}
> postgres=# select c1, c2, w_sum from ( select c1, c2, sum ( c1 ) over ( partition by c2 order by c1 asc nulls first ) w_sum from t222 ) sub_query where w_sum is not null;
> c1 | c2 | w_sum
> ------------+----+------------
> 0 | a | 0
> 1 | a | 1
> 5 | a | 6
> 10 | a | 16
> 11 | a | 27
> 14 | a | 41
> 11111 | a | 11152
> 2 | b | 2
> 9 | b | 11
> 13 | b | 24
> 17 | b | 41
> 4 | c | 4
> 6 | c | 10
> 8 | c | 18
> 12 | c | 30
> 13 | c | 56
> 13 | c | 56
> 10 | d | 10
> 11 | d | 21
> 2147483647 | d | 4294967315
> 2147483647 | d | 4294967315
> -1 | e | -1
> 15 | e | 14
> 19 | | 19
> 65536 | | 65555
> 1000000 | | 1065555
> (26 rows)
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)