You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "Khurram Faraaz (JIRA)" <ji...@apache.org> on 2015/06/26 23:54:05 UTC
[jira] [Updated] (DRILL-3404) Filter on window function does not appear in query plan

     [ https://issues.apache.org/jira/browse/DRILL-3404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Khurram Faraaz updated DRILL-3404:
----------------------------------
    Attachment: 0_0_0.parquet

Attached parquet input file used in the test.

> Filter on window function does not appear in query plan
> -------------------------------------------------------
>
>                 Key: DRILL-3404
>                 URL: https://issues.apache.org/jira/browse/DRILL-3404
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Query Planning & Optimization
>    Affects Versions: 1.1.0
>         Environment: 4 node cluster on CentOS
>            Reporter: Khurram Faraaz
>            Assignee: Jinfeng Ni
>            Priority: Critical
>         Attachments: 0_0_0.parquet
>
>
> Filter is missing in the query plan for the below query in Drill, and hence wrong results are returned.
> Results from Drill
> {code}
> 0: jdbc:drill:schema=dfs.tmp> select c1, c2, w_sum from ( select c1, c2, sum ( c1 ) over ( partition by c2 order by c1 asc nulls first ) w_sum from `tblWnulls` ) sub_query where w_sum is not null;
> +-------------+-------+-------------+
> |     c1      |  c2   |    w_sum    |
> +-------------+-------+-------------+
> | 0           | a     | 0           |
> | 1           | a     | 1           |
> | 5           | a     | 6           |
> | 10          | a     | 16          |
> | 11          | a     | 27          |
> | 14          | a     | 41          |
> | 11111       | a     | 11152       |
> | 2           | b     | 2           |
> | 9           | b     | 11          |
> | 13          | b     | 24          |
> | 17          | b     | 41          |
> | null        | c     | null        |
> | 4           | c     | 4           |
> | 6           | c     | 10          |
> | 8           | c     | 18          |
> | 12          | c     | 30          |
> | 13          | c     | 56          |
> | 13          | c     | 56          |
> | null        | d     | null        |
> | null        | d     | null        |
> | 10          | d     | 10          |
> | 11          | d     | 21          |
> | 2147483647  | d     | 4294967315  |
> | 2147483647  | d     | 4294967315  |
> | -1          | e     | -1          |
> | 15          | e     | 14          |
> | null        | null  | null        |
> | 19          | null  | 19          |
> | 65536       | null  | 65555       |
> | 1000000     | null  | 1065555     |
> +-------------+-------+-------------+
> 30 rows selected (0.337 seconds)
> {code}
> Explain plan for the above query from Drill
> {code}
> 0: jdbc:drill:schema=dfs.tmp> explain plan for select c1, c2, w_sum from ( select c1, c2, sum ( c1 ) over ( partition by c2 order by c1 asc nulls first ) w_sum from `tblWnulls` ) sub_query where w_sum is not null;
> +----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------+
> |                                                                                                                                                                                                                                                                                                                                       text                                                                                                                                                                                                                                                                                                                                       | json  |
> +----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------+
> | 00-00    Screen
> 00-01      Project(c1=[$0], c2=[$1], w_sum=[$2])
> 00-02        Project(c1=[$0], c2=[$1], w_sum=[CASE(>($2, 0), $3, null)])
> 00-03          Window(window#0=[window(partition {1} order by [0 ASC-nulls-first] range between UNBOUNDED PRECEDING and CURRENT ROW aggs [COUNT($0), $SUM0($0)])])
> 00-04            SelectionVectorRemover
> 00-05              Sort(sort0=[$1], sort1=[$0], dir0=[ASC], dir1=[ASC-nulls-first])
> 00-06                Project(c1=[$1], c2=[$0])
> 00-07                  Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath [path=maprfs:///tmp/tblWnulls]], selectionRoot=/tmp/tblWnulls, numFiles=1, columns=[`c1`, `c2`]]])
> {code}
> Results from Postgres 9.3
> {code}
> postgres=# select c1, c2, w_sum from ( select c1, c2, sum ( c1 ) over ( partition by c2 order by c1 asc nulls first ) w_sum from t222 ) sub_query where w_sum is not null;
>      c1     | c2 |   w_sum    
> ------------+----+------------
>           0 | a  |          0
>           1 | a  |          1
>           5 | a  |          6
>          10 | a  |         16
>          11 | a  |         27
>          14 | a  |         41
>       11111 | a  |      11152
>           2 | b  |          2
>           9 | b  |         11
>          13 | b  |         24
>          17 | b  |         41
>           4 | c  |          4
>           6 | c  |         10
>           8 | c  |         18
>          12 | c  |         30
>          13 | c  |         56
>          13 | c  |         56
>          10 | d  |         10
>          11 | d  |         21
>  2147483647 | d  | 4294967315
>  2147483647 | d  | 4294967315
>          -1 | e  |         -1
>          15 | e  |         14
>          19 |    |         19
>       65536 |    |      65555
>     1000000 |    |    1065555
> (26 rows)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)