You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "Bridget Bevens (JIRA)" <ji...@apache.org> on 2015/10/07 23:48:26 UTC

[jira] [Closed] (DRILL-3652) Need to document order of operations with window functions and flatten

     [ https://issues.apache.org/jira/browse/DRILL-3652?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Bridget Bevens closed DRILL-3652.
---------------------------------
    Assignee: Bridget Bevens

Updated documentation with a note about this behavior.

> Need to document order of operations with window functions and flatten
> ----------------------------------------------------------------------
>
>                 Key: DRILL-3652
>                 URL: https://issues.apache.org/jira/browse/DRILL-3652
>             Project: Apache Drill
>          Issue Type: Improvement
>          Components: Documentation
>    Affects Versions: 1.2.0
>            Reporter: Victoria Markman
>            Assignee: Bridget Bevens
>              Labels: window_function
>             Fix For: 1.2.0
>
>
> In standard SQL, window functions are the last set of operations performed in a query except for the final order by clause. 
> Using window function with flatten is a bit confusing, because it appears as an operator in the query plan and I expected flatten to run first followed by a window function.
> This is not what is happening:
> {code}
> 0: jdbc:drill:schema=dfs> select * from `complex.json`;
> +----+-----------+----------+
> | x  |     y     |    z     |
> +----+-----------+----------+
> | 5  | a string  | [1,2,3]  |
> +----+-----------+----------+
> 1 row selected (0.128 seconds)
> 0: jdbc:drill:schema=dfs> select sum(x) over(), x , y, flatten(z) from `complex.json`;
> +---------+----+-----------+---------+
> | EXPR$0  | x  |     y     | EXPR$3  |
> +---------+----+-----------+---------+
> | 5       | 5  | a string  | 1       |
> | 5       | 5  | a string  | 2       |
> | 5       | 5  | a string  | 3       |
> +---------+----+-----------+---------+
> 3 rows selected (0.152 seconds)
> 0: jdbc:drill:schema=dfs> explain plan for select sum(x) over(), x , y, flatten(z) from `complex.json`;
> +------+------+
> | text | json |
> +------+------+
> | 00-00    Screen
> 00-01      ProjectAllowDup(EXPR$0=[$0], x=[$1], y=[$2], EXPR$3=[$3])
> 00-02        Project(w0$o0=[$3], x=[$0], y=[$1], EXPR$3=[$4])
> 00-03          Flatten(flattenField=[$4])
> 00-04            Project(EXPR$0=[$0], EXPR$1=[$1], EXPR$2=[$2], EXPR$3=[$3], EXPR$5=[$2])
> 00-05              Project(x=[$1], y=[$2], z=[$3], w0$o0=[$4])
> 00-06                Window(window#0=[window(partition {} order by [] range between UNBOUNDED PRECEDING and UNBOUNDED FOLLOWING aggs [SUM($1)])])
> 00-07                  Project(T38¦¦*=[$0], x=[$1], y=[$2], z=[$3])
> 00-08                    Scan(groupscan=[EasyGroupScan [selectionRoot=maprfs:/drill/testdata/subqueries/complex.json, numFiles=1, columns=[`*`], files=[maprfs:///drill/testdata/subqueries/complex.json]]]
> {code}
> We should suggest to users to put flatten in a subquery if they want to run window function on top of the result set returned by flatten.
> {code}
> 0: jdbc:drill:schema=dfs> select x, y, a, sum(x) over() from  ( select x , y, flatten(z) as a from `complex.json`);
> +----+-----------+----+---------+
> | x  |     y     | a  | EXPR$3  |
> +----+-----------+----+---------+
> | 5  | a string  | 1  | 15      |
> | 5  | a string  | 2  | 15      |
> | 5  | a string  | 3  | 15      |
> +----+-----------+----+---------+
> 3 rows selected (0.145 seconds)
> {code}
> I suggest we document this issue in the window function section, perhaps in "Usage notes".



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)