You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "Bridget Bevens (JIRA)" <ji...@apache.org> on 2015/10/07 23:48:26 UTC
[jira] [Closed] (DRILL-3652) Need to document order of operations
with window functions and flatten
[ https://issues.apache.org/jira/browse/DRILL-3652?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Bridget Bevens closed DRILL-3652.
---------------------------------
Assignee: Bridget Bevens
Updated documentation with a note about this behavior.
> Need to document order of operations with window functions and flatten
> ----------------------------------------------------------------------
>
> Key: DRILL-3652
> URL: https://issues.apache.org/jira/browse/DRILL-3652
> Project: Apache Drill
> Issue Type: Improvement
> Components: Documentation
> Affects Versions: 1.2.0
> Reporter: Victoria Markman
> Assignee: Bridget Bevens
> Labels: window_function
> Fix For: 1.2.0
>
>
> In standard SQL, window functions are the last set of operations performed in a query except for the final order by clause.
> Using window function with flatten is a bit confusing, because it appears as an operator in the query plan and I expected flatten to run first followed by a window function.
> This is not what is happening:
> {code}
> 0: jdbc:drill:schema=dfs> select * from `complex.json`;
> +----+-----------+----------+
> | x | y | z |
> +----+-----------+----------+
> | 5 | a string | [1,2,3] |
> +----+-----------+----------+
> 1 row selected (0.128 seconds)
> 0: jdbc:drill:schema=dfs> select sum(x) over(), x , y, flatten(z) from `complex.json`;
> +---------+----+-----------+---------+
> | EXPR$0 | x | y | EXPR$3 |
> +---------+----+-----------+---------+
> | 5 | 5 | a string | 1 |
> | 5 | 5 | a string | 2 |
> | 5 | 5 | a string | 3 |
> +---------+----+-----------+---------+
> 3 rows selected (0.152 seconds)
> 0: jdbc:drill:schema=dfs> explain plan for select sum(x) over(), x , y, flatten(z) from `complex.json`;
> +------+------+
> | text | json |
> +------+------+
> | 00-00 Screen
> 00-01 ProjectAllowDup(EXPR$0=[$0], x=[$1], y=[$2], EXPR$3=[$3])
> 00-02 Project(w0$o0=[$3], x=[$0], y=[$1], EXPR$3=[$4])
> 00-03 Flatten(flattenField=[$4])
> 00-04 Project(EXPR$0=[$0], EXPR$1=[$1], EXPR$2=[$2], EXPR$3=[$3], EXPR$5=[$2])
> 00-05 Project(x=[$1], y=[$2], z=[$3], w0$o0=[$4])
> 00-06 Window(window#0=[window(partition {} order by [] range between UNBOUNDED PRECEDING and UNBOUNDED FOLLOWING aggs [SUM($1)])])
> 00-07 Project(T38¦¦*=[$0], x=[$1], y=[$2], z=[$3])
> 00-08 Scan(groupscan=[EasyGroupScan [selectionRoot=maprfs:/drill/testdata/subqueries/complex.json, numFiles=1, columns=[`*`], files=[maprfs:///drill/testdata/subqueries/complex.json]]]
> {code}
> We should suggest to users to put flatten in a subquery if they want to run window function on top of the result set returned by flatten.
> {code}
> 0: jdbc:drill:schema=dfs> select x, y, a, sum(x) over() from ( select x , y, flatten(z) as a from `complex.json`);
> +----+-----------+----+---------+
> | x | y | a | EXPR$3 |
> +----+-----------+----+---------+
> | 5 | a string | 1 | 15 |
> | 5 | a string | 2 | 15 |
> | 5 | a string | 3 | 15 |
> +----+-----------+----+---------+
> 3 rows selected (0.145 seconds)
> {code}
> I suggest we document this issue in the window function section, perhaps in "Usage notes".
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)