You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@drill.apache.org by "Paul Rogers (JIRA)" <ji...@apache.org> on 2016/12/21 00:50:58 UTC
[jira] [Created] (DRILL-5143) Planner creates redundant sort for
nested query on single fragment
Paul Rogers created DRILL-5143:
----------------------------------
Summary: Planner creates redundant sort for nested query on single fragment
Key: DRILL-5143
URL: https://issues.apache.org/jira/browse/DRILL-5143
Project: Apache Drill
Issue Type: Bug
Affects Versions: 1.8.0
Reporter: Paul Rogers
Priority: Minor
The unit test {{TestWindowFrame.testUnboundedFollowing}} identified an optimization opportunity in the planner. It uses the following query:
{code}
SELECT
position_id,
employee_id,
MAX(employee_id) OVER(PARTITION BY position_id) AS `last_value`
FROM (
SELECT *
FROM dfs_test.`%s/window/b4.p4`
ORDER BY position_id, employee_id )
{code}
Which produces the following (heavily elided) plan:
{code}
"graph" : [ {
"pop" : "fs-scan",
...
}, {
"pop" : "project",
...
}, {
"pop" : "external-sort",
"orderings" : [ {
"order" : "ASC",
"expr" : "`position_id`",
"nullDirection" : "UNSPECIFIED"
}, {
"order" : "ASC",
"expr" : "`employee_id`",
"nullDirection" : "UNSPECIFIED"
} ],
...
}, {
"pop" : "selection-vector-remover",
...
}, {
"pop" : "project",
...
"exprs" : [ {
"ref" : "`$0`",
"expr" : "`T0¦¦position_id`"
}...
}, {
"pop" : "external-sort",
...
"orderings" : [ {
"order" : "ASC",
"expr" : "`$0`",
"nullDirection" : "UNSPECIFIED"
} ],
...
}, {
"pop" : "selection-vector-remover",
...
}, {
"pop" : "window",
...
"aggregations" : [ {
"ref" : "`w0$o0`",
"expr" : "max(`$1`) "
} ],
...
}, {
"pop" : "project",
...
}, {
"pop" : "screen",
...
} ]
{code}
Note that two sorts are stacked one atop the other. This is a "degenerate" plan; normally a shuffle operation would occur between the two sorts in a distributed query. But, because the query is so small, it runs on a single node and has a redundant sort.
Either:
* Note that the data is already sorted and omit the downstream sort, or
* Note that the inner query sort is ignored by downstream operators and discard the inner sort.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)