You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@drill.apache.org by "Aman Sinha (JIRA)" <ji...@apache.org> on 2015/02/05 18:51:34 UTC
[jira] [Created] (DRILL-2170) Wrong result when joining to a
subquery that has group-by, order-by and LIMIT
Aman Sinha created DRILL-2170:
---------------------------------
Summary: Wrong result when joining to a subquery that has group-by, order-by and LIMIT
Key: DRILL-2170
URL: https://issues.apache.org/jira/browse/DRILL-2170
Project: Apache Drill
Issue Type: Bug
Components: Query Planning & Optimization
Affects Versions: 0.7.0
Reporter: Aman Sinha
Assignee: Aman Sinha
Priority: Critical
On TPCH SF1: first query gives right result, second produces wrong result. The main difference is the order-by in the second query is on the aggregation function, not the grouping key.
{code}
// Correct result
: jdbc:drill:zk=local> select count(*) from (select l_orderkey, sum(l_quantity), sum(l_extendedprice) from lineitem group by l_orderkey order by 1 limit 100) sq inner join orders o on sq.l_orderkey = o.o_orderkey;
+------------+
| EXPR$0 |
+------------+
| 100 |
+------------+
1 row selected (2.67 seconds)
// Wrong result
0: jdbc:drill:zk=local> select count(*) from (select l_orderkey, sum(l_quantity), sum(l_extendedprice) from lineitem group by l_orderkey order by 3 limit 100) sq inner join orders o on sq.l_orderkey = o.o_orderkey;
+------------+
| EXPR$0 |
+------------+
| 400 |
+------------+
1 row selected (3.163 seconds)
Both queries should be having identical plans, but the plan for the second query shows a missing SingleMergeExchange after the TopN operation.
{code}
0: jdbc:drill:zk=local> explain plan for select count(*) from (select l_orderkey, sum(l_quantity), sum(l_extendedprice) from lineitem group by l_orderkey order by 3 limit 100) sq inner join orders o on sq.l_orderkey = o.o_orderkey;
+------------+------------+
| text | json |
+------------+------------+
| 00-00 Screen
00-01 StreamAgg(group=[{}], EXPR$0=[$SUM0($0)])
00-02 UnionExchange
01-01 StreamAgg(group=[{}], EXPR$0=[COUNT()])
01-02 Project($f0=[0])
01-03 HashJoin(condition=[=($0, $1)], joinType=[inner])
01-05 HashToRandomExchange(dist0=[[$0]])
02-01 Project(l_orderkey=[$0])
02-02 SelectionVectorRemover
02-03 Limit(fetch=[100])
02-04 SelectionVectorRemover
02-05 TopN(limit=[100])
02-06 HashToRandomExchange(dist0=[[$2]])
04-01 HashAgg(group=[{0}], EXPR$1=[SUM($1)], EXPR$2=[SUM($2)])
04-02 HashToRandomExchange(dist0=[[$0]])
05-01 HashAgg(group=[{0}], EXPR$1=[SUM($1)], EXPR$2=[SUM($2)])
05-02 Project(l_orderkey=[$1], l_quantity=[$2], l_extendedprice=[$0])
05-03 Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath [path=file:/Users/asinha/data/tpch-sf1/lineitem]], selectionRoot=/Users/asinha/data/tpch-sf1/lineitem, numFiles=1, columns=[`l_orderkey`, `l_quantity`, `l_extendedprice`]]])
01-04 HashToRandomExchange(dist0=[[$0]])
03-01 Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath [path=file:/Users/asinha/data/tpch-sf1/orders]], selectionRoot=/Users/asinha/data/tpch-sf1/orders, numFiles=1, columns=[`o_orderkey`]]])
{code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)