You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@drill.apache.org by "Rahul Challapalli (JIRA)" <ji...@apache.org> on 2016/12/21 21:37:58 UTC
[jira] [Created] (DRILL-5148) Replace hash-distribution with a
simple round-robin distribution for a simple order by query
Rahul Challapalli created DRILL-5148:
----------------------------------------
Summary: Replace hash-distribution with a simple round-robin distribution for a simple order by query
Key: DRILL-5148
URL: https://issues.apache.org/jira/browse/DRILL-5148
Project: Apache Drill
Issue Type: Bug
Components: Execution - Relational Operators, Query Planning & Optimization
Affects Versions: 1.10.0
Reporter: Rahul Challapalli
git.commit.id.abbrev=cf2b7c7
The below plan indicates that we use hash-distribution to avoid data skew. However in the below case a simple round-robin approach would be sufficient
{code}
explain plan for select * from dfs.`/drill/testdata/resource-manager/5kwidecolumns_500k.tbl` order by columns[0];
+------+------+
| text | json |
+------+------+
| 00-00 Screen
00-01 Project(*=[$0])
00-02 Project(T2¦¦*=[$0])
00-03 SingleMergeExchange(sort0=[1 ASC])
01-01 SelectionVectorRemover
01-02 Sort(sort0=[$1], dir0=[ASC])
01-03 Project(T2¦¦*=[$0], EXPR$1=[$1])
01-04 HashToRandomExchange(dist0=[[$1]])
02-01 UnorderedMuxExchange
03-01 Project(T2¦¦*=[$0], EXPR$1=[$1], E_X_P_R_H_A_S_H_F_I_E_L_D=[hash32AsDouble($1)])
03-02 Project(T2¦¦*=[$0], EXPR$1=[ITEM($1, 0)])
03-03 Project(T2¦¦*=[$0], columns=[$1])
03-04 Scan(groupscan=[EasyGroupScan [selectionRoot=maprfs:/drill/testdata/resource-manager/5kwidecolumns_500k.tbl, numFiles=1, columns=[`*`], files=[maprfs:///drill/testdata/resource-manager/5kwidecolumns_500k.tbl]]])
{code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)