You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "Chun Chang (JIRA)" <ji...@apache.org> on 2015/01/27 20:03:35 UTC
[jira] [Created] (DRILL-2083) order by on large dataset returns
wrong results
Chun Chang created DRILL-2083:
---------------------------------
Summary: order by on large dataset returns wrong results
Key: DRILL-2083
URL: https://issues.apache.org/jira/browse/DRILL-2083
Project: Apache Drill
Issue Type: Bug
Components: Execution - Operators
Affects Versions: 0.8.0
Reporter: Chun Chang
Assignee: Chris Westin
Priority: Critical
#Mon Jan 26 14:10:51 PST 2015
git.commit.id.abbrev=3c6d0ef
Test data has 1 million rows and can be accessed at
http://apache-drill.s3.amazonaws.com/files/complex.json.gz
{code}
0: jdbc:drill:schema=dfs.drillTestDirComplexJ> select count (t.id) from `complex.json` t;
+------------+
| EXPR$0 |
+------------+
| 1000000 |
+------------+
{code}
But order by returned 30 more rows.
{code}
0: jdbc:drill:schema=dfs.drillTestDirComplexJ> select t.id from `complex.json` t order by t.id;
....
| 999997 |
| 999998 |
| 999999 |
| 1000000 |
+------------+
1,000,030 rows selected (19.449 seconds)
{code}
physical plan
{code}
0: jdbc:drill:schema=dfs.drillTestDirComplexJ> explain plan for select t.id from `complex.json` t order by t.id;
+------------+------------+
| text | json |
+------------+------------+
| 00-00 Screen
00-01 SingleMergeExchange(sort0=[0 ASC])
01-01 SelectionVectorRemover
01-02 Sort(sort0=[$0], dir0=[ASC])
01-03 HashToRandomExchange(dist0=[[$0]])
02-01 Scan(groupscan=[EasyGroupScan [selectionRoot=/drill/testdata/complex_type/json/complex.json, numFiles=1, columns=[`id`], files=[maprfs:/drill/testdata/complex_type/json/complex.json]]])
{code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)