You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "Tomer Shiran (JIRA)" <ji...@apache.org> on 2014/11/03 06:45:33 UTC
[jira] [Commented] (DRILL-1629) Cast issue during query
[ https://issues.apache.org/jira/browse/DRILL-1629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14194268#comment-14194268 ]
Tomer Shiran commented on DRILL-1629:
-------------------------------------
Note that this does work when both datasets are coming from DFS:
{code}
0: jdbc:drill:zk=local> SELECT u.user_id, u.name, count(*) reviews FROM dfs.yelp.`yelp_academic_dataset_user.json` u, dfs.yelp.`yelp_academic_dataset_review.json` r WHERE u.user_id = r.user_id GROUP BY u.user_id, u.name ORDER BY reviews DESC LIMIT 10;
+------------+------------+------------+
| user_id | name | reviews |
+------------+------------+------------+
| kGgAARL2UmvCcTRfiscjug | J | 1399 |
| ikm0UCahtK34LbLCEw4YTw | Rand | 1137 |
| Iu3Jo9ROp2IWC9FwtWOaUQ | Norm | 1046 |
| glRXVWWD6x1EZKfjJawTOg | Jade | 1013 |
| PV5voYSD43Cn_3gHmxG7DA | David | 895 |
| fczQCSmaWF78toLEmb0Zsw | Gabi | 872 |
| lHHwLi_YZuDSfdlSShFkug | Nelson | 816 |
| ia1nTRAQEaFWv0cwADeK7g | Emily | 811 |
| 3gIfcQq5KxAegwCPXc83cQ | Jennifer | 790 |
| DrWLhrK8WMZf7Jb-Oqc7ww | Brad | 765 |
+------------+------------+------------+
10 rows selected (54.847 seconds)
0: jdbc:drill:zk=local>
{code}
And here's the explain plan:
{code}
0: jdbc:drill:zk=local> EXPLAIN PLAN FOR SELECT u.user_id, u.name, count(*) reviews FROM dfs.yelp.`yelp_academic_dataset_user.json` u, dfs.yelp.`yelp_academic_dataset_review.json` r WHERE u.user_id = r.user_id GROUP BY u.user_id, u.name ORDER BY reviews DESC LIMIT 10;
+------------+------------+
| text | json |
+------------+------------+
| 00-00 Screen
00-01 Project(user_id=[$0], name=[$1], reviews=[$2])
00-02 SelectionVectorRemover
00-03 Limit(fetch=[10])
00-04 SingleMergeExchange(sort0=[2 DESC])
01-01 SelectionVectorRemover
01-02 TopN(limit=[10])
01-03 HashToRandomExchange(dist0=[[$2]])
02-01 HashAgg(group=[{0, 1}], reviews=[COUNT()])
02-02 Project(user_id=[$0], name=[$1])
02-03 HashJoin(condition=[=($0, $2)], joinType=[inner])
02-05 HashToRandomExchange(dist0=[[$0]])
03-01 Scan(groupscan=[EasyGroupScan [selectionRoot=/Users/tshiran/Development/yelp/yelp_academic_dataset_user.json, numFiles=1, columns = [SchemaPath [`user_id`], SchemaPath [`name`]]]])
02-04 Project(user_id0=[$0])
02-06 HashToRandomExchange(dist0=[[$0]])
04-01 Scan(groupscan=[Ea |
+------------+------------+
1 row selected (0.415 seconds)
{code}
> Cast issue during query
> -----------------------
>
> Key: DRILL-1629
> URL: https://issues.apache.org/jira/browse/DRILL-1629
> Project: Apache Drill
> Issue Type: Bug
> Reporter: Tomer Shiran
>
> Imported Yelp user.json file into a local MongoDB instance (via mongoimport). Ran the following query to find the names of the users who had the most reviews:
> {code}
> 0: jdbc:drill:zk=local> SELECT u.user_id, u.name, count(*) reviews FROM mongo.yelp.users u, dfs.yelp.`yelp_academic_dataset_review.json` r WHERE u.user_id = r.user_id GROUP BY u.user_id, u.name ORDER BY reviews DESC LIMIT 10;
> Query failed: Failure while running fragment. Failure finding function that runtime code generation expected. Signature: compare_to( VARCHAR:OPTIONALINT:OPTIONAL, ) returns INT:REQUIRED [c796f1fb-b73a-49e8-8b4a-ed973d89b7d3]
> Error: exception while executing query: Failure while trying to get next result batch. (state=,code=0)
> {code}
> Here's the explain plan:
> {code}
> 0: jdbc:drill:zk=local> EXPLAIN PLAN FOR SELECT u.user_id, u.name, count(*) reviews FROM mongo.yelp.users u, dfs.yelp.`yelp_academic_dataset_review.json` r WHERE u.user_id = r.user_id GROUP BY u.user_id, u.name ORDER BY reviews DESC LIMIT 10;
> +------------+------------+
> | text | json |
> +------------+------------+
> | 00-00 Screen
> 00-01 Project(user_id=[$0], name=[$1], reviews=[$2])
> 00-02 SelectionVectorRemover
> 00-03 Limit(fetch=[10])
> 00-04 SingleMergeExchange(sort0=[2 DESC])
> 01-01 SelectionVectorRemover
> 01-02 TopN(limit=[10])
> 01-03 HashToRandomExchange(dist0=[[$2]])
> 02-01 HashAgg(group=[{0, 1}], reviews=[COUNT()])
> 02-02 Project(user_id=[$0], name=[$1])
> 02-03 HashJoin(condition=[=($0, $2)], joinType=[inner])
> 02-05 HashToRandomExchange(dist0=[[$0]])
> 03-01 Scan(groupscan=[MongoGroupScan [MongoScanSpec=MongoScanSpec [dbName=yelp, collectionName=users, filters=null], columns=[SchemaPath [`user_id`], SchemaPath [`name`]]]])
> 02-04 HashToRandomExchange(dist0=[[$0]])
> 04-01 Project(T8¦¦*=[$0])
> 04-02 Project(T8¦¦*=[$0], T8¦¦user_id=[$ |
> +------------+------------+
> 1 row selected (0.42 seconds)
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)