You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "Cem Ezberci (JIRA)" <ji...@apache.org> on 2015/02/21 09:20:11 UTC

[jira] [Commented] (DRILL-1475) MongoDB storage plugin errors with sum/avg functions

    [ https://issues.apache.org/jira/browse/DRILL-1475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14330075#comment-14330075 ] 

Cem Ezberci commented on DRILL-1475:
------------------------------------

I know my problem is not related to MongoDB but this was the closest issue I could find in JIRA.

I'm observing the same behavior when I create a table from a gzipped csv file. The table is stored in parquet format. Below are the commands I ran and results:

{code}
0: jdbc:drill:zk=local> create table raw_weather_data(wsid, `year`, `month`, `day`, `hour`, temperature, dewpoint, pressure, wind_direction, wind_speed, sky_condition, one_hour_precip, six_hour_precip) as select columns[0], columns[1], columns[2], columns[3], columns[4], columns[5], columns[6], columns[7], columns[8], columns[9], columns[10], columns[11], columns[12] from dfs.`/home/ezbercih/downloads/sfo-nyc-mia-lax-chi-2008-2014.csv.gz`;
+------------+---------------------------+
|  Fragment  | Number of records written |
+------------+---------------------------+
| 0_0        | 395453                    |
+------------+---------------------------+
1 row selected (3.328 seconds)
0: jdbc:drill:zk=local> select * from raw_weather_data limit 2;
+------------+------------+------------+------------+------------+-------------+------------+------------+----------------+------------+---------------+-----------------+-----------------+
|    wsid    |    year    |   month    |    day     |    hour    | temperature |  dewpoint  |  pressure  | wind_direction | wind_speed | sky_condition | one_hour_precip | six_hour_precip |
+------------+------------+------------+------------+------------+-------------+------------+------------+----------------+------------+---------------+-----------------+-----------------+
| 724940:23234 | 2008       | 01         | 01         | 00         | 11.7        | -0.6       | 1023.8     | 50             | 7.2        | 2             | 0.0             | 0.0             |
| 724940:23234 | 2008       | 01         | 01         | 01         | 10.6        | 3.3        | 1023.5     | 100            | 4.1        | 4             | 0.0             | 0.0             |
+------------+------------+------------+------------+------------+-------------+------------+------------+----------------+------------+---------------+-----------------+-----------------+
2 rows selected (0.239 seconds)
0: jdbc:drill:zk=local> select avg(temperature) from raw_weather_data where wsid='724940:23234';
+------------+
|   EXPR$0   |
+------------+
Query failed: Query failed: Failure while running fragment., Invalid value for boolean: 11.7 [ 2ab002a7-94aa-47c4-8dd2-ad0955351452 on ubuntu-server:31010 ]
[ 2ab002a7-94aa-47c4-8dd2-ad0955351452 on ubuntu-server:31010 ]


java.lang.RuntimeException: java.sql.SQLException: Failure while executing query.
	at sqlline.SqlLine$IncrementalRows.hasNext(SqlLine.java:2514)
	at sqlline.SqlLine$TableOutputFormat.print(SqlLine.java:2148)
	at sqlline.SqlLine.print(SqlLine.java:1809)
	at sqlline.SqlLine$Commands.execute(SqlLine.java:3766)
	at sqlline.SqlLine$Commands.sql(SqlLine.java:3663)
	at sqlline.SqlLine.dispatch(SqlLine.java:889)
	at sqlline.SqlLine.begin(SqlLine.java:763)
	at sqlline.SqlLine.start(SqlLine.java:498)
	at sqlline.SqlLine.main(SqlLine.java:460)
{code}

Any ideas?

> MongoDB storage plugin errors with sum/avg functions
> ----------------------------------------------------
>
>                 Key: DRILL-1475
>                 URL: https://issues.apache.org/jira/browse/DRILL-1475
>             Project: Apache Drill
>          Issue Type: Bug
>    Affects Versions: 0.6.0
>            Reporter: Neeraja
>            Assignee: B Anil Kumar
>             Fix For: 0.6.0
>
>
> Using the latest code from Apache Drill 0.6 branch and hitting issues with the mongodb plugin.
> Dataset is used is the mongo DB zips dataset.
> 0: jdbc:drill:zk=local> select state, sum(pop) from zips group by state order by state;
> Query failed: Failure while running fragment. Invalid value for boolean: 15338 [8b6073af-a6ba-4516-ad9b-144f486c548c]
> Error: exception while executing query: Failure while trying to get next result batch. (state=,code=0)
> 0: jdbc:drill:zk=local> select state, avg(pop) from zips group by state order by state;
> Query failed: Failure while running fragment. Invalid value for boolean: 15338 [79d074a9-6299-4e8c-8ca4-aedef3873c49]
> Error: exception while executing query: Failure while trying to get next result batch. (state=,code=0)
> -------
> Explain plan output.
> 0: jdbc:drill:zk=local> explain plan for select state, avg(pop) from zips group by state order by state;
> +------------+------------+
> |    text    |    json    |
> +------------+------------+
> | 00-00    Screen
> 00-01      Project(state=[$0], EXPR$1=[$1])
> 00-02        SelectionVectorRemover
> 00-03          Sort(sort0=[$0], dir0=[ASC])
> 00-04            Project(state=[$0], EXPR$1=[CAST(/(CastHigh(CASE(=($2, 0), null, $1)), $2)):ANY])
> 00-05              HashAgg(group=[{0}], agg#0=[$SUM0($1)], agg#1=[COUNT($1)])
> 00-06                Project(state=[$1], pop=[$0])
> 00-07                  Scan(groupscan=[MongoGroupScan [MongoScanSpec=MongoScanSpec [dbName=newzipdb, collectionName=zips, filters=null], columns=[SchemaPath [`state`], SchemaPath [`pop`]]]])
>  | {
>   "head" : {
>     "version" : 1,
>     "generator" : {
>       "type" : "ExplainHandler",
>       "info" : ""
>     },
>     "type" : "APACHE_DRILL_PHYSICAL",
>     "options" : [ ],
>     "queue" : 0,
>     "resultMode" : "EXEC"
>   },
>   "graph" : [ {
>     "pop" : "mongo-scan",
>     "@id" : 7,
>     "mongoScanSpec" : {
>       "dbName" : "newzipdb",
>       "collectionName" : "zips",
>       "filters" : null
>     },
>     "storage" : {
>       "type" : "mongo",
>       "connection" : "mongodb://localhost:27017",
>       "enabled" : true
>     },
>     "columns" : [ "`state`", "`pop`" ],
>     "cost" : 29353.0
>   }, {
>     "pop" : "project",
>     "@id" : 6,
>     "exprs" : [ {
>       "ref" : "`state`",
>       "expr" : "`state`"
>     }, {
>       "ref" : "`pop`",
>       "expr" : "`pop`"
>     } ],
>     "child" : 7,
>     "initialAllocation" : 1000000,
>     "maxAllocation" : 10000000000,
>     "cost" : 29353.0
>   }, {
>     "pop" : "hash-aggregate",
>     "@id" : 5,
>     "child" : 6,
>     "cardinality" : 1.0,
>     "initialAllocation" : 1000000,
>     "maxAllocation" : 10000000000,
>     "cost" : 14676.5,
>     "groupByExprs" : [ {
>       "ref" : "`state`",
>       "expr" : "`state`"
>     } ],
>     "aggrExprs" : [ {
>       "ref" : "`$f1`",
>       "expr" : "$sum0(`pop`) "
>     }, {
>       "ref" : "`$f2`",
>       "expr" : "count(`pop`) "
>     } ]
>   }, {
>     "pop" : "project",
>     "@id" : 4,
>     "exprs" : [ {
>       "ref" : "`state`",
>       "expr" : "`state`"
>     }, {
>       "ref" : "`EXPR$1`",
>       "expr" : "divide(casthigh( ( if (equal(`$f2`, 0)  ) then (NULL )  else (`$f1` )  end  ) ) , `$f2`) "
>     } ],
>     "child" : 5,
>     "initialAllocation" : 1000000,
>     "maxAllocation" : 10000000000,
>     "cost" : 2935.3
>   }, {
>     "pop" : "external-sort",
>     "@id" : 3,
>     "child" : 4,
>     "orderings" : [ {
>       "order" : "ASC",
>       "expr" : "`state`",
>       "nullDirection" : "UNSPECIFIED"
>     } ],
>     "reverse" : false,
>     "initialAllocation" : 20000000,
>     "maxAllocation" : 10000000000,
>     "cost" : 2935.3
>   }, {
>     "pop" : "selection-vector-remover",
>     "@id" : 2,
>     "child" : 3,
>     "initialAllocation" : 1000000,
>     "maxAllocation" : 10000000000,
>     "cost" : 2935.3
>   }, {
>     "pop" : "project",
>     "@id" : 1,
>     "exprs" : [ {
>       "ref" : "`state`",
>       "expr" : "`state`"
>     }, {
>       "ref" : "`EXPR$1`",
>       "expr" : "`EXPR$1`"
>     } ],
>     "child" : 2,
>     "initialAllocation" : 1000000,
>     "maxAllocation" : 10000000000,
>     "cost" : 2935.3
>   }, {
>     "pop" : "screen",
>     "@id" : 0,
>     "child" : 1,
>     "initialAllocation" : 1000000,
>     "maxAllocation" : 10000000000,
>     "cost" : 2935.3
>   } ]
> } |
> +------------+------------+



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)