You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "Jinfeng Ni (JIRA)" <ji...@apache.org> on 2015/04/14 21:29:58 UTC

[jira] [Commented] (DRILL-2785) Aggregate MAX query on a directory fails with SQLException

    [ https://issues.apache.org/jira/browse/DRILL-2785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14494716#comment-14494716 ] 

Jinfeng Ni commented on DRILL-2785:
-----------------------------------

The query plan looks fine to me. From the error msg, I think the cause of the problem is in the input data. Is there any field in your csv file which is not a valid number?


java.lang.NumberFormatException: ",fr
        at org.apache.drill.exec.expr.fn.impl.StringFunctionHelpers.nfeL(StringFunctionHelpers.java:90) ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]

If that's the case, I think it is expected that the query will hit this NumberFormatedException.



> Aggregate MAX query on a directory fails with SQLException
> ----------------------------------------------------------
>
>                 Key: DRILL-2785
>                 URL: https://issues.apache.org/jira/browse/DRILL-2785
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Query Planning & Optimization
>    Affects Versions: 0.9.0
>         Environment: | 9d92b8e319f2d46e8659d903d355450e15946533 | DRILL-2580: Exit early from HashJoinBatch if build side is empty | 26.03.2015 @ 16:13:53 EDT 
>            Reporter: Khurram Faraaz
>            Assignee: Jinfeng Ni
>
> Aggregate query that should return maximum value from over 63036308 records, results in SQLException. Test was run on 4 node cluster on CentOS.
> {code}
> 0: jdbc:drill:> select max(cast(columns[0] as bigint)) from `deletions`;
> +------------+
> |   EXPR$0   |
> +------------+
> Query failed: RemoteRpcException: Failure while running fragment., ===================================Listeners will have opportunities to call in to ask questions they have always wanted to ask about wealth [ 67b3a1e6-0f1e-4bae-9e61-2e1d7fb7ba0d on centos-03.qa.lab:31010 ]
> [ 67b3a1e6-0f1e-4bae-9e61-2e1d7fb7ba0d on centos-03.qa.lab:31010 ]
> java.lang.RuntimeException: java.sql.SQLException: Failure while executing query.
> 	at sqlline.SqlLine$IncrementalRows.hasNext(SqlLine.java:2514)
> 	at sqlline.SqlLine$TableOutputFormat.print(SqlLine.java:2148)
> 	at sqlline.SqlLine.print(SqlLine.java:1809)
> 	at sqlline.SqlLine$Commands.execute(SqlLine.java:3766)
> 	at sqlline.SqlLine$Commands.sql(SqlLine.java:3663)
> 	at sqlline.SqlLine.dispatch(SqlLine.java:889)
> 	at sqlline.SqlLine.begin(SqlLine.java:763)
> 	at sqlline.SqlLine.start(SqlLine.java:498)
> 	at sqlline.SqlLine.main(SqlLine.java:460)
> {code}
> Physical plan for the failing query
> {code}
> 0: jdbc:drill:> explain plan for select max(cast(columns[0] as bigint)) from `deletions`;
> +------------+------------+
> |    text    |    json    |
> +------------+------------+
> | 00-00    Screen
> 00-01      StreamAgg(group=[{}], EXPR$0=[MAX($0)])
> 00-02        UnionExchange
> 01-01          StreamAgg(group=[{}], EXPR$0=[MAX($0)])
> 01-02            Project($f0=[CAST(ITEM($0, 0)):BIGINT])
> 01-03              Scan(groupscan=[EasyGroupScan [selectionRoot=/tmp/deletions, numFiles=20, columns=[`columns`[0]], files=[maprfs:/tmp/deletions/deletions-00007-of-00020.csv, maprfs:/tmp/deletions/deletions-00016-of-00020.csv, maprfs:/tmp/deletions/deletions-00012-of-00020.csv, maprfs:/tmp/deletions/deletions-00008-of-00020.csv, maprfs:/tmp/deletions/deletions-00019-of-00020.csv, maprfs:/tmp/deletions/deletions-00015-of-00020.csv, maprfs:/tmp/deletions/deletions-00018-of-00020.csv, maprfs:/tmp/deletions/deletions-00004-of-00020.csv, maprfs:/tmp/deletions/deletions-00000-of-00020.csv, maprfs:/tmp/deletions/deletions-00002-of-00020.csv, maprfs:/tmp/deletions/deletions-00005-of-00020.csv, maprfs:/tmp/deletions/deletions-00014-of-00020.csv, maprfs:/tmp/deletions/deletions-00017-of-00020.csv, maprfs:/tmp/deletions/deletions-00010-of-00020.csv, maprfs:/tmp/deletions/deletions-00006-of-00020.csv, maprfs:/tmp/deletions/deletions-00003-of-00020.csv, maprfs:/tmp/deletions/deletions-00001-of-00020.csv, maprfs:/tmp/deletions/deletions-00013-of-00020.csv, maprfs:/tmp/deletions/deletions-00009-of-00020.csv, maprfs:/tmp/deletions/deletions-00011-of-00020.csv]]])
> {code}
> Note that count took close to 11 seconds, to count total number of records in the files in that directory.
> {code}
> 0: jdbc:drill:> select count(columns[0]) from `deletions`;
> +------------+
> |   EXPR$0   |
> +------------+
> | 63036308   |
> +------------+
> 1 row selected (10.669 seconds)
> {code}
> stack trace from drillbit.log
> {code}
> 2015-04-14 18:36:56,513 [2ad2a1bd-cbb9-e1b9-d454-5f1882f6f427:frag:1:0] WARN  o.a.d.e.w.fragment.FragmentExecutor - Error while initializing or executing fragment
> java.lang.NumberFormatException: ",fr
>         at org.apache.drill.exec.expr.fn.impl.StringFunctionHelpers.nfeL(StringFunctionHelpers.java:90) ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
>         at org.apache.drill.exec.expr.fn.impl.StringFunctionHelpers.varCharToLong(StringFunctionHelpers.java:61) ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
>         at org.apache.drill.exec.test.generated.ProjectorGen0.doEval(ProjectorTemplate.java:35) ~[na:na]
>         at org.apache.drill.exec.test.generated.ProjectorGen0.projectRecords(ProjectorTemplate.java:62) ~[na:na]
>         at org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.doWork(ProjectRecordBatch.java:174) ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
>         at org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext(AbstractSingleRecordBatch.java:93) ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
>         at org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext(ProjectRecordBatch.java:134) ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
>         at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:142) ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
>         at org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next(IteratorValidatorBatchIterator.java:118) ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
>         at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:99) ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
>         at org.apache.drill.exec.test.generated.StreamingAggregatorGen5.doWork(StreamingAggTemplate.java:169) ~[na:na]
>         at org.apache.drill.exec.physical.impl.aggregate.StreamingAggBatch.innerNext(StreamingAggBatch.java:127) ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
>         at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:142) ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
>         at org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next(IteratorValidatorBatchIterator.java:118) ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
>         at org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:68) ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
>         at org.apache.drill.exec.physical.impl.SingleSenderCreator$SingleSenderRootExec.innerNext(SingleSenderCreator.java:99) ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
>         at org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:58) ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
>         at org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:163) ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
>         at org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38) [drill-common-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [na:1.7.0_75]
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [na:1.7.0_75]
>         at java.lang.Thread.run(Thread.java:745) [na:1.7.0_75]
> {code}
> Details of data used in test
> {code}
> [root@centos-01 deletions]# hadoop fs -ls /tmp/deletions
> Found 20 items
> -rwxr-xr-x   3 root root  395624293 2015-04-14 18:10 /tmp/deletions/deletions-00000-of-00020.csv
> -rwxr-xr-x   3 root root  395340408 2015-04-14 18:10 /tmp/deletions/deletions-00001-of-00020.csv
> -rwxr-xr-x   3 root root  395214289 2015-04-14 18:10 /tmp/deletions/deletions-00002-of-00020.csv
> -rwxr-xr-x   3 root root  395567609 2015-04-14 18:10 /tmp/deletions/deletions-00003-of-00020.csv
> -rwxr-xr-x   3 root root  395835996 2015-04-14 18:11 /tmp/deletions/deletions-00004-of-00020.csv
> -rwxr-xr-x   3 root root  395577943 2015-04-14 18:11 /tmp/deletions/deletions-00005-of-00020.csv
> -rwxr-xr-x   3 root root  395107664 2015-04-14 18:11 /tmp/deletions/deletions-00006-of-00020.csv
> -rwxr-xr-x   3 root root  395552816 2015-04-14 18:11 /tmp/deletions/deletions-00007-of-00020.csv
> -rwxr-xr-x   3 root root  395667155 2015-04-14 18:11 /tmp/deletions/deletions-00008-of-00020.csv
> -rwxr-xr-x   3 root root  395615707 2015-04-14 18:11 /tmp/deletions/deletions-00009-of-00020.csv
> -rwxr-xr-x   3 root root  395510351 2015-04-14 18:11 /tmp/deletions/deletions-00010-of-00020.csv
> -rwxr-xr-x   3 root root  395675873 2015-04-14 18:11 /tmp/deletions/deletions-00011-of-00020.csv
> -rwxr-xr-x   3 root root  395411794 2015-04-14 18:11 /tmp/deletions/deletions-00012-of-00020.csv
> -rwxr-xr-x   3 root root  395315641 2015-04-14 18:11 /tmp/deletions/deletions-00013-of-00020.csv
> -rwxr-xr-x   3 root root  395074152 2015-04-14 18:11 /tmp/deletions/deletions-00014-of-00020.csv
> -rwxr-xr-x   3 root root  395805979 2015-04-14 18:11 /tmp/deletions/deletions-00015-of-00020.csv
> -rwxr-xr-x   3 root root  395209117 2015-04-14 18:11 /tmp/deletions/deletions-00016-of-00020.csv
> -rwxr-xr-x   3 root root  395356744 2015-04-14 18:11 /tmp/deletions/deletions-00017-of-00020.csv
> -rwxr-xr-x   3 root root  395563580 2015-04-14 18:11 /tmp/deletions/deletions-00018-of-00020.csv
> -rwxr-xr-x   3 root root  395068587 2015-04-14 18:11 /tmp/deletions/deletions-00019-of-00020.csv
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)