You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "Jason Altekruse (JIRA)" <ji...@apache.org> on 2015/07/07 01:18:04 UTC

[jira] [Comment Edited] (DRILL-2745) Query returns IOB Exception when JSON data with empty arrays is input to flatten function

    [ https://issues.apache.org/jira/browse/DRILL-2745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14615860#comment-14615860 ] 

Jason Altekruse edited comment on DRILL-2745 at 7/6/15 11:17 PM:
-----------------------------------------------------------------

This is currently expected behavior, as the error message reports, Drill doesn't support lists of different types. This also applies to nested lists, the error is being produced because of this column in the record. I have clipped the list but it still illustrates the issue. We cannot have different scalar types in the individual nested lists, in this case you have one list with numbers and the next with strings.

This input also includes a null in a list which is not supported in Drill. All of these problems should be able to be worked around if you turn on `store.json.all_text_mode` for the json reader. This is because null values in lists will be turned into a string containing the word "null" when in all_text_mode. If there is still an issue after turning on all_text_mode please file a new JIRA as these issues are unrelated to flatten.

{code}
{ "outkey":[[1000000,10000000,2000000,999999,1,0,-1,100000],["a","b","c","d","e","p","o","f","m","q","d","s","v"]] }
{code}




was (Author: jaltekruse):
This is currently expected behavior, as the error message reports, Drill doesn't support lists of different types. This also applies to nested lists, the error is being produced because of this column in the record. I have clipped the list but it still illustrates the issue. We cannot have different scalar types in the individual nested lists, in this case you have one list with numbers and the next with strings.

This input also includes a null in a list which is not supported in Drill. All of these problems should be able to be worked around if you turn on `store.json.all_text_mode` for the json reader. This is because null values in lists will be turned into a string containing the word "null" when in all_text_mode. If there is still an issue after turning on all_text_mode please file a new JIRA as these issues are unrelated to flatten.

{code}
{ "outkey":[[1000000,10000000,2000000,999999,1,0,-1,100000],["a","b","c","d","e","p","o","f","m","q","d","s","v"]]
{code}



> Query returns IOB Exception when JSON data with empty arrays is input to flatten function
> -----------------------------------------------------------------------------------------
>
>                 Key: DRILL-2745
>                 URL: https://issues.apache.org/jira/browse/DRILL-2745
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Execution - Relational Operators
>    Affects Versions: 0.9.0
>         Environment: | 9d92b8e319f2d46e8659d903d355450e15946533 | DRILL-2580: Exit early from HashJoinBatch if build side is empty | 26.03.2015 @ 16:13:53 EDT 
>            Reporter: Khurram Faraaz
>            Assignee: Jason Altekruse
>             Fix For: 1.2.0
>
>
> IOB Exception is returned when JSON file that has many empty arrays and arrays with different types of data is passed to flatten function.
> Tested on 4 node cluster on CentOS
> {code}
> 0: jdbc:drill:> select flatten(outkey) from `nestedJArry.json` ;
> Query failed: RemoteRpcException: Failure while running fragment., index: 176, length: 4 (expected: range(0, 176)) [ 2627cf84-9dfb-4077-8531-9955ecdbdec7 on centos-02.qa.lab:31010 ]
> [ 2627cf84-9dfb-4077-8531-9955ecdbdec7 on centos-02.qa.lab:31010 ]
> Error: exception while executing query: Failure while executing query. (state=,code=0)
> 0: jdbc:drill:> select outkey from `nestedJArry.json`;
> +------------+
> |   outkey   |
> +------------+
> | [["1000000","10000000","2000000","999999","1","0","-1","100000"],["a","b","c","d","e","p","o","f","m","q","d","s","v"],["2012-04-01","1998-02-20","2011-08-05","1992-01-01"],["10:30:29.123","12:29:21.999"],["sdfklgjsdlkjfghlsidhfgopiuesrtoipuertoiurtyoiurotuiydkfjlbn,bfn;waokefpqowertoipuwergklnjdfbpdsiofgoigiuewqrqiugkjehgjksdhbvkjshdfkjsdfbnlkfbkljrghljrelkhbdlkfjbgkdfjbgkndfbnkldfgklbhjdflkghjlnkoiurty984756897345609782-3458745uiyoheirluht7895e6y"],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],["null"],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],["test string","hello world!","just do it!","houston we have a problem"],["1","2","3","4","5","6","7","8","9","0"]] |
> +------------+
> 1 row selected (0.088 seconds)
> Stack trace from drillbit.log
> 2015-04-09 23:54:41,965 [2ad8eebd-adb6-6f7e-469e-4bb8ca276984:frag:0:0] WARN  o.a.d.e.w.fragment.FragmentExecutor - Error while initializing or executing fragment
> java.lang.IndexOutOfBoundsException: index: 176, length: 4 (expected: range(0, 176))
>         at io.netty.buffer.DrillBuf.checkIndexD(DrillBuf.java:187) ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:4.0.24.Final]
>         at io.netty.buffer.DrillBuf.chk(DrillBuf.java:209) ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:4.0.24.Final]
>         at io.netty.buffer.DrillBuf.setInt(DrillBuf.java:513) ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:4.0.24.Final]
>         at org.apache.drill.exec.vector.UInt4Vector$Mutator.set(UInt4Vector.java:363) ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
>         at org.apache.drill.exec.vector.RepeatedVarCharVector.splitAndTransferTo(RepeatedVarCharVector.java:173) ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
>         at org.apache.drill.exec.vector.RepeatedVarCharVector$TransferImpl.splitAndTransfer(RepeatedVarCharVector.java:200) ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
>         at org.apache.drill.exec.test.generated.FlattenerGen1107.flattenRecords(FlattenTemplate.java:106) ~[na:na]
>         at org.apache.drill.exec.physical.impl.flatten.FlattenRecordBatch.doWork(FlattenRecordBatch.java:156) ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
>         at org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext(AbstractSingleRecordBatch.java:93) ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
>         at org.apache.drill.exec.physical.impl.flatten.FlattenRecordBatch.innerNext(FlattenRecordBatch.java:122) ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
>         at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:142) ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
>         at org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next(IteratorValidatorBatchIterator.java:118) ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
>         at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:99) ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
>         at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:89) ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
>         at org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext(AbstractSingleRecordBatch.java:51) ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
>         at org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext(ProjectRecordBatch.java:134) ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
>         at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:142) ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
>         at org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next(IteratorValidatorBatchIterator.java:118) ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
>         at org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:68) ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
>         at org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.innerNext(ScreenCreator.java:96) ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
>         at org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:58) ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
>         at org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:163) ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
>         at org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38) [drill-common-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [na:1.7.0_75]
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [na:1.7.0_75]
>         at java.lang.Thread.run(Thread.java:745) [na:1.7.0_75]
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)