You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@drill.apache.org by yousuf <yo...@css.org.sa> on 2016/12/07 11:11:13 UTC
Fwd: Exception : IndexOutOfBoundsException: index: 0, length: 264 -
... querying mongodb
Hi
I'm currently exploring apache drill, running on a cluster mode. my
datasoure is mongodb.My datasource table contains 5 million documents. I
can't execute a simple query
|select body from mongo.twitter.tweets limit 10;|
*Throwing exception*
|QueryFailed:AnErrorOccurredorg.apache.drill.common.exceptions.UserRemoteException:SYSTEM
ERROR:IndexOutOfBoundsException:index:0,length:264(expected:range(0,256))Fragment1:2[ErrorId:8903127a-e9e9-407e-8afc-2092b4c03cf0on
test01.css.org:31010](java.lang.IndexOutOfBoundsException)index:0,length:264(expected:range(0,256))io.netty.buffer.AbstractByteBuf.checkIndex():1134io.netty.buffer.PooledUnsafeDirectByteBuf.setBytes():272io.netty.buffer.WrappedByteBuf.setBytes():390io.netty.buffer.UnsafeDirectLittleEndian.setBytes():30io.netty.buffer.DrillBuf.setBytes():753io.netty.buffer.AbstractByteBuf.setBytes():510org.apache.drill.exec.store.bson.BsonRecordReader.writeString():265org.apache.drill.exec.store.bson.BsonRecordReader.writeToListOrMap():167org.apache.drill.exec.store.bson.BsonRecordReader.write():75org.apache.drill.exec.store.mongo.MongoRecordReader.next():186org.apache.drill.exec.physical.impl.ScanBatch.next():178org.apache.drill.exec.record.AbstractRecordBatch.next():119org.apache.drill.exec.record.AbstractRecordBatch.next():109org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext():51org.apache.drill.exec.physical.impl.limit.LimitRecordBatch.innerNext():115org.apache.drill.exec.record.AbstractRecordBatch.next():162org.apache.drill.exec.record.AbstractRecordBatch.next():119org.apache.drill.exec.record.AbstractRecordBatch.next():109org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext():51org.apache.drill.exec.physical.impl.svremover.RemovingRecordBatch.innerNext():94org.apache.drill.exec.record.AbstractRecordBatch.next():162org.apache.drill.exec.physical.impl.BaseRootExec.next():104org.apache.drill.exec.physical.impl.SingleSenderCreator$SingleSenderRootExec.innerNext():92org.apache.drill.exec.physical.impl.BaseRootExec.next():94org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():232org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():226java.security.AccessController.doPrivileged():-2javax.security.auth.Subject.doAs():422org.apache.hadoop.security.UserGroupInformation.doAs():1657org.apache.drill.exec.work.fragment.FragmentExecutor.run():226org.apache.drill.common.SelfCleaningRunnable.run():38java.util.concurrent.ThreadPoolExecutor.runWorker():1142java.util.concurrent.ThreadPoolExecutor$Worker.run():617java.lang.Thread.run():745|
*Working query which is fetching results:*
|select body from mongo.twitter.tweets where tweet_id
='tag:search.twitter.com,2005:xxxxxxxxxx';|
Sample document in source
|{"_id":ObjectId("58402ad5757d7fede822e641"),"rule_list":["x","(contains:x
(contains:y OR contains:y1)) OR (contains:v contains:b) OR (contains:v
(contains:r OR
contains:t))"],"actor_friends_count":79,"klout_score":19,"actor_favorites_count":0,"actor_preferred_username":"xxxxxxx","sentiment":"neg","tweet_id":"tag:search.twitter.com,2005:xxxxxxxxx","object_actor_followers_count":1286,"actor_posted_time":"2016-07-16T14:08:25.000Z","actor_id":"id:twitter.com:xxxxxxxx","actor_display_name":"xxxxx","retweet_count":6,"hashtag_list":["myhashtag"],"body":"my
tweet
body","actor_followers_count":25,"actor_status_count":243,"verb":"share","posted_time":"2016-08-01T07:49:00.000Z","object_actor_status_count":206,"lang":"ar","object_actor_preferred_username":"xxxxxx","original_tweet_id":"tag:search.twitter.com,2005:xxxxxx","gender":"male","object_actor_id":"id:twitter.com:xxxxxxx","favorites_count":0,"object_posted_time":"2016-06-20T04:12:02.000Z","object_actor_friends_count":2516,"generator_display_name":"Twitter
for iPhone","object_actor_display_name":"sdfsf","actor_listed_count":0}|
Any help is appreciated!
Yousuf
Re: Exception : IndexOutOfBoundsException: index: 0, length: 264 -
... querying mongodb
Posted by yousuf <yo...@css.org.sa>.
Thanks for your reply, able to fix the issue by setting.
set store.mongo.bson.record.reader = false;
On 12/07/2016 08:28 PM, Chunhui Shi wrote:
> The length of utf8 encoded byte array is not guarantee to be the same as
> String.length(). A fix should be in BsonRecordReader.writeString().
>
> On Wed, Dec 7, 2016 at 3:11 AM, yousuf <yo...@css.org.sa> wrote:
>
>> Hi
>>
>> I'm currently exploring apache drill, running on a cluster mode. my
>> datasoure is mongodb.My datasource table contains 5 million documents. I
>> can't execute a simple query
>>
>> |select body from mongo.twitter.tweets limit 10;|
>>
>> *Throwing exception*
>>
>> |QueryFailed:AnErrorOccurredorg.apache.drill.common.
>> exceptions.UserRemoteException:SYSTEM ERROR:IndexOutOfBoundsExceptio
>> n:index:0,length:264(expected:range(0,256))Fragment1:2[Error
>> Id:8903127a-e9e9-407e-8afc-2092b4c03cf0on test01.css.org:31010](java.lan
>> g.IndexOutOfBoundsException)index:0,length:264(expected:rang
>> e(0,256))io.netty.buffer.AbstractByteBuf.checkIndex():1134io
>> .netty.buffer.PooledUnsafeDirectByteBuf.setBytes():272io.
>> netty.buffer.WrappedByteBuf.setBytes():390io.netty.buffer.
>> UnsafeDirectLittleEndian.setBytes():30io.netty.buffer.DrillB
>> uf.setBytes():753io.netty.buffer.AbstractByteBuf.setByte
>> s():510org.apache.drill.exec.store.bson.BsonRecordReader.
>> writeString():265org.apache.drill.exec.store.bson.
>> BsonRecordReader.writeToListOrMap():167org.apache.drill.
>> exec.store.bson.BsonRecordReader.write():75org.apache.drill.
>> exec.store.mongo.MongoRecordReader.next():186org.apache.drill.exec.physi
>> cal.impl.ScanBatch.next():178org.apache.drill.exec.recor
>> d.AbstractRecordBatch.next():119org.apache.drill.exec.
>> record.AbstractRecordBatch.next():109org.apache.drill.
>> exec.record.AbstractSingleRecordBatch.innerNext():51org.
>> apache.drill.exec.physical.impl.limit.LimitRecordBatch.in
>> nerNext():115org.apache.drill.exec.record.AbstractRecordBatc
>> h.next():162org.apache.drill.exec.record.AbstractRecordBatch.next():
>> 119org.apache.drill.exec.record.AbstractRecordBatch.
>> next():109org.apache.drill.exec.record.AbstractSingleReco
>> rdBatch.innerNext():51org.apache.drill.exec.physical.impl.svremover.
>> RemovingRecordBatch.innerNext():94org.apache.drill.exec.
>> record.AbstractRecordBatch.next():162org.apache.drill.exec.physical.impl.
>> BaseRootExec.next():104org.apache.drill.exec.physical.
>> impl.SingleSenderCreator$SingleSenderRootExec.innerNext():
>> 92org.apache.drill.exec.physical.impl.BaseRootExec.
>> next():94org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():
>> 232org.apache.drill.exec.work.fragment.FragmentExecutor$1.
>> run():226java.security.AccessController.doPrivileged():-
>> 2javax.security.auth.Subject.doAs():422org.apache.hadoop.
>> security.UserGroupInformation.doAs():1657org.apache.drill.
>> exec.work.fragment.FragmentExecutor.run():226org.apache.
>> drill.common.SelfCleaningRunnable.run():38java.util.
>> concurrent.ThreadPoolExecutor.runWorker():1142java.util.
>> concurrent.ThreadPoolExecutor$Worker.run():617java.lang.Thread.run():745|
>>
>> *Working query which is fetching results:*
>>
>> |select body from mongo.twitter.tweets where tweet_id ='tag:
>> search.twitter.com,2005:xxxxxxxxxx';|
>>
>> Sample document in source
>>
>> |{"_id":ObjectId("58402ad5757d7fede822e641"),"rule_list":["x","(contains:x
>> (contains:y OR contains:y1)) OR (contains:v contains:b) OR (contains:v
>> (contains:r OR contains:t))"],"actor_friends_
>> count":79,"klout_score":19,"actor_favorites_count":0,"actor_
>> preferred_username":"xxxxxxx","sentiment":"neg","tweet_id":"tag:
>> search.twitter.com,2005:xxxxxxxxx","object_actor_
>> followers_count":1286,"actor_posted_time":"2016-07-16T14:
>> 08:25.000Z","actor_id":"id:twitter.com:xxxxxxxx","actor_
>> display_name":"xxxxx","retweet_count":6,"hashtag_list":["myhashtag"],"body":"my
>> tweet body","actor_followers_count":25,"actor_status_count":243,"v
>> erb":"share","posted_time":"2016-08-01T07:49:00.000Z","objec
>> t_actor_status_count":206,"lang":"ar","object_actor_prefe
>> rred_username":"xxxxxx","original_tweet_id":"tag:search.twitter.com
>> ,2005:xxxxxx","gender":"male","object_actor_id":"id:twitter.com:
>> xxxxxxx","favorites_count":0,"object_posted_time":"2016-06-20T04:12:02.
>> 000Z","object_actor_friends_count":2516,"generator_display_name":"Twitter
>> for iPhone","object_actor_display_name":"sdfsf","actor_listed_count":0}|
>>
>> Any help is appreciated!
>>
>> Yousuf
>>
>>
Re: Exception : IndexOutOfBoundsException: index: 0, length: 264 -
... querying mongodb
Posted by Chunhui Shi <cs...@maprtech.com>.
The length of utf8 encoded byte array is not guarantee to be the same as
String.length(). A fix should be in BsonRecordReader.writeString().
On Wed, Dec 7, 2016 at 3:11 AM, yousuf <yo...@css.org.sa> wrote:
>
> Hi
>
> I'm currently exploring apache drill, running on a cluster mode. my
> datasoure is mongodb.My datasource table contains 5 million documents. I
> can't execute a simple query
>
> |select body from mongo.twitter.tweets limit 10;|
>
> *Throwing exception*
>
> |QueryFailed:AnErrorOccurredorg.apache.drill.common.
> exceptions.UserRemoteException:SYSTEM ERROR:IndexOutOfBoundsExceptio
> n:index:0,length:264(expected:range(0,256))Fragment1:2[Error
> Id:8903127a-e9e9-407e-8afc-2092b4c03cf0on test01.css.org:31010](java.lan
> g.IndexOutOfBoundsException)index:0,length:264(expected:rang
> e(0,256))io.netty.buffer.AbstractByteBuf.checkIndex():1134io
> .netty.buffer.PooledUnsafeDirectByteBuf.setBytes():272io.
> netty.buffer.WrappedByteBuf.setBytes():390io.netty.buffer.
> UnsafeDirectLittleEndian.setBytes():30io.netty.buffer.DrillB
> uf.setBytes():753io.netty.buffer.AbstractByteBuf.setByte
> s():510org.apache.drill.exec.store.bson.BsonRecordReader.
> writeString():265org.apache.drill.exec.store.bson.
> BsonRecordReader.writeToListOrMap():167org.apache.drill.
> exec.store.bson.BsonRecordReader.write():75org.apache.drill.
> exec.store.mongo.MongoRecordReader.next():186org.apache.drill.exec.physi
> cal.impl.ScanBatch.next():178org.apache.drill.exec.recor
> d.AbstractRecordBatch.next():119org.apache.drill.exec.
> record.AbstractRecordBatch.next():109org.apache.drill.
> exec.record.AbstractSingleRecordBatch.innerNext():51org.
> apache.drill.exec.physical.impl.limit.LimitRecordBatch.in
> nerNext():115org.apache.drill.exec.record.AbstractRecordBatc
> h.next():162org.apache.drill.exec.record.AbstractRecordBatch.next():
> 119org.apache.drill.exec.record.AbstractRecordBatch.
> next():109org.apache.drill.exec.record.AbstractSingleReco
> rdBatch.innerNext():51org.apache.drill.exec.physical.impl.svremover.
> RemovingRecordBatch.innerNext():94org.apache.drill.exec.
> record.AbstractRecordBatch.next():162org.apache.drill.exec.physical.impl.
> BaseRootExec.next():104org.apache.drill.exec.physical.
> impl.SingleSenderCreator$SingleSenderRootExec.innerNext():
> 92org.apache.drill.exec.physical.impl.BaseRootExec.
> next():94org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():
> 232org.apache.drill.exec.work.fragment.FragmentExecutor$1.
> run():226java.security.AccessController.doPrivileged():-
> 2javax.security.auth.Subject.doAs():422org.apache.hadoop.
> security.UserGroupInformation.doAs():1657org.apache.drill.
> exec.work.fragment.FragmentExecutor.run():226org.apache.
> drill.common.SelfCleaningRunnable.run():38java.util.
> concurrent.ThreadPoolExecutor.runWorker():1142java.util.
> concurrent.ThreadPoolExecutor$Worker.run():617java.lang.Thread.run():745|
>
> *Working query which is fetching results:*
>
> |select body from mongo.twitter.tweets where tweet_id ='tag:
> search.twitter.com,2005:xxxxxxxxxx';|
>
> Sample document in source
>
> |{"_id":ObjectId("58402ad5757d7fede822e641"),"rule_list":["x","(contains:x
> (contains:y OR contains:y1)) OR (contains:v contains:b) OR (contains:v
> (contains:r OR contains:t))"],"actor_friends_
> count":79,"klout_score":19,"actor_favorites_count":0,"actor_
> preferred_username":"xxxxxxx","sentiment":"neg","tweet_id":"tag:
> search.twitter.com,2005:xxxxxxxxx","object_actor_
> followers_count":1286,"actor_posted_time":"2016-07-16T14:
> 08:25.000Z","actor_id":"id:twitter.com:xxxxxxxx","actor_
> display_name":"xxxxx","retweet_count":6,"hashtag_list":["myhashtag"],"body":"my
> tweet body","actor_followers_count":25,"actor_status_count":243,"v
> erb":"share","posted_time":"2016-08-01T07:49:00.000Z","objec
> t_actor_status_count":206,"lang":"ar","object_actor_prefe
> rred_username":"xxxxxx","original_tweet_id":"tag:search.twitter.com
> ,2005:xxxxxx","gender":"male","object_actor_id":"id:twitter.com:
> xxxxxxx","favorites_count":0,"object_posted_time":"2016-06-20T04:12:02.
> 000Z","object_actor_friends_count":2516,"generator_display_name":"Twitter
> for iPhone","object_actor_display_name":"sdfsf","actor_listed_count":0}|
>
> Any help is appreciated!
>
> Yousuf
>
>
Re: Exception : IndexOutOfBoundsException: index: 0, length: 264 -
... querying mongodb
Posted by yousuf <yo...@css.org.sa>.
Able to fix the solution by setting set store.mongo.bson.record.reader =
false;
Thanks
On 12/08/2016 10:04 AM, yousuf wrote:
>
> Hi,
>
> Thank you for your reply.
>
> Fyi, the body field having arabic & english tweets, I'm using mongo
> 3.2.11 version and apache-drill 1.8.0
>
>
> Thanks & Kind Regards
>
>
> On 12/07/2016 09:24 PM, Kathleen Li wrote:
>> I am not able to reproduce your issue at least with your one sample record, reproduce step:
>> (1) from mongodb, display your sample record:
>>> db.kath.find().pretty();
>> {
>>
>> "_id" : ObjectId("58402ad5757d7fede822e641"),
>> "rule_list" : [
>> "x",
>> "(contains:x(contains:y OR contains:y1)) OR (contains:v contains:b) OR (contains:v(contains:r OR contains:t))"
>> ],
>> "actor_friends_count" : 79,
>> "klout_score" : 19,
>> "actor_favorites_count" : 0,
>> "actor_preferred_username" : "xxxxxxx",
>> "sentiment" : "neg",
>> "tweet_id" : "tag:search.twitter.com,2005:xxxxxxxxx",
>> "object_actor_followers_count" : 1286,
>> "actor_posted_time" : "2016-07-16T14:08:25.000Z",
>> "actor_id" : "id:twitter.com:xxxxxxxx",
>> "actor_display_name" : "xxxxx",
>> "retweet_count" : 6,
>> "hashtag_list" : [
>> "myhashtag"
>> ],
>> "body" : "my tweet body",
>> "actor_followers_count" : 25,
>> "actor_status_count" : 243,
>> "verb" : "share",
>> "posted_time" : "2016-08-01T07:49:00.000Z",
>> "object_actor_status_count" : 206,
>> "lang" : "ar",
>> "object_actor_preferred_username" : "xxxxxx",
>> "original_tweet_id" : "tag:search.twitter.com,2005:xxxxxx",
>> "gender" : "male",
>> "object_actor_id" : "id:twitter.com:xxxxxxx",
>> "favorites_count" : 0,
>> "object_posted_time" : "2016-06-20T04:12:02.000Z",
>> "object_actor_friends_count" : 2516,
>> "generator_display_name" : "Twitter for iPhone",
>> "object_actor_display_name" : "sdfsf",
>> "actor_listed_count" : 0
>> }
>>
>>
>>
>>
>> (2)query from drill
>> 0: jdbc:drill:zk=drill1:5181,drill2:5181,dril> select body from kath where tweet_id='tag:search.twitter.com,2005:xxxxxxxxx'
>> . . . . . . . . . . . . . . . . . . . . . . .> ;
>> +----------------+
>> | body |
>> +----------------+
>> | my tweet body |
>> +----------------+
>> 1 row selected (0.285 seconds)
>> 0: jdbc:drill:zk=drill1:5181,drill2:5181,dril> select body from kath limit 1;
>> +----------------+
>> | body |
>> +----------------+
>> | my tweet body |
>> +----------------+
>>
>>
>>
>> The drill version I am using is
>>
>> 0: jdbc:drill:zk=drill1:5181,drill2:5181,dril> select * from sys.version;
>> +----------+-------------------------------------------+-----------------------------------------------------------------+----------------------------+--------------+----------------------------+
>> | version | commit_id | commit_message | commit_time | build_email | build_time |
>> +----------+-------------------------------------------+-----------------------------------------------------------------+----------------------------+--------------+----------------------------+
>> | 1.8.0 | cd599b4ab670aa5d317b80a31326f9bcf8c0aa72 | MD-1127: Add system property to disable loopback address check | 19.09.2016 @ 22:46:34 UTC | Unknown | 19.09.2016 @ 22:53:13 UTC |
>> +----------+------------------------------------
>>
>>
>>
>>
>>
>>
>>
>> On 12/7/16, 3:11 AM, "yousuf"<yo...@css.org.sa> wrote:
>>
>>> Hi
>>>
>>> I'm currently exploring apache drill, running on a cluster mode. my
>>> datasoure is mongodb.My datasource table contains 5 million documents. I
>>> can't execute a simple query
>>>
>>> |select body from mongo.twitter.tweets limit 10;|
>>>
>>> *Throwing exception*
>>>
>>> |QueryFailed:AnErrorOccurredorg.apache.drill.common.exceptions.UserRemoteException:SYSTEM
>>> ERROR:IndexOutOfBoundsException:index:0,length:264(expected:range(0,256))Fragment1:2[ErrorId:8903127a-e9e9-407e-8afc-2092b4c03cf0on
>>> test01.css.org:31010](java.lang.IndexOutOfBoundsException)index:0,length:264(expected:range(0,256))io.netty.buffer.AbstractByteBuf.checkIndex():1134io.netty.buffer.PooledUnsafeDirectByteBuf.setBytes():272io.netty.buffer.WrappedByteBuf.setBytes():390io.netty.buffer.UnsafeDirectLittleEndian.setBytes():30io.netty.buffer.DrillBuf.setBytes():753io.netty.buffer.AbstractByteBuf.setBytes():510org.apache.drill.exec.store.bson.BsonRecordReader.writeString():265org.apache.drill.exec.store.bson.BsonRecordReader.writeToListOrMap():167org.apache.drill.exec.store.bson.BsonRecordReader.write():75org.apache.drill.exec.store.mongo.MongoRecordReader.next():186org.apache.drill.exec.physical.impl.ScanBatch.next():178org.apache.drill.exec.record.AbstractRecordBatch.next():119org.apache.drill.exec.record.AbstractRecordBatch.next():109org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext():51org.apache.drill.exec.physical.impl.limit.LimitRecordBatch.innerNext():115org.apache.drill.exec.record.Ab
>> stractRecordBatch.next():162org.apache.drill.exec.record.AbstractRecordBatch.next():119org.apache.drill.exec.record.AbstractRecordBatch.next():109org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext():51org.apache.drill.exec.physical.impl.svremover.RemovingRecordBatch.innerNext():94org.apache.drill.exec.record.AbstractRecordBatch.next():162org.apache.drill.exec.physical.impl.BaseRootExec.next():104org.apache.drill.exec.physical.impl.SingleSenderCreator$SingleSenderRootExec.innerNext():92org.apache.drill.exec.physical.impl.BaseRootExec.next():94org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():232org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():226java.security.AccessController.doPrivileged():-2javax.security.auth.Subject.doAs():422org.apache.hadoop.security.UserGroupInformation.doAs():1657org.apache.drill.exec.work.fragment.FragmentExecutor.run():226org.apache.drill.common.SelfCleaningRunnable.run():38java.util.concurrent.ThreadPoolExecutor.runWork
>> er():1142java.util.concurrent.ThreadPoolExecutor$Worker.run():617java.lang.Thread.run():745|
>>> *Working query which is fetching results:*
>>>
>>> |select body from mongo.twitter.tweets where tweet_id
>>> ='tag:search.twitter.com,2005:xxxxxxxxxx';|
>>>
>>> Sample document in source
>>>
>>> |{"_id":ObjectId("58402ad5757d7fede822e641"),"rule_list":["x","(contains:x
>>> (contains:y OR contains:y1)) OR (contains:v contains:b) OR (contains:v
>>> (contains:r OR
>>> contains:t))"],"actor_friends_count":79,"klout_score":19,"actor_favorites_count":0,"actor_preferred_username":"xxxxxxx","sentiment":"neg","tweet_id":"tag:search.twitter.com,2005:xxxxxxxxx","object_actor_followers_count":1286,"actor_posted_time":"2016-07-16T14:08:25.000Z","actor_id":"id:twitter.com:xxxxxxxx","actor_display_name":"xxxxx","retweet_count":6,"hashtag_list":["myhashtag"],"body":"my
>>> tweet
>>> body","actor_followers_count":25,"actor_status_count":243,"verb":"share","posted_time":"2016-08-01T07:49:00.000Z","object_actor_status_count":206,"lang":"ar","object_actor_preferred_username":"xxxxxx","original_tweet_id":"tag:search.twitter.com,2005:xxxxxx","gender":"male","object_actor_id":"id:twitter.com:xxxxxxx","favorites_count":0,"object_posted_time":"2016-06-20T04:12:02.000Z","object_actor_friends_count":2516,"generator_display_name":"Twitter
>>> for iPhone","object_actor_display_name":"sdfsf","actor_listed_count":0}|
>>>
>>> Any help is appreciated!
>>>
>>> Yousuf
>>>
>
Re: Exception : IndexOutOfBoundsException: index: 0, length: 264 -
... querying mongodb
Posted by yousuf <yo...@css.org.sa>.
Hi,
Thank you for your reply.
Fyi, the body field having arabic & english tweets, I'm using mongo
3.2.11 version and apache-drill 1.8.0
Thanks & Kind Regards
On 12/07/2016 09:24 PM, Kathleen Li wrote:
> I am not able to reproduce your issue at least with your one sample record, reproduce step:
> (1) from mongodb, display your sample record:
>> db.kath.find().pretty();
> {
>
> "_id" : ObjectId("58402ad5757d7fede822e641"),
> "rule_list" : [
> "x",
> "(contains:x(contains:y OR contains:y1)) OR (contains:v contains:b) OR (contains:v(contains:r OR contains:t))"
> ],
> "actor_friends_count" : 79,
> "klout_score" : 19,
> "actor_favorites_count" : 0,
> "actor_preferred_username" : "xxxxxxx",
> "sentiment" : "neg",
> "tweet_id" : "tag:search.twitter.com,2005:xxxxxxxxx",
> "object_actor_followers_count" : 1286,
> "actor_posted_time" : "2016-07-16T14:08:25.000Z",
> "actor_id" : "id:twitter.com:xxxxxxxx",
> "actor_display_name" : "xxxxx",
> "retweet_count" : 6,
> "hashtag_list" : [
> "myhashtag"
> ],
> "body" : "my tweet body",
> "actor_followers_count" : 25,
> "actor_status_count" : 243,
> "verb" : "share",
> "posted_time" : "2016-08-01T07:49:00.000Z",
> "object_actor_status_count" : 206,
> "lang" : "ar",
> "object_actor_preferred_username" : "xxxxxx",
> "original_tweet_id" : "tag:search.twitter.com,2005:xxxxxx",
> "gender" : "male",
> "object_actor_id" : "id:twitter.com:xxxxxxx",
> "favorites_count" : 0,
> "object_posted_time" : "2016-06-20T04:12:02.000Z",
> "object_actor_friends_count" : 2516,
> "generator_display_name" : "Twitter for iPhone",
> "object_actor_display_name" : "sdfsf",
> "actor_listed_count" : 0
> }
>
>
>
>
> (2)query from drill
> 0: jdbc:drill:zk=drill1:5181,drill2:5181,dril> select body from kath where tweet_id='tag:search.twitter.com,2005:xxxxxxxxx'
> . . . . . . . . . . . . . . . . . . . . . . .> ;
> +----------------+
> | body |
> +----------------+
> | my tweet body |
> +----------------+
> 1 row selected (0.285 seconds)
> 0: jdbc:drill:zk=drill1:5181,drill2:5181,dril> select body from kath limit 1;
> +----------------+
> | body |
> +----------------+
> | my tweet body |
> +----------------+
>
>
>
> The drill version I am using is
>
> 0: jdbc:drill:zk=drill1:5181,drill2:5181,dril> select * from sys.version;
> +----------+-------------------------------------------+-----------------------------------------------------------------+----------------------------+--------------+----------------------------+
> | version | commit_id | commit_message | commit_time | build_email | build_time |
> +----------+-------------------------------------------+-----------------------------------------------------------------+----------------------------+--------------+----------------------------+
> | 1.8.0 | cd599b4ab670aa5d317b80a31326f9bcf8c0aa72 | MD-1127: Add system property to disable loopback address check | 19.09.2016 @ 22:46:34 UTC | Unknown | 19.09.2016 @ 22:53:13 UTC |
> +----------+------------------------------------
>
>
>
>
>
>
>
> On 12/7/16, 3:11 AM, "yousuf" <yo...@css.org.sa> wrote:
>
>> Hi
>>
>> I'm currently exploring apache drill, running on a cluster mode. my
>> datasoure is mongodb.My datasource table contains 5 million documents. I
>> can't execute a simple query
>>
>> |select body from mongo.twitter.tweets limit 10;|
>>
>> *Throwing exception*
>>
>> |QueryFailed:AnErrorOccurredorg.apache.drill.common.exceptions.UserRemoteException:SYSTEM
>> ERROR:IndexOutOfBoundsException:index:0,length:264(expected:range(0,256))Fragment1:2[ErrorId:8903127a-e9e9-407e-8afc-2092b4c03cf0on
>> test01.css.org:31010](java.lang.IndexOutOfBoundsException)index:0,length:264(expected:range(0,256))io.netty.buffer.AbstractByteBuf.checkIndex():1134io.netty.buffer.PooledUnsafeDirectByteBuf.setBytes():272io.netty.buffer.WrappedByteBuf.setBytes():390io.netty.buffer.UnsafeDirectLittleEndian.setBytes():30io.netty.buffer.DrillBuf.setBytes():753io.netty.buffer.AbstractByteBuf.setBytes():510org.apache.drill.exec.store.bson.BsonRecordReader.writeString():265org.apache.drill.exec.store.bson.BsonRecordReader.writeToListOrMap():167org.apache.drill.exec.store.bson.BsonRecordReader.write():75org.apache.drill.exec.store.mongo.MongoRecordReader.next():186org.apache.drill.exec.physical.impl.ScanBatch.next():178org.apache.drill.exec.record.AbstractRecordBatch.next():119org.apache.drill.exec.record.AbstractRecordBatch.next():109org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext():51org.apache.drill.exec.physical.impl.limit.LimitRecordBatch.innerNext():115org.apache.drill.exec.record.Ab
> stractRecordBatch.next():162org.apache.drill.exec.record.AbstractRecordBatch.next():119org.apache.drill.exec.record.AbstractRecordBatch.next():109org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext():51org.apache.drill.exec.physical.impl.svremover.RemovingRecordBatch.innerNext():94org.apache.drill.exec.record.AbstractRecordBatch.next():162org.apache.drill.exec.physical.impl.BaseRootExec.next():104org.apache.drill.exec.physical.impl.SingleSenderCreator$SingleSenderRootExec.innerNext():92org.apache.drill.exec.physical.impl.BaseRootExec.next():94org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():232org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():226java.security.AccessController.doPrivileged():-2javax.security.auth.Subject.doAs():422org.apache.hadoop.security.UserGroupInformation.doAs():1657org.apache.drill.exec.work.fragment.FragmentExecutor.run():226org.apache.drill.common.SelfCleaningRunnable.run():38java.util.concurrent.ThreadPoolExecutor.runWork
> er():1142java.util.concurrent.ThreadPoolExecutor$Worker.run():617java.lang.Thread.run():745|
>> *Working query which is fetching results:*
>>
>> |select body from mongo.twitter.tweets where tweet_id
>> ='tag:search.twitter.com,2005:xxxxxxxxxx';|
>>
>> Sample document in source
>>
>> |{"_id":ObjectId("58402ad5757d7fede822e641"),"rule_list":["x","(contains:x
>> (contains:y OR contains:y1)) OR (contains:v contains:b) OR (contains:v
>> (contains:r OR
>> contains:t))"],"actor_friends_count":79,"klout_score":19,"actor_favorites_count":0,"actor_preferred_username":"xxxxxxx","sentiment":"neg","tweet_id":"tag:search.twitter.com,2005:xxxxxxxxx","object_actor_followers_count":1286,"actor_posted_time":"2016-07-16T14:08:25.000Z","actor_id":"id:twitter.com:xxxxxxxx","actor_display_name":"xxxxx","retweet_count":6,"hashtag_list":["myhashtag"],"body":"my
>> tweet
>> body","actor_followers_count":25,"actor_status_count":243,"verb":"share","posted_time":"2016-08-01T07:49:00.000Z","object_actor_status_count":206,"lang":"ar","object_actor_preferred_username":"xxxxxx","original_tweet_id":"tag:search.twitter.com,2005:xxxxxx","gender":"male","object_actor_id":"id:twitter.com:xxxxxxx","favorites_count":0,"object_posted_time":"2016-06-20T04:12:02.000Z","object_actor_friends_count":2516,"generator_display_name":"Twitter
>> for iPhone","object_actor_display_name":"sdfsf","actor_listed_count":0}|
>>
>> Any help is appreciated!
>>
>> Yousuf
>>
>
Re: Exception : IndexOutOfBoundsException: index: 0, length: 264 -
... querying mongodb
Posted by Kathleen Li <kl...@maprtech.com>.
I am not able to reproduce your issue at least with your one sample record, reproduce step:
(1) from mongodb, display your sample record:
>db.kath.find().pretty();
{
"_id" : ObjectId("58402ad5757d7fede822e641"),
"rule_list" : [
"x",
"(contains:x(contains:y OR contains:y1)) OR (contains:v contains:b) OR (contains:v(contains:r OR contains:t))"
],
"actor_friends_count" : 79,
"klout_score" : 19,
"actor_favorites_count" : 0,
"actor_preferred_username" : "xxxxxxx",
"sentiment" : "neg",
"tweet_id" : "tag:search.twitter.com,2005:xxxxxxxxx",
"object_actor_followers_count" : 1286,
"actor_posted_time" : "2016-07-16T14:08:25.000Z",
"actor_id" : "id:twitter.com:xxxxxxxx",
"actor_display_name" : "xxxxx",
"retweet_count" : 6,
"hashtag_list" : [
"myhashtag"
],
"body" : "my tweet body",
"actor_followers_count" : 25,
"actor_status_count" : 243,
"verb" : "share",
"posted_time" : "2016-08-01T07:49:00.000Z",
"object_actor_status_count" : 206,
"lang" : "ar",
"object_actor_preferred_username" : "xxxxxx",
"original_tweet_id" : "tag:search.twitter.com,2005:xxxxxx",
"gender" : "male",
"object_actor_id" : "id:twitter.com:xxxxxxx",
"favorites_count" : 0,
"object_posted_time" : "2016-06-20T04:12:02.000Z",
"object_actor_friends_count" : 2516,
"generator_display_name" : "Twitter for iPhone",
"object_actor_display_name" : "sdfsf",
"actor_listed_count" : 0
}
(2)query from drill
0: jdbc:drill:zk=drill1:5181,drill2:5181,dril> select body from kath where tweet_id='tag:search.twitter.com,2005:xxxxxxxxx'
. . . . . . . . . . . . . . . . . . . . . . .> ;
+----------------+
| body |
+----------------+
| my tweet body |
+----------------+
1 row selected (0.285 seconds)
0: jdbc:drill:zk=drill1:5181,drill2:5181,dril> select body from kath limit 1;
+----------------+
| body |
+----------------+
| my tweet body |
+----------------+
The drill version I am using is
0: jdbc:drill:zk=drill1:5181,drill2:5181,dril> select * from sys.version;
+----------+-------------------------------------------+-----------------------------------------------------------------+----------------------------+--------------+----------------------------+
| version | commit_id | commit_message | commit_time | build_email | build_time |
+----------+-------------------------------------------+-----------------------------------------------------------------+----------------------------+--------------+----------------------------+
| 1.8.0 | cd599b4ab670aa5d317b80a31326f9bcf8c0aa72 | MD-1127: Add system property to disable loopback address check | 19.09.2016 @ 22:46:34 UTC | Unknown | 19.09.2016 @ 22:53:13 UTC |
+----------+------------------------------------
On 12/7/16, 3:11 AM, "yousuf" <yo...@css.org.sa> wrote:
>
>Hi
>
>I'm currently exploring apache drill, running on a cluster mode. my
>datasoure is mongodb.My datasource table contains 5 million documents. I
>can't execute a simple query
>
>|select body from mongo.twitter.tweets limit 10;|
>
>*Throwing exception*
>
>|QueryFailed:AnErrorOccurredorg.apache.drill.common.exceptions.UserRemoteException:SYSTEM
>ERROR:IndexOutOfBoundsException:index:0,length:264(expected:range(0,256))Fragment1:2[ErrorId:8903127a-e9e9-407e-8afc-2092b4c03cf0on
>test01.css.org:31010](java.lang.IndexOutOfBoundsException)index:0,length:264(expected:range(0,256))io.netty.buffer.AbstractByteBuf.checkIndex():1134io.netty.buffer.PooledUnsafeDirectByteBuf.setBytes():272io.netty.buffer.WrappedByteBuf.setBytes():390io.netty.buffer.UnsafeDirectLittleEndian.setBytes():30io.netty.buffer.DrillBuf.setBytes():753io.netty.buffer.AbstractByteBuf.setBytes():510org.apache.drill.exec.store.bson.BsonRecordReader.writeString():265org.apache.drill.exec.store.bson.BsonRecordReader.writeToListOrMap():167org.apache.drill.exec.store.bson.BsonRecordReader.write():75org.apache.drill.exec.store.mongo.MongoRecordReader.next():186org.apache.drill.exec.physical.impl.ScanBatch.next():178org.apache.drill.exec.record.AbstractRecordBatch.next():119org.apache.drill.exec.record.AbstractRecordBatch.next():109org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext():51org.apache.drill.exec.physical.impl.limit.LimitRecordBatch.innerNext():115org.apache.drill.exec.record.Ab
stractRecordBatch.next():162org.apache.drill.exec.record.AbstractRecordBatch.next():119org.apache.drill.exec.record.AbstractRecordBatch.next():109org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext():51org.apache.drill.exec.physical.impl.svremover.RemovingRecordBatch.innerNext():94org.apache.drill.exec.record.AbstractRecordBatch.next():162org.apache.drill.exec.physical.impl.BaseRootExec.next():104org.apache.drill.exec.physical.impl.SingleSenderCreator$SingleSenderRootExec.innerNext():92org.apache.drill.exec.physical.impl.BaseRootExec.next():94org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():232org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():226java.security.AccessController.doPrivileged():-2javax.security.auth.Subject.doAs():422org.apache.hadoop.security.UserGroupInformation.doAs():1657org.apache.drill.exec.work.fragment.FragmentExecutor.run():226org.apache.drill.common.SelfCleaningRunnable.run():38java.util.concurrent.ThreadPoolExecutor.runWork
er():1142java.util.concurrent.ThreadPoolExecutor$Worker.run():617java.lang.Thread.run():745|
>
>*Working query which is fetching results:*
>
>|select body from mongo.twitter.tweets where tweet_id
>='tag:search.twitter.com,2005:xxxxxxxxxx';|
>
>Sample document in source
>
>|{"_id":ObjectId("58402ad5757d7fede822e641"),"rule_list":["x","(contains:x
>(contains:y OR contains:y1)) OR (contains:v contains:b) OR (contains:v
>(contains:r OR
>contains:t))"],"actor_friends_count":79,"klout_score":19,"actor_favorites_count":0,"actor_preferred_username":"xxxxxxx","sentiment":"neg","tweet_id":"tag:search.twitter.com,2005:xxxxxxxxx","object_actor_followers_count":1286,"actor_posted_time":"2016-07-16T14:08:25.000Z","actor_id":"id:twitter.com:xxxxxxxx","actor_display_name":"xxxxx","retweet_count":6,"hashtag_list":["myhashtag"],"body":"my
>tweet
>body","actor_followers_count":25,"actor_status_count":243,"verb":"share","posted_time":"2016-08-01T07:49:00.000Z","object_actor_status_count":206,"lang":"ar","object_actor_preferred_username":"xxxxxx","original_tweet_id":"tag:search.twitter.com,2005:xxxxxx","gender":"male","object_actor_id":"id:twitter.com:xxxxxxx","favorites_count":0,"object_posted_time":"2016-06-20T04:12:02.000Z","object_actor_friends_count":2516,"generator_display_name":"Twitter
>for iPhone","object_actor_display_name":"sdfsf","actor_listed_count":0}|
>
>Any help is appreciated!
>
>Yousuf
>