You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hive.apache.org by Fernando Andrés Doglio Turissini <fe...@globant.com> on 2013/01/15 19:31:10 UTC

HIVE: java.lang.ArrayIndexOutOfBoundsException: 2 during JOIN

Hello everyone, I'm struggling with an exception I'm getting on a
particular query that's driving me crazy!

Here is the exception I get:

java.lang.RuntimeException:
org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while
processing writable org.apache.hadoop.hive.serde2.colum
nar.BytesRefArrayWritable@71412b61
        at
org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:161)
        at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
        at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:441)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:377)
        at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:396)
        at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1132)
        at org.apache.hadoop.mapred.Child.main(Child.java:249)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime
Error while processing writable
org.apache.hadoop.hive.serde2.columnar.BytesRefArray
Writable@71412b61
        at
org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:524)
        at
org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:143)
        ... 8 more
Caused by: java.lang.ArrayIndexOutOfBoundsException: 2
        at
org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:506)
        ... 9 more


Here is the query I'm running:

INSERT INTO TABLE variance
SELECT id, collect_set(name)[0], SUM( POW(age - age_mean, 2) ) / count(1)
FROM age_mean join data_table on (age_mean.id = '01' AND data_table.q1 = 1)
where age is not null and age_mean is not null GROUP BY id;

It's probably relevant to mention that I'm doing this on an EMR cluster.

Any idea what might be causing the exception?

Thanks!
Fernando

Re: HIVE: java.lang.ArrayIndexOutOfBoundsException: 2 during JOIN

Posted by Fernando Andrés Doglio Turissini <fe...@globant.com>.
BTW, one of those tables is partitioned and the other one isn't.

I don't know if that makes any difference.

Fernando

On Tue, Jan 15, 2013 at 4:59 PM, Fernando Andrés Doglio Turissini <
fernando.doglio@globant.com> wrote:

> Sorry about that, I'm using the columnar SerDe on both tables. Do you need
> anything else?
> I don't have the create tables for them, so I can't give you that
> particular code.
>
>
> On Tue, Jan 15, 2013 at 4:46 PM, Mark Grover <gr...@gmail.com>wrote:
>
>> I was more interested in knowing if you were using any particular SerDes.
>> You don't have to list out the columns, just the skeleton create table
>> statement should do.
>>
>>
>> On Tue, Jan 15, 2013 at 10:43 AM, Fernando Andrés Doglio Turissini <
>> fernando.doglio@globant.com> wrote:
>>
>>> The "data_table" has around 5k fields, all doubles.
>>> As for the "age_mean" table, here it is:
>>>
>>> hive> desc age_mean;
>>> OK
>>> id string
>>> name string
>>> age_mean double
>>> Time taken: 0.127 seconds
>>>
>>> Does this help?
>>>
>>> Thanks!
>>> Fernando
>>>
>>> On Tue, Jan 15, 2013 at 4:35 PM, Mark Grover <
>>> grover.markgrover@gmail.com> wrote:
>>>
>>>> Fernando,
>>>> Could you share your table definitions as well please?
>>>>
>>>>
>>>> On Tue, Jan 15, 2013 at 10:31 AM, Fernando Andrés Doglio Turissini <
>>>> fernando.doglio@globant.com> wrote:
>>>>
>>>>> Hello everyone, I'm struggling with an exception I'm getting on a
>>>>> particular query that's driving me crazy!
>>>>>
>>>>> Here is the exception I get:
>>>>>
>>>>> java.lang.RuntimeException:
>>>>> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while
>>>>> processing writable org.apache.hadoop.hive.serde2.colum
>>>>> nar.BytesRefArrayWritable@71412b61
>>>>>         at
>>>>> org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:161)
>>>>>         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
>>>>>         at
>>>>> org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:441)
>>>>>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:377)
>>>>>         at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
>>>>>         at java.security.AccessController.doPrivileged(Native Method)
>>>>>         at javax.security.auth.Subject.doAs(Subject.java:396)
>>>>>         at
>>>>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1132)
>>>>>         at org.apache.hadoop.mapred.Child.main(Child.java:249)
>>>>> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive
>>>>> Runtime Error while processing writable
>>>>> org.apache.hadoop.hive.serde2.columnar.BytesRefArray
>>>>> Writable@71412b61
>>>>>         at
>>>>> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:524)
>>>>>         at
>>>>> org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:143)
>>>>>         ... 8 more
>>>>> Caused by: java.lang.ArrayIndexOutOfBoundsException: 2
>>>>>         at
>>>>> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:506)
>>>>>         ... 9 more
>>>>>
>>>>>
>>>>> Here is the query I'm running:
>>>>>
>>>>> INSERT INTO TABLE variance
>>>>> SELECT id, collect_set(name)[0], SUM( POW(age - age_mean, 2) ) /
>>>>> count(1)
>>>>> FROM age_mean join data_table on (age_mean.id = '01' AND
>>>>> data_table.q1 = 1)
>>>>> where age is not null and age_mean is not null GROUP BY id;
>>>>>
>>>>> It's probably relevant to mention that I'm doing this on an EMR
>>>>> cluster.
>>>>>
>>>>> Any idea what might be causing the exception?
>>>>>
>>>>> Thanks!
>>>>> Fernando
>>>>>
>>>>
>>>>
>>>
>>
>

Re: HIVE: java.lang.ArrayIndexOutOfBoundsException: 2 during JOIN

Posted by Fernando Andrés Doglio Turissini <fe...@globant.com>.
Sorry about that, I'm using the columnar SerDe on both tables. Do you need
anything else?
I don't have the create tables for them, so I can't give you that
particular code.

On Tue, Jan 15, 2013 at 4:46 PM, Mark Grover <gr...@gmail.com>wrote:

> I was more interested in knowing if you were using any particular SerDes.
> You don't have to list out the columns, just the skeleton create table
> statement should do.
>
>
> On Tue, Jan 15, 2013 at 10:43 AM, Fernando Andrés Doglio Turissini <
> fernando.doglio@globant.com> wrote:
>
>> The "data_table" has around 5k fields, all doubles.
>> As for the "age_mean" table, here it is:
>>
>> hive> desc age_mean;
>> OK
>> id string
>> name string
>> age_mean double
>> Time taken: 0.127 seconds
>>
>> Does this help?
>>
>> Thanks!
>> Fernando
>>
>> On Tue, Jan 15, 2013 at 4:35 PM, Mark Grover <grover.markgrover@gmail.com
>> > wrote:
>>
>>> Fernando,
>>> Could you share your table definitions as well please?
>>>
>>>
>>> On Tue, Jan 15, 2013 at 10:31 AM, Fernando Andrés Doglio Turissini <
>>> fernando.doglio@globant.com> wrote:
>>>
>>>> Hello everyone, I'm struggling with an exception I'm getting on a
>>>> particular query that's driving me crazy!
>>>>
>>>> Here is the exception I get:
>>>>
>>>> java.lang.RuntimeException:
>>>> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while
>>>> processing writable org.apache.hadoop.hive.serde2.colum
>>>> nar.BytesRefArrayWritable@71412b61
>>>>         at
>>>> org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:161)
>>>>         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
>>>>         at
>>>> org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:441)
>>>>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:377)
>>>>         at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
>>>>         at java.security.AccessController.doPrivileged(Native Method)
>>>>         at javax.security.auth.Subject.doAs(Subject.java:396)
>>>>         at
>>>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1132)
>>>>         at org.apache.hadoop.mapred.Child.main(Child.java:249)
>>>> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive
>>>> Runtime Error while processing writable
>>>> org.apache.hadoop.hive.serde2.columnar.BytesRefArray
>>>> Writable@71412b61
>>>>         at
>>>> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:524)
>>>>         at
>>>> org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:143)
>>>>         ... 8 more
>>>> Caused by: java.lang.ArrayIndexOutOfBoundsException: 2
>>>>         at
>>>> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:506)
>>>>         ... 9 more
>>>>
>>>>
>>>> Here is the query I'm running:
>>>>
>>>> INSERT INTO TABLE variance
>>>> SELECT id, collect_set(name)[0], SUM( POW(age - age_mean, 2) ) /
>>>> count(1)
>>>> FROM age_mean join data_table on (age_mean.id = '01' AND data_table.q1
>>>> = 1)
>>>> where age is not null and age_mean is not null GROUP BY id;
>>>>
>>>> It's probably relevant to mention that I'm doing this on an EMR cluster.
>>>>
>>>> Any idea what might be causing the exception?
>>>>
>>>> Thanks!
>>>> Fernando
>>>>
>>>
>>>
>>
>

Re: HIVE: java.lang.ArrayIndexOutOfBoundsException: 2 during JOIN

Posted by Mark Grover <gr...@gmail.com>.
I was more interested in knowing if you were using any particular SerDes.
You don't have to list out the columns, just the skeleton create table
statement should do.

On Tue, Jan 15, 2013 at 10:43 AM, Fernando Andrés Doglio Turissini <
fernando.doglio@globant.com> wrote:

> The "data_table" has around 5k fields, all doubles.
> As for the "age_mean" table, here it is:
>
> hive> desc age_mean;
> OK
> id string
> name string
> age_mean double
> Time taken: 0.127 seconds
>
> Does this help?
>
> Thanks!
> Fernando
>
> On Tue, Jan 15, 2013 at 4:35 PM, Mark Grover <gr...@gmail.com>wrote:
>
>> Fernando,
>> Could you share your table definitions as well please?
>>
>>
>> On Tue, Jan 15, 2013 at 10:31 AM, Fernando Andrés Doglio Turissini <
>> fernando.doglio@globant.com> wrote:
>>
>>> Hello everyone, I'm struggling with an exception I'm getting on a
>>> particular query that's driving me crazy!
>>>
>>> Here is the exception I get:
>>>
>>> java.lang.RuntimeException:
>>> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while
>>> processing writable org.apache.hadoop.hive.serde2.colum
>>> nar.BytesRefArrayWritable@71412b61
>>>         at
>>> org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:161)
>>>         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
>>>         at
>>> org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:441)
>>>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:377)
>>>         at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
>>>         at java.security.AccessController.doPrivileged(Native Method)
>>>         at javax.security.auth.Subject.doAs(Subject.java:396)
>>>         at
>>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1132)
>>>         at org.apache.hadoop.mapred.Child.main(Child.java:249)
>>> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive
>>> Runtime Error while processing writable
>>> org.apache.hadoop.hive.serde2.columnar.BytesRefArray
>>> Writable@71412b61
>>>         at
>>> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:524)
>>>         at
>>> org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:143)
>>>         ... 8 more
>>> Caused by: java.lang.ArrayIndexOutOfBoundsException: 2
>>>         at
>>> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:506)
>>>         ... 9 more
>>>
>>>
>>> Here is the query I'm running:
>>>
>>> INSERT INTO TABLE variance
>>> SELECT id, collect_set(name)[0], SUM( POW(age - age_mean, 2) ) /
>>> count(1)
>>> FROM age_mean join data_table on (age_mean.id = '01' AND data_table.q1
>>> = 1)
>>> where age is not null and age_mean is not null GROUP BY id;
>>>
>>> It's probably relevant to mention that I'm doing this on an EMR cluster.
>>>
>>> Any idea what might be causing the exception?
>>>
>>> Thanks!
>>> Fernando
>>>
>>
>>
>

Re: HIVE: java.lang.ArrayIndexOutOfBoundsException: 2 during JOIN

Posted by Fernando Andrés Doglio Turissini <fe...@globant.com>.
The "data_table" has around 5k fields, all doubles.
As for the "age_mean" table, here it is:

hive> desc age_mean;
OK
id string
name string
age_mean double
Time taken: 0.127 seconds

Does this help?

Thanks!
Fernando

On Tue, Jan 15, 2013 at 4:35 PM, Mark Grover <gr...@gmail.com>wrote:

> Fernando,
> Could you share your table definitions as well please?
>
>
> On Tue, Jan 15, 2013 at 10:31 AM, Fernando Andrés Doglio Turissini <
> fernando.doglio@globant.com> wrote:
>
>> Hello everyone, I'm struggling with an exception I'm getting on a
>> particular query that's driving me crazy!
>>
>> Here is the exception I get:
>>
>> java.lang.RuntimeException:
>> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while
>> processing writable org.apache.hadoop.hive.serde2.colum
>> nar.BytesRefArrayWritable@71412b61
>>         at
>> org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:161)
>>         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
>>         at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:441)
>>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:377)
>>         at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
>>         at java.security.AccessController.doPrivileged(Native Method)
>>         at javax.security.auth.Subject.doAs(Subject.java:396)
>>         at
>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1132)
>>         at org.apache.hadoop.mapred.Child.main(Child.java:249)
>> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime
>> Error while processing writable
>> org.apache.hadoop.hive.serde2.columnar.BytesRefArray
>> Writable@71412b61
>>         at
>> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:524)
>>         at
>> org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:143)
>>         ... 8 more
>> Caused by: java.lang.ArrayIndexOutOfBoundsException: 2
>>         at
>> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:506)
>>         ... 9 more
>>
>>
>> Here is the query I'm running:
>>
>> INSERT INTO TABLE variance
>> SELECT id, collect_set(name)[0], SUM( POW(age - age_mean, 2) ) / count(1)
>> FROM age_mean join data_table on (age_mean.id = '01' AND data_table.q1 =
>> 1)
>> where age is not null and age_mean is not null GROUP BY id;
>>
>> It's probably relevant to mention that I'm doing this on an EMR cluster.
>>
>> Any idea what might be causing the exception?
>>
>> Thanks!
>> Fernando
>>
>
>

Re: HIVE: java.lang.ArrayIndexOutOfBoundsException: 2 during JOIN

Posted by Mark Grover <gr...@gmail.com>.
Fernando,
Could you share your table definitions as well please?

On Tue, Jan 15, 2013 at 10:31 AM, Fernando Andrés Doglio Turissini <
fernando.doglio@globant.com> wrote:

> Hello everyone, I'm struggling with an exception I'm getting on a
> particular query that's driving me crazy!
>
> Here is the exception I get:
>
> java.lang.RuntimeException:
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while
> processing writable org.apache.hadoop.hive.serde2.colum
> nar.BytesRefArrayWritable@71412b61
>         at
> org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:161)
>         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
>         at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:441)
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:377)
>         at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:396)
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1132)
>         at org.apache.hadoop.mapred.Child.main(Child.java:249)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime
> Error while processing writable
> org.apache.hadoop.hive.serde2.columnar.BytesRefArray
> Writable@71412b61
>         at
> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:524)
>         at
> org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:143)
>         ... 8 more
> Caused by: java.lang.ArrayIndexOutOfBoundsException: 2
>         at
> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:506)
>         ... 9 more
>
>
> Here is the query I'm running:
>
> INSERT INTO TABLE variance
> SELECT id, collect_set(name)[0], SUM( POW(age - age_mean, 2) ) / count(1)
> FROM age_mean join data_table on (age_mean.id = '01' AND data_table.q1 =
> 1)
> where age is not null and age_mean is not null GROUP BY id;
>
> It's probably relevant to mention that I'm doing this on an EMR cluster.
>
> Any idea what might be causing the exception?
>
> Thanks!
> Fernando
>