You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hive.apache.org by Tom Nichols <tm...@gmail.com> on 2010/03/04 21:34:55 UTC

All Map jobs fail with NPE in LazyStruct.uncheckedGetField

I am trying out Hive, using Cloudera's EC2 distribution (Hadoop
0.18.3, Hive 0.4.1, I believe)

I'm trying to run the following query which causes every map task to
fail with an NPE before making any progress:

java.lang.NullPointerException
	at org.apache.hadoop.hive.serde2.lazy.LazyStruct.uncheckedGetField(LazyStruct.java:205)
	at org.apache.hadoop.hive.serde2.lazy.LazyStruct.getField(LazyStruct.java:182)
	at org.apache.hadoop.hive.serde2.objectinspector.LazySimpleStructObjectInspector.getStructFieldData(LazySimpleStructObjectInspector.java:141)
	at org.apache.hadoop.hive.ql.exec.ExprNodeColumnEvaluator.evaluate(ExprNodeColumnEvaluator.java:53)
	at org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:74)
	at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:332)
	at org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:49)
	at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:332)
	at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:175)
	at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:71)
	at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:47)


The query:
-- Get the node's max price and corresponding year/day/hour/month
select isone.node_id, isone.day, isone.hour, isone.lmp
from (select max(lmp) as mlmp, node_id
    from isone_lmp
    where isone_lmp.node_id = 400
    group by node_id) maxlmp
join isone_lmp isone on ( isone.node_id = maxlmp.node_id
  and isone.lmp=maxlmp.mlmp );

The table:
CREATE TABLE isone_lmp (
  node_id int,
  day string,
  hour int,
  minute int,
  energy float,
  congestion float,
  loss float,
  lmp float
)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ','
STORED AS TEXTFILE;

The data looks like the following:
396,20090120,00,00,62.77,0,.78,63.55
397,20090120,00,00,62.77,0,.65,63.42
398,20090120,00,00,62.77,0,.65,63.42
399,20090120,00,00,62.77,0,.65,63.42
400,20090120,00,00,62.77,0,.65,63.42
401,20090120,00,00,62.77,0,-1.02,61.75
405,20090120,00,00,62.77,0,.21,62.98

It's about 15GB of data total; I can do a simple "select count(1) from
isone_lmp;" which executes as expected.  Any thoughts?  I've been able
to execute the same query on a smaller subset of data (2M rows as
opposed to 500M) on a non-distributed setup locally.

Thanks.
-Tom

Re: All Map jobs fail with NPE in LazyStruct.uncheckedGetField

Posted by Tom Nichols <tm...@gmail.com>.
Just a follow-up here -- when I upgraded to Hive 0.5 everything
worked...  Thanks again for the help.

On Fri, Mar 5, 2010 at 5:04 AM, Zheng Shao <zs...@gmail.com> wrote:
> Do you want to try hive release 0.5.0 or hive trunk?
> We should have provided better error messages here:
> https://issues.apache.org/jira/browse/HIVE-1216
>
> Zheng
>
> On Thu, Mar 4, 2010 at 12:34 PM, Tom Nichols <tm...@gmail.com> wrote:
>> I am trying out Hive, using Cloudera's EC2 distribution (Hadoop
>> 0.18.3, Hive 0.4.1, I believe)
>>
>> I'm trying to run the following query which causes every map task to
>> fail with an NPE before making any progress:
>>
>> java.lang.NullPointerException
>>        at org.apache.hadoop.hive.serde2.lazy.LazyStruct.uncheckedGetField(LazyStruct.java:205)
>>        at org.apache.hadoop.hive.serde2.lazy.LazyStruct.getField(LazyStruct.java:182)
>>        at org.apache.hadoop.hive.serde2.objectinspector.LazySimpleStructObjectInspector.getStructFieldData(LazySimpleStructObjectInspector.java:141)
>>        at org.apache.hadoop.hive.ql.exec.ExprNodeColumnEvaluator.evaluate(ExprNodeColumnEvaluator.java:53)
>>        at org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:74)
>>        at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:332)
>>        at org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:49)
>>        at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:332)
>>        at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:175)
>>        at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:71)
>>        at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:47)
>>
>>
>> The query:
>> -- Get the node's max price and corresponding year/day/hour/month
>> select isone.node_id, isone.day, isone.hour, isone.lmp
>> from (select max(lmp) as mlmp, node_id
>>    from isone_lmp
>>    where isone_lmp.node_id = 400
>>    group by node_id) maxlmp
>> join isone_lmp isone on ( isone.node_id = maxlmp.node_id
>>  and isone.lmp=maxlmp.mlmp );
>>
>> The table:
>> CREATE TABLE isone_lmp (
>>  node_id int,
>>  day string,
>>  hour int,
>>  minute int,
>>  energy float,
>>  congestion float,
>>  loss float,
>>  lmp float
>> )
>> ROW FORMAT DELIMITED
>> FIELDS TERMINATED BY ','
>> STORED AS TEXTFILE;
>>
>> The data looks like the following:
>> 396,20090120,00,00,62.77,0,.78,63.55
>> 397,20090120,00,00,62.77,0,.65,63.42
>> 398,20090120,00,00,62.77,0,.65,63.42
>> 399,20090120,00,00,62.77,0,.65,63.42
>> 400,20090120,00,00,62.77,0,.65,63.42
>> 401,20090120,00,00,62.77,0,-1.02,61.75
>> 405,20090120,00,00,62.77,0,.21,62.98
>>
>> It's about 15GB of data total; I can do a simple "select count(1) from
>> isone_lmp;" which executes as expected.  Any thoughts?  I've been able
>> to execute the same query on a smaller subset of data (2M rows as
>> opposed to 500M) on a non-distributed setup locally.
>>
>> Thanks.
>> -Tom
>>
>
>
>
> --
> Yours,
> Zheng
>

Re: All Map jobs fail with NPE in LazyStruct.uncheckedGetField

Posted by Zheng Shao <zs...@gmail.com>.
Do you want to try hive release 0.5.0 or hive trunk?
We should have provided better error messages here:
https://issues.apache.org/jira/browse/HIVE-1216

Zheng

On Thu, Mar 4, 2010 at 12:34 PM, Tom Nichols <tm...@gmail.com> wrote:
> I am trying out Hive, using Cloudera's EC2 distribution (Hadoop
> 0.18.3, Hive 0.4.1, I believe)
>
> I'm trying to run the following query which causes every map task to
> fail with an NPE before making any progress:
>
> java.lang.NullPointerException
>        at org.apache.hadoop.hive.serde2.lazy.LazyStruct.uncheckedGetField(LazyStruct.java:205)
>        at org.apache.hadoop.hive.serde2.lazy.LazyStruct.getField(LazyStruct.java:182)
>        at org.apache.hadoop.hive.serde2.objectinspector.LazySimpleStructObjectInspector.getStructFieldData(LazySimpleStructObjectInspector.java:141)
>        at org.apache.hadoop.hive.ql.exec.ExprNodeColumnEvaluator.evaluate(ExprNodeColumnEvaluator.java:53)
>        at org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:74)
>        at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:332)
>        at org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:49)
>        at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:332)
>        at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:175)
>        at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:71)
>        at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:47)
>
>
> The query:
> -- Get the node's max price and corresponding year/day/hour/month
> select isone.node_id, isone.day, isone.hour, isone.lmp
> from (select max(lmp) as mlmp, node_id
>    from isone_lmp
>    where isone_lmp.node_id = 400
>    group by node_id) maxlmp
> join isone_lmp isone on ( isone.node_id = maxlmp.node_id
>  and isone.lmp=maxlmp.mlmp );
>
> The table:
> CREATE TABLE isone_lmp (
>  node_id int,
>  day string,
>  hour int,
>  minute int,
>  energy float,
>  congestion float,
>  loss float,
>  lmp float
> )
> ROW FORMAT DELIMITED
> FIELDS TERMINATED BY ','
> STORED AS TEXTFILE;
>
> The data looks like the following:
> 396,20090120,00,00,62.77,0,.78,63.55
> 397,20090120,00,00,62.77,0,.65,63.42
> 398,20090120,00,00,62.77,0,.65,63.42
> 399,20090120,00,00,62.77,0,.65,63.42
> 400,20090120,00,00,62.77,0,.65,63.42
> 401,20090120,00,00,62.77,0,-1.02,61.75
> 405,20090120,00,00,62.77,0,.21,62.98
>
> It's about 15GB of data total; I can do a simple "select count(1) from
> isone_lmp;" which executes as expected.  Any thoughts?  I've been able
> to execute the same query on a smaller subset of data (2M rows as
> opposed to 500M) on a non-distributed setup locally.
>
> Thanks.
> -Tom
>



-- 
Yours,
Zheng

Re: CTAS- Hive 0.5.0

Posted by Sonal Goyal <so...@gmail.com>.
Sanjay,

I use the following:

create table products_bought (....);
insert overwrite table products_bought select ... from tableMaster;


Thanks and Regards,
Sonal


On Fri, Mar 5, 2010 at 7:36 PM, Sanjay Sharma
<sa...@impetus.co.in>wrote:

> Hi,
> Am trying to get Create Table ... AS SELECT working in Hive 0.50.0 but
> still getting {mismatched input 'AS' expecting EOF} error.
>
> Jira HIVE-31 patch seems to be present in Hive 0.5.0 so might be something
> to do with the syntax .
>
> Any suggestions on what is the correct syntax or whether it is supposed to
> work in 0.5.
>
>
> Regards,
> Sanjay
>
> Impetus Technologies is participating at the CTIA Wireless 2010 from 23rd
> to 25th March 2010. Meet Impetus in Las Vegas to experience our mobile and
> wireless domain expertise. Click http://impetus.com/events to know more.
>
> Follow our updates on www.twitter.com/impetuscalling.
>
> NOTE: This message may contain information that is confidential,
> proprietary, privileged or otherwise protected by law. The message is
> intended solely for the named addressee. If received in error, please
> destroy and notify the sender. Any use of this email is prohibited when
> received in error. Impetus does not represent, warrant and/or guarantee,
> that the integrity of this communication has been maintained nor that the
> communication is free of errors, virus, interception or interference.
>

RE: CTAS- Hive 0.5.0

Posted by Sanjay Sharma <sa...@impetus.co.in>.
Thanks Ning.
Works now- had to restart Hadoop cluster to get it running though-could not understand why?

Regards,
Sanjay Sharma
Impetus
t: +91-120-4363300 Extn 2761

-----Original Message-----
From: Ning Zhang [mailto:nzhang@facebook.com]
Sent: Friday, March 05, 2010 10:08 PM
To: hive-user@hadoop.apache.org
Subject: Re: CTAS- Hive 0.5.0

Can you post your query? It should work like

create table T as select a, b+1 b1, c*2 c2 from S where ...


You don't need to specify the schema of T cause it is derived from the select-clause. T's column's name is the same as the alias name in the select-clause.

Thanks,
Ning

On Mar 5, 2010, at 6:06 AM, Sanjay Sharma wrote:

> Hi,
> Am trying to get Create Table ... AS SELECT working in Hive 0.50.0 but still getting {mismatched input 'AS' expecting EOF} error.
>
> Jira HIVE-31 patch seems to be present in Hive 0.5.0 so might be something to do with the syntax .
>
> Any suggestions on what is the correct syntax or whether it is supposed to work in 0.5.
>
>
> Regards,
> Sanjay
>
> Impetus Technologies is participating at the CTIA Wireless 2010 from 23rd to 25th March 2010. Meet Impetus in Las Vegas to experience our mobile and wireless domain expertise. Click http://impetus.com/events to know more.
>
> Follow our updates on www.twitter.com/impetuscalling.
>
> NOTE: This message may contain information that is confidential, proprietary, privileged or otherwise protected by law. The message is intended solely for the named addressee. If received in error, please destroy and notify the sender. Any use of this email is prohibited when received in error. Impetus does not represent, warrant and/or guarantee, that the integrity of this communication has been maintained nor that the communication is free of errors, virus, interception or interference.


Impetus Technologies is participating at the CTIA Wireless 2010 from 23rd to 25th March 2010. Meet Impetus in Las Vegas to experience our mobile and wireless domain expertise. Click http://impetus.com/events to know more.

Follow our updates on www.twitter.com/impetuscalling.

NOTE: This message may contain information that is confidential, proprietary, privileged or otherwise protected by law. The message is intended solely for the named addressee. If received in error, please destroy and notify the sender. Any use of this email is prohibited when received in error. Impetus does not represent, warrant and/or guarantee, that the integrity of this communication has been maintained nor that the communication is free of errors, virus, interception or interference.

Re: CTAS- Hive 0.5.0

Posted by Ning Zhang <nz...@facebook.com>.
Can you post your query? It should work like 

create table T as select a, b+1 b1, c*2 c2 from S where ...


You don't need to specify the schema of T cause it is derived from the select-clause. T's column's name is the same as the alias name in the select-clause. 

Thanks,
Ning

On Mar 5, 2010, at 6:06 AM, Sanjay Sharma wrote:

> Hi,
> Am trying to get Create Table ... AS SELECT working in Hive 0.50.0 but still getting {mismatched input 'AS' expecting EOF} error.
> 
> Jira HIVE-31 patch seems to be present in Hive 0.5.0 so might be something to do with the syntax .
> 
> Any suggestions on what is the correct syntax or whether it is supposed to work in 0.5.
> 
> 
> Regards,
> Sanjay
> 
> Impetus Technologies is participating at the CTIA Wireless 2010 from 23rd to 25th March 2010. Meet Impetus in Las Vegas to experience our mobile and wireless domain expertise. Click http://impetus.com/events to know more.
> 
> Follow our updates on www.twitter.com/impetuscalling.
> 
> NOTE: This message may contain information that is confidential, proprietary, privileged or otherwise protected by law. The message is intended solely for the named addressee. If received in error, please destroy and notify the sender. Any use of this email is prohibited when received in error. Impetus does not represent, warrant and/or guarantee, that the integrity of this communication has been maintained nor that the communication is free of errors, virus, interception or interference.


CTAS- Hive 0.5.0

Posted by Sanjay Sharma <sa...@impetus.co.in>.
Hi,
Am trying to get Create Table ... AS SELECT working in Hive 0.50.0 but still getting {mismatched input 'AS' expecting EOF} error.

Jira HIVE-31 patch seems to be present in Hive 0.5.0 so might be something to do with the syntax .

Any suggestions on what is the correct syntax or whether it is supposed to work in 0.5.


Regards,
Sanjay

Impetus Technologies is participating at the CTIA Wireless 2010 from 23rd to 25th March 2010. Meet Impetus in Las Vegas to experience our mobile and wireless domain expertise. Click http://impetus.com/events to know more.

Follow our updates on www.twitter.com/impetuscalling.

NOTE: This message may contain information that is confidential, proprietary, privileged or otherwise protected by law. The message is intended solely for the named addressee. If received in error, please destroy and notify the sender. Any use of this email is prohibited when received in error. Impetus does not represent, warrant and/or guarantee, that the integrity of this communication has been maintained nor that the communication is free of errors, virus, interception or interference.