You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@hive.apache.org by Jason Shih <hl...@gmail.com> on 2013/03/27 21:25:50 UTC

hive hbase storage handler fail

Hi all,


I try inserting data into hive table storing by hbase storage handler but fail with the following exception end of mapreduce stage info. 

however, we didn't have problem accessing HbaseStorageHandler if force accessing YARN rather than mapreduce. (hive: 0.9.0, hbase: 0.92.1). 

could expert shed some light how we tweak the settings if would like to initiate the mapper tasks to insert data leveraging HBaseStorageHandler. Thanks.


hereafter simple hive script tried before: 

CREATE TABLE dest(num string, name string)
STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,cf1:name")
TBLPROPERTIES ("hbase.table.name" = "dest");


and the exception: 

...
2013-03-28 03:49:41,107 Stage-0 map = 0%,  reduce = 0%, Cumulative CPU 0.67 sec
2013-03-28 03:49:42,125 Stage-0 map = 0%,  reduce = 0%, Cumulative CPU 0.67 sec
2013-03-28 03:49:43,143 Stage-0 map = 0%,  reduce = 0%, Cumulative CPU 0.67 sec
2013-03-28 03:49:44,153 Stage-0 map = 0%,  reduce = 0%
2013-03-28 03:49:47,184 Stage-0 map = 100%,  reduce = 100%
MapReduce Total cumulative CPU time: 670 msec
Ended Job = job_201303251402_0005 with errors
Error during job, obtaining debugging information...
Examining task ID: task_201303251402_0005_m_000002 (and more) from job job_201303251402_0005
Exception in thread "Thread-35" java.lang.NullPointerException
       at org.apache.hadoop.hive.shims.Hadoop23Shims.getTaskAttemptLogUrl(Hadoop23Shims.java:44)
       at org.apache.hadoop.hive.ql.exec.JobDebugger$TaskInfoGrabber.getTaskInfos(JobDebugger.java:186)
       at org.apache.hadoop.hive.ql.exec.JobDebugger$TaskInfoGrabber.run(JobDebugger.java:142)
       at java.lang.Thread.run(Thread.java:662)
               No encryption was performed by peer.
FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.MapRedTask
MapReduce Jobs Launched: 
Job 0: Map: 1   Cumulative CPU: 0.67 sec   HDFS Read: 0 HDFS Write: 0 FAIL



Cheers,
Jason

hive.limit.optimize.fetch.max

Posted by Sanjay Subramanian <Sa...@wizecommerce.com>.

Hi
I have following settings in the hive-site.xml
<property>
  <name>hive.limit.row.max.size</name>
  <value>10</value>
</property>

<property>
  <name>hive.limit.optimize.enable</name>
  <value>true</value>
</property>

<property>
  <name>hive.limit.optimize.fetch.max</name>
  <value>11</value>
</property>

When I do a select query with WHERE clause it does not LIMIT The results
to 10.

How do u limit the SELECT query results to 10 rows ?

My end goal is I want to create a GROUP in Hive in production that allows
only MAX LIMIT 10 rows returnedŠbecause I don't want business analysts
going SQL happy on Beeswax and bringing my server down with heavy duty
queries

Thanks
sanjay


CONFIDENTIALITY NOTICE
======================
This email message and any attachments are for the exclusive use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message along with any attachments, from your computer system. If you are the intended recipient, please be advised that the content of this message is subject to access, review and disclosure by the sender's Email System Administrator.

Re: hive hbase storage handler fail

Posted by Jason Shih <hl...@gmail.com>.

Hi Sanjay,


thanks for the info. indeed, I did raise up the log level earlier to avoid dumping too much info in app log. I got much the same info and exception after lowering down at INFO. also, I am trying with CDH 4.1.2, except I am running mapreduce rather than YARN. The exception observe only in mapreduce (MR1), I can successfully insert the data into hbase by initiating the mapper tasks follow the insert statement. 


forgot to paste also the log observed in datanodes, hereafter the java.io.IOException read from datanode log file: 

---
at org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderCreationException(HiveIOExceptionHandlerChain.java:97)
at org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderCreationException(HiveIOExceptionHandlerUtil.java:57)
at org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:243)
at org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getRecordReader(CombineHiveInputFormat.java:522)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:373)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:327)
at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232)
at org.apache.hadoop.mapred.Child.main



Cheers,
Jason

On Mar 28, 2013, at 6:18 AM, Sanjay Subramanian wrote:

> If you can run your hive insert data script with debug option u may get
> some clues
> 
> /usr/lib/hive/bin/hive -hiveconf hive.root.logger=INFO,console -e "insert
> into dest select * from some_table_same_structure_as_dest limit 10;"
> 
> I created a small demo usecase and this is failing for me as well
> The error I get is
> Unable to retrieve URL for Hadoop Task logs. port out of range:999999
> 13/03/27 14:02:32 ERROR exec.Task: Unable to retrieve URL for Hadoop Task
> logs. port out of range:999999
> 
> I am using ClouderaManager 4.1.2 Hadoop, Yarn, HDFS, Oozie, Hive and Hue
> 
> sanjay
> 
> 
> On 3/27/13 1:25 PM, "Jason Shih" <hl...@gmail.com> wrote:
> 
>> 
>> Hi all,
>> 
>> 
>> I try inserting data into hive table storing by hbase storage handler but
>> fail with the following exception end of mapreduce stage info.
>> 
>> however, we didn't have problem accessing HbaseStorageHandler if force
>> accessing YARN rather than mapreduce. (hive: 0.9.0, hbase: 0.92.1).
>> 
>> could expert shed some light how we tweak the settings if would like to
>> initiate the mapper tasks to insert data leveraging HBaseStorageHandler.
>> Thanks.
>> 
>> 
>> hereafter simple hive script tried before:
>> 
>> CREATE TABLE dest(num string, name string)
>> STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
>> WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,cf1:name")
>> TBLPROPERTIES ("hbase.table.name" = "dest");
>> 
>> 
>> and the exception:
>> 
>> ...
>> 2013-03-28 03:49:41,107 Stage-0 map = 0%,  reduce = 0%, Cumulative CPU
>> 0.67 sec
>> 2013-03-28 03:49:42,125 Stage-0 map = 0%,  reduce = 0%, Cumulative CPU
>> 0.67 sec
>> 2013-03-28 03:49:43,143 Stage-0 map = 0%,  reduce = 0%, Cumulative CPU
>> 0.67 sec
>> 2013-03-28 03:49:44,153 Stage-0 map = 0%,  reduce = 0%
>> 2013-03-28 03:49:47,184 Stage-0 map = 100%,  reduce = 100%
>> MapReduce Total cumulative CPU time: 670 msec
>> Ended Job = job_201303251402_0005 with errors
>> Error during job, obtaining debugging information...
>> Examining task ID: task_201303251402_0005_m_000002 (and more) from job
>> job_201303251402_0005
>> Exception in thread "Thread-35" java.lang.NullPointerException
>>      at
>> org.apache.hadoop.hive.shims.Hadoop23Shims.getTaskAttemptLogUrl(Hadoop23Sh
>> ims.java:44)
>>      at
>> org.apache.hadoop.hive.ql.exec.JobDebugger$TaskInfoGrabber.getTaskInfos(Jo
>> bDebugger.java:186)
>>      at
>> org.apache.hadoop.hive.ql.exec.JobDebugger$TaskInfoGrabber.run(JobDebugger
>> .java:142)
>>      at java.lang.Thread.run(Thread.java:662)
>>              No encryption was performed by peer.
>> FAILED: Execution Error, return code 2 from
>> org.apache.hadoop.hive.ql.exec.MapRedTask
>> MapReduce Jobs Launched:
>> Job 0: Map: 1   Cumulative CPU: 0.67 sec   HDFS Read: 0 HDFS Write: 0 FAIL
>> 
>> 
>> 
>> Cheers,
>> Jason

Re: hive hbase storage handler fail

Posted by Sanjay Subramanian <Sa...@wizecommerce.com>.

If you can run your hive insert data script with debug option u may get
some clues

/usr/lib/hive/bin/hive -hiveconf hive.root.logger=INFO,console -e "insert
into dest select * from some_table_same_structure_as_dest limit 10;"

I created a small demo usecase and this is failing for me as well
The error I get is
Unable to retrieve URL for Hadoop Task logs. port out of range:999999
13/03/27 14:02:32 ERROR exec.Task: Unable to retrieve URL for Hadoop Task
logs. port out of range:999999

I am using ClouderaManager 4.1.2 Hadoop, Yarn, HDFS, Oozie, Hive and Hue

sanjay


On 3/27/13 1:25 PM, "Jason Shih" <hl...@gmail.com> wrote:

>
>Hi all,
>
>
>I try inserting data into hive table storing by hbase storage handler but
>fail with the following exception end of mapreduce stage info.
>
>however, we didn't have problem accessing HbaseStorageHandler if force
>accessing YARN rather than mapreduce. (hive: 0.9.0, hbase: 0.92.1).
>
>could expert shed some light how we tweak the settings if would like to
>initiate the mapper tasks to insert data leveraging HBaseStorageHandler.
>Thanks.
>
>
>hereafter simple hive script tried before:
>
>CREATE TABLE dest(num string, name string)
>STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
>WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,cf1:name")
>TBLPROPERTIES ("hbase.table.name" = "dest");
>
>
>and the exception:
>
>...
>2013-03-28 03:49:41,107 Stage-0 map = 0%,  reduce = 0%, Cumulative CPU
>0.67 sec
>2013-03-28 03:49:42,125 Stage-0 map = 0%,  reduce = 0%, Cumulative CPU
>0.67 sec
>2013-03-28 03:49:43,143 Stage-0 map = 0%,  reduce = 0%, Cumulative CPU
>0.67 sec
>2013-03-28 03:49:44,153 Stage-0 map = 0%,  reduce = 0%
>2013-03-28 03:49:47,184 Stage-0 map = 100%,  reduce = 100%
>MapReduce Total cumulative CPU time: 670 msec
>Ended Job = job_201303251402_0005 with errors
>Error during job, obtaining debugging information...
>Examining task ID: task_201303251402_0005_m_000002 (and more) from job
>job_201303251402_0005
>Exception in thread "Thread-35" java.lang.NullPointerException
>       at
>org.apache.hadoop.hive.shims.Hadoop23Shims.getTaskAttemptLogUrl(Hadoop23Sh
>ims.java:44)
>       at
>org.apache.hadoop.hive.ql.exec.JobDebugger$TaskInfoGrabber.getTaskInfos(Jo
>bDebugger.java:186)
>       at
>org.apache.hadoop.hive.ql.exec.JobDebugger$TaskInfoGrabber.run(JobDebugger
>.java:142)
>       at java.lang.Thread.run(Thread.java:662)
>               No encryption was performed by peer.
>FAILED: Execution Error, return code 2 from
>org.apache.hadoop.hive.ql.exec.MapRedTask
>MapReduce Jobs Launched:
>Job 0: Map: 1   Cumulative CPU: 0.67 sec   HDFS Read: 0 HDFS Write: 0 FAIL
>
>
>
>Cheers,
>Jason
>
>


CONFIDENTIALITY NOTICE
======================
This email message and any attachments are for the exclusive use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message along with any attachments, from your computer system. If you are the intended recipient, please be advised that the content of this message is subject to access, review and disclosure by the sender's Email System Administrator.