You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "SeanM (JIRA)" <ji...@apache.org> on 2010/01/26 08:50:34 UTC

[jira] Commented: (HIVE-758) function to load data from hive to hbase

    [ https://issues.apache.org/jira/browse/HIVE-758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12804938#action_12804938 ] 

SeanM commented on HIVE-758:
----------------------------

This UDAF works well but I've encountered two gotchyas:

*strong*Nested queries with a where clause that filter out records will throw an exception, even if there are no null values in the table whatsoever*strong*
{noformat} 
Hive> SELECT hbase_put("test", rowid, "data", colfamily, value, 0) FROM ( SELECT * FROM some_table WHERE value = "some_value") t1;

java.lang.RuntimeException: Error while closing operators
	at org.apache.hadoop.hive.ql.exec.ExecMapper.close(ExecMapper.java:232)
	at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57)
	at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
	at org.apache.hadoop.mapred.Child.main(Child.java:170)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: org.apache.hadoop.hive.ql.metadata.HiveException: Unable to execute method public boolean org.apache.hadoop.hive.contrib.udaf.hbase.UDAFHbasePut$UDAFHbasePutEvaluator.iterate(java.lang.String,java.lang.String,java.lang.String,java.lang.String,java.lang.String,int)  on object org.apache.hadoop.hive.contrib.udaf.hbase.UDAFHbasePut$UDAFHbasePutEvaluator@70e35d5 of class org.apache.hadoop.hive.contrib.udaf.hbase.UDAFHbasePut$UDAFHbasePutEvaluator with arguments {null, null, null, null, null, null} of size 6
	at org.apache.hadoop.hive.ql.exec.GroupByOperator.closeOp(GroupByOperator.java:799)
	at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:462)
	at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:470)
	at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:470)
	at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:470)
	at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:470)
	at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:470)
	at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:470)
	at org.apache.hadoop.hive.ql.exec.ExecMapper.close(ExecMapper.java:211)
	... 4 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unable to execute method public boolean org.apache.hadoop.hive.contrib.udaf.hbase.UDAFHbasePut$UDAFHbasePutEvaluator.iterate(java.lang.String,java.lang.String,java.lang.String,java.lang.String,java.lang.String,int)  on object org.apache.hadoop.hive.contrib.udaf.hbase.UDAFHbasePut$UDAFHbasePutEvaluator@70e35d5 of class org.apache.hadoop.hive.contrib.udaf.hbase.UDAFHbasePut$UDAFHbasePutEvaluator with arguments {null, null, null, null, null, null} of size 6
	at org.apache.hadoop.hive.ql.exec.FunctionRegistry.invoke(FunctionRegistry.java:661)
	at org.apache.hadoop.hive.ql.udf.generic.GenericUDAFBridge$GenericUDAFBridgeEvaluator.iterate(GenericUDAFBridge.java:167)
	at org.apache.hadoop.hive.ql.udf.generic.GenericUDAFEvaluator.aggregate(GenericUDAFEvaluator.java:110)
	at org.apache.hadoop.hive.ql.exec.GroupByOperator.closeOp(GroupByOperator.java:768)
	... 12 more
Caused by: java.lang.IllegalArgumentException
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
	at java.lang.reflect.Method.invoke(Method.java:597)
	at org.apache.hadoop.hive.ql.exec.FunctionRegistry.invoke(FunctionRegistry.java:638)
	... 15 more


It's just a hunch, but it seems like the UDAFs iterate() is being called with null values for rows that were filtered out?
{noformat} 


*strong*Null values*strong*
The UDAF is very sensitive to null values. If using mapped or array types, or any field that may possibly be null, use an if construct for safety:
{noformat}
if (some_field IS NULL, "", some_field)
{noformat}




> function to load data from hive to hbase
> ----------------------------------------
>
>                 Key: HIVE-758
>                 URL: https://issues.apache.org/jira/browse/HIVE-758
>             Project: Hadoop Hive
>          Issue Type: New Feature
>          Components: Contrib
>            Reporter: Raghotham Murthy
>            Priority: Minor
>         Attachments: hive-758.1.patch, hive-758.2.patch
>
>
> supoprt a query like: SELECT hbase_put('hive_hbase_table', rowid, colfamily, col, value, ts) FROM src;

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.