You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Hao Zhu (JIRA)" <ji...@apache.org> on 2015/08/12 02:53:46 UTC
[jira] [Commented] (HIVE-11532) UDTF failed with "org.apache.hadoop.hive.serde2.lazy.LazyString cannot be cast to org.apache.hadoop.io.Text"

    [ https://issues.apache.org/jira/browse/HIVE-11532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14692677#comment-14692677 ] 

Hao Zhu commented on HIVE-11532:
--------------------------------

I further narrowed down the issue by adding below debug code in "process" function:
{code}
                for (int i=1; i < args.length; i++) {
                        PrimitiveObjectInspector tmpExtraColumnOI = (PrimitiveObjectInspector) extraColumnList.get(i-1);
                        System.out.println("DEBUG: class for tmpExtraColumnOI is " + tmpExtraColumnOI.getClass().getName());
                        System.out.println("DEBUG: class for args[i] is " + args[i].getClass().getName());
                        String extraColumnString = (String) tmpExtraColumnOI.getPrimitiveJavaObject(args[i]);
                        outputColumnAddList.add(extraColumnString);
                }
{code}

In Hive 0.12 (Working fine):
{code}
DEBUG: class for tmpExtraColumnOI is org.apache.hadoop.hive.serde2.lazy.objectinspector.primitive.LazyStringObjectInspector
DEBUG: class for args[i] is org.apache.hadoop.hive.serde2.lazy.LazyString
{code}

In Hive 0.13(failed):
{code}
DEBUG: class for tmpExtraColumnOI is org.apache.hadoop.hive.serde2.objectinspector.primitive.WritableStringObjectInspector
DEBUG: class for args[i] is org.apache.hadoop.hive.serde2.lazy.LazyString
{code}

Is there any hive code change which may change the class from LazyStringObjectInspector to WritableStringObjectInspector?


> UDTF failed with "org.apache.hadoop.hive.serde2.lazy.LazyString cannot be cast to org.apache.hadoop.io.Text"
> ------------------------------------------------------------------------------------------------------------
>
>                 Key: HIVE-11532
>                 URL: https://issues.apache.org/jira/browse/HIVE-11532
>             Project: Hive
>          Issue Type: Bug
>    Affects Versions: 0.13
>            Reporter: Hao Zhu
>
> This hive UDTF works fine in Hive 0.12 but this fails starting in Hive 0.13 with below stacktrace:
> {code}
> Task with the most failures(4):
> -----
> Task ID:
>   task_1436218099233_0158_m_000000
> -----
> Diagnostic Messages for this Task:
> Error: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {"a":"abc","b":"xyz"}
> 	at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:195)
> 	at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
> 	at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:435)
> 	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:347)
> 	at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
> 	at java.security.AccessController.doPrivileged(Native Method)
> 	at javax.security.auth.Subject.doAs(Subject.java:422)
> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1566)
> 	at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {"a":"abc","b":"xyz"}
> 	at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:550)
> 	at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:177)
> 	... 8 more
> Caused by: java.lang.ClassCastException: org.apache.hadoop.hive.serde2.lazy.LazyString cannot be cast to org.apache.hadoop.io.Text
> 	at org.apache.hadoop.hive.serde2.objectinspector.primitive.WritableStringObjectInspector.getPrimitiveJavaObject(WritableStringObjectInspector.java:46)
> 	at org.apache.hadoop.hive.serde2.objectinspector.primitive.WritableStringObjectInspector.getPrimitiveJavaObject(WritableStringObjectInspector.java:26)
> 	at openkb.hive.udtf.DoubleColumn.process(DoubleColumn.java:64)
> 	at org.apache.hadoop.hive.ql.exec.UDTFOperator.processOp(UDTFOperator.java:107)
> 	at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:793)
> 	at org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:87)
> 	at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:793)
> 	at org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:92)
> 	at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:793)
> 	at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:540)
> 	... 9 more
> {code}
> Basically this sample UDTF just duplicates the first input column.
> It works fine in Hive 0.12:
> {code}
> select * from testudtf;
> abc	xyz
> ADD JAR ~/target/DoubleColumn-1.0.0.jar;
> CREATE TEMPORARY FUNCTION double_column AS 'openkb.hive.udtf.DoubleColumn'; 
> SELECT double_column(a,b) as (a1,a2,b) FROM testudtf;
> abc	abc	xyz
> {code}
> The source code is here:
> https://github.com/viadea/HiveUDTF
> Is there any change between Hive 0.12 and 0.13 which may cause this to fail?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)