You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2020/07/22 15:03:00 UTC

[jira] [Work logged] (HIVE-20771) LazyBinarySerDe fails on empty structs.

     [ https://issues.apache.org/jira/browse/HIVE-20771?focusedWorklogId=462111&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-462111 ]

ASF GitHub Bot logged work on HIVE-20771:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 22/Jul/20 15:02
            Start Date: 22/Jul/20 15:02
    Worklog Time Spent: 10m 
      Work Description: HunterL opened a new pull request #1298:
URL: https://github.com/apache/hive/pull/1298


   Copy of PR 450
   
   https://github.com/apache/hive/pull/450
   
   CC @belugabehr 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Issue Time Tracking
-------------------

    Worklog Id:     (was: 462111)
    Time Spent: 1h 10m  (was: 1h)

> LazyBinarySerDe fails on empty structs.
> ---------------------------------------
>
>                 Key: HIVE-20771
>                 URL: https://issues.apache.org/jira/browse/HIVE-20771
>             Project: Hive
>          Issue Type: Bug
>          Components: Serializers/Deserializers
>    Affects Versions: 1.2.2, 2.3.2, 3.1.0
>            Reporter: Clemens Valiente
>            Assignee: Clemens Valiente
>            Priority: Minor
>              Labels: pull-request-available
>         Attachments: HIVE-20771.patch
>
>          Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> {code:java}
> CREATE TABLE cvaliente.structtest AS
> SELECT named_struct();
> SHOW CREATE TABLE cvaliente.structtest;
> SELECT * FROM cvaliente.structtest ORDER BY rand();
> {code}
> The resulting schema is:
> {code:sql}
> CREATE TABLE `cvaliente.structtest`(
>   `_c0` struct<>)
> ROW FORMAT SERDE 
>   'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' 
> STORED AS INPUTFORMAT 
>   'org.apache.hadoop.mapred.TextInputFormat' 
> OUTPUTFORMAT 
>   'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
> LOCATION
>   'hdfs://nameservice1/user/cvaliente/cvaliente/structtest2'
> TBLPROPERTIES (
>   'COLUMN_STATS_ACCURATE'='true', 
>   'numFiles'='1', 	  
>   'numRows'='1', 
>   'rawDataSize'='0', 
>   'totalSize'='1', 	  
>   'transient_lastDdlTime'='1539781607');
> {code}
> Between the MAP and REDUCE phase hive serializes to LazyBinaryStruct and when trying to read the same object back the {{SELECT}} query above fails:
> {code}
> 2018-10-17 14:32:02,298 [FATAL] [TezChild] |tez.ReduceRecordSource|: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row (tag=0) {"key":{"reducesinkkey0":0.13508293503238622},"value":{"_col0":{}}}
> 	at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource$GroupIterator.next(ReduceRecordSource.java:338)
> 	at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:259)
> 	at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.run(ReduceRecordProcessor.java:169)
> 	at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:164)
> 	at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:139)
> 	at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:370)
> 	at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73)
> 	at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61)
> 	at java.security.AccessController.doPrivileged(Native Method)
> 	at javax.security.auth.Subject.doAs(Subject.java:422)
> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1917)
> 	at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61)
> 	at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37)
> 	at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
> 	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> 	at java.lang.Thread.run(Thread.java:748)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Error evaluating VALUE._col0
> 	at org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:82)
> 	at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource$GroupIterator.next(ReduceRecordSource.java:329)
> 	... 17 more
> Caused by: java.lang.RuntimeException: length should be positive!
> 	at org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryNonPrimitive.init(LazyBinaryNonPrimitive.java:54)
> 	at org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryStruct.init(LazyBinaryStruct.java:95)
> 	at org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryStruct.uncheckedGetField(LazyBinaryStruct.java:264)
> 	at org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryStruct.getField(LazyBinaryStruct.java:201)
> 	at org.apache.hadoop.hive.serde2.lazybinary.objectinspector.LazyBinaryStructObjectInspector.getStructFieldData(LazyBinaryStructObjectInspector.java:64)
> 	at org.apache.hadoop.hive.ql.exec.ExprNodeColumnEvaluator._evaluate(ExprNodeColumnEvaluator.java:98)
> 	at org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:77)
> 	at org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:65)
> 	at org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:77)
> 	... 18 more
> {code}
> this is because the LazyBinaryNonPrimitive doesn't allow for empty structs in https://github.com/apache/hive/blob/master/serde/src/java/org/apache/hadoop/hive/serde2/lazybinary/LazyBinaryNonPrimitive.java#L53



--
This message was sent by Atlassian Jira
(v8.3.4#803005)