You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2020/07/22 15:03:00 UTC
[jira] [Work logged] (HIVE-20771) LazyBinarySerDe fails on empty
structs.
[ https://issues.apache.org/jira/browse/HIVE-20771?focusedWorklogId=462111&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-462111 ]
ASF GitHub Bot logged work on HIVE-20771:
-----------------------------------------
Author: ASF GitHub Bot
Created on: 22/Jul/20 15:02
Start Date: 22/Jul/20 15:02
Worklog Time Spent: 10m
Work Description: HunterL opened a new pull request #1298:
URL: https://github.com/apache/hive/pull/1298
Copy of PR 450
https://github.com/apache/hive/pull/450
CC @belugabehr
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
Issue Time Tracking
-------------------
Worklog Id: (was: 462111)
Time Spent: 1h 10m (was: 1h)
> LazyBinarySerDe fails on empty structs.
> ---------------------------------------
>
> Key: HIVE-20771
> URL: https://issues.apache.org/jira/browse/HIVE-20771
> Project: Hive
> Issue Type: Bug
> Components: Serializers/Deserializers
> Affects Versions: 1.2.2, 2.3.2, 3.1.0
> Reporter: Clemens Valiente
> Assignee: Clemens Valiente
> Priority: Minor
> Labels: pull-request-available
> Attachments: HIVE-20771.patch
>
> Time Spent: 1h 10m
> Remaining Estimate: 0h
>
> {code:java}
> CREATE TABLE cvaliente.structtest AS
> SELECT named_struct();
> SHOW CREATE TABLE cvaliente.structtest;
> SELECT * FROM cvaliente.structtest ORDER BY rand();
> {code}
> The resulting schema is:
> {code:sql}
> CREATE TABLE `cvaliente.structtest`(
> `_c0` struct<>)
> ROW FORMAT SERDE
> 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe'
> STORED AS INPUTFORMAT
> 'org.apache.hadoop.mapred.TextInputFormat'
> OUTPUTFORMAT
> 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
> LOCATION
> 'hdfs://nameservice1/user/cvaliente/cvaliente/structtest2'
> TBLPROPERTIES (
> 'COLUMN_STATS_ACCURATE'='true',
> 'numFiles'='1',
> 'numRows'='1',
> 'rawDataSize'='0',
> 'totalSize'='1',
> 'transient_lastDdlTime'='1539781607');
> {code}
> Between the MAP and REDUCE phase hive serializes to LazyBinaryStruct and when trying to read the same object back the {{SELECT}} query above fails:
> {code}
> 2018-10-17 14:32:02,298 [FATAL] [TezChild] |tez.ReduceRecordSource|: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row (tag=0) {"key":{"reducesinkkey0":0.13508293503238622},"value":{"_col0":{}}}
> at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource$GroupIterator.next(ReduceRecordSource.java:338)
> at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:259)
> at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.run(ReduceRecordProcessor.java:169)
> at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:164)
> at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:139)
> at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:370)
> at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73)
> at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1917)
> at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61)
> at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37)
> at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Error evaluating VALUE._col0
> at org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:82)
> at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource$GroupIterator.next(ReduceRecordSource.java:329)
> ... 17 more
> Caused by: java.lang.RuntimeException: length should be positive!
> at org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryNonPrimitive.init(LazyBinaryNonPrimitive.java:54)
> at org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryStruct.init(LazyBinaryStruct.java:95)
> at org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryStruct.uncheckedGetField(LazyBinaryStruct.java:264)
> at org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryStruct.getField(LazyBinaryStruct.java:201)
> at org.apache.hadoop.hive.serde2.lazybinary.objectinspector.LazyBinaryStructObjectInspector.getStructFieldData(LazyBinaryStructObjectInspector.java:64)
> at org.apache.hadoop.hive.ql.exec.ExprNodeColumnEvaluator._evaluate(ExprNodeColumnEvaluator.java:98)
> at org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:77)
> at org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:65)
> at org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:77)
> ... 18 more
> {code}
> this is because the LazyBinaryNonPrimitive doesn't allow for empty structs in https://github.com/apache/hive/blob/master/serde/src/java/org/apache/hadoop/hive/serde2/lazybinary/LazyBinaryNonPrimitive.java#L53
--
This message was sent by Atlassian Jira
(v8.3.4#803005)