You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hive.apache.org by Ch Wan <xm...@gmail.com> on 2014/11/15 18:23:36 UTC
union all with null value gives exception
Hi all, I use hive 0.13.1 and encountered a problem now. Here is the tables
and query.
// each table contains only one row
create table test_union_all_1(a double);
create table test_union_all_2(a double);
create table test_union_all_3(a double);
select max(a) from (
select NULL a from test_union_all_1
union all
select NULL a from test_union_all_2
union all
select a a from test_union_all_3
)t;
Then I got such exception blew:
Error: java.lang.RuntimeException:
org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error
while processing row (tag=0) {"key":{},"value":{"_col0":0.0}}
at org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce(ExecReducer.java:283)
at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:462)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:408)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:162)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:157)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive
Runtime Error while processing row (tag=0)
{"key":{},"value":{"_col0":0.0}}
at org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce(ExecReducer.java:271)
... 7 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException:
java.lang.ArrayIndexOutOfBoundsException: 7
at org.apache.hadoop.hive.ql.exec.GroupByOperator.processOp(GroupByOperator.java:808)
at org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce(ExecReducer.java:262)
... 7 more
Caused by: java.lang.ArrayIndexOutOfBoundsException: 7
at org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryUtils.byteArrayToLong(LazyBinaryUtils.java:81)
at org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryDouble.init(LazyBinaryDouble.java:43)
at org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryStruct.uncheckedGetField(LazyBinaryStruct.java:216)
at org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryStruct.getField(LazyBinaryStruct.java:197)
at org.apache.hadoop.hive.serde2.lazybinary.objectinspector.LazyBinaryStructObjectInspector.getStructFieldData(LazyBinaryStructObjectInspector.java:64)
at org.apache.hadoop.hive.ql.exec.ExprNodeColumnEvaluator._evaluate(ExprNodeColumnEvaluator.java:98)
at org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:77)
at org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:65)
at org.apache.hadoop.hive.ql.exec.GroupByOperator.updateAggregations(GroupByOperator.java:616)
at org.apache.hadoop.hive.ql.exec.GroupByOperator.processAggr(GroupByOperator.java:905)
at org.apache.hadoop.hive.ql.exec.GroupByOperator.processKey(GroupByOperator.java:737)
at org.apache.hadoop.hive.ql.exec.GroupByOperator.processOp(GroupByOperator.java:803)
... 8 more
Here is the explain of the sql:
OK
STAGE DEPENDENCIES:
Stage-1 is a root stage
Stage-0 is a root stage
STAGE PLANS:
Stage: Stage-1
Map Reduce
Map Operator Tree:
TableScan
alias: test_union_all_2
Statistics: Num rows: -1 Data size: 4 Basic stats: PARTIAL
Column stats: COMPLETE
Select Operator
expressions: null (type: string)
outputColumnNames: _col0
Statistics: Num rows: -1 Data size: 4 Basic stats:
PARTIAL Column stats: COMPLETE
Union
Statistics: Num rows: -2 Data size: 12 Basic stats:
PARTIAL Column stats: PARTIAL
Select Operator
expressions: _col0 (type: double)
outputColumnNames: _col0
Statistics: Num rows: -2 Data size: 12 Basic stats:
PARTIAL Column stats: PARTIAL
Group By Operator
aggregations: max(_col0)
mode: hash
outputColumnNames: _col0
Statistics: Num rows: 1 Data size: 8 Basic stats:
COMPLETE Column stats: PARTIAL
Reduce Output Operator
sort order:
Statistics: Num rows: 1 Data size: 8 Basic
stats: COMPLETE Column stats: PARTIAL
value expressions: _col0 (type: double)
TableScan
alias: test_union_all_1
Statistics: Num rows: -1 Data size: 4 Basic stats: PARTIAL
Column stats: COMPLETE
Select Operator
expressions: null (type: string)
outputColumnNames: _col0
Statistics: Num rows: -1 Data size: 4 Basic stats:
PARTIAL Column stats: COMPLETE
Union
Statistics: Num rows: -2 Data size: 12 Basic stats:
PARTIAL Column stats: PARTIAL
Select Operator
expressions: _col0 (type: double)
outputColumnNames: _col0
Statistics: Num rows: -2 Data size: 12 Basic stats:
PARTIAL Column stats: PARTIAL
Group By Operator
aggregations: max(_col0)
mode: hash
outputColumnNames: _col0
Statistics: Num rows: 1 Data size: 8 Basic stats:
COMPLETE Column stats: PARTIAL
Reduce Output Operator
sort order:
Statistics: Num rows: 1 Data size: 8 Basic
stats: COMPLETE Column stats: PARTIAL
value expressions: _col0 (type: double)
TableScan
alias: test_union_all_3
Statistics: Num rows: 0 Data size: 4 Basic stats: PARTIAL
Column stats: NONE
Select Operator
expressions: a (type: double)
outputColumnNames: _col0
Statistics: Num rows: 0 Data size: 4 Basic stats:
PARTIAL Column stats: NONE
Union
Statistics: Num rows: -2 Data size: 12 Basic stats:
PARTIAL Column stats: PARTIAL
Select Operator
expressions: _col0 (type: double)
outputColumnNames: _col0
Statistics: Num rows: -2 Data size: 12 Basic stats:
PARTIAL Column stats: PARTIAL
Group By Operator
aggregations: max(_col0)
mode: hash
outputColumnNames: _col0
Statistics: Num rows: 1 Data size: 8 Basic stats:
COMPLETE Column stats: PARTIAL
Reduce Output Operator
sort order:
Statistics: Num rows: 1 Data size: 8 Basic
stats: COMPLETE Column stats: PARTIAL
value expressions: _col0 (type: double)
Reduce Operator Tree:
Group By Operator
aggregations: max(VALUE._col0)
mode: mergepartial
outputColumnNames: _col0
Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE
Column stats: PARTIAL
Select Operator
expressions: _col0 (type: double)
outputColumnNames: _col0
Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE
Column stats: PARTIAL
File Output Operator
compressed: false
Statistics: Num rows: 1 Data size: 8 Basic stats:
COMPLETE Column stats: PARTIAL
table:
input format: org.apache.hadoop.mapred.TextInputFormat
output format:
org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
Stage: Stage-0
Fetch Operator
limit: -1
It seems that the query plan treats the null as string which should be
double in fact. So I changed the query as below and solved the problem:
select max(a) from (
select cast(NULL as double) a from test_union_all_1
union all
select NULL a from test_union_all_2
union all
select a a from test_union_all_3
)t;
Will anyone give some suggestions about it? Thanks.