You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Sun Rui (JIRA)" <ji...@apache.org> on 2013/12/12 12:32:06 UTC
[jira] [Created] (HIVE-6021) Problem in GroupByOperator for
handling distinct aggrgations
Sun Rui created HIVE-6021:
-----------------------------
Summary: Problem in GroupByOperator for handling distinct aggrgations
Key: HIVE-6021
URL: https://issues.apache.org/jira/browse/HIVE-6021
Project: Hive
Issue Type: Bug
Components: Query Processor
Affects Versions: 0.12.0
Reporter: Sun Rui
Assignee: Sun Rui
Use the following test case with HIVE 0.12:
{code:sql}
create table src(key int, value string);
load data local inpath 'src/data/files/kv1.txt' overwrite into table src;
set hive.map.aggr=false;
select count(key),count(distinct value) from src group by key;
{code}
We will get an ArrayIndexOutOfBoundsException from GroupByOperator:
{code}
java.lang.RuntimeException: Error in configuring object
at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93)
at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:485)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:420)
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:260)
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88)
... 5 more
Caused by: java.lang.RuntimeException: Reduce operator initialization failed
at org.apache.hadoop.hive.ql.exec.mr.ExecReducer.configure(ExecReducer.java:159)
... 10 more
Caused by: java.lang.ArrayIndexOutOfBoundsException: 1
at org.apache.hadoop.hive.ql.exec.GroupByOperator.initializeOp(GroupByOperator.java:281)
at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:377)
at org.apache.hadoop.hive.ql.exec.mr.ExecReducer.configure(ExecReducer.java:152)
... 10 more
{code}
explain select count(key),count(distinct value) from src group by key;
{code}
STAGE PLANS:
Stage: Stage-1
Map Reduce
Alias -> Map Operator Tree:
src
TableScan
alias: src
Select Operator
expressions:
expr: key
type: int
expr: value
type: string
outputColumnNames: key, value
Reduce Output Operator
key expressions:
expr: key
type: int
expr: value
type: string
sort order: ++
Map-reduce partition columns:
expr: key
type: int
tag: -1
Reduce Operator Tree:
Group By Operator
aggregations:
expr: count(KEY._col0) // The parameter causes this problem
^^^^^^^^^^^
expr: count(DISTINCT KEY._col1:0._col0)
bucketGroup: false
keys:
expr: KEY._col0
type: int
mode: complete
outputColumnNames: _col0, _col1, _col2
Select Operator
expressions:
expr: _col1
type: bigint
expr: _col2
type: bigint
outputColumnNames: _col0, _col1
File Output Operator
compressed: false
GlobalTableId: 0
table:
input format: org.apache.hadoop.mapred.TextInputFormat
output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
Stage: Stage-0
Fetch Operator
limit: -1
{code}
The root cause is within GroupByOperator.initializeOp(). The method forgets to handle the case:
For a query has distinct aggregations, there is an aggregation function has a parameter which is a groupby key column but not distinct key column.
{code}
if (unionExprEval != null) {
String[] names = parameters.get(j).getExprString().split("\\.");
// parameters of the form : KEY.colx:t.coly
if (Utilities.ReduceField.KEY.name().equals(names[0])) {
String name = names[names.length - 2];
int tag = Integer.parseInt(name.split("\\:")[1]);
...
} else {
// will be VALUE._COLx
if (!nonDistinctAggrs.contains(i)) {
nonDistinctAggrs.add(i);
}
}
{code}
--
This message was sent by Atlassian JIRA
(v6.1.4#6159)