You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Xuefu Zhang (JIRA)" <ji...@apache.org> on 2013/12/05 18:48:37 UTC
[jira] [Updated] (HIVE-5878) Hive standard avg UDAF returns double
as the return type for some exact input types
[ https://issues.apache.org/jira/browse/HIVE-5878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Xuefu Zhang updated HIVE-5878:
------------------------------
Description:
For standard, no-partial avg result, hive currently returns double as the result type.
{code}
hive> desc test;
OK
d int None
Time taken: 0.051 seconds, Fetched: 1 row(s)
hive> explain select avg(`d`) from test;
...
Reduce Operator Tree:
Group By Operator
aggregations:
expr: avg(VALUE._col0)
bucketGroup: false
mode: mergepartial
outputColumnNames: _col0
Select Operator
expressions:
expr: _col0
type: double
{code}
However, exact types including integers and decimal should yield exact type. Here is what MySQL does:
{code}
mysql> desc test;
+-------+--------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------+--------------+------+-----+---------+-------+
| i | int(11) | YES | | NULL | |
| b | tinyint(1) | YES | | NULL | |
| d | double | YES | | NULL | |
| s | varchar(5) | YES | | NULL | |
| dd | decimal(5,2) | YES | | NULL | |
+-------+--------------+------+-----+---------+-------+
mysql> create table test62 as select avg(i) from test;
mysql> desc test62;
+-------+---------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------+---------------+------+-----+---------+-------+
| avg(i) | decimal(14,4) | YES | | NULL | |
+-------+---------------+------+-----+---------+-------+
1 row in set (0.00 sec)
{code}
was:
For standard, no-partial avg result, hive currently returns double as the result type.
{code}
hive> desc test;
OK
d int None
Time taken: 0.051 seconds, Fetched: 1 row(s)
hive> explain select avg(`d`) from test;
...
Reduce Operator Tree:
Group By Operator
aggregations:
expr: avg(VALUE._col0)
bucketGroup: false
mode: mergepartial
outputColumnNames: _col0
Select Operator
expressions:
expr: _col0
type: double
{code}
However, exact types including integers and decimal should yield exact type. Here is what MySQL does:
{code}
mysql> desc test;
+-------+--------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------+--------------+------+-----+---------+-------+
| i | int(11) | YES | | NULL | |
| b | tinyint(1) | YES | | NULL | |
| d | double | YES | | NULL | |
| s | varchar(5) | YES | | NULL | |
| dd | decimal(5,2) | YES | | NULL | |
+-------+--------------+------+-----+---------+-------+
mysql> desc test62;
+-------+---------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------+---------------+------+-----+---------+-------+
| sum_t | decimal(14,4) | YES | | NULL | |
+-------+---------------+------+-----+---------+-------+
1 row in set (0.00 sec)
{code}
> Hive standard avg UDAF returns double as the return type for some exact input types
> -----------------------------------------------------------------------------------
>
> Key: HIVE-5878
> URL: https://issues.apache.org/jira/browse/HIVE-5878
> Project: Hive
> Issue Type: Bug
> Components: Types, UDF
> Affects Versions: 0.12.0
> Reporter: Xuefu Zhang
> Assignee: Xuefu Zhang
> Attachments: HIVE-5878.patch
>
>
> For standard, no-partial avg result, hive currently returns double as the result type.
> {code}
> hive> desc test;
> OK
> d int None
> Time taken: 0.051 seconds, Fetched: 1 row(s)
> hive> explain select avg(`d`) from test;
> ...
> Reduce Operator Tree:
> Group By Operator
> aggregations:
> expr: avg(VALUE._col0)
> bucketGroup: false
> mode: mergepartial
> outputColumnNames: _col0
> Select Operator
> expressions:
> expr: _col0
> type: double
> {code}
> However, exact types including integers and decimal should yield exact type. Here is what MySQL does:
> {code}
> mysql> desc test;
> +-------+--------------+------+-----+---------+-------+
> | Field | Type | Null | Key | Default | Extra |
> +-------+--------------+------+-----+---------+-------+
> | i | int(11) | YES | | NULL | |
> | b | tinyint(1) | YES | | NULL | |
> | d | double | YES | | NULL | |
> | s | varchar(5) | YES | | NULL | |
> | dd | decimal(5,2) | YES | | NULL | |
> +-------+--------------+------+-----+---------+-------+
> mysql> create table test62 as select avg(i) from test;
> mysql> desc test62;
> +-------+---------------+------+-----+---------+-------+
> | Field | Type | Null | Key | Default | Extra |
> +-------+---------------+------+-----+---------+-------+
> | avg(i) | decimal(14,4) | YES | | NULL | |
> +-------+---------------+------+-----+---------+-------+
> 1 row in set (0.00 sec)
> {code}
--
This message was sent by Atlassian JIRA
(v6.1#6144)