You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Makoto Yui (JIRA)" <ji...@apache.org> on 2017/08/29 07:32:00 UTC
[jira] [Updated] (HIVE-17406) UDAF throws IllegalArgumentException
for a complex input when column stats is not provided
[ https://issues.apache.org/jira/browse/HIVE-17406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Makoto Yui updated HIVE-17406:
------------------------------
Description:
I found that UDAF (both generic and non-generic UDAF w/ or w/o estimable) of Hive v2.2.0 throws IllegalArgumentException for a complex input when column stats is not provided.
The exception does not occur in v2.1.0.
https://github.com/apache/hive/blob/34eebff194e81180202d198200e84058c4910d95/ql/src/java/org/apache/hadoop/hive/ql/stats/StatsUtils.java#L1156
{code:sql}
select version();
> 2.3.0-amzn-0 rcb482944667f96f43c89932dcb66d61ee7e4ac1d
with t2 as (
select array(1,2) as c1
union all
select array(2,3) as c1
)
select collect_list(c1) from t2;
> FAILED: IllegalArgumentException Size requested for unknown type: java.util.Collection
{code}
On the other hand, it succeeds when colunm stats is provided as follows:
{code:sql}
create table t1 as (
select array(1,2) as c1
union all
select array(2,3) as c1
);
> select collect_list(c1) from t1;
[[1,2],[2,3]]
> desc formatted t1;
...
Table Parameters:
COLUMN_STATS_ACCURATE {\"BASIC_STATS\":\"true\"}
numFiles 2
numRows 2
rawDataSize 6
totalSize 8
transient_lastDdlTime 1503990290
{code}
was:
I found that Non-generic UDAF of Hive v2.2.0 throws IllegalArgumentException for a complex input when column stats is not provided. The exception does not occur in v2.1.0.
https://github.com/apache/hive/blob/34eebff194e81180202d198200e84058c4910d95/ql/src/java/org/apache/hadoop/hive/ql/stats/StatsUtils.java#L1156
{code:sql}
select version();
> 2.3.0-amzn-0 rcb482944667f96f43c89932dcb66d61ee7e4ac1d
with t2 as (
select array(1,2) as c1
union all
select array(2,3) as c1
)
select collect_list(c1) from t2;
> FAILED: IllegalArgumentException Size requested for unknown type: java.util.Collection
{code}
On the other hand, it succeeds when colunm stats is provided as follows:
{code:sql}
create table t1 as (
select array(1,2) as c1
union all
select array(2,3) as c1
);
> select collect_list(c1) from t1;
[[1,2],[2,3]]
> desc formatted t1;
...
Table Parameters:
COLUMN_STATS_ACCURATE {\"BASIC_STATS\":\"true\"}
numFiles 2
numRows 2
rawDataSize 6
totalSize 8
transient_lastDdlTime 1503990290
{code}
Summary: UDAF throws IllegalArgumentException for a complex input when column stats is not provided (was: Non-generic UDAF throws IllegalArgumentException for a complex input when column stats is not provided)
> UDAF throws IllegalArgumentException for a complex input when column stats is not provided
> ------------------------------------------------------------------------------------------
>
> Key: HIVE-17406
> URL: https://issues.apache.org/jira/browse/HIVE-17406
> Project: Hive
> Issue Type: Bug
> Components: Query Processor
> Affects Versions: 2.2.0
> Reporter: Makoto Yui
> Priority: Minor
>
> I found that UDAF (both generic and non-generic UDAF w/ or w/o estimable) of Hive v2.2.0 throws IllegalArgumentException for a complex input when column stats is not provided.
> The exception does not occur in v2.1.0.
> https://github.com/apache/hive/blob/34eebff194e81180202d198200e84058c4910d95/ql/src/java/org/apache/hadoop/hive/ql/stats/StatsUtils.java#L1156
> {code:sql}
> select version();
> > 2.3.0-amzn-0 rcb482944667f96f43c89932dcb66d61ee7e4ac1d
> with t2 as (
> select array(1,2) as c1
> union all
> select array(2,3) as c1
> )
> select collect_list(c1) from t2;
> > FAILED: IllegalArgumentException Size requested for unknown type: java.util.Collection
> {code}
> On the other hand, it succeeds when colunm stats is provided as follows:
> {code:sql}
> create table t1 as (
> select array(1,2) as c1
> union all
> select array(2,3) as c1
> );
> > select collect_list(c1) from t1;
> [[1,2],[2,3]]
> > desc formatted t1;
> ...
> Table Parameters:
> COLUMN_STATS_ACCURATE {\"BASIC_STATS\":\"true\"}
> numFiles 2
> numRows 2
> rawDataSize 6
> totalSize 8
> transient_lastDdlTime 1503990290
> {code}
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)