You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Zoltan Haindrich (JIRA)" <ji...@apache.org> on 2017/08/29 09:34:00 UTC

[jira] [Assigned] (HIVE-17406) UDAF throws IllegalArgumentException for a complex input when column stats is not provided

     [ https://issues.apache.org/jira/browse/HIVE-17406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Zoltan Haindrich reassigned HIVE-17406:
---------------------------------------

    Assignee: Zoltan Haindrich

> UDAF throws IllegalArgumentException for a complex input when column stats is not provided
> ------------------------------------------------------------------------------------------
>
>                 Key: HIVE-17406
>                 URL: https://issues.apache.org/jira/browse/HIVE-17406
>             Project: Hive
>          Issue Type: Bug
>          Components: Query Processor
>    Affects Versions: 2.2.0
>            Reporter: Makoto Yui
>            Assignee: Zoltan Haindrich
>            Priority: Minor
>
> I found that UDAF (both generic and non-generic UDAF w/ or w/o estimable) of Hive v2.2.0 throws IllegalArgumentException for a complex input when column stats is not provided. 
> The exception does not occur in v2.1.0.
> https://github.com/apache/hive/blob/34eebff194e81180202d198200e84058c4910d95/ql/src/java/org/apache/hadoop/hive/ql/stats/StatsUtils.java#L1156
> {code:sql}
> select version();
> > 2.3.0-amzn-0 rcb482944667f96f43c89932dcb66d61ee7e4ac1d
> with t2 as ( 
>   select array(1,2) as c1 
>   union all 
>   select array(2,3) as c1
> ) 
> select collect_list(c1) from t2;
> > FAILED: IllegalArgumentException Size requested for unknown type: java.util.Collection
> {code}
> On the other hand, it succeeds when colunm stats is provided as follows:
> {code:sql}
> create table t1 as (
>   select array(1,2) as c1 
>   union all
>   select array(2,3) as c1
> );
> > select collect_list(c1) from t1;
> [[1,2],[2,3]]
> > desc formatted t1;
> ...       
> Table Parameters:                
>         COLUMN_STATS_ACCURATE   {\"BASIC_STATS\":\"true\"}
>         numFiles                2                   
>         numRows                 2                   
>         rawDataSize             6                   
>         totalSize               8                   
>         transient_lastDdlTime   1503990290
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)