You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Makoto Yui (JIRA)" <ji...@apache.org> on 2017/08/29 07:32:00 UTC

[jira] [Updated] (HIVE-17406) UDAF throws IllegalArgumentException for a complex input when column stats is not provided

     [ https://issues.apache.org/jira/browse/HIVE-17406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Makoto Yui updated HIVE-17406:
------------------------------
    Description: 
I found that UDAF (both generic and non-generic UDAF w/ or w/o estimable) of Hive v2.2.0 throws IllegalArgumentException for a complex input when column stats is not provided. 

The exception does not occur in v2.1.0.

https://github.com/apache/hive/blob/34eebff194e81180202d198200e84058c4910d95/ql/src/java/org/apache/hadoop/hive/ql/stats/StatsUtils.java#L1156

{code:sql}
select version();
> 2.3.0-amzn-0 rcb482944667f96f43c89932dcb66d61ee7e4ac1d

with t2 as ( 
  select array(1,2) as c1 
  union all 
  select array(2,3) as c1
) 
select collect_list(c1) from t2;

> FAILED: IllegalArgumentException Size requested for unknown type: java.util.Collection
{code}

On the other hand, it succeeds when colunm stats is provided as follows:

{code:sql}
create table t1 as (
  select array(1,2) as c1 
  union all
  select array(2,3) as c1
);

> select collect_list(c1) from t1;
[[1,2],[2,3]]

> desc formatted t1;
...       
Table Parameters:                
        COLUMN_STATS_ACCURATE   {\"BASIC_STATS\":\"true\"}
        numFiles                2                   
        numRows                 2                   
        rawDataSize             6                   
        totalSize               8                   
        transient_lastDdlTime   1503990290
{code}

  was:
I found that Non-generic UDAF of Hive v2.2.0 throws IllegalArgumentException for a complex input when column stats is not provided. The exception does not occur in v2.1.0.

https://github.com/apache/hive/blob/34eebff194e81180202d198200e84058c4910d95/ql/src/java/org/apache/hadoop/hive/ql/stats/StatsUtils.java#L1156

{code:sql}
select version();
> 2.3.0-amzn-0 rcb482944667f96f43c89932dcb66d61ee7e4ac1d

with t2 as ( 
  select array(1,2) as c1 
  union all 
  select array(2,3) as c1
) 
select collect_list(c1) from t2;

> FAILED: IllegalArgumentException Size requested for unknown type: java.util.Collection
{code}

On the other hand, it succeeds when colunm stats is provided as follows:

{code:sql}
create table t1 as (
  select array(1,2) as c1 
  union all
  select array(2,3) as c1
);

> select collect_list(c1) from t1;
[[1,2],[2,3]]

> desc formatted t1;
...       
Table Parameters:                
        COLUMN_STATS_ACCURATE   {\"BASIC_STATS\":\"true\"}
        numFiles                2                   
        numRows                 2                   
        rawDataSize             6                   
        totalSize               8                   
        transient_lastDdlTime   1503990290
{code}

        Summary: UDAF throws IllegalArgumentException for a complex input when column stats is not provided  (was: Non-generic UDAF throws IllegalArgumentException for a complex input when column stats is not provided)

> UDAF throws IllegalArgumentException for a complex input when column stats is not provided
> ------------------------------------------------------------------------------------------
>
>                 Key: HIVE-17406
>                 URL: https://issues.apache.org/jira/browse/HIVE-17406
>             Project: Hive
>          Issue Type: Bug
>          Components: Query Processor
>    Affects Versions: 2.2.0
>            Reporter: Makoto Yui
>            Priority: Minor
>
> I found that UDAF (both generic and non-generic UDAF w/ or w/o estimable) of Hive v2.2.0 throws IllegalArgumentException for a complex input when column stats is not provided. 
> The exception does not occur in v2.1.0.
> https://github.com/apache/hive/blob/34eebff194e81180202d198200e84058c4910d95/ql/src/java/org/apache/hadoop/hive/ql/stats/StatsUtils.java#L1156
> {code:sql}
> select version();
> > 2.3.0-amzn-0 rcb482944667f96f43c89932dcb66d61ee7e4ac1d
> with t2 as ( 
>   select array(1,2) as c1 
>   union all 
>   select array(2,3) as c1
> ) 
> select collect_list(c1) from t2;
> > FAILED: IllegalArgumentException Size requested for unknown type: java.util.Collection
> {code}
> On the other hand, it succeeds when colunm stats is provided as follows:
> {code:sql}
> create table t1 as (
>   select array(1,2) as c1 
>   union all
>   select array(2,3) as c1
> );
> > select collect_list(c1) from t1;
> [[1,2],[2,3]]
> > desc formatted t1;
> ...       
> Table Parameters:                
>         COLUMN_STATS_ACCURATE   {\"BASIC_STATS\":\"true\"}
>         numFiles                2                   
>         numRows                 2                   
>         rawDataSize             6                   
>         totalSize               8                   
>         transient_lastDdlTime   1503990290
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)