You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Frankline Jose S (JIRA)" <ji...@apache.org> on 2013/01/28 11:07:13 UTC
[jira] [Commented] (HIVE-3945) union all datatype do not match may result wrong result

    [ https://issues.apache.org/jira/browse/HIVE-3945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13564169#comment-13564169 ] 

Frankline Jose S commented on HIVE-3945:
----------------------------------------

Hi, I just tried the following queries, Seems to work fine

// Union all - combine the result-set of two or more select statement.
   Union-all allow duplicate values while matches in both tables 

Input
33	35
55	55.62
20	44
55	100
44	44
33	33
33	55
44	44
33
55	55
	55

Query :>

select c.key, c.value from (
select a.key, a.value from uniontbl a where a.key='33'
union all
select b.key, sum( b.value) as value from uniontbl b where b.key='33'
group by b.key
)c;

o/p: 
33	35.0
33	33.0
33	55.0
33	NULL    --> current and above line blongs to alia  a   ( a.key, a.value )
33	123.0   --> belongs to alias b ( b.key, sum( b.value) )

select c.key, c.value  from (
select a.key, a.value from uniontbl a where a.key='55'
union all
select b.key, sum( b.value) as value from uniontbl b where b.key='55'
group by b.key )c

o/p
55	55.62
55	100.0
55	55.0 	--> current and above line blongs to alia  a   ( a.key, a.value )
55	210.62  --> belongs to alias b ( b.key, sum( b.value) )


Can you give me an example scenario for this, in case I have misunderstood the problem. 


                
> union all datatype do not match may result wrong result 
> --------------------------------------------------------
>
>                 Key: HIVE-3945
>                 URL: https://issues.apache.org/jira/browse/HIVE-3945
>             Project: Hive
>          Issue Type: Bug
>          Components: Query Processor
>    Affects Versions: 0.9.0
>            Reporter: caofangkun
>            Priority: Minor
>
> hive (default)> desc src;
> key	string	
> value	string	
> select key, value FROM 
> ( select 'key' as key, 'value' as value         -- datatype: string, string
>   from src s1 limit 1
>   UNION  ALL  
>   select s2.key as key, sum(s2.value) as value  -- datatype: strung, double 
>   from src s2 group by s2.key
>  ) unionsrc;
> this query exec normally but has wrong result:
> key	2.4081029415476845E-282    -- expected is 'value'
> 	35.0
> 100	100.0
> 48	0.0
> and sometimes when the string title is too long it may case ArrayIndexOutOfBoundsException:
> Caused by: java.lang.ArrayIndexOutOfBoundsException
> at java.lang.System.arraycopy(Native Method)
> at org.apache.hadoop.io.Text.set(Text.java:205)
> at org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryString.init(LazyBinaryString.java:48)
> at org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryStruct.uncheckedGetField(LazyBinaryStruct.java:216)
> at org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryStruct.getField(LazyBinaryStruct.java:197)
> at org.apache.hadoop.hive.serde2.lazybinary.objectinspector.LazyBinaryStructObjectInspector.getStructFieldData(LazyBinaryStructObjectInspector.java:61)
> at org.apache.hadoop.hive.ql.exec.UnionOperator.processOp(UnionOperator.java:125)
> at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:471)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:762)
> at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:531)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira