You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "HBase Review Board (JIRA)" <ji...@apache.org> on 2010/09/07 12:20:33 UTC

[jira] Commented: (HIVE-537) Hive TypeInfo/ObjectInspector to support union (besides struct, array, and map)

    [ https://issues.apache.org/jira/browse/HIVE-537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12906759#action_12906759 ] 

HBase Review Board commented on HIVE-537:
-----------------------------------------

Message from: "Amareshwari Sriramadasu" <am...@yahoo-inc.com>

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/795/
-----------------------------------------------------------

Review request for Hive Developers.


Summary
-------

Adds Union type to Standard ObjectInSpectors, TypeInfo and Lazy ObjectInspectors.


This addresses bug HIVE-537.
    http://issues.apache.org/jira/browse/HIVE-537


Diffs
-----

  trunk/serde/src/gen-java/org/apache/hadoop/hive/serde/Constants.java 991812 
  trunk/serde/src/java/org/apache/hadoop/hive/serde2/SerDeUtils.java 991812 
  trunk/serde/src/java/org/apache/hadoop/hive/serde2/binarysortable/BinarySortableSerDe.java 991812 
  trunk/serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazyFactory.java 991812 
  trunk/serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazySimpleSerDe.java 991812 
  trunk/serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazyUnion.java PRE-CREATION 
  trunk/serde/src/java/org/apache/hadoop/hive/serde2/lazy/objectinspector/LazyObjectInspectorFactory.java 991812 
  trunk/serde/src/java/org/apache/hadoop/hive/serde2/lazy/objectinspector/LazyUnionObjectInspector.java PRE-CREATION 
  trunk/serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/ObjectInspector.java 991812 
  trunk/serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/ObjectInspectorFactory.java 991812 
  trunk/serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/ObjectInspectorUtils.java 991812 
  trunk/serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/StandardUnionObjectInspector.java PRE-CREATION 
  trunk/serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/UnionObject.java PRE-CREATION 
  trunk/serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/UnionObjectInspector.java PRE-CREATION 
  trunk/serde/src/java/org/apache/hadoop/hive/serde2/typeinfo/TypeInfo.java 991812 
  trunk/serde/src/java/org/apache/hadoop/hive/serde2/typeinfo/TypeInfoFactory.java 991812 
  trunk/serde/src/java/org/apache/hadoop/hive/serde2/typeinfo/TypeInfoUtils.java 991812 
  trunk/serde/src/java/org/apache/hadoop/hive/serde2/typeinfo/UnionTypeInfo.java PRE-CREATION 
  trunk/serde/src/test/org/apache/hadoop/hive/serde2/lazy/TestLazyArrayMapStruct.java 991812 
  trunk/serde/src/test/org/apache/hadoop/hive/serde2/objectinspector/TestStandardObjectInspectors.java 991812 

Diff: http://review.cloudera.org/r/795/diff


Testing
-------


Thanks,

Amareshwari




> Hive TypeInfo/ObjectInspector to support union (besides struct, array, and map)
> -------------------------------------------------------------------------------
>
>                 Key: HIVE-537
>                 URL: https://issues.apache.org/jira/browse/HIVE-537
>             Project: Hadoop Hive
>          Issue Type: New Feature
>            Reporter: Zheng Shao
>            Assignee: Amareshwari Sriramadasu
>         Attachments: HIVE-537.1.patch, patch-537-1.txt, patch-537.txt
>
>
> There are already some cases inside the code that we use heterogeneous data: JoinOperator, and UnionOperator (in the sense that different parents can pass in records with different ObjectInspectors).
> We currently use Operator's parentID to distinguish that. However that approach does not extend to more complex plans that might be needed in the future.
> We will support the union type like this:
> {code}
> TypeDefinition:
>   type: primitivetype | structtype | arraytype | maptype | uniontype
>   uniontype: "union" "<" tag ":" type ("," tag ":" type)* ">"
> Example:
>   union<0:int,1:double,2:array<string>,3:struct<a:int,b:string>>
> Example of serialized data format:
>   We will first store the tag byte before we serialize the object. On deserialization, we will first read out the tag byte, then we know what is the current type of the following object, so we can deserialize it successfully.
> Interface for ObjectInspector:
> interface UnionObjectInspector {
>   /** Returns the array of OIs that are for each of the tags
>    */
>   ObjectInspector[] getObjectInspectors();
>   /** Return the tag of the object.
>    */
>   byte getTag(Object o);
>   /** Return the field based on the tag value associated with the Object.
>    */
>   Object getField(Object o);
> };
> An example serialization format (Using deliminated format, with ' ' as first-level delimitor and '=' as second-level delimitor)
> userid:int,log:union<0:struct<touserid:int,message:string>>,1:string>
> 123 1=login
> 123 0=243=helloworld
> 123 1=logout
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.