You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hive.apache.org by zs...@apache.org on 2008/12/16 23:59:59 UTC

svn commit: r727209 - in /hadoop/hive/trunk/serde: README README.txt src/java/org/apache/hadoop/hive/serde2/dynamic_type/DynamicSerDeFieldList.java

Author: zshao
Date: Tue Dec 16 14:59:59 2008
New Revision: 727209

URL: http://svn.apache.org/viewvc?rev=727209&view=rev
Log:
Moved README to wiki and README.txt.

Added:
    hadoop/hive/trunk/serde/README.txt   (contents, props changed)
      - copied, changed from r727179, hadoop/hive/trunk/serde/README
Removed:
    hadoop/hive/trunk/serde/README
Modified:
    hadoop/hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/dynamic_type/DynamicSerDeFieldList.java

Copied: hadoop/hive/trunk/serde/README.txt (from r727179, hadoop/hive/trunk/serde/README)
URL: http://svn.apache.org/viewvc/hadoop/hive/trunk/serde/README.txt?p2=hadoop/hive/trunk/serde/README.txt&p1=hadoop/hive/trunk/serde/README&r1=727179&r2=727209&rev=727209&view=diff
==============================================================================
--- hadoop/hive/trunk/serde/README (original)
+++ hadoop/hive/trunk/serde/README.txt Tue Dec 16 14:59:59 2008
@@ -1,52 +1,6 @@
-What is SerDe
------------
-SerDe is a short name for Serializer and Deserializer.
-Hive uses SerDe (and FileFormat) to read from/write to tables.
-
-* HDFS files --(InputFileFormat)--> <key, value> --(Deserializer)--> Row object
-* Row object --(Serializer)--> <key, value> --(OutputFileFormat)--> HDFS files
-
-Note that the "key" part is ignored when reading, and is always a constant when
-writing. Basically the row object is only stored into the "value".
-
-One principle of Hive is that Hive does not own the HDFS file format - Users
-should be able to directly read the HDFS files in the Hive tables using other
-tools, or use other tools to directly write to HDFS files that can be read by
-Hive through "CREATE EXTERNAL TABLE", or can be loaded into Hive through "LOAD
-DATA INPATH" which just move the file into Hive table directory.
-
-
-Note that org.apache.hadoop.hive.serde is the deprecated old serde library.
-Please look at org.apache.hadoop.hive.serde2 for the latest version.
-
-
-Existing FileFormats and SerDe classes
+What is SerDe and how to write a SerDe?
 ------------------------
-Hive currently use these FileFormats to read/write to files:
-
-* TextInputFormat/NoKeyTextOutputFormat
-  These 2 classes read/write data in plain text file format.
-
-* SequenceFileInputFormat/SequenceFileOutputFormat
-  These 2 classes read/write data in hadoop SequenceFile format.
-
-Hive currently use these SerDe classes to serialize and deserialize data:
-
-* MetadataTypedColumnsetSerDe
-  This serde is used to read/write delimited records like CSV, tab-separated
-  control-A separated records (sorry, quote is not supported yet.)
-
-* ThriftSerDe
-  This serde is used to read/write thrift serialized objects.  The class file
-  for the Thrift object must be loaded first.
-
-* DynamicSerDe
-  This serde also read/write thrift serialized objects, but it understands thrift
-  DDL so the schema of the object can be provided at runtime.  Also it supports
-  a lot of different protocols, including TBinaryProtocol, TJSONProtocol,
-  TCTLSeparatedProtocol (which writes data in delimited records).
-
-
+Please refer to http://wiki.apache.org/hadoop/Hive/DeveloperGuide Section SerDe.
 
 How to load data into Hive
 ------------------------
@@ -104,12 +58,3 @@
 Then we just need to directly open the HDFS file and read the data.
 
 
-How to write your own SerDe
-------------------------
-
-In most cases, users want to write a Deserializer instead of a SerDe.
-For example, the RegexDeserializer will deserialize the data using the
-configuration parameter 'regex', and possibly a list of column names (see
-serde2.MetadataTypedColumnsetSerDe).
-
-Please see serde2/Deserializer.java for details.

Propchange: hadoop/hive/trunk/serde/README.txt
------------------------------------------------------------------------------
    svn:mergeinfo = 

Modified: hadoop/hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/dynamic_type/DynamicSerDeFieldList.java
URL: http://svn.apache.org/viewvc/hadoop/hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/dynamic_type/DynamicSerDeFieldList.java?rev=727209&r1=727208&r2=727209&view=diff
==============================================================================
--- hadoop/hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/dynamic_type/DynamicSerDeFieldList.java (original)
+++ hadoop/hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/dynamic_type/DynamicSerDeFieldList.java Tue Dec 16 14:59:59 2008
@@ -198,7 +198,7 @@
     return struct;
   }
 
-
+  
   TField field = new TField();
   public void serialize(Object o, ObjectInspector oi, TProtocol oprot) throws TException, SerDeException, NoSuchFieldException,IllegalAccessException  {
     // Assuming the ObjectInspector represents exactly the same type as this struct.
@@ -212,7 +212,8 @@
     List<? extends StructField> fields = soi.getAllStructFieldRefs();
     if (fields.size() != ordered_types.length) {
       throw new SerDeException("Trying to serialize " + fields.size() 
-          + " fields into a struct with " + ordered_types.length);
+          + " fields into a struct with " + ordered_types.length
+          + " object=" + o + " objectinspector=" + oi.getTypeName());
     }
     for (int i=0; i<fields.size(); i++) {
       Object f = soi.getStructFieldData(o, fields.get(i));