You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-commits@hadoop.apache.org by Apache Wiki <wi...@apache.org> on 2010/06/08 23:42:16 UTC

[Hadoop Wiki] Trivial Update of "Hive/StorageHandlers" by CarlSteinbach

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.

The "Hive/StorageHandlers" page has been changed by CarlSteinbach.
http://wiki.apache.org/hadoop/Hive/StorageHandlers?action=diff&rev1=3&rev2=4

--------------------------------------------------

+ = Hive Storage Handlers =
+ 
+ <<TableOfContents>>
+ 
- = Introduction =
+ == Introduction ==
  
  This page documents the storage handler support being added to Hive as
  part of work on [[Hive/HBaseIntegration]].  The motivation is to make
@@ -24, +28 @@

  managing object definitions in both the Hive metastore and the
  other system's catalog simultaneously and consistently.
  
- = Terminology =
+ == Terminology ==
  
  Before storage handlers, Hive already had a concept of ''managed'' vs
  ''external'' tables.  A managed table is one for which the definition
@@ -50, +54 @@

  Note that we avoid the term ''file-based'' in these definitions, since
  the form of storage used by the other system is irrelevant.
  
- = DDL =
+ == DDL ==
  
  Storage handlers are associated with a table when it is created via
  the new STORED BY clause, an alternative to the existing ROW FORMAT
@@ -89, +93 @@

  DROP TABLE works as usual, but ALTER TABLE is not yet supported for
  non-native tables.
  
- = Storage Handler Interface =
+ == Storage Handler Interface ==
  
  The Java interface which must be implemented by a storage handler is
  reproduced below; for details, see the Javadoc in the code:
@@ -127, +131 @@

  attributes on jobProperties.  At execution time, only these jobProperties
  will be available to the input format, output format, and serde.
  
- = HiveMetaHook Interface =
+ == HiveMetaHook Interface ==
  
  The {{{HiveMetaHook}}} interface is reproduced below; for details, see
  the Javadoc in the code:
@@ -165, +169 @@

  result, there is a small window in which a crash during DDL can lead
  to the two systems getting out of sync.  
  
- = Open Issues =
+ == Open Issues ==
  
   * The storage handler class name is currently saved to the metastore via table property {{{storage_handler}}}; this should probably be a first-class attribute on MStorageDescriptor instead
   * Names of helper classes such as input format and output format are saved into the metastore based on what the storage handler returns during CREATE TABLE; it would be better to leave these null in case they are changed later as part of a handler upgrade