You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-commits@hadoop.apache.org by Apache Wiki <wi...@apache.org> on 2010/06/08 23:43:40 UTC

[Hadoop Wiki] Trivial Update of "Hive/HBaseIntegration" by CarlSteinbach

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.

The "Hive/HBaseIntegration" page has been changed by CarlSteinbach.
http://wiki.apache.org/hadoop/Hive/HBaseIntegration?action=diff&rev1=29&rev2=30

--------------------------------------------------

+ = Hive HBase Integration =
+ 
+ <<TableOfContents>>
+ 
- = Introduction =
+ == Introduction ==
  
  This page documents the Hive/HBase integration support originally
  introduced in
@@ -15, +19 @@

  This feature is a work in progress, and suggestions for its
  improvement are very welcome.
  
- = Storage Handlers =
+ == Storage Handlers ==
  
  Before proceeding, please read [[Hive/StorageHandlers]] for an overview
  of the generic storage handler framework on which HBase integration depends.
  
- = Usage =
+ == Usage ==
  
  The storage handler is built as an independent module,
  {{{hive_hbase_handler.jar}}}, which must be available on the Hive
@@ -132, +136 @@

  validated against the existing HBase table's column families), whereas
  {{{hbase.table.name}}} is optional.
  
- = Column Mapping =
+ == Column Mapping ==
  
  The column mapping support currently available is somewhat
  cumbersome and restrictive:
@@ -148, +152 @@

  
  The next few sections provide detailed examples of the kinds of column mappings currently possible.
  
- == Multiple Columns and Families ==
+ === Multiple Columns and Families ===
  
  Here's an example with three Hive columns and two HBase column
  families, with two of the Hive columns ({{{value1}}} and {{{value2}}})
@@ -202, +206 @@

  Time taken: 4.054 seconds
  }}}
  
- == Hive MAP to HBase Column Family ==
+ === Hive MAP to HBase Column Family ===
  
  Here's how a Hive MAP datatype can be used to access an entire column
  family.  Each row can have a different set of columns, where the
@@ -256, +260 @@

  FAILED: Error in metadata: java.lang.RuntimeException: MetaException(message:org.apache.hadoop.hive.serde2.SerDeException org.apache.hadoop.hive.hbase.HBaseSerDe: hbase column family 'cf:' should be mapped to map<string,?> but is mapped to map<int,int>)
  }}}
  
- == Illegal:  Hive Primitive to HBase Column Family ==
+ === Illegal:  Hive Primitive to HBase Column Family ===
  
  Table definitions such as the following are illegal because a
  Hive column mapped to an entire column family must have MAP type:
@@ -271, +275 @@

  }}}
  
  
- = Key Uniqueness =
+ == Key Uniqueness ==
  
  One subtle difference between HBase tables and Hive tables is that HBase tables have a unique key, whereas Hive tables do not.  When multiple rows with the same key are inserted into HBase, only one of them is stored (the choice is arbitrary, so do not rely on HBase to pick the right one).  This is in contrast to Hive, which is happy to store multiple rows with the same key and different values.
  
@@ -299, +303 @@

  SELECT COUNT(1) FROM pokes3 WHERE foo=498;
  }}}
  
- = Potential Followups =
+ == Potential Followups ==
  
  There are a number of areas where Hive/HBase integration could definitely use more love:
  
@@ -314, +318 @@

   * replace dependencies on deprecated HBase API's such as RowResult (HIVE-1229)
   * allow HBase WAL to be disabled (HIVE-1383)
  
- = Build =
+ == Build ==
  
  Code for the storage handler is located under
  {{{hive/trunk/hbase-handler}}}.  The Hive build automatically enables
@@ -327, +331 @@

  {{{hbase-handler/lib}}}.  We will convert this to use Ivy instead once
  the corresponding POM's are available.
  
- = Tests =
+ == Tests ==
  
  Class-level unit tests are provided under
  {{{hbase-handler/src/test/org/apache/hadoop/hive/hbase}}}.
@@ -344, +348 @@

  
  An Eclipse launch template remains to be defined.
  
- = Links =
+ == Links ==
  
   * For information on how to bulk load data from Hive into HBase, see [[Hive/HBaseBulkLoad]].
   * For another project which adds SQL-like query language support on top of HBase, see [[http://www.hbql.com|HBQL]] (unrelated to Hive).
  
- = Acknowledgements =
+ == Acknowledgements ==
  
   * Primary credit for this feature goes to Samuel Guo, who did most of the development work in the early drafts of the patch