You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hbase.apache.org by st...@apache.org on 2011/03/10 21:04:27 UTC

svn commit: r1080332 - /hbase/trunk/src/docbkx/book.xml

Author: stack
Date: Thu Mar 10 20:04:27 2011
New Revision: 1080332

URL: http://svn.apache.org/viewvc?rev=1080332&view=rev
Log:
Added section on keeping row and column names small to schema section

Modified:
    hbase/trunk/src/docbkx/book.xml

Modified: hbase/trunk/src/docbkx/book.xml
URL: http://svn.apache.org/viewvc/hbase/trunk/src/docbkx/book.xml?rev=1080332&r1=1080331&r2=1080332&view=diff
==============================================================================
--- hbase/trunk/src/docbkx/book.xml (original)
+++ hbase/trunk/src/docbkx/book.xml Thu Mar 10 20:04:27 2011
@@ -1384,6 +1384,25 @@ of all regions.
   successful example.  It has a page describing the schema it uses in
   HBase.  You might also consider just using OpenTSDB altogether.</para>
   </section>
+  <section xml:id="keysize">
+      <title>Try to minimize row and column sizes</title>
+      <para>In HBase, values are always freighted with their coordinates; as a
+          cell value passes through the system, it'll be accompanied by its
+          row, column name, and timestamp.  Always.  If your rows and column names
+          are large, especially compared o the size of the cell value, then
+          you may run up against some interesting scenarios.  One such is
+          the case described by Marc Limotte at the tail of
+          <link xlink:url="https://issues.apache.org/jira/browse/HBASE-3551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&amp;focusedCommentId=13005272#comment-13005272">HBASE-3551</link>
+          (recommended!).
+          Therein, the indices that are kept on HBase storefiles (<link linkend="hfile">HFile</link>s)
+                  to facilitate random access may end up occupyng large chunks of the HBase
+                  allotted RAM because the cell value coordinates are large.
+                  Mark in the above cited comment suggests upping the block size so
+                  entries in the store file index happen at a larger interval or
+                  modify the table schema so it makes for smaller rows and column
+                  names.
+      `</para>
+  </section>
   </chapter>
 
   <chapter xml:id="hbase_metrics">