You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hbase.apache.org by st...@apache.org on 2013/04/02 20:06:21 UTC

svn commit: r1463652 [7/21] - in /hbase/hbase.apache.org/trunk: ./ book/ case_studies/ community/ configuration/ css/ developer/ external_apis/ getting_started/ hbase-assembly/ hbase-assembly/book/ hbase-assembly/xref/ images/ ops_mgt/ performance/ pre...

Modified: hbase/hbase.apache.org/trunk/book/ops_mgt.html
URL: http://svn.apache.org/viewvc/hbase/hbase.apache.org/trunk/book/ops_mgt.html?rev=1463652&r1=1463651&r2=1463652&view=diff
==============================================================================
--- hbase/hbase.apache.org/trunk/book/ops_mgt.html (original)
+++ hbase/hbase.apache.org/trunk/book/ops_mgt.html Tue Apr  2 18:06:19 2013
@@ -1,6 +1,6 @@
 <html><head>
       <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
-   <title>Chapter&nbsp;14.&nbsp;Apache HBase (TM) Operational Management</title><link rel="stylesheet" type="text/css" href="../css/freebsd_docbook.css"><meta name="generator" content="DocBook XSL-NS Stylesheets V1.76.1"><link rel="home" href="book.html" title="The Apache HBase&#153; Reference Guide"><link rel="up" href="book.html" title="The Apache HBase&#153; Reference Guide"><link rel="prev" href="casestudies.perftroub.html" title="13.3.&nbsp;Performance/Troubleshooting"><link rel="next" href="ops.regionmgt.html" title="14.2.&nbsp;Region Management"></head><body bgcolor="white" text="black" link="#0000FF" vlink="#840084" alink="#0000FF"><div class="navheader"><table width="100%" summary="Navigation header"><tr><th colspan="3" align="center">Chapter&nbsp;14.&nbsp;Apache HBase (TM) Operational Management</th></tr><tr><td width="20%" align="left"><a accesskey="p" href="casestudies.perftroub.html">Prev</a>&nbsp;</td><th width="60%" align="center">&nbsp;</th><td width="20%" al
 ign="right">&nbsp;<a accesskey="n" href="ops.regionmgt.html">Next</a></td></tr></table><hr></div><div class="chapter" title="Chapter&nbsp;14.&nbsp;Apache HBase (TM) Operational Management"><div class="titlepage"><div><div><h2 class="title"><a name="ops_mgt"></a>Chapter&nbsp;14.&nbsp;Apache HBase (TM) Operational Management</h2></div></div></div><div class="toc"><p><b>Table of Contents</b></p><dl><dt><span class="section"><a href="ops_mgt.html#tools">14.1. HBase Tools and Utilities</a></span></dt><dd><dl><dt><span class="section"><a href="ops_mgt.html#driver">14.1.1. Driver</a></span></dt><dt><span class="section"><a href="ops_mgt.html#hbck">14.1.2. HBase <span class="application">hbck</span></a></span></dt><dt><span class="section"><a href="ops_mgt.html#hfile_tool2">14.1.3. HFile Tool</a></span></dt><dt><span class="section"><a href="ops_mgt.html#wal_tools">14.1.4. WAL Tools</a></span></dt><dt><span class="section"><a href="ops_mgt.html#compression.tool">14.1.5. Compression 
 Tool</a></span></dt><dt><span class="section"><a href="ops_mgt.html#copytable">14.1.6. CopyTable</a></span></dt><dt><span class="section"><a href="ops_mgt.html#export">14.1.7. Export</a></span></dt><dt><span class="section"><a href="ops_mgt.html#import">14.1.8. Import</a></span></dt><dt><span class="section"><a href="ops_mgt.html#importtsv">14.1.9. ImportTsv</a></span></dt><dt><span class="section"><a href="ops_mgt.html#completebulkload">14.1.10. CompleteBulkLoad</a></span></dt><dt><span class="section"><a href="ops_mgt.html#walplayer">14.1.11. WALPlayer</a></span></dt><dt><span class="section"><a href="ops_mgt.html#rowcounter">14.1.12. RowCounter and CellCounter</a></span></dt></dl></dd><dt><span class="section"><a href="ops.regionmgt.html">14.2. Region Management</a></span></dt><dd><dl><dt><span class="section"><a href="ops.regionmgt.html#ops.regionmgt.majorcompact">14.2.1. Major Compaction</a></span></dt><dt><span class="section"><a href="ops.regionmgt.html#ops.regionmgt.
 merge">14.2.2. Merge</a></span></dt></dl></dd><dt><span class="section"><a href="node.management.html">14.3. Node Management</a></span></dt><dd><dl><dt><span class="section"><a href="node.management.html#decommission">14.3.1. Node Decommission</a></span></dt><dt><span class="section"><a href="node.management.html#rolling">14.3.2. Rolling Restart</a></span></dt></dl></dd><dt><span class="section"><a href="hbase_metrics.html">14.4. HBase Metrics</a></span></dt><dd><dl><dt><span class="section"><a href="hbase_metrics.html#metric_setup">14.4.1. Metric Setup</a></span></dt><dt><span class="section"><a href="hbase_metrics.html#rs_metrics">14.4.2. RegionServer Metrics</a></span></dt></dl></dd><dt><span class="section"><a href="ops.monitoring.html">14.5. HBase Monitoring</a></span></dt><dd><dl><dt><span class="section"><a href="ops.monitoring.html#ops.monitoring.overview">14.5.1. Overview</a></span></dt><dt><span class="section"><a href="ops.monitoring.html#ops.slow.query">14.5.2. S
 low Query Log</a></span></dt></dl></dd><dt><span class="section"><a href="cluster_replication.html">14.6. Cluster Replication</a></span></dt><dt><span class="section"><a href="ops.backup.html">14.7. HBase Backup</a></span></dt><dd><dl><dt><span class="section"><a href="ops.backup.html#ops.backup.fullshutdown">14.7.1. Full Shutdown Backup</a></span></dt><dt><span class="section"><a href="ops.backup.html#ops.backup.live.replication">14.7.2. Live Cluster Backup - Replication</a></span></dt><dt><span class="section"><a href="ops.backup.html#ops.backup.live.copytable">14.7.3. Live Cluster Backup - CopyTable</a></span></dt><dt><span class="section"><a href="ops.backup.html#ops.backup.live.export">14.7.4. Live Cluster Backup - Export</a></span></dt></dl></dd><dt><span class="section"><a href="ops.capacity.html">14.8. Capacity Planning</a></span></dt><dd><dl><dt><span class="section"><a href="ops.capacity.html#ops.capacity.storage">14.8.1. Storage</a></span></dt><dt><span class="sec
 tion"><a href="ops.capacity.html#ops.capacity.regions">14.8.2. Regions</a></span></dt></dl></dd></dl></div>
+   <title>Chapter&nbsp;14.&nbsp;Apache HBase (TM) Operational Management</title><link rel="stylesheet" type="text/css" href="../css/freebsd_docbook.css"><meta name="generator" content="DocBook XSL-NS Stylesheets V1.76.1"><link rel="home" href="book.html" title="The Apache HBase&#153; Reference Guide"><link rel="up" href="book.html" title="The Apache HBase&#153; Reference Guide"><link rel="prev" href="casestudies.perftroub.html" title="13.3.&nbsp;Performance/Troubleshooting"><link rel="next" href="ops.regionmgt.html" title="14.2.&nbsp;Region Management"></head><body bgcolor="white" text="black" link="#0000FF" vlink="#840084" alink="#0000FF"><div class="navheader"><table width="100%" summary="Navigation header"><tr><th colspan="3" align="center">Chapter&nbsp;14.&nbsp;Apache HBase (TM) Operational Management</th></tr><tr><td width="20%" align="left"><a accesskey="p" href="casestudies.perftroub.html">Prev</a>&nbsp;</td><th width="60%" align="center">&nbsp;</th><td width="20%" al
 ign="right">&nbsp;<a accesskey="n" href="ops.regionmgt.html">Next</a></td></tr></table><hr></div><div class="chapter" title="Chapter&nbsp;14.&nbsp;Apache HBase (TM) Operational Management"><div class="titlepage"><div><div><h2 class="title"><a name="ops_mgt"></a>Chapter&nbsp;14.&nbsp;Apache HBase (TM) Operational Management</h2></div></div></div><div class="toc"><p><b>Table of Contents</b></p><dl><dt><span class="section"><a href="ops_mgt.html#tools">14.1. HBase Tools and Utilities</a></span></dt><dd><dl><dt><span class="section"><a href="ops_mgt.html#driver">14.1.1. Driver</a></span></dt><dt><span class="section"><a href="ops_mgt.html#hbck">14.1.2. HBase <span class="application">hbck</span></a></span></dt><dt><span class="section"><a href="ops_mgt.html#hfile_tool2">14.1.3. HFile Tool</a></span></dt><dt><span class="section"><a href="ops_mgt.html#wal_tools">14.1.4. WAL Tools</a></span></dt><dt><span class="section"><a href="ops_mgt.html#compression.tool">14.1.5. Compression 
 Tool</a></span></dt><dt><span class="section"><a href="ops_mgt.html#copytable">14.1.6. CopyTable</a></span></dt><dt><span class="section"><a href="ops_mgt.html#export">14.1.7. Export</a></span></dt><dt><span class="section"><a href="ops_mgt.html#import">14.1.8. Import</a></span></dt><dt><span class="section"><a href="ops_mgt.html#importtsv">14.1.9. ImportTsv</a></span></dt><dt><span class="section"><a href="ops_mgt.html#completebulkload">14.1.10. CompleteBulkLoad</a></span></dt><dt><span class="section"><a href="ops_mgt.html#walplayer">14.1.11. WALPlayer</a></span></dt><dt><span class="section"><a href="ops_mgt.html#rowcounter">14.1.12. RowCounter and CellCounter</a></span></dt><dt><span class="section"><a href="ops_mgt.html#mlockall">14.1.13. mlockall</a></span></dt></dl></dd><dt><span class="section"><a href="ops.regionmgt.html">14.2. Region Management</a></span></dt><dd><dl><dt><span class="section"><a href="ops.regionmgt.html#ops.regionmgt.majorcompact">14.2.1. Major Com
 paction</a></span></dt><dt><span class="section"><a href="ops.regionmgt.html#ops.regionmgt.merge">14.2.2. Merge</a></span></dt></dl></dd><dt><span class="section"><a href="node.management.html">14.3. Node Management</a></span></dt><dd><dl><dt><span class="section"><a href="node.management.html#decommission">14.3.1. Node Decommission</a></span></dt><dt><span class="section"><a href="node.management.html#rolling">14.3.2. Rolling Restart</a></span></dt></dl></dd><dt><span class="section"><a href="hbase_metrics.html">14.4. HBase Metrics</a></span></dt><dd><dl><dt><span class="section"><a href="hbase_metrics.html#metric_setup">14.4.1. Metric Setup</a></span></dt><dt><span class="section"><a href="hbase_metrics.html#rs_metrics">14.4.2. RegionServer Metrics</a></span></dt></dl></dd><dt><span class="section"><a href="ops.monitoring.html">14.5. HBase Monitoring</a></span></dt><dd><dl><dt><span class="section"><a href="ops.monitoring.html#ops.monitoring.overview">14.5.1. Overview</a><
 /span></dt><dt><span class="section"><a href="ops.monitoring.html#ops.slow.query">14.5.2. Slow Query Log</a></span></dt></dl></dd><dt><span class="section"><a href="cluster_replication.html">14.6. Cluster Replication</a></span></dt><dt><span class="section"><a href="ops.backup.html">14.7. HBase Backup</a></span></dt><dd><dl><dt><span class="section"><a href="ops.backup.html#ops.backup.fullshutdown">14.7.1. Full Shutdown Backup</a></span></dt><dt><span class="section"><a href="ops.backup.html#ops.backup.live.replication">14.7.2. Live Cluster Backup - Replication</a></span></dt><dt><span class="section"><a href="ops.backup.html#ops.backup.live.copytable">14.7.3. Live Cluster Backup - CopyTable</a></span></dt><dt><span class="section"><a href="ops.backup.html#ops.backup.live.export">14.7.4. Live Cluster Backup - Export</a></span></dt></dl></dd><dt><span class="section"><a href="ops.capacity.html">14.8. Capacity Planning</a></span></dt><dd><dl><dt><span class="section"><a href="
 ops.capacity.html#ops.capacity.storage">14.8.1. Storage</a></span></dt><dt><span class="section"><a href="ops.capacity.html#ops.capacity.regions">14.8.2. Regions</a></span></dt></dl></dd></dl></div>
   This chapter will cover operational tools and practices required of a running Apache HBase cluster.
   The subject of operations is related to the topics of <a class="xref" href="trouble.html" title="Chapter&nbsp;12.&nbsp;Troubleshooting and Debugging Apache HBase (TM)">Chapter&nbsp;12, <i>Troubleshooting and Debugging Apache HBase (TM)</i></a>, <a class="xref" href="performance.html" title="Chapter&nbsp;11.&nbsp;Apache HBase (TM) Performance Tuning">Chapter&nbsp;11, <i>Apache HBase (TM) Performance Tuning</i></a>,
   and <a class="xref" href="configuration.html" title="Chapter&nbsp;2.&nbsp;Apache HBase (TM) Configuration">Chapter&nbsp;2, <i>Apache HBase (TM) Configuration</i></a> but is a distinct topic in itself.
@@ -33,7 +33,7 @@ Valid program names are:
         Passing <span class="command"><strong>-fix</strong></span> may correct the inconsistency (This latter
         is an experimental feature).
         </p><p>For more information, see <a class="xref" href="hbck.in.depth.html" title="Appendix&nbsp;B.&nbsp;hbck In Depth">Appendix&nbsp;B, <i>hbck In Depth</i></a>.
-        </p></div><div class="section" title="14.1.3.&nbsp;HFile Tool"><div class="titlepage"><div><div><h3 class="title"><a name="hfile_tool2"></a>14.1.3.&nbsp;HFile Tool</h3></div></div></div><p>See <a class="xref" href="regions.arch.html#hfile_tool" title="9.7.5.2.2.&nbsp;HFile Tool">Section&nbsp;9.7.5.2.2, &#8220;HFile Tool&#8221;</a>.</p></div><div class="section" title="14.1.4.&nbsp;WAL Tools"><div class="titlepage"><div><div><h3 class="title"><a name="wal_tools"></a>14.1.4.&nbsp;WAL Tools</h3></div></div></div><div class="section" title="14.1.4.1.&nbsp;HLog tool"><div class="titlepage"><div><div><h4 class="title"><a name="hlog_tool"></a>14.1.4.1.&nbsp;<code class="classname">HLog</code> tool</h4></div></div></div><p>The main method on <code class="classname">HLog</code> offers manual
+        </p></div><div class="section" title="14.1.3.&nbsp;HFile Tool"><div class="titlepage"><div><div><h3 class="title"><a name="hfile_tool2"></a>14.1.3.&nbsp;HFile Tool</h3></div></div></div><p>See <a class="xref" href="regions.arch.html#hfile_tool" title="9.7.6.2.2.&nbsp;HFile Tool">Section&nbsp;9.7.6.2.2, &#8220;HFile Tool&#8221;</a>.</p></div><div class="section" title="14.1.4.&nbsp;WAL Tools"><div class="titlepage"><div><div><h3 class="title"><a name="wal_tools"></a>14.1.4.&nbsp;WAL Tools</h3></div></div></div><div class="section" title="14.1.4.1.&nbsp;HLog tool"><div class="titlepage"><div><div><h4 class="title"><a name="hlog_tool"></a>14.1.4.1.&nbsp;<code class="classname">HLog</code> tool</h4></div></div></div><p>The main method on <code class="classname">HLog</code> offers manual
         split and dump facilities. Pass it WALs or the product of a split, the
         content of the <code class="filename">recovered.edits</code>. directory.</p><p>You can get a textual dump of a WAL file content by doing the
         following:</p><pre class="programlisting"> <code class="code">$ ./bin/hbase org.apache.hadoop.hbase.regionserver.wal.HLog --dump hdfs://example.org:8020/hbase/.logs/example.org,60020,1283516293161/10.10.21.10%3A60020.1283973724012</code> </pre><p>The
@@ -160,7 +160,12 @@ row10	c1	c2
          <code class="code">hbase.mapreduce.scan.column.family</code> to specify scanning a single column family.
          </p><pre class="programlisting">$ bin/hbase org.apache.hadoop.hbase.mapreduce.CellCounter &lt;tablename&gt; &lt;outputDir&gt; [regex or prefix]</pre><p>
        </p><p>Note: just like RowCounter, caching for the input Scan is configured via <code class="code">hbase.client.scanner.caching</code> in the
-       job configuration. </p></div></div></div><div id="disqus_thread"></div><script type="text/javascript">
+       job configuration. </p></div><div class="section" title="14.1.13.&nbsp;mlockall"><div class="titlepage"><div><div><h3 class="title"><a name="mlockall"></a>14.1.13.&nbsp;mlockall</h3></div></div></div><p>It is possible to optionally pin your servers in physical memory making them less likely
+            to be swapped out in oversubscribed environments by having the servers call
+            <a class="link" href="http://linux.die.net/man/2/mlockall" target="_top">mlockall</a> on startup.
+            See <a class="link" href="https://issues.apache.org/jira/browse/HBASE-4391" target="_top">HBASE-4391 Add ability to start RS as root and call mlockall</a>
+            for how to build the optional library and have it run on startup.
+        </p></div></div></div><div id="disqus_thread"></div><script type="text/javascript">
     var disqus_shortname = 'hbase'; // required: replace example with your forum shortname
     var disqus_url = 'http://hbase.apache.org/book';
     var disqus_identifier = 'ops_mgt';

Modified: hbase/hbase.apache.org/trunk/book/perf.deleting.html
URL: http://svn.apache.org/viewvc/hbase/hbase.apache.org/trunk/book/perf.deleting.html?rev=1463652&r1=1463651&r2=1463652&view=diff
==============================================================================
--- hbase/hbase.apache.org/trunk/book/perf.deleting.html (original)
+++ hbase/hbase.apache.org/trunk/book/perf.deleting.html Tue Apr  2 18:06:19 2013
@@ -3,7 +3,7 @@
    <title>11.10.&nbsp;Deleting from HBase</title><link rel="stylesheet" type="text/css" href="../css/freebsd_docbook.css"><meta name="generator" content="DocBook XSL-NS Stylesheets V1.76.1"><link rel="home" href="book.html" title="The Apache HBase&#153; Reference Guide"><link rel="up" href="performance.html" title="Chapter&nbsp;11.&nbsp;Apache HBase (TM) Performance Tuning"><link rel="prev" href="perf.reading.html" title="11.9.&nbsp;Reading from HBase"><link rel="next" href="perf.hdfs.html" title="11.11.&nbsp;HDFS"></head><body bgcolor="white" text="black" link="#0000FF" vlink="#840084" alink="#0000FF"><div class="navheader"><table width="100%" summary="Navigation header"><tr><th colspan="3" align="center">11.10.&nbsp;Deleting from HBase</th></tr><tr><td width="20%" align="left"><a accesskey="p" href="perf.reading.html">Prev</a>&nbsp;</td><th width="60%" align="center">Chapter&nbsp;11.&nbsp;Apache HBase (TM) Performance Tuning</th><td width="20%" align="right">&nbsp;<a acces
 skey="n" href="perf.hdfs.html">Next</a></td></tr></table><hr></div><div class="section" title="11.10.&nbsp;Deleting from HBase"><div class="titlepage"><div><div><h2 class="title" style="clear: both"><a name="perf.deleting"></a>11.10.&nbsp;Deleting from HBase</h2></div></div></div><div class="section" title="11.10.1.&nbsp;Using HBase Tables as Queues"><div class="titlepage"><div><div><h3 class="title"><a name="perf.deleting.queue"></a>11.10.1.&nbsp;Using HBase Tables as Queues</h3></div></div></div><p>HBase tables are sometimes used as queues.  In this case, special care must be taken to regularly perform major compactions on tables used in
        this manner.  As is documented in <a class="xref" href="datamodel.html" title="Chapter&nbsp;5.&nbsp;Data Model">Chapter&nbsp;5, <i>Data Model</i></a>, marking rows as deleted creates additional StoreFiles which then need to be processed
        on reads.  Tombstones only get cleaned up with major compactions.
-       </p><p>See also <a class="xref" href="regions.arch.html#compaction" title="9.7.5.5.&nbsp;Compaction">Section&nbsp;9.7.5.5, &#8220;Compaction&#8221;</a> and <a class="link" href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/HBaseAdmin.html#majorCompact%28java.lang.String%29" target="_top">HBaseAdmin.majorCompact</a>.
+       </p><p>See also <a class="xref" href="regions.arch.html#compaction" title="9.7.6.5.&nbsp;Compaction">Section&nbsp;9.7.6.5, &#8220;Compaction&#8221;</a> and <a class="link" href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/HBaseAdmin.html#majorCompact%28java.lang.String%29" target="_top">HBaseAdmin.majorCompact</a>.
        </p></div><div class="section" title="11.10.2.&nbsp;Delete RPC Behavior"><div class="titlepage"><div><div><h3 class="title"><a name="perf.deleting.rpc"></a>11.10.2.&nbsp;Delete RPC Behavior</h3></div></div></div><p>Be aware that <code class="code">htable.delete(Delete)</code> doesn't use the writeBuffer.  It will execute an RegionServer RPC with each invocation.
        For a large number of deletes, consider <code class="code">htable.delete(List)</code>.
        </p><p>See <a class="link" href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/HTable.html#delete%28org.apache.hadoop.hbase.client.Delete%29" target="_top">http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/HTable.html#delete%28org.apache.hadoop.hbase.client.Delete%29</a>

Modified: hbase/hbase.apache.org/trunk/book/perf.hdfs.html
URL: http://svn.apache.org/viewvc/hbase/hbase.apache.org/trunk/book/perf.hdfs.html?rev=1463652&r1=1463651&r2=1463652&view=diff
==============================================================================
--- hbase/hbase.apache.org/trunk/book/perf.hdfs.html (original)
+++ hbase/hbase.apache.org/trunk/book/perf.hdfs.html Tue Apr  2 18:06:19 2013
@@ -13,7 +13,7 @@ read directly from disk instead of going
 data is local. What this means for HBase is that the RegionServers can
 read directly off their machine's disks instead of having to open a
 socket to talk to the DataNode, the former being generally much
-faster<sup>[<a name="d2279e7533" href="#ftn.d2279e7533" class="footnote">27</a>]</sup>.
+faster<sup>[<a name="d2475e7777" href="#ftn.d2475e7777" class="footnote">27</a>]</sup>.
 Also see <a class="link" href="http://search-hadoop.com/m/zV6dKrLCVh1" target="_top">HBase, mail # dev - read short circuit</a> thread for
 more discussion around short circuit reads.
 </p><p>To enable "short circuit" reads, you must set two configurations.
@@ -37,7 +37,7 @@ the data will still be read.
      returning the most current row or specified timestamps, etc.), and as such HBase is 4-5 times slower than HDFS in this
      processing context.  Not that there isn't room for improvement (and this gap will, over time, be reduced), but HDFS
       will always be faster in this use-case.
-     </p></div><div class="footnotes"><br><hr width="100" align="left"><div class="footnote"><p><sup>[<a id="ftn.d2279e7533" href="#d2279e7533" class="para">27</a>] </sup>See JD's <a class="link" href="http://files.meetup.com/1350427/hug_ebay_jdcryans.pdf" target="_top">Performance Talk</a></p></div></div></div><div id="disqus_thread"></div><script type="text/javascript">
+     </p></div><div class="footnotes"><br><hr width="100" align="left"><div class="footnote"><p><sup>[<a id="ftn.d2475e7777" href="#d2475e7777" class="para">27</a>] </sup>See JD's <a class="link" href="http://files.meetup.com/1350427/hug_ebay_jdcryans.pdf" target="_top">Performance Talk</a></p></div></div></div><div id="disqus_thread"></div><script type="text/javascript">
     var disqus_shortname = 'hbase'; // required: replace example with your forum shortname
     var disqus_url = 'http://hbase.apache.org/book';
     var disqus_identifier = 'perf.hdfs';

Modified: hbase/hbase.apache.org/trunk/book/perf.reading.html
URL: http://svn.apache.org/viewvc/hbase/hbase.apache.org/trunk/book/perf.reading.html?rev=1463652&r1=1463651&r2=1463652&view=diff
==============================================================================
--- hbase/hbase.apache.org/trunk/book/perf.reading.html (original)
+++ hbase/hbase.apache.org/trunk/book/perf.reading.html Tue Apr  2 18:06:19 2013
@@ -53,34 +53,34 @@ htable.close();</pre></div><div class="s
     Table Creation: Pre-Creating Regions
     &#8221;</a>, as well as <a class="xref" href="perf.configurations.html" title="11.4.&nbsp;HBase Configurations">Section&nbsp;11.4, &#8220;HBase Configurations&#8221;</a> </p></div><div class="section" title="11.9.8.&nbsp;Bloom Filters"><div class="titlepage"><div><div><h3 class="title"><a name="blooms"></a>11.9.8.&nbsp;Bloom Filters</h3></div></div></div><p>Enabling Bloom Filters can save your having to go to disk and
          can help improve read latencys.</p><p><a class="link" href="http://en.wikipedia.org/wiki/Bloom_filter" target="_top">Bloom filters</a> were developed over in <a class="link" href="https://issues.apache.org/jira/browse/HBASE-1200" target="_top">HBase-1200
-    Add bloomfilters</a>.<sup>[<a name="d2279e7358" href="#ftn.d2279e7358" class="footnote">25</a>]</sup><sup>[<a name="d2279e7370" href="#ftn.d2279e7370" class="footnote">26</a>]</sup></p><p>See also <a class="xref" href="perf.schema.html#schema.bloom" title="11.6.4.&nbsp;Bloom Filters">Section&nbsp;11.6.4, &#8220;Bloom Filters&#8221;</a>.
+    Add bloomfilters</a>.<sup>[<a name="d2475e7602" href="#ftn.d2475e7602" class="footnote">25</a>]</sup><sup>[<a name="d2475e7614" href="#ftn.d2475e7614" class="footnote">26</a>]</sup></p><p>See also <a class="xref" href="perf.schema.html#schema.bloom" title="11.6.4.&nbsp;Bloom Filters">Section&nbsp;11.6.4, &#8220;Bloom Filters&#8221;</a>.
         </p><div class="section" title="11.9.8.1.&nbsp;Bloom StoreFile footprint"><div class="titlepage"><div><div><h4 class="title"><a name="bloom_footprint"></a>11.9.8.1.&nbsp;Bloom StoreFile footprint</h4></div></div></div><p>Bloom filters add an entry to the <code class="classname">StoreFile</code>
       general <code class="classname">FileInfo</code> data structure and then two
       extra entries to the <code class="classname">StoreFile</code> metadata
-      section.</p><div class="section" title="11.9.8.1.1.&nbsp;BloomFilter in the StoreFile FileInfo data structure"><div class="titlepage"><div><div><h5 class="title"><a name="d2279e7394"></a>11.9.8.1.1.&nbsp;BloomFilter in the <code class="classname">StoreFile</code>
+      section.</p><div class="section" title="11.9.8.1.1.&nbsp;BloomFilter in the StoreFile FileInfo data structure"><div class="titlepage"><div><div><h5 class="title"><a name="d2475e7638"></a>11.9.8.1.1.&nbsp;BloomFilter in the <code class="classname">StoreFile</code>
         <code class="classname">FileInfo</code> data structure</h5></div></div></div><p><code class="classname">FileInfo</code> has a
           <code class="varname">BLOOM_FILTER_TYPE</code> entry which is set to
           <code class="varname">NONE</code>, <code class="varname">ROW</code> or
-          <code class="varname">ROWCOL.</code></p></div><div class="section" title="11.9.8.1.2.&nbsp;BloomFilter entries in StoreFile metadata"><div class="titlepage"><div><div><h5 class="title"><a name="d2279e7418"></a>11.9.8.1.2.&nbsp;BloomFilter entries in <code class="classname">StoreFile</code>
+          <code class="varname">ROWCOL.</code></p></div><div class="section" title="11.9.8.1.2.&nbsp;BloomFilter entries in StoreFile metadata"><div class="titlepage"><div><div><h5 class="title"><a name="d2475e7662"></a>11.9.8.1.2.&nbsp;BloomFilter entries in <code class="classname">StoreFile</code>
         metadata</h5></div></div></div><p><code class="varname">BLOOM_FILTER_META</code> holds Bloom Size, Hash
           Function used, etc. Its small in size and is cached on
           <code class="classname">StoreFile.Reader</code> load</p><p><code class="varname">BLOOM_FILTER_DATA</code> is the actual bloomfilter
           data. Obtained on-demand. Stored in the LRU cache, if it is enabled
-          (Its enabled by default).</p></div></div><div class="section" title="11.9.8.2.&nbsp;Bloom Filter Configuration"><div class="titlepage"><div><div><h4 class="title"><a name="config.bloom"></a>11.9.8.2.&nbsp;Bloom Filter Configuration</h4></div></div></div><div class="section" title="11.9.8.2.1.&nbsp;io.hfile.bloom.enabled global kill switch"><div class="titlepage"><div><div><h5 class="title"><a name="d2279e7438"></a>11.9.8.2.1.&nbsp;<code class="varname">io.hfile.bloom.enabled</code> global kill
+          (Its enabled by default).</p></div></div><div class="section" title="11.9.8.2.&nbsp;Bloom Filter Configuration"><div class="titlepage"><div><div><h4 class="title"><a name="config.bloom"></a>11.9.8.2.&nbsp;Bloom Filter Configuration</h4></div></div></div><div class="section" title="11.9.8.2.1.&nbsp;io.hfile.bloom.enabled global kill switch"><div class="titlepage"><div><div><h5 class="title"><a name="d2475e7682"></a>11.9.8.2.1.&nbsp;<code class="varname">io.hfile.bloom.enabled</code> global kill
         switch</h5></div></div></div><p><code class="code">io.hfile.bloom.enabled</code> in
         <code class="classname">Configuration</code> serves as the kill switch in case
-        something goes wrong. Default = <code class="varname">true</code>.</p></div><div class="section" title="11.9.8.2.2.&nbsp;io.hfile.bloom.error.rate"><div class="titlepage"><div><div><h5 class="title"><a name="d2279e7453"></a>11.9.8.2.2.&nbsp;<code class="varname">io.hfile.bloom.error.rate</code></h5></div></div></div><p><code class="varname">io.hfile.bloom.error.rate</code> = average false
+        something goes wrong. Default = <code class="varname">true</code>.</p></div><div class="section" title="11.9.8.2.2.&nbsp;io.hfile.bloom.error.rate"><div class="titlepage"><div><div><h5 class="title"><a name="d2475e7697"></a>11.9.8.2.2.&nbsp;<code class="varname">io.hfile.bloom.error.rate</code></h5></div></div></div><p><code class="varname">io.hfile.bloom.error.rate</code> = average false
         positive rate. Default = 1%. Decrease rate by &frac12; (e.g. to .5%) == +1
-        bit per bloom entry.</p></div><div class="section" title="11.9.8.2.3.&nbsp;io.hfile.bloom.max.fold"><div class="titlepage"><div><div><h5 class="title"><a name="d2279e7461"></a>11.9.8.2.3.&nbsp;<code class="varname">io.hfile.bloom.max.fold</code></h5></div></div></div><p><code class="varname">io.hfile.bloom.max.fold</code> = guaranteed minimum
+        bit per bloom entry.</p></div><div class="section" title="11.9.8.2.3.&nbsp;io.hfile.bloom.max.fold"><div class="titlepage"><div><div><h5 class="title"><a name="d2475e7705"></a>11.9.8.2.3.&nbsp;<code class="varname">io.hfile.bloom.max.fold</code></h5></div></div></div><p><code class="varname">io.hfile.bloom.max.fold</code> = guaranteed minimum
         fold rate. Most people should leave this alone. Default = 7, or can
         collapse to at least 1/128th of original size. See the
         <span class="emphasis"><em>Development Process</em></span> section of the document <a class="link" href="https://issues.apache.org/jira/secure/attachment/12444007/Bloom_Filters_in_HBase.pdf" target="_top">BloomFilters
-        in HBase</a> for more on what this option means.</p></div></div></div><div class="footnotes"><br><hr width="100" align="left"><div class="footnote"><p><sup>[<a id="ftn.d2279e7358" href="#d2279e7358" class="para">25</a>] </sup>For description of the development process -- why static blooms
+        in HBase</a> for more on what this option means.</p></div></div></div><div class="footnotes"><br><hr width="100" align="left"><div class="footnote"><p><sup>[<a id="ftn.d2475e7602" href="#d2475e7602" class="para">25</a>] </sup>For description of the development process -- why static blooms
         rather than dynamic -- and for an overview of the unique properties
         that pertain to blooms in HBase, as well as possible future
         directions, see the <span class="emphasis"><em>Development Process</em></span> section
         of the document <a class="link" href="https://issues.apache.org/jira/secure/attachment/12444007/Bloom_Filters_in_HBase.pdf" target="_top">BloomFilters
-        in HBase</a> attached to <a class="link" href="https://issues.apache.org/jira/browse/HBASE-1200" target="_top">HBase-1200</a>.</p></div><div class="footnote"><p><sup>[<a id="ftn.d2279e7370" href="#d2279e7370" class="para">26</a>] </sup>The bloom filters described here are actually version two of
+        in HBase</a> attached to <a class="link" href="https://issues.apache.org/jira/browse/HBASE-1200" target="_top">HBase-1200</a>.</p></div><div class="footnote"><p><sup>[<a id="ftn.d2475e7614" href="#d2475e7614" class="para">26</a>] </sup>The bloom filters described here are actually version two of
         blooms in HBase. In versions up to 0.19.x, HBase had a dynamic bloom
         option based on work done by the <a class="link" href="http://www.one-lab.org" target="_top">European Commission One-Lab
         Project 034819</a>. The core of the HBase bloom work was later

Modified: hbase/hbase.apache.org/trunk/book/perf.schema.html
URL: http://svn.apache.org/viewvc/hbase/hbase.apache.org/trunk/book/perf.schema.html?rev=1463652&r1=1463651&r2=1463652&view=diff
==============================================================================
--- hbase/hbase.apache.org/trunk/book/perf.schema.html (original)
+++ hbase/hbase.apache.org/trunk/book/perf.schema.html Tue Apr  2 18:06:19 2013
@@ -21,7 +21,7 @@
     There is an inverse relationship between blocksize and the resulting StoreFile indexes (i.e., if the blocksize is doubled then the resulting
     indexes should be roughly halved).
     </p><p>See <a class="link" href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/HColumnDescriptor.html" target="_top">HColumnDescriptor</a>
-    and <a class="xref" href="regions.arch.html#store" title="9.7.5.&nbsp;Store">Section&nbsp;9.7.5, &#8220;Store&#8221;</a>for more information.
+    and <a class="xref" href="regions.arch.html#store" title="9.7.6.&nbsp;Store">Section&nbsp;9.7.6, &#8220;Store&#8221;</a>for more information.
     </p></div><div class="section" title="11.6.6.&nbsp;In-Memory ColumnFamilies"><div class="titlepage"><div><div><h3 class="title"><a name="cf.in.memory"></a>11.6.6.&nbsp;In-Memory ColumnFamilies</h3></div></div></div><p>ColumnFamilies can optionally be defined as in-memory.  Data is still persisted to disk, just like any other ColumnFamily.
     In-memory blocks have the highest priority in the <a class="xref" href="regionserver.arch.html#block.cache" title="9.6.4.&nbsp;Block Cache">Section&nbsp;9.6.4, &#8220;Block Cache&#8221;</a>, but it is not a guarantee that the entire table
     will be in memory.
@@ -31,7 +31,7 @@
          MemStore) or on the wire (e.g., transferring between RegionServer and Client) it's inflated.
          So while using ColumnFamily compression is a best practice, but it's not going to completely eliminate
          the impact of over-sized Keys, over-sized ColumnFamily names, or over-sized Column names.
-         </p><p>See <a class="xref" href="rowkey.design.html#keysize" title="6.3.2.&nbsp;Try to minimize row and column sizes">Section&nbsp;6.3.2, &#8220;Try to minimize row and column sizes&#8221;</a> on for schema design tips, and <a class="xref" href="regions.arch.html#keyvalue" title="9.7.5.4.&nbsp;KeyValue">Section&nbsp;9.7.5.4, &#8220;KeyValue&#8221;</a> for more information on HBase stores data internally.
+         </p><p>See <a class="xref" href="rowkey.design.html#keysize" title="6.3.2.&nbsp;Try to minimize row and column sizes">Section&nbsp;6.3.2, &#8220;Try to minimize row and column sizes&#8221;</a> on for schema design tips, and <a class="xref" href="regions.arch.html#keyvalue" title="9.7.6.4.&nbsp;KeyValue">Section&nbsp;9.7.6.4, &#8220;KeyValue&#8221;</a> for more information on HBase stores data internally.
          </p></div></div></div><div id="disqus_thread"></div><script type="text/javascript">
     var disqus_shortname = 'hbase'; // required: replace example with your forum shortname
     var disqus_url = 'http://hbase.apache.org/book';

Modified: hbase/hbase.apache.org/trunk/book/physical.view.html
URL: http://svn.apache.org/viewvc/hbase/hbase.apache.org/trunk/book/physical.view.html?rev=1463652&r1=1463651&r2=1463652&view=diff
==============================================================================
--- hbase/hbase.apache.org/trunk/book/physical.view.html (original)
+++ hbase/hbase.apache.org/trunk/book/physical.view.html Tue Apr  2 18:06:19 2013
@@ -5,8 +5,8 @@
         Physically they are stored on a per-column family basis.  New columns
         (i.e., <code class="varname">columnfamily:column</code>) can be added to any
         column family without pre-announcing them.
-        </p><div class="table"><a name="d2279e3126"></a><p class="title"><b>Table&nbsp;5.2.&nbsp;ColumnFamily <code class="varname">anchor</code></b></p><div class="table-contents"><table summary="ColumnFamily anchor" border="1"><colgroup><col align="left" class="c1"><col align="left" class="c2"><col align="left" class="c3"></colgroup><thead><tr><th align="left">Row Key</th><th align="left">Time Stamp</th><th align="left">Column Family <code class="varname">anchor</code></th></tr></thead><tbody><tr><td align="left">"com.cnn.www"</td><td align="left">t9</td><td align="left"><code class="varname">anchor:cnnsi.com</code> = "CNN"</td></tr><tr><td align="left">"com.cnn.www"</td><td align="left">t8</td><td align="left"><code class="varname">anchor:my.look.ca</code> = "CNN.com"</td></tr></tbody></table></div></div><p><br class="table-break">
-    </p><div class="table"><a name="d2279e3165"></a><p class="title"><b>Table&nbsp;5.3.&nbsp;ColumnFamily <code class="varname">contents</code></b></p><div class="table-contents"><table summary="ColumnFamily contents" border="1"><colgroup><col align="left" class="c1"><col align="left" class="c2"><col align="left" class="c3"></colgroup><thead><tr><th align="left">Row Key</th><th align="left">Time Stamp</th><th align="left">ColumnFamily "contents:"</th></tr></thead><tbody><tr><td align="left">"com.cnn.www"</td><td align="left">t6</td><td align="left"><code class="varname">contents:html</code> = "&lt;html&gt;..."</td></tr><tr><td align="left">"com.cnn.www"</td><td align="left">t5</td><td align="left"><code class="varname">contents:html</code> = "&lt;html&gt;..."</td></tr><tr><td align="left">"com.cnn.www"</td><td align="left">t3</td><td align="left"><code class="varname">contents:html</code> = "&lt;html&gt;..."</td></tr></tbody></table></div></div><p><br class="table-break">
+        </p><div class="table"><a name="d2475e3130"></a><p class="title"><b>Table&nbsp;5.2.&nbsp;ColumnFamily <code class="varname">anchor</code></b></p><div class="table-contents"><table summary="ColumnFamily anchor" border="1"><colgroup><col align="left" class="c1"><col align="left" class="c2"><col align="left" class="c3"></colgroup><thead><tr><th align="left">Row Key</th><th align="left">Time Stamp</th><th align="left">Column Family <code class="varname">anchor</code></th></tr></thead><tbody><tr><td align="left">"com.cnn.www"</td><td align="left">t9</td><td align="left"><code class="varname">anchor:cnnsi.com</code> = "CNN"</td></tr><tr><td align="left">"com.cnn.www"</td><td align="left">t8</td><td align="left"><code class="varname">anchor:my.look.ca</code> = "CNN.com"</td></tr></tbody></table></div></div><p><br class="table-break">
+    </p><div class="table"><a name="d2475e3169"></a><p class="title"><b>Table&nbsp;5.3.&nbsp;ColumnFamily <code class="varname">contents</code></b></p><div class="table-contents"><table summary="ColumnFamily contents" border="1"><colgroup><col align="left" class="c1"><col align="left" class="c2"><col align="left" class="c3"></colgroup><thead><tr><th align="left">Row Key</th><th align="left">Time Stamp</th><th align="left">ColumnFamily "contents:"</th></tr></thead><tbody><tr><td align="left">"com.cnn.www"</td><td align="left">t6</td><td align="left"><code class="varname">contents:html</code> = "&lt;html&gt;..."</td></tr><tr><td align="left">"com.cnn.www"</td><td align="left">t5</td><td align="left"><code class="varname">contents:html</code> = "&lt;html&gt;..."</td></tr><tr><td align="left">"com.cnn.www"</td><td align="left">t3</td><td align="left"><code class="varname">contents:html</code> = "&lt;html&gt;..."</td></tr></tbody></table></div></div><p><br class="table-break">
     It is important to note in the diagram above that the empty cells shown in the
     conceptual view are not stored since they need not be in a column-oriented
     storage format. Thus a request for the value of the <code class="varname">contents:html</code>

Modified: hbase/hbase.apache.org/trunk/book/preface.html
URL: http://svn.apache.org/viewvc/hbase/hbase.apache.org/trunk/book/preface.html?rev=1463652&r1=1463651&r2=1463652&view=diff
==============================================================================
--- hbase/hbase.apache.org/trunk/book/preface.html (original)
+++ hbase/hbase.apache.org/trunk/book/preface.html Tue Apr  2 18:06:19 2013
@@ -1,7 +1,7 @@
 <html><head>
       <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
    <title>Preface</title><link rel="stylesheet" type="text/css" href="../css/freebsd_docbook.css"><meta name="generator" content="DocBook XSL-NS Stylesheets V1.76.1"><link rel="home" href="book.html" title="The Apache HBase&#153; Reference Guide"><link rel="up" href="book.html" title="The Apache HBase&#153; Reference Guide"><link rel="prev" href="book.html" title="The Apache HBase&#153; Reference Guide"><link rel="next" href="getting_started.html" title="Chapter&nbsp;1.&nbsp;Getting Started"></head><body bgcolor="white" text="black" link="#0000FF" vlink="#840084" alink="#0000FF"><div class="navheader"><table width="100%" summary="Navigation header"><tr><th colspan="3" align="center">Preface</th></tr><tr><td width="20%" align="left"><a accesskey="p" href="book.html">Prev</a>&nbsp;</td><th width="60%" align="center">&nbsp;</th><td width="20%" align="right">&nbsp;<a accesskey="n" href="getting_started.html">Next</a></td></tr></table><hr></div><div class="preface" title="Preface
 "><div class="titlepage"><div><div><h2 class="title"><a name="preface"></a>Preface</h2></div></div></div><p>This is the official reference guide for the <a class="link" href="http://hbase.apache.org/" target="_top">HBase</a> version it ships with.
-  This document describes HBase version <span class="emphasis"><em>0.97-SNAPSHOT</em></span>.
+  This document describes HBase version <span class="emphasis"><em>0.97.0-SNAPSHOT</em></span>.
   Herein you will find either the definitive documentation on an HBase topic
   as of its standing when the referenced HBase version shipped, or it
   will point to the location in <a class="link" href="http://hbase.apache.org/apidocs/index.html" target="_top">javadoc</a>,

Modified: hbase/hbase.apache.org/trunk/book/quickstart.html
URL: http://svn.apache.org/viewvc/hbase/hbase.apache.org/trunk/book/quickstart.html?rev=1463652&r1=1463651&r2=1463652&view=diff
==============================================================================
--- hbase/hbase.apache.org/trunk/book/quickstart.html (original)
+++ hbase/hbase.apache.org/trunk/book/quickstart.html Tue Apr  2 18:06:19 2013
@@ -11,14 +11,14 @@
             127.0.0.1 localhost
             127.0.0.1 ubuntu.ubuntu-domain ubuntu
 </pre><p>
-        </p></div><div class="section" title="1.2.1.&nbsp;Download and unpack the latest stable release."><div class="titlepage"><div><div><h3 class="title"><a name="d2279e105"></a>1.2.1.&nbsp;Download and unpack the latest stable release.</h3></div></div></div><p>Choose a download site from this list of <a class="link" href="http://www.apache.org/dyn/closer.cgi/hbase/" target="_top">Apache Download
+        </p></div><div class="section" title="1.2.1.&nbsp;Download and unpack the latest stable release."><div class="titlepage"><div><div><h3 class="title"><a name="d2475e105"></a>1.2.1.&nbsp;Download and unpack the latest stable release.</h3></div></div></div><p>Choose a download site from this list of <a class="link" href="http://www.apache.org/dyn/closer.cgi/hbase/" target="_top">Apache Download
       Mirrors</a>. Click on the suggested top link. This will take you to a
       mirror of <span class="emphasis"><em>HBase Releases</em></span>. Click on the folder named
       <code class="filename">stable</code> and then download the file that ends in
       <code class="filename">.tar.gz</code> to your local filesystem; e.g.
       <code class="filename">hbase-0.94.2.tar.gz</code>.</p><p>Decompress and untar your download and then change into the
-      unpacked directory.</p><pre class="programlisting">$ tar xfz hbase-0.97-SNAPSHOT.tar.gz
-$ cd hbase-0.97-SNAPSHOT
+      unpacked directory.</p><pre class="programlisting">$ tar xfz hbase-0.97.0-SNAPSHOT.tar.gz
+$ cd hbase-0.97.0-SNAPSHOT
 </pre><p>At this point, you are ready to start HBase. But before starting
       it, edit <code class="filename">conf/hbase-site.xml</code>, the file you write
       your site-specific configurations into. Set
@@ -93,7 +93,7 @@ cf:a        timestamp=1288380727188, val
 0 row(s) in 1.0930 seconds
 hbase(main):013:0&gt; drop 'test'
 0 row(s) in 0.0770 seconds </pre><p>Exit the shell by typing exit.</p><pre class="programlisting">hbase(main):014:0&gt; exit</pre></div><div class="section" title="1.2.4.&nbsp;Stopping HBase"><div class="titlepage"><div><div><h3 class="title"><a name="stopping"></a>1.2.4.&nbsp;Stopping HBase</h3></div></div></div><p>Stop your hbase instance by running the stop script.</p><pre class="programlisting">$ ./bin/stop-hbase.sh
-stopping hbase...............</pre></div><div class="section" title="1.2.5.&nbsp;Where to go next"><div class="titlepage"><div><div><h3 class="title"><a name="d2279e265"></a>1.2.5.&nbsp;Where to go next</h3></div></div></div><p>The above described standalone setup is good for testing and
+stopping hbase...............</pre></div><div class="section" title="1.2.5.&nbsp;Where to go next"><div class="titlepage"><div><div><h3 class="title"><a name="d2475e265"></a>1.2.5.&nbsp;Where to go next</h3></div></div></div><p>The above described standalone setup is good for testing and
           experiments only. In the next chapter, <a class="xref" href="configuration.html" title="Chapter&nbsp;2.&nbsp;Apache HBase (TM) Configuration">Chapter&nbsp;2, <i>Apache HBase (TM) Configuration</i></a>,
       we'll go into depth on the different HBase run modes, system requirements
       running HBase, and critical configurations setting up a distributed HBase deploy.</p></div></div><div id="disqus_thread"></div><script type="text/javascript">

Modified: hbase/hbase.apache.org/trunk/book/regions.arch.html
URL: http://svn.apache.org/viewvc/hbase/hbase.apache.org/trunk/book/regions.arch.html?rev=1463652&r1=1463651&r2=1463652&view=diff
==============================================================================
--- hbase/hbase.apache.org/trunk/book/regions.arch.html (original)
+++ hbase/hbase.apache.org/trunk/book/regions.arch.html Tue Apr  2 18:06:19 2013
@@ -57,24 +57,39 @@
           to the RegionServer.
         </p><p>For more information, see <a class="link" href="http://hadoop.apache.org/common/docs/r0.20.205.0/hdfs_design.html#Replica+Placement%3A+The+First+Baby+Steps" target="_top">HDFS Design on Replica Placement</a>
         and also Lars George's blog on <a class="link" href="http://www.larsgeorge.com/2010/05/hbase-file-locality-in-hdfs.html" target="_top">HBase and HDFS locality</a>.
-        </p></div><div class="section" title="9.7.4.&nbsp;Region Splits"><div class="titlepage"><div><div><h3 class="title"><a name="d2279e5530"></a>9.7.4.&nbsp;Region Splits</h3></div></div></div><p>Splits run unaided on the RegionServer; i.e. the Master does not
+        </p></div><div class="section" title="9.7.4.&nbsp;Region Splits"><div class="titlepage"><div><div><h3 class="title"><a name="d2475e5763"></a>9.7.4.&nbsp;Region Splits</h3></div></div></div><p>Splits run unaided on the RegionServer; i.e. the Master does not
         participate. The RegionServer splits a region, offlines the split
         region and then adds the daughter regions to META, opens daughters on
         the parent's hosting RegionServer and then reports the split to the
         Master. See <a class="xref" href="important_configurations.html#disable.splitting" title="2.5.2.7.&nbsp;Managed Splitting">Section&nbsp;2.5.2.7, &#8220;Managed Splitting&#8221;</a> for how to manually manage
-        splits (and for why you might do this)</p><div class="section" title="9.7.4.1.&nbsp;Custom Split Policies"><div class="titlepage"><div><div><h4 class="title"><a name="d2279e5537"></a>9.7.4.1.&nbsp;Custom Split Policies</h4></div></div></div><p>The default split policy can be overwritten using a custom <a class="link" href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/regionserver/RegionSplitPolicy.html" target="_top">RegionSplitPolicy</a> (HBase 0.94+).
+        splits (and for why you might do this)</p><div class="section" title="9.7.4.1.&nbsp;Custom Split Policies"><div class="titlepage"><div><div><h4 class="title"><a name="d2475e5770"></a>9.7.4.1.&nbsp;Custom Split Policies</h4></div></div></div><p>The default split policy can be overwritten using a custom <a class="link" href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/regionserver/RegionSplitPolicy.html" target="_top">RegionSplitPolicy</a> (HBase 0.94+).
           Typically a custom split policy should extend HBase's default split policy: <a class="link" href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/regionserver/ConstantSizeRegionSplitPolicy.html" target="_top">ConstantSizeRegionSplitPolicy</a>.
           </p><p>The policy can set globally through the HBaseConfiguration used or on a per table basis:
 </p><pre class="programlisting">
 HTableDescriptor myHtd = ...;
 myHtd.setValue(HTableDescriptor.SPLIT_POLICY, MyCustomSplitPolicy.class.getName());
 </pre><p>
-          </p></div></div><div class="section" title="9.7.5.&nbsp;Store"><div class="titlepage"><div><div><h3 class="title"><a name="store"></a>9.7.5.&nbsp;Store</h3></div></div></div><p>A Store hosts a MemStore and 0 or more StoreFiles (HFiles). A Store corresponds to a column family for a table for a given region.
-          </p><div class="section" title="9.7.5.1.&nbsp;MemStore"><div class="titlepage"><div><div><h4 class="title"><a name="store.memstore"></a>9.7.5.1.&nbsp;MemStore</h4></div></div></div><p>The MemStore holds in-memory modifications to the Store.  Modifications are KeyValues.
+          </p></div></div><div class="section" title="9.7.5.&nbsp;Online Region Merges"><div class="titlepage"><div><div><h3 class="title"><a name="d2475e5786"></a>9.7.5.&nbsp;Online Region Merges</h3></div></div></div><p>Both Master and Regionserver participate in the event of online region merges.
+        Client sends merge RPC to master, then master moves the regions together to the
+        same regionserver where the more heavily loaded region resided, finally master
+        send merge request to this regionserver and regionserver run the region merges. 
+        Similar with process of region splits, region merges run as a local transaction
+        on the regionserver, offlines the regions and then merges two regions on the file
+        system, atomically delete merging regions from META and add merged region to the META,
+        opens merged region on the regionserver and reports the merge to Master at last.
+        </p><p>An example of region merges in the hbase shell
+          </p><pre class="programlisting">$ hbase&gt; merge_region 'ENCODED_REGIONNAME', 'ENCODED_REGIONNAME'
+          hbase&gt; merge_region 'ENCODED_REGIONNAME', 'ENCODED_REGIONNAME', true
+          </pre><p>
+          It's an asynchronous operation and call returns immediately without waiting merge completed.
+          Passing 'true' as the optional third parameter will force a merge ('force' merges regardless 
+          else merge will fail unless passed adjacent regions. 'force' is for expert use only)
+        </p></div><div class="section" title="9.7.6.&nbsp;Store"><div class="titlepage"><div><div><h3 class="title"><a name="store"></a>9.7.6.&nbsp;Store</h3></div></div></div><p>A Store hosts a MemStore and 0 or more StoreFiles (HFiles). A Store corresponds to a column family for a table for a given region.
+          </p><div class="section" title="9.7.6.1.&nbsp;MemStore"><div class="titlepage"><div><div><h4 class="title"><a name="store.memstore"></a>9.7.6.1.&nbsp;MemStore</h4></div></div></div><p>The MemStore holds in-memory modifications to the Store.  Modifications are KeyValues.
        When asked to flush, current memstore is moved to snapshot and is cleared.
        HBase continues to serve edits out of new memstore and backing snapshot until flusher reports in that the
-       flush succeeded. At this point the snapshot is let go.</p></div><div class="section" title="9.7.5.2.&nbsp;StoreFile (HFile)"><div class="titlepage"><div><div><h4 class="title"><a name="hfile"></a>9.7.5.2.&nbsp;StoreFile (HFile)</h4></div></div></div><p>StoreFiles are where your data lives.
-      </p><div class="section" title="9.7.5.2.1.&nbsp;HFile Format"><div class="titlepage"><div><div><h5 class="title"><a name="d2279e5568"></a>9.7.5.2.1.&nbsp;HFile Format</h5></div></div></div><p>The <span class="emphasis"><em>hfile</em></span> file format is based on
+       flush succeeded. At this point the snapshot is let go.</p></div><div class="section" title="9.7.6.2.&nbsp;StoreFile (HFile)"><div class="titlepage"><div><div><h4 class="title"><a name="hfile"></a>9.7.6.2.&nbsp;StoreFile (HFile)</h4></div></div></div><p>StoreFiles are where your data lives.
+      </p><div class="section" title="9.7.6.2.1.&nbsp;HFile Format"><div class="titlepage"><div><div><h5 class="title"><a name="d2475e5811"></a>9.7.6.2.1.&nbsp;HFile Format</h5></div></div></div><p>The <span class="emphasis"><em>hfile</em></span> file format is based on
               the SSTable file described in the <a class="link" href="http://research.google.com/archive/bigtable.html" target="_top">BigTable [2006]</a> paper and on
               Hadoop's <a class="link" href="http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/io/file/tfile/TFile.html" target="_top">tfile</a>
               (The unit test suite and the compression harness were taken directly from tfile).
@@ -82,7 +97,7 @@ myHtd.setValue(HTableDescriptor.SPLIT_PO
               helpful description, <a class="link" href="http://th30z.blogspot.com/2011/02/hbase-io-hfile.html?spref=tw" target="_top">HBase I/O: HFile</a>.
           </p><p>For more information, see the <a class="link" href="http://hbase.apache.org/xref/org/apache/hadoop/hbase/io/hfile/HFile.html" target="_top">HFile source code</a>.
           Also see <a class="xref" href="hfilev2.html" title="Appendix&nbsp;E.&nbsp;HFile format version 2">Appendix&nbsp;E, <i>HFile format version 2</i></a> for information about the HFile v2 format that was included in 0.92.
-          </p></div><div class="section" title="9.7.5.2.2.&nbsp;HFile Tool"><div class="titlepage"><div><div><h5 class="title"><a name="hfile_tool"></a>9.7.5.2.2.&nbsp;HFile Tool</h5></div></div></div><p>To view a textualized version of hfile content, you can do use
+          </p></div><div class="section" title="9.7.6.2.2.&nbsp;HFile Tool"><div class="titlepage"><div><div><h5 class="title"><a name="hfile_tool"></a>9.7.6.2.2.&nbsp;HFile Tool</h5></div></div></div><p>To view a textualized version of hfile content, you can do use
         the <code class="classname">org.apache.hadoop.hbase.io.hfile.HFile
         </code>tool. Type the following to see usage:</p><pre class="programlisting"><code class="code">$ ${HBASE_HOME}/bin/hbase org.apache.hadoop.hbase.io.hfile.HFile </code> </pre><p>For
         example, to view the content of the file
@@ -90,11 +105,11 @@ myHtd.setValue(HTableDescriptor.SPLIT_PO
         type the following:</p><pre class="programlisting"> <code class="code">$ ${HBASE_HOME}/bin/hbase org.apache.hadoop.hbase.io.hfile.HFile -v -f hdfs://10.81.47.41:8020/hbase/TEST/1418428042/DSMP/4759508618286845475 </code> </pre><p>If
         you leave off the option -v to see just a summary on the hfile. See
         usage for other things to do with the <code class="classname">HFile</code>
-        tool.</p></div><div class="section" title="9.7.5.2.3.&nbsp;StoreFile Directory Structure on HDFS"><div class="titlepage"><div><div><h5 class="title"><a name="store.file.dir"></a>9.7.5.2.3.&nbsp;StoreFile Directory Structure on HDFS</h5></div></div></div><p>For more information of what StoreFiles look like on HDFS with respect to the directory structure, see <a class="xref" href="trouble.namenode.html#trouble.namenode.hbase.objects" title="12.7.2.&nbsp;Browsing HDFS for HBase Objects">Section&nbsp;12.7.2, &#8220;Browsing HDFS for HBase Objects&#8221;</a>.
-        </p></div></div><div class="section" title="9.7.5.3.&nbsp;Blocks"><div class="titlepage"><div><div><h4 class="title"><a name="hfile.blocks"></a>9.7.5.3.&nbsp;Blocks</h4></div></div></div><p>StoreFiles are composed of blocks.  The blocksize is configured on a per-ColumnFamily basis.
+        tool.</p></div><div class="section" title="9.7.6.2.3.&nbsp;StoreFile Directory Structure on HDFS"><div class="titlepage"><div><div><h5 class="title"><a name="store.file.dir"></a>9.7.6.2.3.&nbsp;StoreFile Directory Structure on HDFS</h5></div></div></div><p>For more information of what StoreFiles look like on HDFS with respect to the directory structure, see <a class="xref" href="trouble.namenode.html#trouble.namenode.hbase.objects" title="12.7.2.&nbsp;Browsing HDFS for HBase Objects">Section&nbsp;12.7.2, &#8220;Browsing HDFS for HBase Objects&#8221;</a>.
+        </p></div></div><div class="section" title="9.7.6.3.&nbsp;Blocks"><div class="titlepage"><div><div><h4 class="title"><a name="hfile.blocks"></a>9.7.6.3.&nbsp;Blocks</h4></div></div></div><p>StoreFiles are composed of blocks.  The blocksize is configured on a per-ColumnFamily basis.
         </p><p>Compression happens at the block level within StoreFiles.  For more information on compression, see <a class="xref" href="compression.html" title="Appendix&nbsp;C.&nbsp;Compression In HBase">Appendix&nbsp;C, <i>Compression In HBase</i></a>.
         </p><p>For more information on blocks, see the <a class="link" href="http://hbase.apache.org/xref/org/apache/hadoop/hbase/io/hfile/HFileBlock.html" target="_top">HFileBlock source code</a>.
-        </p></div><div class="section" title="9.7.5.4.&nbsp;KeyValue"><div class="titlepage"><div><div><h4 class="title"><a name="keyvalue"></a>9.7.5.4.&nbsp;KeyValue</h4></div></div></div><p>The KeyValue class is the heart of data storage in HBase.  KeyValue wraps a byte array and takes offsets and lengths into passed array
+        </p></div><div class="section" title="9.7.6.4.&nbsp;KeyValue"><div class="titlepage"><div><div><h4 class="title"><a name="keyvalue"></a>9.7.6.4.&nbsp;KeyValue</h4></div></div></div><p>The KeyValue class is the heart of data storage in HBase.  KeyValue wraps a byte array and takes offsets and lengths into passed array
          at where to start interpreting the content as KeyValue.
         </p><p>The KeyValue format inside a byte array is:
            </p><div class="itemizedlist"><ul class="itemizedlist" type="disc"><li class="listitem">keylength</li><li class="listitem">valuelength</li><li class="listitem">key</li><li class="listitem">value</li></ul></div><p>
@@ -103,19 +118,19 @@ myHtd.setValue(HTableDescriptor.SPLIT_PO
         </p><p>KeyValue instances are <span class="emphasis"><em>not</em></span> split across blocks.
          For example, if there is an 8 MB KeyValue, even if the block-size is 64kb this KeyValue will be read
          in as a coherent block.  For more information, see the <a class="link" href="http://hbase.apache.org/xref/org/apache/hadoop/hbase/KeyValue.html" target="_top">KeyValue source code</a>.
-        </p><div class="section" title="9.7.5.4.1.&nbsp;Example"><div class="titlepage"><div><div><h5 class="title"><a name="keyvalue.example"></a>9.7.5.4.1.&nbsp;Example</h5></div></div></div><p>To emphasize the points above, examine what happens with two Puts for two different columns for the same row:</p><div class="itemizedlist"><ul class="itemizedlist" type="disc"><li class="listitem">Put #1:  <code class="code">rowkey=row1, cf:attr1=value1</code></li><li class="listitem">Put #2:  <code class="code">rowkey=row1, cf:attr2=value2</code></li></ul></div><p>Even though these are for the same row, a KeyValue is created for each column:</p><p>Key portion for Put #1:
+        </p><div class="section" title="9.7.6.4.1.&nbsp;Example"><div class="titlepage"><div><div><h5 class="title"><a name="keyvalue.example"></a>9.7.6.4.1.&nbsp;Example</h5></div></div></div><p>To emphasize the points above, examine what happens with two Puts for two different columns for the same row:</p><div class="itemizedlist"><ul class="itemizedlist" type="disc"><li class="listitem">Put #1:  <code class="code">rowkey=row1, cf:attr1=value1</code></li><li class="listitem">Put #2:  <code class="code">rowkey=row1, cf:attr2=value2</code></li></ul></div><p>Even though these are for the same row, a KeyValue is created for each column:</p><p>Key portion for Put #1:
            </p><div class="itemizedlist"><ul class="itemizedlist" type="disc"><li class="listitem">rowlength <code class="code">------------&gt; 4</code></li><li class="listitem">row <code class="code">-----------------&gt; row1</code></li><li class="listitem">columnfamilylength <code class="code">---&gt; 2</code></li><li class="listitem">columnfamily <code class="code">--------&gt; cf</code></li><li class="listitem">columnqualifier <code class="code">------&gt; attr1</code></li><li class="listitem">timestamp <code class="code">-----------&gt; server time of Put</code></li><li class="listitem">keytype <code class="code">-------------&gt; Put</code></li></ul></div><p>
           </p><p>Key portion for Put #2:
            </p><div class="itemizedlist"><ul class="itemizedlist" type="disc"><li class="listitem">rowlength <code class="code">------------&gt; 4</code></li><li class="listitem">row <code class="code">-----------------&gt; row1</code></li><li class="listitem">columnfamilylength <code class="code">---&gt; 2</code></li><li class="listitem">columnfamily <code class="code">--------&gt; cf</code></li><li class="listitem">columnqualifier <code class="code">------&gt; attr2</code></li><li class="listitem">timestamp <code class="code">-----------&gt; server time of Put</code></li><li class="listitem">keytype <code class="code">-------------&gt; Put</code></li></ul></div><p>
            
           </p></div><p>It is critical to understand that the rowkey, ColumnFamily, and column (aka columnqualifier) are embedded within
-       the KeyValue instance.  The longer these identifiers are, the bigger the KeyValue is.</p></div><div class="section" title="9.7.5.5.&nbsp;Compaction"><div class="titlepage"><div><div><h4 class="title"><a name="compaction"></a>9.7.5.5.&nbsp;Compaction</h4></div></div></div><p>There are two types of compactions:  minor and major.  Minor compactions will usually pick up a couple of the smaller adjacent
+       the KeyValue instance.  The longer these identifiers are, the bigger the KeyValue is.</p></div><div class="section" title="9.7.6.5.&nbsp;Compaction"><div class="titlepage"><div><div><h4 class="title"><a name="compaction"></a>9.7.6.5.&nbsp;Compaction</h4></div></div></div><p>There are two types of compactions:  minor and major.  Minor compactions will usually pick up a couple of the smaller adjacent
          StoreFiles and rewrite them as one.  Minors do not drop deletes or expired cells, only major compactions do this.  Sometimes a minor compaction
          will pick up all the StoreFiles in the Store and in this case it actually promotes itself to being a major compaction.
          </p><p>After a major compaction runs there will be a single StoreFile per Store, and this will help performance usually.  Caution:  major compactions rewrite all of the Stores data and on a loaded system, this may not be tenable;
              major compactions will usually have to be done manually on large systems.  See <a class="xref" href="important_configurations.html#managed.compactions" title="2.5.2.8.&nbsp;Managed Compactions">Section&nbsp;2.5.2.8, &#8220;Managed Compactions&#8221;</a>.
         </p><p>Compactions will <span class="emphasis"><em>not</em></span> perform region merges.  See <a class="xref" href="ops.regionmgt.html#ops.regionmgt.merge" title="14.2.2.&nbsp;Merge">Section&nbsp;14.2.2, &#8220;Merge&#8221;</a> for more information on region merging.
-        </p><div class="section" title="9.7.5.5.1.&nbsp;Compaction File Selection"><div class="titlepage"><div><div><h5 class="title"><a name="compaction.file.selection"></a>9.7.5.5.1.&nbsp;Compaction File Selection</h5></div></div></div><p>To understand the core algorithm for StoreFile selection, there is some ASCII-art in the <a class="link" href="http://hbase.apache.org/xref/org/apache/hadoop/hbase/regionserver/Store.html#836" target="_top">Store source code</a> that
+        </p><div class="section" title="9.7.6.5.1.&nbsp;Compaction File Selection"><div class="titlepage"><div><div><h5 class="title"><a name="compaction.file.selection"></a>9.7.6.5.1.&nbsp;Compaction File Selection</h5></div></div></div><p>To understand the core algorithm for StoreFile selection, there is some ASCII-art in the <a class="link" href="http://hbase.apache.org/xref/org/apache/hadoop/hbase/regionserver/Store.html#836" target="_top">Store source code</a> that
           will serve as useful reference.  It has been copied below:
 </p><pre class="programlisting">
 /* normal skew:
@@ -139,7 +154,7 @@ myHtd.setValue(HTableDescriptor.SPLIT_PO
             Any StoreFile larger than this setting with automatically be excluded from compaction (default Long.MAX_VALUE). </li></ul></div><p>
           </p><p>The minor compaction StoreFile selection logic is size based, and selects a file for compaction when the file
            &lt;= sum(smaller_files) * <code class="code">hbase.hstore.compaction.ratio</code>.
-          </p></div><div class="section" title="9.7.5.5.2.&nbsp;Minor Compaction File Selection - Example #1 (Basic Example)"><div class="titlepage"><div><div><h5 class="title"><a name="compaction.file.selection.example1"></a>9.7.5.5.2.&nbsp;Minor Compaction File Selection - Example #1 (Basic Example)</h5></div></div></div><p>This example mirrors an example from the unit test <code class="code">TestCompactSelection</code>.
+          </p></div><div class="section" title="9.7.6.5.2.&nbsp;Minor Compaction File Selection - Example #1 (Basic Example)"><div class="titlepage"><div><div><h5 class="title"><a name="compaction.file.selection.example1"></a>9.7.6.5.2.&nbsp;Minor Compaction File Selection - Example #1 (Basic Example)</h5></div></div></div><p>This example mirrors an example from the unit test <code class="code">TestCompactSelection</code>.
           </p><div class="itemizedlist"><ul class="itemizedlist" type="disc"><li class="listitem"><code class="code">hbase.store.compaction.ratio</code> = 1.0f </li><li class="listitem"><code class="code">hbase.hstore.compaction.min</code> = 3 (files) </li><li class="listitem"><code class="code">hbase.hstore.compaction.max</code> = 5 (files) </li><li class="listitem"><code class="code">hbase.hstore.compaction.min.size</code> = 10 (bytes) </li><li class="listitem"><code class="code">hbase.hstore.compaction.max.size</code> = 1000 (bytes) </li></ul></div><p>
           The following StoreFiles exist: 100, 50, 23, 12, and 12 bytes apiece (oldest to newest).
           With the above parameters, the files that would be selected for minor compaction are 23, 12, and 12.
@@ -147,19 +162,19 @@ myHtd.setValue(HTableDescriptor.SPLIT_PO
           </p><div class="itemizedlist"><ul class="itemizedlist" type="disc"><li class="listitem">100 --&gt;  No, because sum(50, 23, 12, 12) * 1.0 = 97. </li><li class="listitem">50 --&gt;  No, because sum(23, 12, 12) * 1.0 = 47. </li><li class="listitem">23 --&gt;  Yes, because sum(12, 12) * 1.0 = 24. </li><li class="listitem">12 --&gt;  Yes, because the previous file has been included, and because this
           does not exceed the the max-file limit of 5  </li><li class="listitem">12 --&gt;  Yes, because the previous file had been included, and because this
           does not exceed the the max-file limit of 5.</li></ul></div><p>
-          </p></div><div class="section" title="9.7.5.5.3.&nbsp;Minor Compaction File Selection - Example #2 (Not Enough Files To Compact)"><div class="titlepage"><div><div><h5 class="title"><a name="compaction.file.selection.example2"></a>9.7.5.5.3.&nbsp;Minor Compaction File Selection - Example #2 (Not Enough Files To Compact)</h5></div></div></div><p>This example mirrors an example from the unit test <code class="code">TestCompactSelection</code>.
+          </p></div><div class="section" title="9.7.6.5.3.&nbsp;Minor Compaction File Selection - Example #2 (Not Enough Files To Compact)"><div class="titlepage"><div><div><h5 class="title"><a name="compaction.file.selection.example2"></a>9.7.6.5.3.&nbsp;Minor Compaction File Selection - Example #2 (Not Enough Files To Compact)</h5></div></div></div><p>This example mirrors an example from the unit test <code class="code">TestCompactSelection</code>.
           </p><div class="itemizedlist"><ul class="itemizedlist" type="disc"><li class="listitem"><code class="code">hbase.store.compaction.ratio</code> = 1.0f </li><li class="listitem"><code class="code">hbase.hstore.compaction.min</code> = 3 (files) </li><li class="listitem"><code class="code">hbase.hstore.compaction.max</code> = 5 (files) </li><li class="listitem"><code class="code">hbase.hstore.compaction.min.size</code> = 10 (bytes) </li><li class="listitem"><code class="code">hbase.hstore.compaction.max.size</code> = 1000 (bytes) </li></ul></div><p>
           </p><p>The following StoreFiles exist: 100, 25, 12, and 12 bytes apiece (oldest to newest).
           With the above parameters, no compaction will be started.
           </p><p>Why?
           </p><div class="itemizedlist"><ul class="itemizedlist" type="disc"><li class="listitem">100 --&gt; No, because sum(25, 12, 12) * 1.0 = 47</li><li class="listitem">25 --&gt;  No, because sum(12, 12) * 1.0 = 24</li><li class="listitem">12 --&gt;  No. Candidate because sum(12) * 1.0 = 12, there are only 2 files to compact and that is less than the threshold of 3</li><li class="listitem">12 --&gt;  No. Candidate because the previous StoreFile was, but there are not enough files to compact</li></ul></div><p>
-          </p></div><div class="section" title="9.7.5.5.4.&nbsp;Minor Compaction File Selection - Example #3 (Limiting Files To Compact)"><div class="titlepage"><div><div><h5 class="title"><a name="compaction.file.selection.example2"></a>9.7.5.5.4.&nbsp;Minor Compaction File Selection - Example #3 (Limiting Files To Compact)</h5></div></div></div><p>This example mirrors an example from the unit test <code class="code">TestCompactSelection</code>.
+          </p></div><div class="section" title="9.7.6.5.4.&nbsp;Minor Compaction File Selection - Example #3 (Limiting Files To Compact)"><div class="titlepage"><div><div><h5 class="title"><a name="compaction.file.selection.example2"></a>9.7.6.5.4.&nbsp;Minor Compaction File Selection - Example #3 (Limiting Files To Compact)</h5></div></div></div><p>This example mirrors an example from the unit test <code class="code">TestCompactSelection</code>.
           </p><div class="itemizedlist"><ul class="itemizedlist" type="disc"><li class="listitem"><code class="code">hbase.store.compaction.ratio</code> = 1.0f </li><li class="listitem"><code class="code">hbase.hstore.compaction.min</code> = 3 (files) </li><li class="listitem"><code class="code">hbase.hstore.compaction.max</code> = 5 (files) </li><li class="listitem"><code class="code">hbase.hstore.compaction.min.size</code> = 10 (bytes) </li><li class="listitem"><code class="code">hbase.hstore.compaction.max.size</code> = 1000 (bytes) </li></ul></div><p>
           The following StoreFiles exist: 7, 6, 5, 4, 3, 2, and 1 bytes apiece (oldest to newest).
           With the above parameters, the files that would be selected for minor compaction are 7, 6, 5, 4, 3.
           </p><p>Why?
           </p><div class="itemizedlist"><ul class="itemizedlist" type="disc"><li class="listitem">7 --&gt;  Yes, because sum(6, 5, 4, 3, 2, 1) * 1.0 = 21.  Also, 7 is less than the min-size</li><li class="listitem">6 --&gt;  Yes, because sum(5, 4, 3, 2, 1) * 1.0 = 15.  Also, 6 is less than the min-size. </li><li class="listitem">5 --&gt;  Yes, because sum(4, 3, 2, 1) * 1.0 = 10.  Also, 5 is less than the min-size. </li><li class="listitem">4 --&gt;  Yes, because sum(3, 2, 1) * 1.0 = 6.  Also, 4 is less than the min-size. </li><li class="listitem">3 --&gt;  Yes, because sum(2, 1) * 1.0 = 3.  Also, 3 is less than the min-size. </li><li class="listitem">2 --&gt;  No.  Candidate because previous file was selected and 2 is less than the min-size, but the max-number of files to compact has been reached. </li><li class="listitem">1 --&gt;  No.  Candidate because previous file was selected and 1 is less than the min-size, but max-number of files to compact has been reached. </li></u
 l></div><p>
-          </p></div><div class="section" title="9.7.5.5.5.&nbsp;Impact of Key Configuration Options"><div class="titlepage"><div><div><h5 class="title"><a name="compaction.config.impact"></a>9.7.5.5.5.&nbsp;Impact of Key Configuration Options</h5></div></div></div><p><code class="code">hbase.store.compaction.ratio</code>.  A large ratio (e.g., 10) will produce a single giant file.  Conversely, a value of .25 will
+          </p></div><div class="section" title="9.7.6.5.5.&nbsp;Impact of Key Configuration Options"><div class="titlepage"><div><div><h5 class="title"><a name="compaction.config.impact"></a>9.7.6.5.5.&nbsp;Impact of Key Configuration Options</h5></div></div></div><p><code class="code">hbase.store.compaction.ratio</code>.  A large ratio (e.g., 10) will produce a single giant file.  Conversely, a value of .25 will
           produce behavior similar to the BigTable compaction algorithm - resulting in 4 StoreFiles.
           </p><p><code class="code">hbase.hstore.compaction.min.size</code>.  Because
           this limit represents the "automatic include" limit for all StoreFiles smaller than this value, this value may need to

Modified: hbase/hbase.apache.org/trunk/book/regionserver.arch.html
URL: http://svn.apache.org/viewvc/hbase/hbase.apache.org/trunk/book/regionserver.arch.html?rev=1463652&r1=1463651&r2=1463652&view=diff
==============================================================================
--- hbase/hbase.apache.org/trunk/book/regionserver.arch.html (original)
+++ hbase/hbase.apache.org/trunk/book/regionserver.arch.html Tue Apr  2 18:06:19 2013
@@ -52,7 +52,7 @@
             the option of turning this off via the setCaching method (set it to false). You can still keep block caching turned on on this table if you need fast random read access. An example would be
             counting the number of rows in a table that serves live traffic, caching every block of that table would create massive churn and would surely evict data that's currently in use.
             </li></ul></div></div></div><div class="section" title="9.6.5.&nbsp;Write Ahead Log (WAL)"><div class="titlepage"><div><div><h3 class="title"><a name="wal"></a>9.6.5.&nbsp;Write Ahead Log (WAL)</h3></div></div></div><div class="section" title="9.6.5.1.&nbsp;Purpose"><div class="titlepage"><div><div><h4 class="title"><a name="purpose.wal"></a>9.6.5.1.&nbsp;Purpose</h4></div></div></div><p>Each RegionServer adds updates (Puts, Deletes) to its write-ahead log (WAL)
-            first, and then to the <a class="xref" href="regions.arch.html#store.memstore" title="9.7.5.1.&nbsp;MemStore">Section&nbsp;9.7.5.1, &#8220;MemStore&#8221;</a> for the affected <a class="xref" href="regions.arch.html#store" title="9.7.5.&nbsp;Store">Section&nbsp;9.7.5, &#8220;Store&#8221;</a>.
+            first, and then to the <a class="xref" href="regions.arch.html#store.memstore" title="9.7.6.1.&nbsp;MemStore">Section&nbsp;9.7.6.1, &#8220;MemStore&#8221;</a> for the affected <a class="xref" href="regions.arch.html#store" title="9.7.6.&nbsp;Store">Section&nbsp;9.7.6, &#8220;Store&#8221;</a>.
         This ensures that HBase has durable writes. Without WAL, there is the possibility of data loss in the case of a RegionServer failure
         before each MemStore is flushed and new StoreFiles are written.  <a class="link" href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/regionserver/wal/HLog.html" target="_top">HLog</a>
         is the HBase WAL implementation, and there is one HLog instance per RegionServer.
@@ -61,22 +61,22 @@
         For more general information about the concept of write ahead logs, see the Wikipedia
         <a class="link" href="http://en.wikipedia.org/wiki/Write-ahead_logging" target="_top">Write-Ahead Log</a> article.
        </p></div><div class="section" title="9.6.5.2.&nbsp;WAL Flushing"><div class="titlepage"><div><div><h4 class="title"><a name="wal_flush"></a>9.6.5.2.&nbsp;WAL Flushing</h4></div></div></div><p>TODO (describe).
-          </p></div><div class="section" title="9.6.5.3.&nbsp;WAL Splitting"><div class="titlepage"><div><div><h4 class="title"><a name="wal_splitting"></a>9.6.5.3.&nbsp;WAL Splitting</h4></div></div></div><div class="section" title="9.6.5.3.1.&nbsp;How edits are recovered from a crashed RegionServer"><div class="titlepage"><div><div><h5 class="title"><a name="d2279e5359"></a>9.6.5.3.1.&nbsp;How edits are recovered from a crashed RegionServer</h5></div></div></div><p>When a RegionServer crashes, it will lose its ephemeral lease in
-         ZooKeeper...TODO</p></div><div class="section" title="9.6.5.3.2.&nbsp;hbase.hlog.split.skip.errors"><div class="titlepage"><div><div><h5 class="title"><a name="d2279e5364"></a>9.6.5.3.2.&nbsp;<code class="varname">hbase.hlog.split.skip.errors</code></h5></div></div></div><p>When set to <code class="constant">true</code>, any error
+          </p></div><div class="section" title="9.6.5.3.&nbsp;WAL Splitting"><div class="titlepage"><div><div><h4 class="title"><a name="wal_splitting"></a>9.6.5.3.&nbsp;WAL Splitting</h4></div></div></div><div class="section" title="9.6.5.3.1.&nbsp;How edits are recovered from a crashed RegionServer"><div class="titlepage"><div><div><h5 class="title"><a name="d2475e5592"></a>9.6.5.3.1.&nbsp;How edits are recovered from a crashed RegionServer</h5></div></div></div><p>When a RegionServer crashes, it will lose its ephemeral lease in
+         ZooKeeper...TODO</p></div><div class="section" title="9.6.5.3.2.&nbsp;hbase.hlog.split.skip.errors"><div class="titlepage"><div><div><h5 class="title"><a name="d2475e5597"></a>9.6.5.3.2.&nbsp;<code class="varname">hbase.hlog.split.skip.errors</code></h5></div></div></div><p>When set to <code class="constant">true</code>, any error
         encountered splitting will be logged, the problematic WAL will be
         moved into the <code class="filename">.corrupt</code> directory under the hbase
         <code class="varname">rootdir</code>, and processing will continue. If set to
         <code class="constant">false</code>, the default, the exception will be propagated and the
-        split logged as failed.<sup>[<a name="d2279e5382" href="#ftn.d2279e5382" class="footnote">22</a>]</sup></p></div><div class="section" title="9.6.5.3.3.&nbsp;How EOFExceptions are treated when splitting a crashed RegionServers' WALs"><div class="titlepage"><div><div><h5 class="title"><a name="d2279e5388"></a>9.6.5.3.3.&nbsp;How EOFExceptions are treated when splitting a crashed
+        split logged as failed.<sup>[<a name="d2475e5615" href="#ftn.d2475e5615" class="footnote">22</a>]</sup></p></div><div class="section" title="9.6.5.3.3.&nbsp;How EOFExceptions are treated when splitting a crashed RegionServers' WALs"><div class="titlepage"><div><div><h5 class="title"><a name="d2475e5621"></a>9.6.5.3.3.&nbsp;How EOFExceptions are treated when splitting a crashed
         RegionServers' WALs</h5></div></div></div><p>If we get an EOF while splitting logs, we proceed with the split
         even when <code class="varname">hbase.hlog.split.skip.errors</code> ==
         <code class="constant">false</code>. An EOF while reading the last log in the
         set of files to split is near-guaranteed since the RegionServer likely
         crashed mid-write of a record. But we'll continue even if we got an
-        EOF reading other than the last file in the set.<sup>[<a name="d2279e5399" href="#ftn.d2279e5399" class="footnote">23</a>]</sup></p></div></div></div><div class="footnotes"><br><hr width="100" align="left"><div class="footnote"><p><sup>[<a id="ftn.d2279e5382" href="#d2279e5382" class="para">22</a>] </sup>See <a class="link" href="https://issues.apache.org/jira/browse/HBASE-2958" target="_top">HBASE-2958
+        EOF reading other than the last file in the set.<sup>[<a name="d2475e5632" href="#ftn.d2475e5632" class="footnote">23</a>]</sup></p></div></div></div><div class="footnotes"><br><hr width="100" align="left"><div class="footnote"><p><sup>[<a id="ftn.d2475e5615" href="#d2475e5615" class="para">22</a>] </sup>See <a class="link" href="https://issues.apache.org/jira/browse/HBASE-2958" target="_top">HBASE-2958
             When hbase.hlog.split.skip.errors is set to false, we fail the
             split but thats it</a>. We need to do more than just fail split
-            if this flag is set.</p></div><div class="footnote"><p><sup>[<a id="ftn.d2279e5399" href="#d2279e5399" class="para">23</a>] </sup>For background, see <a class="link" href="https://issues.apache.org/jira/browse/HBASE-2643" target="_top">HBASE-2643
+            if this flag is set.</p></div><div class="footnote"><p><sup>[<a id="ftn.d2475e5632" href="#d2475e5632" class="para">23</a>] </sup>For background, see <a class="link" href="https://issues.apache.org/jira/browse/HBASE-2643" target="_top">HBASE-2643
             Figure how to deal with eof splitting logs</a></p></div></div></div><div id="disqus_thread"></div><script type="text/javascript">
     var disqus_shortname = 'hbase'; // required: replace example with your forum shortname
     var disqus_url = 'http://hbase.apache.org/book';

Modified: hbase/hbase.apache.org/trunk/book/rowkey.design.html
URL: http://svn.apache.org/viewvc/hbase/hbase.apache.org/trunk/book/rowkey.design.html?rev=1463652&r1=1463651&r2=1463652&view=diff
==============================================================================
--- hbase/hbase.apache.org/trunk/book/rowkey.design.html (original)
+++ hbase/hbase.apache.org/trunk/book/rowkey.design.html Tue Apr  2 18:06:19 2013
@@ -10,6 +10,7 @@
     study <a class="link" href="http://opentsdb.net/" target="_top">OpenTSDB</a> as a
     successful example.  It has a page describing the <a class="link" href=" http://opentsdb.net/schema.html" target="_top">schema</a> it uses in
     HBase.  The key format in OpenTSDB is effectively [metric_type][event_timestamp], which would appear at first glance to contradict the previous advice about not using a timestamp as the key.  However, the difference is that the timestamp is not in the <span class="emphasis"><em>lead</em></span> position of the key, and the design assumption is that there are dozens or hundreds (or more) of different metric types.  Thus, even with a continual stream of input data with a mix of metric types, the Puts are distributed across various points of regions in the table.
+   </p><p>See <a class="xref" href="schema.casestudies.html" title="6.11.&nbsp;Schema Design Case Studies">Section&nbsp;6.11, &#8220;Schema Design Case Studies&#8221;</a> for some rowkey design examples.
    </p></div><div class="section" title="6.3.2.&nbsp;Try to minimize row and column sizes"><div class="titlepage"><div><div><h3 class="title"><a name="keysize"></a>6.3.2.&nbsp;Try to minimize row and column sizes</h3></div><div><h4 class="subtitle">Or why are my StoreFile indices large?</h4></div></div></div><p>In HBase, values are always freighted with their coordinates; as a
           cell value passes through the system, it'll be accompanied by its
           row, column name, and timestamp - always.  If your rows and column names
@@ -18,7 +19,7 @@
           the case described by Marc Limotte at the tail of
           HBASE-3551
           (recommended!).
-          Therein, the indices that are kept on HBase storefiles (<a class="xref" href="regions.arch.html#hfile" title="9.7.5.2.&nbsp;StoreFile (HFile)">Section&nbsp;9.7.5.2, &#8220;StoreFile (HFile)&#8221;</a>)
+          Therein, the indices that are kept on HBase storefiles (<a class="xref" href="regions.arch.html#hfile" title="9.7.6.2.&nbsp;StoreFile (HFile)">Section&nbsp;9.7.6.2, &#8220;StoreFile (HFile)&#8221;</a>)
                   to facilitate random access may end up occupyng large chunks of the HBase
                   allotted RAM because the cell value coordinates are large.
                   Mark in the above cited comment suggests upping the block size so
@@ -30,10 +31,10 @@
                   up on the user mailing list.
        </p><p>Most of the time small inefficiencies don't matter all that much.  Unfortunately,
          this is a case where they do.  Whatever patterns are selected for ColumnFamilies, attributes, and rowkeys they could be repeated
-       several billion times in your data. </p><p>See <a class="xref" href="regions.arch.html#keyvalue" title="9.7.5.4.&nbsp;KeyValue">Section&nbsp;9.7.5.4, &#8220;KeyValue&#8221;</a> for more information on HBase stores data internally to see why this is important.</p><div class="section" title="6.3.2.1.&nbsp;Column Families"><div class="titlepage"><div><div><h4 class="title"><a name="keysize.cf"></a>6.3.2.1.&nbsp;Column Families</h4></div></div></div><p>Try to keep the ColumnFamily names as small as possible, preferably one character (e.g. "d" for data/default).
-         </p><p>See <a class="xref" href="regions.arch.html#keyvalue" title="9.7.5.4.&nbsp;KeyValue">Section&nbsp;9.7.5.4, &#8220;KeyValue&#8221;</a> for more information on HBase stores data internally to see why this is important.</p></div><div class="section" title="6.3.2.2.&nbsp;Attributes"><div class="titlepage"><div><div><h4 class="title"><a name="keysize.atttributes"></a>6.3.2.2.&nbsp;Attributes</h4></div></div></div><p>Although verbose attribute names (e.g., "myVeryImportantAttribute") are easier to read, prefer shorter attribute names (e.g., "via")
+       several billion times in your data. </p><p>See <a class="xref" href="regions.arch.html#keyvalue" title="9.7.6.4.&nbsp;KeyValue">Section&nbsp;9.7.6.4, &#8220;KeyValue&#8221;</a> for more information on HBase stores data internally to see why this is important.</p><div class="section" title="6.3.2.1.&nbsp;Column Families"><div class="titlepage"><div><div><h4 class="title"><a name="keysize.cf"></a>6.3.2.1.&nbsp;Column Families</h4></div></div></div><p>Try to keep the ColumnFamily names as small as possible, preferably one character (e.g. "d" for data/default).
+         </p><p>See <a class="xref" href="regions.arch.html#keyvalue" title="9.7.6.4.&nbsp;KeyValue">Section&nbsp;9.7.6.4, &#8220;KeyValue&#8221;</a> for more information on HBase stores data internally to see why this is important.</p></div><div class="section" title="6.3.2.2.&nbsp;Attributes"><div class="titlepage"><div><div><h4 class="title"><a name="keysize.atttributes"></a>6.3.2.2.&nbsp;Attributes</h4></div></div></div><p>Although verbose attribute names (e.g., "myVeryImportantAttribute") are easier to read, prefer shorter attribute names (e.g., "via")
          to store in HBase.
-         </p><p>See <a class="xref" href="regions.arch.html#keyvalue" title="9.7.5.4.&nbsp;KeyValue">Section&nbsp;9.7.5.4, &#8220;KeyValue&#8221;</a> for more information on HBase stores data internally to see why this is important.</p></div><div class="section" title="6.3.2.3.&nbsp;Rowkey Length"><div class="titlepage"><div><div><h4 class="title"><a name="keysize.row"></a>6.3.2.3.&nbsp;Rowkey Length</h4></div></div></div><p>Keep them as short as is reasonable such that they can still be useful for required data access (e.g., Get vs. Scan).
+         </p><p>See <a class="xref" href="regions.arch.html#keyvalue" title="9.7.6.4.&nbsp;KeyValue">Section&nbsp;9.7.6.4, &#8220;KeyValue&#8221;</a> for more information on HBase stores data internally to see why this is important.</p></div><div class="section" title="6.3.2.3.&nbsp;Rowkey Length"><div class="titlepage"><div><div><h4 class="title"><a name="keysize.row"></a>6.3.2.3.&nbsp;Rowkey Length</h4></div></div></div><p>Keep them as short as is reasonable such that they can still be useful for required data access (e.g., Get vs. Scan).
          A short key that is useless for data access is not better than a longer key with better get/scan properties.  Expect tradeoffs
          when designing rowkeys.
          </p></div><div class="section" title="6.3.2.4.&nbsp;Byte Patterns"><div class="titlepage"><div><div><h4 class="title"><a name="keysize.patterns"></a>6.3.2.4.&nbsp;Byte Patterns</h4></div></div></div><p>A long is 8 bytes.  You can store an unsigned number up to 18,446,744,073,709,551,615 in those eight bytes.