You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hbase.apache.org by st...@apache.org on 2014/08/09 00:19:17 UTC

svn commit: r1616896 [2/5] - in /hbase/hbase.apache.org/trunk: ./ apidocs/org/apache/hadoop/hbase/io/util/ apidocs/org/apache/hadoop/hbase/io/util/class-use/ book/ devapidocs/org/apache/hadoop/hbase/io/util/ devapidocs/org/apache/hadoop/hbase/io/util/c...

Added: hbase/hbase.apache.org/trunk/book/apas09.html
URL: http://svn.apache.org/viewvc/hbase/hbase.apache.org/trunk/book/apas09.html?rev=1616896&view=auto
==============================================================================
--- hbase/hbase.apache.org/trunk/book/apas09.html (added)
+++ hbase/hbase.apache.org/trunk/book/apas09.html Fri Aug  8 22:19:16 2014
@@ -0,0 +1,82 @@
+<html><head>
+      <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
+   <title>A.9.&nbsp;Docbook Common Issues</title><link rel="stylesheet" type="text/css" href="${baserdir}/src/main/site/resources/css/freebsd_docbook.css"><meta name="generator" content="DocBook XSL Stylesheets V1.78.1"><link rel="home" href="appendix_contributing_to_documentation.html" title="Appendix&nbsp;A.&nbsp;Contributing to Documentation"><link rel="up" href="appendix_contributing_to_documentation.html" title="Appendix&nbsp;A.&nbsp;Contributing to Documentation"><link rel="prev" href="apas08.html" title="A.8.&nbsp;Adding a New Chapter to the HBase Reference Guide"></head><body bgcolor="white" text="black" link="#0000FF" vlink="#840084" alink="#0000FF"><div class="navheader"><table width="100%" summary="Navigation header"><tr><th colspan="3" align="center">A.9.&nbsp;Docbook Common Issues</th></tr><tr><td width="20%" align="left"><a accesskey="p" href="apas08.html">Prev</a>&nbsp;</td><th width="60%" align="center">&nbsp;</th><td width="20%" align="right">&nbsp;</td></tr></table
 ><hr></div><script type="text/javascript">
+    var disqus_shortname = 'hbase'; // required: replace example with your forum shortname
+    var disqus_url = 'http://hbase.apache.org/book/.html';
+    </script><div class="section"><div class="titlepage"><div><div><h2 class="title" style="clear: both"><a name="d9580e239"></a>A.9.&nbsp;Docbook Common Issues</h2></div></div></div><p>The following Docbook issues come up often. Some of these are preferences, but others
+            can create mysterious build errors or other problems.</p><div class="qandaset"><a name="d9580e244"></a><dl><dt>A.9.1. <a href="apas09.html#d9580e245">What can go where?</a></dt><dt>A.9.2. <a href="apas09.html#d9580e255">Paragraphs and Admonitions</a></dt><dt>A.9.3. <a href="apas09.html#d9580e264">Wrap textual &lt;listitem&gt; and &lt;entry&gt; contents in &lt;para&gt;
+                        elements.</a></dt><dt>A.9.4. <a href="apas09.html#d9580e273">When to use &lt;command&gt;, &lt;code&gt;, &lt;programlisting&gt;,
+                        &lt;screen&gt;</a></dt><dt>A.9.5. <a href="apas09.html#d9580e290">How to escape XML elements so that they show up as XML</a></dt><dt>A.9.6. <a href="apas09.html#d9580e297">Tips and tricks for making screen output look good</a></dt><dt>A.9.7. <a href="apas09.html#d9580e316">Isolate Changes for Easy Diff Review.</a></dt><dt>A.9.8. <a href="apas09.html#d9580e323">Syntax Highlighting</a></dt></dl><table border="0" style="width: 100%;"><colgroup><col align="left" width="1%"><col></colgroup><tbody><tr class="question"><td align="left" valign="top"><a name="d9580e245"></a><a name="d9580e246"></a><p><b>A.9.1.</b></p></td><td align="left" valign="top"><p>What can go where?</p></td></tr><tr class="answer"><td align="left" valign="top"></td><td align="left" valign="top"><p>There is often confusion about which child elements are valid in a given
+                        context. When in doubt, <a class="link" href="http://docbook.org/tdg/en/html/docbook.html" target="_top">Docbook: The
+                            Definitive Guide</a> is the best resource. It has an appendix which
+                        is indexed by element and contains all valid child and parent elements of
+                        any given element. If you edit Docbook often, a schema-aware XML editor
+                        makes things easier.</p></td></tr><tr class="question"><td align="left" valign="top"><a name="d9580e255"></a><a name="d9580e256"></a><p><b>A.9.2.</b></p></td><td align="left" valign="top"><p>Paragraphs and Admonitions</p></td></tr><tr class="answer"><td align="left" valign="top"></td><td align="left" valign="top"><p>It is a common pattern, and it is technically valid, to put an admonition
+                        such as a &lt;note&gt; inside a &lt;para&gt; element. Because admonitions
+                        render as block-level elements (they take the whole width of the page), it
+                        is better to mark them up as siblings to the paragraphs around them, like
+                        this:</p><pre class="programlisting"><strong class="hl-tag" style="color: #000096">&lt;para&gt;</strong>This is the paragraph.<strong class="hl-tag" style="color: #000096">&lt;/para&gt;</strong>
+<strong class="hl-tag" style="color: #000096">&lt;note&gt;</strong>
+    <strong class="hl-tag" style="color: #000096">&lt;para&gt;</strong>This is an admonition which occurs after the paragraph.<strong class="hl-tag" style="color: #000096">&lt;/para&gt;</strong>
+<strong class="hl-tag" style="color: #000096">&lt;/note&gt;</strong></pre></td></tr><tr class="question"><td align="left" valign="top"><a name="d9580e264"></a><a name="d9580e265"></a><p><b>A.9.3.</b></p></td><td align="left" valign="top"><p>Wrap textual &lt;listitem&gt; and &lt;entry&gt; contents in &lt;para&gt;
+                        elements.</p></td></tr><tr class="answer"><td align="left" valign="top"></td><td align="left" valign="top"><p>Because the contents of a &lt;listitem&gt; (an element in an itemized,
+                        ordered, or variable list) or an &lt;entry&gt; (a cell in a table) can
+                        consist of things other than plain text, they need to be wrapped in some
+                        element. If they are plain text, they need to be inclosed in &lt;para&gt;
+                        tags. This is tedious but necessary for validity.</p><pre class="programlisting"><strong class="hl-tag" style="color: #000096">&lt;itemizedlist&gt;</strong>
+    <strong class="hl-tag" style="color: #000096">&lt;listitem&gt;</strong>
+        <strong class="hl-tag" style="color: #000096">&lt;para&gt;</strong>This is a paragraph.<strong class="hl-tag" style="color: #000096">&lt;/para&gt;</strong>
+    <strong class="hl-tag" style="color: #000096">&lt;/listitem&gt;</strong>
+    <strong class="hl-tag" style="color: #000096">&lt;listitem&gt;</strong>
+        <strong class="hl-tag" style="color: #000096">&lt;screen&gt;</strong>This is screen output.<strong class="hl-tag" style="color: #000096">&lt;/screen&gt;</strong>
+    <strong class="hl-tag" style="color: #000096">&lt;/listitem&gt;</strong>
+<strong class="hl-tag" style="color: #000096">&lt;/itemizedlist&gt;</strong></pre></td></tr><tr class="question"><td align="left" valign="top"><a name="d9580e273"></a><a name="d9580e274"></a><p><b>A.9.4.</b></p></td><td align="left" valign="top"><p>When to use &lt;command&gt;, &lt;code&gt;, &lt;programlisting&gt;,
+                        &lt;screen&gt;</p></td></tr><tr class="answer"><td align="left" valign="top"></td><td align="left" valign="top"><p>The first two are in-line tags, which can occur within the flow of
+                        paragraphs or titles. The second two are block elements.</p><p>Use &lt;command&gt; to mention a command such as <span class="command"><strong>hbase
+                            shell</strong></span> in the flow of a sentence. Use &lt;code&gt; for other
+                        inline text referring to code. Incidentally, use &lt;literal&gt; to specify
+                        literal strings that should be typed or entered exactly as shown. Within a
+                        &lt;screen&gt; listing, it can be helpful to use the &lt;userinput&gt; and
+                        &lt;computeroutput&gt; elements to mark up the text further.</p><p>Use &lt;screen&gt; to display input and output as the user would
+                            <span class="emphasis"><em>see</em></span> it on the screen, in a log file, etc. Use
+                        &lt;programlisting&gt; only for blocks of code that occur within a file,
+                        such as Java or XML code, or a Bash shell script.</p></td></tr><tr class="question"><td align="left" valign="top"><a name="d9580e290"></a><a name="d9580e291"></a><p><b>A.9.5.</b></p></td><td align="left" valign="top"><p>How to escape XML elements so that they show up as XML</p></td></tr><tr class="answer"><td align="left" valign="top"></td><td align="left" valign="top"><p>For one-off instances or short in-line mentions, use the &amp;lt; and
+                        &amp;gt; encoded characters. For longer mentions, or blocks of code, enclose
+                        it with &amp;lt;![CDATA[]]&amp;gt;, which is much easier to maintain and
+                        parse in the source files..</p></td></tr><tr class="question"><td align="left" valign="top"><a name="d9580e297"></a><a name="d9580e298"></a><p><b>A.9.6.</b></p></td><td align="left" valign="top"><p>Tips and tricks for making screen output look good</p></td></tr><tr class="answer"><td align="left" valign="top"></td><td align="left" valign="top"><p>Text within &lt;screen&gt; and &lt;programlisting&gt; elements is shown
+                        exactly as it appears in the source, including indentation, tabs, and line
+                        wrap.</p><div class="itemizedlist"><ul class="itemizedlist" style="list-style-type: disc; "><li class="listitem"><p>Indent the starting and closing XML elements, but do not indent
+                                the content. Also, to avoid having an extra blank line at the
+                                beginning of the programlisting output, do not put the CDATA
+                                element on its own line. For example:</p><pre class="programlisting">        &lt;programlisting&gt;
+<strong class="hl-keyword">case</strong> $<span class="hl-number">1</span> in
+  --cleanZk|--cleanHdfs|--cleanAll)
+    matches=<strong class="hl-string"><em style="color:red">"yes"</em></strong> ;;
+  *) ;;
+<strong class="hl-keyword">esac</strong>
+        &lt;/programlisting&gt;</pre></li><li class="listitem"><p>After pasting code into a programlisting, fix the indentation
+                                manually, using two <span class="emphasis"><em>spaces</em></span> per desired
+                                indentation. For screen output, be sure to include line breaks so
+                                that the text is no longer than 100 characters.</p></li></ul></div></td></tr><tr class="question"><td align="left" valign="top"><a name="d9580e316"></a><a name="d9580e317"></a><p><b>A.9.7.</b></p></td><td align="left" valign="top"><p>Isolate Changes for Easy Diff Review.</p></td></tr><tr class="answer"><td align="left" valign="top"></td><td align="left" valign="top"><p>Be careful with pretty-printing or re-formatting an entire XML file, even
+                        if the formatting has degraded over time. If you need to reformat a file, do
+                        that in a separate JIRA where you do not change any content. Be careful
+                        because some XML editors do a bulk-reformat when you open a new file,
+                        especially if you use GUI mode in the editor.</p></td></tr><tr class="question"><td align="left" valign="top"><a name="d9580e323"></a><a name="d9580e324"></a><p><b>A.9.8.</b></p></td><td align="left" valign="top"><p>Syntax Highlighting</p></td></tr><tr class="answer"><td align="left" valign="top"></td><td align="left" valign="top"><p>The HBase Reference Guide uses the <a class="link" href="http://sourceforge.net/projects/xslthl/files/xslthl/2.1.0/" target="_top">XSLT Syntax Highlighting</a> Maven module for syntax highlighting.
+                        To enable syntax highlighting for a given &lt;programlisting&gt; or
+                        &lt;screen&gt; (or possibly other elements), add the attribute
+                                <code class="literal">language=<em class="replaceable"><code>LANGUAGE_OF_CHOICE</code></em></code>
+                        to the element, as in the following example:</p><pre class="programlisting">
+<strong class="hl-tag" style="color: #000096">&lt;programlisting</strong> <span class="hl-attribute" style="color: #F5844C">language</span>=<span class="hl-value" style="color: #993300">"xml"</span><strong class="hl-tag" style="color: #000096">&gt;</strong>
+    <strong class="hl-tag" style="color: #000096">&lt;foo&gt;</strong>bar<strong class="hl-tag" style="color: #000096">&lt;/foo&gt;</strong>
+    <strong class="hl-tag" style="color: #000096">&lt;bar&gt;</strong>foo<strong class="hl-tag" style="color: #000096">&lt;/bar&gt;</strong>
+<strong class="hl-tag" style="color: #000096">&lt;/programlisting&gt;</strong></pre><p>Several syntax types are supported. The most interesting ones for the
+                        HBase Reference Guide are <code class="literal">java</code>, <code class="literal">xml</code>,
+                            <code class="literal">sql</code>, and <code class="literal">bourne</code> (for BASH shell
+                        output or Linux command-line examples).</p></td></tr></tbody></table></div></div><div id="disqus_thread"></div><script type="text/javascript">
+    /* * * DON'T EDIT BELOW THIS LINE * * */
+    (function() {
+        var dsq = document.createElement('script'); dsq.type = 'text/javascript'; dsq.async = true;
+        dsq.src = 'http://' + disqus_shortname + '.disqus.com/embed.js';
+        (document.getElementsByTagName('head')[0] || document.getElementsByTagName('body')[0]).appendChild(dsq);
+    })();
+</script><noscript>Please enable JavaScript to view the <a href="http://disqus.com/?ref_noscript">comments powered by Disqus.</a></noscript><a href="http://disqus.com" class="dsq-brlink">comments powered by <span class="logo-disqus">Disqus</span></a><div class="navfooter"><hr><table width="100%" summary="Navigation footer"><tr><td width="40%" align="left"><a accesskey="p" href="apas08.html">Prev</a>&nbsp;</td><td width="20%" align="center">&nbsp;</td><td width="40%" align="right">&nbsp;</td></tr><tr><td width="40%" align="left" valign="top">A.8.&nbsp;Adding a New Chapter to the HBase Reference Guide&nbsp;</td><td width="20%" align="center"><a accesskey="h" href="appendix_contributing_to_documentation.html">Home</a></td><td width="40%" align="right" valign="top">&nbsp;</td></tr></table></div></body></html>
\ No newline at end of file

Added: hbase/hbase.apache.org/trunk/book/apcs03.html
URL: http://svn.apache.org/viewvc/hbase/hbase.apache.org/trunk/book/apcs03.html?rev=1616896&view=auto
==============================================================================
--- hbase/hbase.apache.org/trunk/book/apcs03.html (added)
+++ hbase/hbase.apache.org/trunk/book/apcs03.html Fri Aug  8 22:19:16 2014
@@ -0,0 +1,39 @@
+<html><head>
+      <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
+   <title>C.3.&nbsp;Localized repairs</title><link rel="stylesheet" type="text/css" href="${baserdir}/src/main/site/resources/css/freebsd_docbook.css"><meta name="generator" content="DocBook XSL Stylesheets V1.78.1"><link rel="home" href="book.html" title="The Apache HBase&#153; Reference Guide"><link rel="up" href="hbck.in.depth.html" title="Appendix&nbsp;C.&nbsp;hbck In Depth"><link rel="prev" href="apcs02.html" title="C.2.&nbsp;Inconsistencies"><link rel="next" href="apcs04.html" title="C.4.&nbsp;Region Overlap Repairs"></head><body bgcolor="white" text="black" link="#0000FF" vlink="#840084" alink="#0000FF"><div class="navheader"><table width="100%" summary="Navigation header"><tr><th colspan="3" align="center">C.3.&nbsp;Localized repairs</th></tr><tr><td width="20%" align="left"><a accesskey="p" href="apcs02.html">Prev</a>&nbsp;</td><th width="60%" align="center">Appendix&nbsp;C.&nbsp;hbck In Depth</th><td width="20%" align="right">&nbsp;<a accesskey="n" href="apcs04.html">Next<
 /a></td></tr></table><hr></div><script type="text/javascript">
+    var disqus_shortname = 'hbase'; // required: replace example with your forum shortname
+    var disqus_url = 'http://hbase.apache.org/book/.html';
+    </script><div class="section"><div class="titlepage"><div><div><h2 class="title" style="clear: both"><a name="d4029e21069"></a>C.3.&nbsp;Localized repairs</h2></div></div></div><p>
+	When repairing a corrupted HBase, it is best to repair the lowest risk inconsistencies first.
+These are generally region consistency repairs -- localized single region repairs, that only modify
+in-memory data, ephemeral zookeeper data, or patch holes in the META table.
+Region consistency requires that the HBase instance has the state of the region&#8217;s data in HDFS
+(.regioninfo files), the region&#8217;s row in the hbase:meta table., and region&#8217;s deployment/assignments on
+region servers and the master in accordance. Options for repairing region consistency include:
+	</p><div class="itemizedlist"><ul class="itemizedlist" style="list-style-type: disc; "><li class="listitem"><p><code class="code">-fixAssignments</code> (equivalent to the 0.90 <code class="code">-fix</code> option) repairs unassigned, incorrectly
+assigned or multiply assigned regions.</p></li><li class="listitem"><p><code class="code">-fixMeta</code> which removes meta rows when corresponding regions are not present in
+		  HDFS and adds new meta rows if they regions are present in HDFS while not in META.</p></li></ul></div><p>
+	To fix deployment and assignment problems you can run this command:
+</p><pre class="programlisting">
+$ ./bin/hbase hbck -fixAssignments
+</pre><p>To fix deployment and assignment problems as well as repairing incorrect meta rows you can
+run this command:</p><pre class="programlisting">
+$ ./bin/hbase hbck -fixAssignments -fixMeta
+</pre><p>There are a few classes of table integrity problems that are low risk repairs. The first two are
+degenerate (startkey == endkey) regions and backwards regions (startkey &gt; endkey). These are
+automatically handled by sidelining the data to a temporary directory (/hbck/xxxx).
+The third low-risk class is hdfs region holes. This can be repaired by using the:</p><div class="itemizedlist"><ul class="itemizedlist" style="list-style-type: disc; "><li class="listitem"><p><code class="code">-fixHdfsHoles</code> option for fabricating new empty regions on the file system.
+If holes are detected you can use -fixHdfsHoles and should include -fixMeta and -fixAssignments to make the new region consistent.</p></li></ul></div><pre class="programlisting">
+$ ./bin/hbase hbck -fixAssignments -fixMeta -fixHdfsHoles
+</pre><p>Since this is a common operation, we&#8217;ve added a the <code class="code">-repairHoles</code> flag that is equivalent to the
+previous command:</p><pre class="programlisting">
+$ ./bin/hbase hbck -repairHoles
+</pre><p>If inconsistencies still remain after these steps, you most likely have table integrity problems
+related to orphaned or overlapping regions.</p></div><div id="disqus_thread"></div><script type="text/javascript">
+    /* * * DON'T EDIT BELOW THIS LINE * * */
+    (function() {
+        var dsq = document.createElement('script'); dsq.type = 'text/javascript'; dsq.async = true;
+        dsq.src = 'http://' + disqus_shortname + '.disqus.com/embed.js';
+        (document.getElementsByTagName('head')[0] || document.getElementsByTagName('body')[0]).appendChild(dsq);
+    })();
+</script><noscript>Please enable JavaScript to view the <a href="http://disqus.com/?ref_noscript">comments powered by Disqus.</a></noscript><a href="http://disqus.com" class="dsq-brlink">comments powered by <span class="logo-disqus">Disqus</span></a><div class="navfooter"><hr><table width="100%" summary="Navigation footer"><tr><td width="40%" align="left"><a accesskey="p" href="apcs02.html">Prev</a>&nbsp;</td><td width="20%" align="center"><a accesskey="u" href="hbck.in.depth.html">Up</a></td><td width="40%" align="right">&nbsp;<a accesskey="n" href="apcs04.html">Next</a></td></tr><tr><td width="40%" align="left" valign="top">C.2.&nbsp;Inconsistencies&nbsp;</td><td width="20%" align="center"><a accesskey="h" href="book.html">Home</a></td><td width="40%" align="right" valign="top">&nbsp;C.4.&nbsp;Region Overlap Repairs</td></tr></table></div></body></html>
\ No newline at end of file

Added: hbase/hbase.apache.org/trunk/book/apcs04.html
URL: http://svn.apache.org/viewvc/hbase/hbase.apache.org/trunk/book/apcs04.html?rev=1616896&view=auto
==============================================================================
--- hbase/hbase.apache.org/trunk/book/apcs04.html (added)
+++ hbase/hbase.apache.org/trunk/book/apcs04.html Fri Aug  8 22:19:16 2014
@@ -0,0 +1,64 @@
+<html><head>
+      <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
+   <title>C.4.&nbsp;Region Overlap Repairs</title><link rel="stylesheet" type="text/css" href="${baserdir}/src/main/site/resources/css/freebsd_docbook.css"><meta name="generator" content="DocBook XSL Stylesheets V1.78.1"><link rel="home" href="book.html" title="The Apache HBase&#153; Reference Guide"><link rel="up" href="hbck.in.depth.html" title="Appendix&nbsp;C.&nbsp;hbck In Depth"><link rel="prev" href="apcs03.html" title="C.3.&nbsp;Localized repairs"><link rel="next" href="compression.html" title="Appendix&nbsp;D.&nbsp;Compression and Data Block Encoding In HBase"></head><body bgcolor="white" text="black" link="#0000FF" vlink="#840084" alink="#0000FF"><div class="navheader"><table width="100%" summary="Navigation header"><tr><th colspan="3" align="center">C.4.&nbsp;Region Overlap Repairs</th></tr><tr><td width="20%" align="left"><a accesskey="p" href="apcs03.html">Prev</a>&nbsp;</td><th width="60%" align="center">Appendix&nbsp;C.&nbsp;hbck In Depth</th><td width="20%" align="rig
 ht">&nbsp;<a accesskey="n" href="compression.html">Next</a></td></tr></table><hr></div><script type="text/javascript">
+    var disqus_shortname = 'hbase'; // required: replace example with your forum shortname
+    var disqus_url = 'http://hbase.apache.org/book/.html';
+    </script><div class="section"><div class="titlepage"><div><div><h2 class="title" style="clear: both"><a name="d4029e21114"></a>C.4.&nbsp;Region Overlap Repairs</h2></div></div></div><p>Table integrity problems can require repairs that deal with overlaps. This is a riskier operation
+because it requires modifications to the file system, requires some decision making, and may
+require some manual steps. For these repairs it is best to analyze the output of a <code class="code">hbck -details</code>
+run so that you isolate repairs attempts only upon problems the checks identify. Because this is
+riskier, there are safeguard that should be used to limit the scope of the repairs.
+WARNING: This is a relatively new and have only been tested on online but idle HBase instances
+(no reads/writes). Use at your own risk in an active production environment!
+The options for repairing table integrity violations include:</p><div class="itemizedlist"><ul class="itemizedlist" style="list-style-type: disc; "><li class="listitem"><p><code class="code">-fixHdfsOrphans</code> option for &#8220;adopting&#8221; a region directory that is missing a region
+metadata file (the .regioninfo file).</p></li><li class="listitem"><p><code class="code">-fixHdfsOverlaps</code> ability for fixing overlapping regions</p></li></ul></div><p>When repairing overlapping regions, a region&#8217;s data can be modified on the file system in two
+ways: 1) by merging regions into a larger region or 2) by sidelining regions by moving data to
+&#8220;sideline&#8221; directory where data could be restored later. Merging a large number of regions is
+technically correct but could result in an extremely large region that requires series of costly
+compactions and splitting operations. In these cases, it is probably better to sideline the regions
+that overlap with the most other regions (likely the largest ranges) so that merges can happen on
+a more reasonable scale. Since these sidelined regions are already laid out in HBase&#8217;s native
+directory and HFile format, they can be restored by using HBase&#8217;s bulk load mechanism.
+The default safeguard thresholds are conservative. These options let you override the default
+thresholds and to enable the large region sidelining feature.</p><div class="itemizedlist"><ul class="itemizedlist" style="list-style-type: disc; "><li class="listitem"><p><code class="code">-maxMerge &lt;n&gt;</code> maximum number of overlapping regions to merge</p></li><li class="listitem"><p><code class="code">-sidelineBigOverlaps</code> if more than maxMerge regions are overlapping, sideline attempt
+to sideline the regions overlapping with the most other regions.</p></li><li class="listitem"><p><code class="code">-maxOverlapsToSideline &lt;n&gt;</code> if sidelining large overlapping regions, sideline at most n
+regions.</p></li></ul></div><p>Since often times you would just want to get the tables repaired, you can use this option to turn
+on all repair options:</p><div class="itemizedlist"><ul class="itemizedlist" style="list-style-type: disc; "><li class="listitem"><p><code class="code">-repair</code> includes all the region consistency options and only the hole repairing table
+integrity options.</p></li></ul></div><p>Finally, there are safeguards to limit repairs to only specific tables. For example the following
+command would only attempt to check and repair table TableFoo and TableBar.</p><pre class="screen">
+$ ./bin/hbase hbck -repair TableFoo TableBar
+</pre><div class="section"><div class="titlepage"><div><div><h3 class="title"><a name="d4029e21163"></a>C.4.1.&nbsp;Special cases: Meta is not properly assigned</h3></div></div></div><p>There are a few special cases that hbck can handle as well.
+Sometimes the meta table&#8217;s only region is inconsistently assigned or deployed. In this case
+there is a special <code class="code">-fixMetaOnly</code> option that can try to fix meta assignments.</p><pre class="screen">
+$ ./bin/hbase hbck -fixMetaOnly -fixAssignments
+</pre></div><div class="section"><div class="titlepage"><div><div><h3 class="title"><a name="d4029e21173"></a>C.4.2.&nbsp;Special cases: HBase version file is missing</h3></div></div></div><p>HBase&#8217;s data on the file system requires a version file in order to start. If this flie is missing, you
+can use the <code class="code">-fixVersionFile</code> option to fabricating a new HBase version file. This assumes that
+the version of hbck you are running is the appropriate version for the HBase cluster.</p></div><div class="section"><div class="titlepage"><div><div><h3 class="title"><a name="d4029e21181"></a>C.4.3.&nbsp;Special case: Root and META are corrupt.</h3></div></div></div><p>The most drastic corruption scenario is the case where the ROOT or META is corrupted and
+HBase will not start. In this case you can use the OfflineMetaRepair tool create new ROOT
+and META regions and tables.
+This tool assumes that HBase is offline. It then marches through the existing HBase home
+directory, loads as much information from region metadata files (.regioninfo files) as possible
+from the file system. If the region metadata has proper table integrity, it sidelines the original root
+and meta table directories, and builds new ones with pointers to the region directories and their
+data.</p><pre class="screen">
+$ ./bin/hbase org.apache.hadoop.hbase.util.hbck.OfflineMetaRepair
+</pre><p>NOTE: This tool is not as clever as uberhbck but can be used to bootstrap repairs that uberhbck
+can complete.
+If the tool succeeds you should be able to start hbase and run online repairs if necessary.</p></div><div class="section"><div class="titlepage"><div><div><h3 class="title"><a name="d4029e21190"></a>C.4.4.&nbsp;Special cases: Offline split parent</h3></div></div></div><p>
+Once a region is split, the offline parent will be cleaned up automatically. Sometimes, daughter regions
+are split again before their parents are cleaned up. HBase can clean up parents in the right order. However,
+there could be some lingering offline split parents sometimes. They are in META, in HDFS, and not deployed.
+But HBase can't clean them up. In this case, you can use the <code class="code">-fixSplitParents</code> option to reset
+them in META to be online and not split. Therefore, hbck can merge them with other regions if fixing
+overlapping regions option is used.
+    </p><p>
+This option should not normally be used, and it is not in <code class="code">-fixAll</code>.
+    </p></div></div><div id="disqus_thread"></div><script type="text/javascript">
+    /* * * DON'T EDIT BELOW THIS LINE * * */
+    (function() {
+        var dsq = document.createElement('script'); dsq.type = 'text/javascript'; dsq.async = true;
+        dsq.src = 'http://' + disqus_shortname + '.disqus.com/embed.js';
+        (document.getElementsByTagName('head')[0] || document.getElementsByTagName('body')[0]).appendChild(dsq);
+    })();
+</script><noscript>Please enable JavaScript to view the <a href="http://disqus.com/?ref_noscript">comments powered by Disqus.</a></noscript><a href="http://disqus.com" class="dsq-brlink">comments powered by <span class="logo-disqus">Disqus</span></a><div class="navfooter"><hr><table width="100%" summary="Navigation footer"><tr><td width="40%" align="left"><a accesskey="p" href="apcs03.html">Prev</a>&nbsp;</td><td width="20%" align="center"><a accesskey="u" href="hbck.in.depth.html">Up</a></td><td width="40%" align="right">&nbsp;<a accesskey="n" href="compression.html">Next</a></td></tr><tr><td width="40%" align="left" valign="top">C.3.&nbsp;Localized repairs&nbsp;</td><td width="20%" align="center"><a accesskey="h" href="book.html">Home</a></td><td width="40%" align="right" valign="top">&nbsp;Appendix&nbsp;D.&nbsp;Compression and Data Block Encoding In
+          HBase</td></tr></table></div></body></html>
\ No newline at end of file

Added: hbase/hbase.apache.org/trunk/book/apds02.html
URL: http://svn.apache.org/viewvc/hbase/hbase.apache.org/trunk/book/apds02.html?rev=1616896&view=auto
==============================================================================
--- hbase/hbase.apache.org/trunk/book/apds02.html (added)
+++ hbase/hbase.apache.org/trunk/book/apds02.html Fri Aug  8 22:19:16 2014
@@ -0,0 +1,145 @@
+<html><head>
+      <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
+   <title>D.2.&nbsp;Compressor Configuration, Installation, and Use</title><link rel="stylesheet" type="text/css" href="${baserdir}/src/main/site/resources/css/freebsd_docbook.css"><meta name="generator" content="DocBook XSL Stylesheets V1.78.1"><link rel="home" href="book.html" title="The Apache HBase&#153; Reference Guide"><link rel="up" href="compression.html" title="Appendix&nbsp;D.&nbsp;Compression and Data Block Encoding In HBase"><link rel="prev" href="compression.html" title="Appendix&nbsp;D.&nbsp;Compression and Data Block Encoding In HBase"><link rel="next" href="data.block.encoding.enable.html" title="D.3.&nbsp;Enable Data Block Encoding"></head><body bgcolor="white" text="black" link="#0000FF" vlink="#840084" alink="#0000FF"><div class="navheader"><table width="100%" summary="Navigation header"><tr><th colspan="3" align="center">D.2.&nbsp;Compressor Configuration, Installation, and Use</th></tr><tr><td width="20%" align="left"><a accesskey="p" href="compression.html">Pre
 v</a>&nbsp;</td><th width="60%" align="center">Appendix&nbsp;D.&nbsp;Compression and Data Block Encoding In
+          HBase</th><td width="20%" align="right">&nbsp;<a accesskey="n" href="data.block.encoding.enable.html">Next</a></td></tr></table><hr></div><script type="text/javascript">
+    var disqus_shortname = 'hbase'; // required: replace example with your forum shortname
+    var disqus_url = 'http://hbase.apache.org/book/.html';
+    </script><div class="section"><div class="titlepage"><div><div><h2 class="title" style="clear: both"><a name="d4029e21361"></a>D.2.&nbsp;Compressor Configuration, Installation, and Use</h2></div></div></div><div class="section"><div class="titlepage"><div><div><h3 class="title"><a name="compressor.install"></a>D.2.1.&nbsp;Configure HBase For Compressors</h3></div></div></div><p>Before HBase can use a given compressor, its libraries need to be available. Due to
+          licensing issues, only GZ compression is available to HBase (via native Java libraries) in
+          a default installation.</p><div class="section"><div class="titlepage"><div><div><h4 class="title"><a name="d4029e21369"></a>D.2.1.1.&nbsp;Compressor Support On the Master</h4></div></div></div><p>A new configuration setting was introduced in HBase 0.95, to check the Master to
+            determine which data block encoders are installed and configured on it, and assume that
+            the entire cluster is configured the same. This option,
+              <code class="code">hbase.master.check.compression</code>, defaults to <code class="literal">true</code>. This
+            prevents the situation described in <a class="link" href="https://issues.apache.org/jira/browse/HBASE-6370" target="_top">HBASE-6370</a>, where
+            a table is created or modified to support a codec that a region server does not support,
+            leading to failures that take a long time to occur and are difficult to debug. </p><p>If <code class="code">hbase.master.check.compression</code> is enabled, libraries for all desired
+            compressors need to be installed and configured on the Master, even if the Master does
+            not run a region server.</p></div><div class="section"><div class="titlepage"><div><div><h4 class="title"><a name="d4029e21388"></a>D.2.1.2.&nbsp;Install GZ Support Via Native Libraries</h4></div></div></div><p>HBase uses Java's built-in GZip support unless the native Hadoop libraries are
+            available on the CLASSPATH. The recommended way to add libraries to the CLASSPATH is to
+            set the environment variable <code class="envar">HBASE_LIBRARY_PATH</code> for the user running
+            HBase. If native libraries are not available and Java's GZIP is used, <code class="literal">Got
+              brand-new compressor</code> reports will be present in the logs. See <a class="xref" href="trouble.rs.html#brand.new.compressor" title="15.9.2.10.&nbsp;Logs flooded with '2011-01-10 12:40:48,407 INFO org.apache.hadoop.io.compress.CodecPool: Got brand-new compressor' messages">Section&nbsp;15.9.2.10, &#8220;Logs flooded with '2011-01-10 12:40:48,407 INFO org.apache.hadoop.io.compress.CodecPool: Got
+            brand-new compressor' messages&#8221;</a>).</p></div><div class="section"><div class="titlepage"><div><div><h4 class="title"><a name="lzo.compression"></a>D.2.1.3.&nbsp;Install LZO Support</h4></div></div></div><p>HBase cannot ship with LZO because of incompatibility between HBase, which uses an
+            Apache Software License (ASL) and LZO, which uses a GPL license. See the <a class="link" href="http://wiki.apache.org/hadoop/UsingLzoCompression" target="_top">Using LZO
+              Compression</a> wiki page for information on configuring LZO support for HBase. </p><p>If you depend upon LZO compression, consider configuring your RegionServers to fail
+            to start if LZO is not available. See <a class="xref" href="apds02.html#hbase.regionserver.codecs" title="D.2.1.7.&nbsp;Enforce Compression Settings On a RegionServer">Section&nbsp;D.2.1.7, &#8220;Enforce Compression Settings On a RegionServer&#8221;</a>.</p></div><div class="section"><div class="titlepage"><div><div><h4 class="title"><a name="lz4.compression"></a>D.2.1.4.&nbsp;Configure LZ4 Support</h4></div></div></div><p>LZ4 support is bundled with Hadoop. Make sure the hadoop shared library
+            (libhadoop.so) is accessible when you start
+            HBase. After configuring your platform (see <a class="xref" href="">???</a>), you can make a symbolic link from HBase to the native Hadoop
+            libraries. This assumes the two software installs are colocated. For example, if my
+            'platform' is Linux-amd64-64:
+            </p><pre class="programlisting">$ <strong class="hl-keyword">cd</strong> $HBASE_HOME
+$ mkdir lib/native
+$ ln -s $HADOOP_HOME/lib/native lib/native/Linux-amd64-<span class="hl-number">64</span></pre><p>
+            Use the compression tool to check that LZ4 is installed on all nodes. Start up (or restart)
+            HBase. Afterward, you can create and alter tables to enable LZ4 as a
+            compression codec.:
+            </p><pre class="screen">
+hbase(main):003:0&gt; <strong class="userinput"><code>alter 'TestTable', {NAME =&gt; 'info', COMPRESSION =&gt; 'LZ4'}</code></strong>
+            </pre><p>
+          </p></div><div class="section"><div class="titlepage"><div><div><h4 class="title"><a name="snappy.compression.installation"></a>D.2.1.5.&nbsp;Install Snappy Support</h4></div></div></div><p>HBase does not ship with Snappy support because of licensing issues. You can install
+            Snappy binaries (for instance, by using <span class="command"><strong>yum install snappy</strong></span> on CentOS)
+            or build Snappy from source. After installing Snappy, search for the shared library,
+            which will be called <code class="filename">libsnappy.so.X</code> where X is a number. If you
+            built from source, copy the shared library to a known location on your system, such as
+              <code class="filename">/opt/snappy/lib/</code>.</p><p>In addition to the Snappy library, HBase also needs access to the Hadoop shared
+            library, which will be called something like <code class="filename">libhadoop.so.X.Y</code>,
+            where X and Y are both numbers. Make note of the location of the Hadoop library, or copy
+            it to the same location as the Snappy library.</p><div class="note" style="margin-left: 0.5in; margin-right: 0.5in;"><h3 class="title">Note</h3><p>The Snappy and Hadoop libraries need to be available on each node of your cluster.
+              See <a class="xref" href="apds02.html#compression.test" title="D.2.1.6.&nbsp;CompressionTest">Section&nbsp;D.2.1.6, &#8220;CompressionTest&#8221;</a> to find out how to test that this is the case.</p><p>See <a class="xref" href="apds02.html#hbase.regionserver.codecs" title="D.2.1.7.&nbsp;Enforce Compression Settings On a RegionServer">Section&nbsp;D.2.1.7, &#8220;Enforce Compression Settings On a RegionServer&#8221;</a> to configure your RegionServers to fail to
+              start if a given compressor is not available.</p></div><p>Each of these library locations need to be added to the environment variable
+              <code class="envar">HBASE_LIBRARY_PATH</code> for the operating system user that runs HBase. You
+            need to restart the RegionServer for the changes to take effect.</p></div><div class="section"><div class="titlepage"><div><div><h4 class="title"><a name="compression.test"></a>D.2.1.6.&nbsp;CompressionTest</h4></div></div></div><p>You can use the CompressionTest tool to verify that your compressor is available to
+            HBase:</p><pre class="screen">
+ $ hbase org.apache.hadoop.hbase.util.CompressionTest hdfs://<em class="replaceable"><code>host/path/to/hbase</code></em> snappy       
+          </pre></div><div class="section"><div class="titlepage"><div><div><h4 class="title"><a name="hbase.regionserver.codecs"></a>D.2.1.7.&nbsp;Enforce Compression Settings On a RegionServer</h4></div></div></div><p>You can configure a RegionServer so that it will fail to restart if compression is
+            configured incorrectly, by adding the option hbase.regionserver.codecs to the
+              <code class="filename">hbase-site.xml</code>, and setting its value to a comma-separated list
+            of codecs that need to be available. For example, if you set this property to
+              <code class="literal">lzo,gz</code>, the RegionServer would fail to start if both compressors
+            were not available. This would prevent a new server from being added to the cluster
+            without having codecs configured properly.</p></div></div><div class="section"><div class="titlepage"><div><div><h3 class="title"><a name="changing.compression"></a>D.2.2.&nbsp;Enable Compression On a ColumnFamily</h3></div></div></div><p>To enable compression for a ColumnFamily, use an <code class="code">alter</code> command. You do
+          not need to re-create the table or copy data. If you are changing codecs, be sure the old
+          codec is still available until all the old StoreFiles have been compacted.</p><div class="example"><a name="d4029e21491"></a><p class="title"><b>Example&nbsp;D.1.&nbsp;Enabling Compression on a ColumnFamily of an Existing Table using HBase
+            Shell</b></p><div class="example-contents"><pre class="screen">
+hbase&gt; disable 'test'
+hbase&gt; alter 'test', {NAME =&gt; 'cf', COMPRESSION =&gt; 'GZ'}
+hbase&gt; enable 'test'
+        </pre></div></div><br class="example-break"><div class="example"><a name="d4029e21496"></a><p class="title"><b>Example&nbsp;D.2.&nbsp;Creating a New Table with Compression On a ColumnFamily</b></p><div class="example-contents"><pre class="screen">
+hbase&gt; create 'test2', { NAME =&gt; 'cf2', COMPRESSION =&gt; 'SNAPPY' }         
+          </pre></div></div><br class="example-break"><div class="example"><a name="d4029e21501"></a><p class="title"><b>Example&nbsp;D.3.&nbsp;Verifying a ColumnFamily's Compression Settings</b></p><div class="example-contents"><pre class="screen">
+hbase&gt; describe 'test'
+DESCRIPTION                                          ENABLED
+ 'test', {NAME =&gt; 'cf', DATA_BLOCK_ENCODING =&gt; 'NONE false
+ ', BLOOMFILTER =&gt; 'ROW', REPLICATION_SCOPE =&gt; '0',
+ VERSIONS =&gt; '1', COMPRESSION =&gt; 'GZ', MIN_VERSIONS
+ =&gt; '0', TTL =&gt; 'FOREVER', KEEP_DELETED_CELLS =&gt; 'fa
+ lse', BLOCKSIZE =&gt; '65536', IN_MEMORY =&gt; 'false', B
+ LOCKCACHE =&gt; 'true'}
+1 row(s) in 0.1070 seconds
+          </pre></div></div><br class="example-break"></div><div class="section"><div class="titlepage"><div><div><h3 class="title"><a name="d4029e21506"></a>D.2.3.&nbsp;Testing Compression Performance</h3></div></div></div><p>HBase includes a tool called LoadTestTool which provides mechanisms to test your
+          compression performance. You must specify either <code class="literal">-write</code> or
+          <code class="literal">-update-read</code> as your first parameter, and if you do not specify another
+        parameter, usage advice is printed for each option.</p><div class="example"><a name="d4029e21517"></a><p class="title"><b>Example&nbsp;D.4.&nbsp;<span class="command">LoadTestTool</span> Usage</b></p><div class="example-contents"><pre class="screen">
+$ bin/hbase org.apache.hadoop.hbase.util.LoadTestTool -h            
+usage: bin/hbase org.apache.hadoop.hbase.util.LoadTestTool &lt;options&gt;
+Options:
+ -batchupdate                 Whether to use batch as opposed to separate
+                              updates <strong class="hl-keyword">for</strong> every column in a row
+ -bloom &lt;arg&gt;                 Bloom filter <strong class="hl-keyword">type</strong>, one of [NONE, ROW, ROWCOL]
+ -compression &lt;arg&gt;           Compression <strong class="hl-keyword">type</strong>, one of [LZO, GZ, NONE, SNAPPY,
+                              LZ4]
+ -data_block_encoding &lt;arg&gt;   Encoding algorithm (e.g. prefix compression) to
+                              use <strong class="hl-keyword">for</strong> data blocks in the <strong class="hl-keyword">test</strong> column family, one
+                              of [NONE, PREFIX, DIFF, FAST_DIFF, PREFIX_TREE].
+ -encryption &lt;arg&gt;            Enables transparent encryption on the <strong class="hl-keyword">test</strong> table,
+                              one of [AES]
+ -generator &lt;arg&gt;             The class which generates load <strong class="hl-keyword">for</strong> the tool. Any
+                              args <strong class="hl-keyword">for</strong> this class can be passed as colon
+                              separated after class name
+ -h,--help                    Show usage
+ -in_memory                   Tries to keep the HFiles of the CF inmemory as far
+                              as possible.  Not guaranteed that reads are always
+                              served from inmemory
+ -init_only                   Initialize the <strong class="hl-keyword">test</strong> table only, don<strong class="hl-string"><em style="color:red">'t do any
+                              loading
+ -key_window &lt;arg&gt;            The '</em></strong>key window<strong class="hl-string"><em style="color:red">' to maintain between reads and
+                              writes for concurrent write/read workload. The
+                              default is 0.
+ -max_read_errors &lt;arg&gt;       The maximum number of read errors to tolerate
+                              before terminating all reader threads. The default
+                              is 10.
+ -multiput                    Whether to use multi-puts as opposed to separate
+                              puts for every column in a row
+ -num_keys &lt;arg&gt;              The number of keys to read/write
+ -num_tables &lt;arg&gt;            A positive integer number. When a number n is
+                              speicfied, load test tool  will load n table
+                              parallely. -tn parameter value becomes table name
+                              prefix. Each table name is in format
+                              &lt;tn&gt;_1...&lt;tn&gt;_n
+ -read &lt;arg&gt;                  &lt;verify_percent&gt;[:&lt;#threads=20&gt;]
+ -regions_per_server &lt;arg&gt;    A positive integer number. When a number n is
+                              specified, load test tool will create the test
+                              table with n regions per server
+ -skip_init                   Skip the initialization; assume test table already
+                              exists
+ -start_key &lt;arg&gt;             The first key to read/write (a 0-based index). The
+                              default value is 0.
+ -tn &lt;arg&gt;                    The name of the table to read or write
+ -update &lt;arg&gt;                &lt;update_percent&gt;[:&lt;#threads=20&gt;][:&lt;#whether to
+                              ignore nonce collisions=0&gt;]
+ -write &lt;arg&gt;                 &lt;avg_cols_per_key&gt;:&lt;avg_data_size&gt;[:&lt;#threads=20&gt;]
+ -zk &lt;arg&gt;                    ZK quorum as comma-separated host names without
+                              port numbers
+ -zk_root &lt;arg&gt;               name of parent znode in zookeeper            
+          </em></strong></pre></div></div><br class="example-break"><div class="example"><a name="d4029e21524"></a><p class="title"><b>Example&nbsp;D.5.&nbsp;Example Usage of LoadTestTool</b></p><div class="example-contents"><pre class="screen">
+$ hbase org.apache.hadoop.hbase.util.LoadTestTool -write <span class="hl-number">1</span>:<span class="hl-number">10</span>:<span class="hl-number">100</span> -num_keys <span class="hl-number">1000000</span>
+          -<strong class="hl-keyword">read</strong> <span class="hl-number">100</span>:<span class="hl-number">30</span> -num_tables <span class="hl-number">1</span> -data_block_encoding NONE -tn load_test_tool_NONE
+          </pre></div></div><br class="example-break"></div></div><div id="disqus_thread"></div><script type="text/javascript">
+    /* * * DON'T EDIT BELOW THIS LINE * * */
+    (function() {
+        var dsq = document.createElement('script'); dsq.type = 'text/javascript'; dsq.async = true;
+        dsq.src = 'http://' + disqus_shortname + '.disqus.com/embed.js';
+        (document.getElementsByTagName('head')[0] || document.getElementsByTagName('body')[0]).appendChild(dsq);
+    })();
+</script><noscript>Please enable JavaScript to view the <a href="http://disqus.com/?ref_noscript">comments powered by Disqus.</a></noscript><a href="http://disqus.com" class="dsq-brlink">comments powered by <span class="logo-disqus">Disqus</span></a><div class="navfooter"><hr><table width="100%" summary="Navigation footer"><tr><td width="40%" align="left"><a accesskey="p" href="compression.html">Prev</a>&nbsp;</td><td width="20%" align="center"><a accesskey="u" href="compression.html">Up</a></td><td width="40%" align="right">&nbsp;<a accesskey="n" href="data.block.encoding.enable.html">Next</a></td></tr><tr><td width="40%" align="left" valign="top">Appendix&nbsp;D.&nbsp;Compression and Data Block Encoding In
+          HBase&nbsp;</td><td width="20%" align="center"><a accesskey="h" href="book.html">Home</a></td><td width="40%" align="right" valign="top">&nbsp;D.3.&nbsp;Enable Data Block Encoding</td></tr></table></div></body></html>
\ No newline at end of file

Added: hbase/hbase.apache.org/trunk/book/ape.html
URL: http://svn.apache.org/viewvc/hbase/hbase.apache.org/trunk/book/ape.html?rev=1616896&view=auto
==============================================================================
--- hbase/hbase.apache.org/trunk/book/ape.html (added)
+++ hbase/hbase.apache.org/trunk/book/ape.html Fri Aug  8 22:19:16 2014
@@ -0,0 +1,15 @@
+<html><head>
+      <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
+   <title>Appendix&nbsp;E.&nbsp;YCSB: The Yahoo! Cloud Serving Benchmark and HBase</title><link rel="stylesheet" type="text/css" href="${baserdir}/src/main/site/resources/css/freebsd_docbook.css"><meta name="generator" content="DocBook XSL Stylesheets V1.78.1"><link rel="home" href="book.html" title="The Apache HBase&#153; Reference Guide"><link rel="up" href="book.html" title="The Apache HBase&#153; Reference Guide"><link rel="prev" href="data.block.encoding.enable.html" title="D.3.&nbsp;Enable Data Block Encoding"><link rel="next" href="hfilev2.html" title="Appendix&nbsp;F.&nbsp;HFile format version 2"></head><body bgcolor="white" text="black" link="#0000FF" vlink="#840084" alink="#0000FF"><div class="navheader"><table width="100%" summary="Navigation header"><tr><th colspan="3" align="center">Appendix&nbsp;E.&nbsp;YCSB: The Yahoo! Cloud Serving Benchmark and HBase</th></tr><tr><td width="20%" align="left"><a accesskey="p" href="data.block.encoding.enable.html">Prev</a>&nbsp;</td>
 <th width="60%" align="center">&nbsp;</th><td width="20%" align="right">&nbsp;<a accesskey="n" href="hfilev2.html">Next</a></td></tr></table><hr></div><script type="text/javascript">
+    var disqus_shortname = 'hbase'; // required: replace example with your forum shortname
+    var disqus_url = 'http://hbase.apache.org/book/.html';
+    </script><div class="appendix"><div class="titlepage"><div><div><h1 class="title"><a name="d4029e21547"></a>Appendix&nbsp;E.&nbsp;<a class="link" href="https://github.com/brianfrankcooper/YCSB/" target="_top">YCSB: The Yahoo! Cloud Serving Benchmark</a> and HBase</h1></div></div></div><p>TODO: Describe how YCSB is poor for putting up a decent cluster load.</p><p>TODO: Describe setup of YCSB for HBase.  In particular, presplit your tables before you start
+          a run.  See <a class="link" href="https://issues.apache.org/jira/browse/HBASE-4163" target="_top">HBASE-4163 Create Split Strategy for YCSB Benchmark</a>
+          for why and a little shell command for how to do it.</p><p>Ted Dunning redid YCSB so it's mavenized and added facility for verifying workloads.  See <a class="link" href="https://github.com/tdunning/YCSB" target="_top">Ted Dunning's YCSB</a>.</p></div><div id="disqus_thread"></div><script type="text/javascript">
+    /* * * DON'T EDIT BELOW THIS LINE * * */
+    (function() {
+        var dsq = document.createElement('script'); dsq.type = 'text/javascript'; dsq.async = true;
+        dsq.src = 'http://' + disqus_shortname + '.disqus.com/embed.js';
+        (document.getElementsByTagName('head')[0] || document.getElementsByTagName('body')[0]).appendChild(dsq);
+    })();
+</script><noscript>Please enable JavaScript to view the <a href="http://disqus.com/?ref_noscript">comments powered by Disqus.</a></noscript><a href="http://disqus.com" class="dsq-brlink">comments powered by <span class="logo-disqus">Disqus</span></a><div class="navfooter"><hr><table width="100%" summary="Navigation footer"><tr><td width="40%" align="left"><a accesskey="p" href="data.block.encoding.enable.html">Prev</a>&nbsp;</td><td width="20%" align="center">&nbsp;</td><td width="40%" align="right">&nbsp;<a accesskey="n" href="hfilev2.html">Next</a></td></tr><tr><td width="40%" align="left" valign="top">D.3.&nbsp;Enable Data Block Encoding&nbsp;</td><td width="20%" align="center"><a accesskey="h" href="book.html">Home</a></td><td width="40%" align="right" valign="top">&nbsp;Appendix&nbsp;F.&nbsp;HFile format version 2</td></tr></table></div></body></html>
\ No newline at end of file

Added: hbase/hbase.apache.org/trunk/book/apfs02.html
URL: http://svn.apache.org/viewvc/hbase/hbase.apache.org/trunk/book/apfs02.html?rev=1616896&view=auto
==============================================================================
--- hbase/hbase.apache.org/trunk/book/apfs02.html (added)
+++ hbase/hbase.apache.org/trunk/book/apfs02.html Fri Aug  8 22:19:16 2014
@@ -0,0 +1,21 @@
+<html><head>
+      <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
+   <title>F.2.&nbsp;HFile format version 1 overview</title><link rel="stylesheet" type="text/css" href="${baserdir}/src/main/site/resources/css/freebsd_docbook.css"><meta name="generator" content="DocBook XSL Stylesheets V1.78.1"><link rel="home" href="book.html" title="The Apache HBase&#153; Reference Guide"><link rel="up" href="hfilev2.html" title="Appendix&nbsp;F.&nbsp;HFile format version 2"><link rel="prev" href="hfilev2.html" title="Appendix&nbsp;F.&nbsp;HFile format version 2"><link rel="next" href="apfs03.html" title="F.3.&nbsp; HBase file format with inline blocks (version 2)"></head><body bgcolor="white" text="black" link="#0000FF" vlink="#840084" alink="#0000FF"><div class="navheader"><table width="100%" summary="Navigation header"><tr><th colspan="3" align="center">F.2.&nbsp;HFile format version 1 overview </th></tr><tr><td width="20%" align="left"><a accesskey="p" href="hfilev2.html">Prev</a>&nbsp;</td><th width="60%" align="center">Appendix&nbsp;F.&nbsp;HFile format ve
 rsion 2</th><td width="20%" align="right">&nbsp;<a accesskey="n" href="apfs03.html">Next</a></td></tr></table><hr></div><script type="text/javascript">
+    var disqus_shortname = 'hbase'; // required: replace example with your forum shortname
+    var disqus_url = 'http://hbase.apache.org/book/.html';
+    </script><div class="section"><div class="titlepage"><div><div><h2 class="title" style="clear: both"><a name="d4029e21580"></a>F.2.&nbsp;HFile format version 1 overview </h2></div></div></div><p>As we will be discussing the changes we are making to the HFile format, it is useful to give a short overview of the previous (HFile version 1) format. An HFile in the existing format is structured as follows:
+           <span class="inlinemediaobject"><img src="/Users/stack/checkouts/hbase.git.commit/target/docbkx/book/images/hfile.png" align="middle" alt="HFile Version 1"></span>
+           <a href="#ftn.d4029e21592" class="footnote" name="d4029e21592"><sup class="footnote">[34]</sup></a>
+       </p><div class="section"><div class="titlepage"><div><div><h3 class="title"><a name="d4029e21599"></a>F.2.1.&nbsp; Block index format in version 1 </h3></div></div></div><p>The block index in version 1 is very straightforward. For each entry, it contains: </p><div class="orderedlist"><ol class="orderedlist" type="1"><li class="listitem"><p>Offset (long)</p></li><li class="listitem"><p>Uncompressed size (int)</p></li><li class="listitem"><p>Key (a serialized byte array written using Bytes.writeByteArray) </p><div class="orderedlist"><ol class="orderedlist" type="a"><li class="listitem"><p>Key length as a variable-length integer (VInt)
+                  </p></li><li class="listitem"><p>
+                     Key bytes
+                 </p></li></ol></div></li></ol></div><p>The number of entries in the block index is stored in the fixed file trailer, and has to be passed in to the method that reads the block index. One of the limitations of the block index in version 1 is that it does not provide the compressed size of a block, which turns out to be necessary for decompression. Therefore, the HFile reader has to infer this compressed size from the offset difference between blocks. We fix this limitation in version 2, where we store on-disk block size instead of uncompressed size, and get uncompressed size from the block header.</p></div><div class="footnotes"><br><hr style="width:100; text-align:left;margin-left: 0"><div id="ftn.d4029e21592" class="footnote"><p><a href="#d4029e21592" class="para"><sup class="para">[34] </sup></a>Image courtesy of Lars George, <a class="link" href="http://www.larsgeorge.com/2009/10/hbase-architecture-101-storage.html" target="_top">hbase-architecture-101-storage.ht
 ml</a>.</p></div></div></div><div id="disqus_thread"></div><script type="text/javascript">
+    /* * * DON'T EDIT BELOW THIS LINE * * */
+    (function() {
+        var dsq = document.createElement('script'); dsq.type = 'text/javascript'; dsq.async = true;
+        dsq.src = 'http://' + disqus_shortname + '.disqus.com/embed.js';
+        (document.getElementsByTagName('head')[0] || document.getElementsByTagName('body')[0]).appendChild(dsq);
+    })();
+</script><noscript>Please enable JavaScript to view the <a href="http://disqus.com/?ref_noscript">comments powered by Disqus.</a></noscript><a href="http://disqus.com" class="dsq-brlink">comments powered by <span class="logo-disqus">Disqus</span></a><div class="navfooter"><hr><table width="100%" summary="Navigation footer"><tr><td width="40%" align="left"><a accesskey="p" href="hfilev2.html">Prev</a>&nbsp;</td><td width="20%" align="center"><a accesskey="u" href="hfilev2.html">Up</a></td><td width="40%" align="right">&nbsp;<a accesskey="n" href="apfs03.html">Next</a></td></tr><tr><td width="40%" align="left" valign="top">Appendix&nbsp;F.&nbsp;HFile format version 2&nbsp;</td><td width="20%" align="center"><a accesskey="h" href="book.html">Home</a></td><td width="40%" align="right" valign="top">&nbsp;F.3.&nbsp;
+      HBase file format with inline blocks (version 2)
+      </td></tr></table></div></body></html>
\ No newline at end of file

Added: hbase/hbase.apache.org/trunk/book/apfs03.html
URL: http://svn.apache.org/viewvc/hbase/hbase.apache.org/trunk/book/apfs03.html?rev=1616896&view=auto
==============================================================================
--- hbase/hbase.apache.org/trunk/book/apfs03.html (added)
+++ hbase/hbase.apache.org/trunk/book/apfs03.html Fri Aug  8 22:19:16 2014
@@ -0,0 +1,145 @@
+<html><head>
+      <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
+   <title>F.3.&nbsp; HBase file format with inline blocks (version 2)</title><link rel="stylesheet" type="text/css" href="${baserdir}/src/main/site/resources/css/freebsd_docbook.css"><meta name="generator" content="DocBook XSL Stylesheets V1.78.1"><link rel="home" href="book.html" title="The Apache HBase&#153; Reference Guide"><link rel="up" href="hfilev2.html" title="Appendix&nbsp;F.&nbsp;HFile format version 2"><link rel="prev" href="apfs02.html" title="F.2.&nbsp;HFile format version 1 overview"><link rel="next" href="other.info.html" title="Appendix&nbsp;G.&nbsp;Other Information About HBase"></head><body bgcolor="white" text="black" link="#0000FF" vlink="#840084" alink="#0000FF"><div class="navheader"><table width="100%" summary="Navigation header"><tr><th colspan="3" align="center">F.3.&nbsp;
+      HBase file format with inline blocks (version 2)
+      </th></tr><tr><td width="20%" align="left"><a accesskey="p" href="apfs02.html">Prev</a>&nbsp;</td><th width="60%" align="center">Appendix&nbsp;F.&nbsp;HFile format version 2</th><td width="20%" align="right">&nbsp;<a accesskey="n" href="other.info.html">Next</a></td></tr></table><hr></div><script type="text/javascript">
+    var disqus_shortname = 'hbase'; // required: replace example with your forum shortname
+    var disqus_url = 'http://hbase.apache.org/book/.html';
+    </script><div class="section"><div class="titlepage"><div><div><h2 class="title" style="clear: both"><a name="d4029e21623"></a>F.3.&nbsp;
+      HBase file format with inline blocks (version 2)
+      </h2></div></div></div><div class="section"><div class="titlepage"><div><div><h3 class="title"><a name="d4029e21626"></a>F.3.1.&nbsp; Overview</h3></div></div></div><p>The version of HBase introducing the above features reads both version 1 and 2 HFiles, but only writes version 2 HFiles. A version 2 HFile is structured as follows:
+           <span class="inlinemediaobject"><img src="/Users/stack/checkouts/hbase.git.commit/target/docbkx/book/images/hfilev2.png" align="middle" alt="HFile Version 2"></span>
+
+   </p></div><div class="section"><div class="titlepage"><div><div><h3 class="title"><a name="d4029e21638"></a>F.3.2.&nbsp;Unified version 2 block format</h3></div></div></div><p>In the version 2 every block in the data section contains the following fields: </p><div class="orderedlist"><ol class="orderedlist" type="1"><li class="listitem"><p>8 bytes: Block type, a sequence of bytes equivalent to version 1's "magic records". Supported block types are: </p><div class="orderedlist"><ol class="orderedlist" type="a"><li class="listitem"><p>DATA &#8211; data blocks
+                  </p></li><li class="listitem"><p>
+                     LEAF_INDEX &#8211; leaf-level index blocks in a multi-level-block-index
+                 </p></li><li class="listitem"><p>
+                     BLOOM_CHUNK &#8211; Bloom filter chunks
+                  </p></li><li class="listitem"><p>
+                     META &#8211; meta blocks (not used for Bloom filters in version 2 anymore)
+                  </p></li><li class="listitem"><p>
+                     INTERMEDIATE_INDEX &#8211; intermediate-level index blocks in a multi-level blockindex
+                  </p></li><li class="listitem"><p>
+                     ROOT_INDEX &#8211; root&gt;level index blocks in a multi&gt;level block index
+                  </p></li><li class="listitem"><p>
+                     FILE_INFO &#8211; the &#8220;file info&#8221; block, a small key&gt;value map of metadata
+                  </p></li><li class="listitem"><p>
+                     BLOOM_META &#8211; a Bloom filter metadata block in the load&gt;on&gt;open section
+                  </p></li><li class="listitem"><p>
+                     TRAILER &#8211; a fixed&gt;size file trailer. As opposed to the above, this is not an
+                     HFile v2 block but a fixed&gt;size (for each HFile version) data structure
+                  </p></li><li class="listitem"><p>
+                      INDEX_V1 &#8211; this block type is only used for legacy HFile v1 block
+                  </p></li></ol></div></li><li class="listitem"><p>Compressed size of the block's data, not including the header (int).
+         </p><p>
+Can be used for skipping the current data block when scanning HFile data.
+                  </p></li><li class="listitem"><p>Uncompressed size of the block's data, not including the header (int)</p><p>
+ This is equal to the compressed size if the compression algorithm is NON
+                  </p></li><li class="listitem"><p>File offset of the previous block of the same type (long)</p><p>
+ Can be used for seeking to the previous data/index block
+                  </p></li><li class="listitem"><p>Compressed data (or uncompressed data if the compression algorithm is NONE).</p></li></ol></div><p>The above format of blocks is used in the following HFile sections:</p><div class="orderedlist"><ol class="orderedlist" type="1"><li class="listitem"><p>Scanned block section. The section is named so because it contains all data blocks that need to be read when an HFile is scanned sequentially. &nbsp;Also contains leaf block index and Bloom chunk blocks. </p></li><li class="listitem"><p>Non-scanned block section. This section still contains unified-format v2 blocks but it does not have to be read when doing a sequential scan. This section contains &#8220;meta&#8221; blocks and intermediate-level index blocks.
+         </p></li></ol></div><p>We are supporting &#8220;meta&#8221; blocks in version 2 the same way they were supported in version 1, even though we do not store Bloom filter data in these blocks anymore. </p></div><div class="section"><div class="titlepage"><div><div><h3 class="title"><a name="d4029e21707"></a>F.3.3.&nbsp; Block index in version 2</h3></div></div></div><p>There are three types of block indexes in HFile version 2, stored in two different formats (root and non-root): </p><div class="orderedlist"><ol class="orderedlist" type="1"><li class="listitem"><p>Data index &#8212; version 2 multi-level block index, consisting of:</p><div class="orderedlist"><ol class="orderedlist" type="a"><li class="listitem"><p>
+ Version 2 root index, stored in the data block index section of the file
+             </p></li><li class="listitem"><p>
+Optionally, version 2 intermediate levels, stored in the non%root format in   the data index section of the file.    Intermediate levels can only be present if leaf level blocks are present
+             </p></li><li class="listitem"><p>
+Optionally, version 2 leaf levels, stored in the non%root format inline with   data blocks
+             </p></li></ol></div></li><li class="listitem"><p>Meta index &#8212; version 2 root index format only, stored in the meta index section of the file</p></li><li class="listitem"><p>Bloom index &#8212; version 2 root index format only, stored in the &#8220;load-on-open&#8221; section as part of Bloom filter metadata.</p></li></ol></div></div><div class="section"><div class="titlepage"><div><div><h3 class="title"><a name="d4029e21732"></a>F.3.4.&nbsp;
+      Root block index format in version 2</h3></div></div></div><p>This format applies to:</p><div class="orderedlist"><ol class="orderedlist" type="1"><li class="listitem"><p>Root level of the version 2 data index</p></li><li class="listitem"><p>Entire meta and Bloom indexes in version 2, which are always single-level. </p></li></ol></div><p>A version 2 root index block is a sequence of entries of the following format, similar to entries of a version 1 block index, but storing on-disk size instead of uncompressed size. </p><div class="orderedlist"><ol class="orderedlist" type="1"><li class="listitem"><p>Offset (long) </p><p>
+This offset may point to a data block or to a deeper&gt;level index block.
+             </p></li><li class="listitem"><p>On-disk size (int) </p></li><li class="listitem"><p>Key (a serialized byte array stored using Bytes.writeByteArray) </p><div class="orderedlist"><ol class="orderedlist" type="a"><li class="listitem"><p>Key (VInt)
+             </p></li><li class="listitem"><p>Key bytes
+             </p></li></ol></div></li></ol></div><p>A single-level version 2 block index consists of just a single root index block. To read a root index block of version 2, one needs to know the number of entries. For the data index and the meta index the number of entries is stored in the trailer, and for the Bloom index it is stored in the compound Bloom filter metadata.</p><p>For a multi-level block index we also store the following fields in the root index block in the load-on-open section of the HFile, in addition to the data structure described above:</p><div class="orderedlist"><ol class="orderedlist" type="1"><li class="listitem"><p>Middle leaf index block offset</p></li><li class="listitem"><p>Middle leaf block on-disk size (meaning the leaf index block containing the reference to the &#8220;middle&#8221; data block of the file) </p></li><li class="listitem"><p>The index of the mid-key (defined below) in the middle leaf-level block.</p></li></ol></div><p></p><p>These addit
 ional fields are used to efficiently retrieve the mid-key of the HFile used in HFile splits, which we define as the first key of the block with a zero-based index of (n &#8211; 1) / 2, if the total number of blocks in the HFile is n. This definition is consistent with how the mid-key was determined in HFile version 1, and is reasonable in general, because blocks are likely to be the same size on average, but we don&#8217;t have any estimates on individual key/value pair sizes. </p><p></p><p>When writing a version 2 HFile, the total number of data blocks pointed to by every leaf-level index block is kept track of. When we finish writing and the total number of leaf-level blocks is determined, it is clear which leaf-level block contains the mid-key, and the fields listed above are computed. &nbsp;When reading the HFile and the mid-key is requested, we retrieve the middle leaf index block (potentially from the block cache) and get the mid-key value from the appropriate position inside 
 that leaf block.</p></div><div class="section"><div class="titlepage"><div><div><h3 class="title"><a name="d4029e21785"></a>F.3.5.&nbsp;
+      Non-root block index format in version 2</h3></div></div></div><p>This format applies to intermediate-level and leaf index blocks of a version 2 multi-level data block index. Every non-root index block is structured as follows. </p><div class="orderedlist"><ol class="orderedlist" type="1"><li class="listitem"><p>numEntries: the number of entries (int). </p></li><li class="listitem"><p>entryOffsets: the &#8220;secondary index&#8221; of offsets of entries in the block, to facilitate a quick binary search on the key (numEntries + 1 int values). The last value is the total length of all entries in this index block. For example, in a non-root index block with entry sizes 60, 80, 50 the &#8220;secondary index&#8221; will contain the following int array: {0, 60, 140, 190}.</p></li><li class="listitem"><p>Entries. Each entry contains: </p><div class="orderedlist"><ol class="orderedlist" type="a"><li class="listitem"><p>
+Offset of the block referenced by this entry in the file (long)
+             </p></li><li class="listitem"><p>
+On&gt;disk size of the referenced block (int)
+             </p></li><li class="listitem"><p>
+Key. The length can be calculated from entryOffsets.
+             </p></li></ol></div></li></ol></div></div><div class="section"><div class="titlepage"><div><div><h3 class="title"><a name="d4029e21810"></a>F.3.6.&nbsp;
+      Bloom filters in version 2</h3></div></div></div><p>In contrast with version 1, in a version 2 HFile Bloom filter metadata is stored in the load-on-open section of the HFile for quick startup. </p><div class="orderedlist"><ol class="orderedlist" type="1"><li class="listitem"><p>A compound Bloom filter. </p><div class="orderedlist"><ol class="orderedlist" type="a"><li class="listitem"><p>
+ Bloom filter version = 3 (int). There used to be a DynamicByteBloomFilter class that had the Bloom   filter version number 2
+             </p></li><li class="listitem"><p>
+The total byte size of all compound Bloom filter chunks (long)
+             </p></li><li class="listitem"><p>
+ Number of hash functions (int
+             </p></li><li class="listitem"><p>
+Type of hash functions (int)
+             </p></li><li class="listitem"><p>
+The total key count inserted into the Bloom filter (long)
+             </p></li><li class="listitem"><p>
+The maximum total number of keys in the Bloom filter (long)
+             </p></li><li class="listitem"><p>
+The number of chunks (int)
+             </p></li><li class="listitem"><p>
+Comparator class used for Bloom filter keys, a UTF&gt;8 encoded string stored   using Bytes.writeByteArray
+             </p></li><li class="listitem"><p>
+ Bloom block index in the version 2 root block index format
+             </p></li></ol></div></li></ol></div></div><div class="section"><div class="titlepage"><div><div><h3 class="title"><a name="d4029e21847"></a>F.3.7.&nbsp;File Info format in versions 1 and 2</h3></div></div></div><p>The file info block is a serialized <a class="link" href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/io/HbaseMapWritable.html" target="_top">HbaseMapWritable</a> (essentially a map from byte arrays to byte arrays) with the following keys, among others. StoreFile-level logic adds more keys to this.</p><div class="informaltable"><table border="1"><colgroup><col><col></colgroup><tbody><tr><td>
+               <p>hfile.LASTKEY </p>
+            </td><td>
+               <p>The last key of the file (byte array) </p>
+            </td></tr><tr><td>
+               <p>hfile.AVG_KEY_LEN </p>
+            </td><td>
+               <p>The average key length in the file (int) </p>
+            </td></tr><tr><td>
+               <p>hfile.AVG_VALUE_LEN </p>
+            </td><td>
+               <p>The average value length in the file (int) </p>
+            </td></tr></tbody></table></div><p>File info format did not change in version 2. However, we moved the file info to the final section of the file, which can be loaded as one block at the time the HFile is being opened. Also, we do not store comparator in the version 2 file info anymore. Instead, we store it in the fixed file trailer. This is because we need to know the comparator at the time of parsing the load-on-open section of the HFile.</p></div><div class="section"><div class="titlepage"><div><div><h3 class="title"><a name="d4029e21893"></a>F.3.8.&nbsp;
+      Fixed file trailer format differences between versions 1 and 2</h3></div></div></div><p>The following table shows common and different fields between fixed file trailers in versions 1 and 2. Note that the size of the trailer is different depending on the version, so it is &#8220;fixed&#8221; only within one version. However, the version is always stored as the last four-byte integer in the file. </p><p></p><div class="informaltable"><table border="1"><colgroup><col class="c1"><col class="c2"></colgroup><tbody><tr><td>
+               <p>Version 1 </p>
+            </td><td>
+               <p>Version 2 </p>
+            </td></tr><tr><td colspan="2" align="center">
+               <p>File info offset (long) </p>
+            </td></tr><tr><td>
+               <p>Data index offset (long) </p>
+            </td><td>
+                <p>loadOnOpenOffset (long)</p>
+                <p><span class="emphasis"><em>The offset of the section that we need toload when opening the file.</em></span></p>
+            </td></tr><tr><td colspan="2" align="center">
+               <p>Number of data index entries (int) </p>
+            </td></tr><tr><td>
+               <p>metaIndexOffset (long)</p>
+               <p>This field is not being used by the version 1 reader, so we removed it from version 2.</p>
+            </td><td>
+               <p>uncompressedDataIndexSize (long)</p>
+               <p>The total uncompressed size of the whole data block index, including root-level, intermediate-level, and leaf-level blocks.</p>
+            </td></tr><tr><td colspan="2" align="center">
+               <p>Number of meta index entries (int) </p>
+            </td></tr><tr><td colspan="2" align="center">
+               <p>Total uncompressed bytes (long) </p>
+            </td></tr><tr><td>
+               <p>numEntries (int) </p>
+            </td><td>
+               <p>numEntries (long) </p>
+            </td></tr><tr><td colspan="2" align="center">
+               <p>Compression codec: 0 = LZO, 1 = GZ, 2 = NONE (int) </p>
+            </td></tr><tr><td>
+               <p></p>
+            </td><td>
+               <p>The number of levels in the data block index (int) </p>
+            </td></tr><tr><td>
+               <p></p>
+            </td><td>
+               <p>firstDataBlockOffset (long)</p>
+               <p>The offset of the first first data block. Used when scanning. </p>
+            </td></tr><tr><td>
+               <p></p>
+            </td><td>
+               <p>lastDataBlockEnd (long)</p>
+               <p>The offset of the first byte after the last key/value data block. We don't need to go beyond this offset when scanning. </p>
+            </td></tr><tr><td>
+               <p>Version: 1 (int) </p>
+            </td><td>
+               <p>Version: 2 (int) </p>
+            </td></tr></tbody></table></div><p></p></div><div class="section"><div class="titlepage"><div><div><h3 class="title"><a name="d4029e22036"></a>F.3.9.&nbsp;getShortMidpointKey(an optimization for data index block)</h3></div></div></div><p>Note: this optimization was introduced in HBase 0.95+</p><p>HFiles contain many blocks that contain a range of sorted Cells. Each cell has a key. To save IO when reading Cells, the HFile also has an index that maps a Cell's start key to the offset of the beginning of a particular block. Prior to this optimization, HBase would use the key of the first cell in each data block as the index key.</p><p>In HBASE-7845, we generate a new key that is lexicographically larger than the last key of the previous block and lexicographically equal or smaller than the start key of the current block. While actual keys can potentially be very long, this "fake key" or "virtual key" can be much shorter. For example, if the stop key of previous block is "the
  quick brown fox", the start key of current block is "the who", we could use "the r" as our virtual key in our hfile index.</p><p>There are two benefits to this:</p><div class="itemizedlist"><ul class="itemizedlist" style="list-style-type: disc; "><li class="listitem"><p>having shorter keys reduces the hfile index size, (allowing us to keep more indexes in memory), and</p></li><li class="listitem"><p>using something closer to the end key of the previous block allows us to avoid a potential extra IO when the target key lives in between the "virtual key" and the key of the first element in the target block.</p></li></ul></div><p>This optimization (implemented by the getShortMidpointKey method) is inspired by LevelDB's ByteWiseComparatorImpl::FindShortestSeparator() and FindShortSuccessor().</p></div></div><div id="disqus_thread"></div><script type="text/javascript">
+    /* * * DON'T EDIT BELOW THIS LINE * * */
+    (function() {
+        var dsq = document.createElement('script'); dsq.type = 'text/javascript'; dsq.async = true;
+        dsq.src = 'http://' + disqus_shortname + '.disqus.com/embed.js';
+        (document.getElementsByTagName('head')[0] || document.getElementsByTagName('body')[0]).appendChild(dsq);
+    })();
+</script><noscript>Please enable JavaScript to view the <a href="http://disqus.com/?ref_noscript">comments powered by Disqus.</a></noscript><a href="http://disqus.com" class="dsq-brlink">comments powered by <span class="logo-disqus">Disqus</span></a><div class="navfooter"><hr><table width="100%" summary="Navigation footer"><tr><td width="40%" align="left"><a accesskey="p" href="apfs02.html">Prev</a>&nbsp;</td><td width="20%" align="center"><a accesskey="u" href="hfilev2.html">Up</a></td><td width="40%" align="right">&nbsp;<a accesskey="n" href="other.info.html">Next</a></td></tr><tr><td width="40%" align="left" valign="top">F.2.&nbsp;HFile format version 1 overview &nbsp;</td><td width="20%" align="center"><a accesskey="h" href="book.html">Home</a></td><td width="40%" align="right" valign="top">&nbsp;Appendix&nbsp;G.&nbsp;Other Information About HBase</td></tr></table></div></body></html>
\ No newline at end of file

Added: hbase/hbase.apache.org/trunk/book/apks02.html
URL: http://svn.apache.org/viewvc/hbase/hbase.apache.org/trunk/book/apks02.html?rev=1616896&view=auto
==============================================================================
--- hbase/hbase.apache.org/trunk/book/apks02.html (added)
+++ hbase/hbase.apache.org/trunk/book/apks02.html Fri Aug  8 22:19:16 2014
@@ -0,0 +1,21 @@
+<html><head>
+      <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
+   <title>K.2.&nbsp;TODO</title><link rel="stylesheet" type="text/css" href="${baserdir}/src/main/site/resources/css/freebsd_docbook.css"><meta name="generator" content="DocBook XSL Stylesheets V1.78.1"><link rel="home" href="book.html" title="The Apache HBase&#153; Reference Guide"><link rel="up" href="hbase.rpc.html" title="Appendix&nbsp;K.&nbsp;0.95 RPC Specification"><link rel="prev" href="hbase.rpc.html" title="Appendix&nbsp;K.&nbsp;0.95 RPC Specification"><link rel="next" href="apks03.html" title="K.3.&nbsp;RPC"></head><body bgcolor="white" text="black" link="#0000FF" vlink="#840084" alink="#0000FF"><div class="navheader"><table width="100%" summary="Navigation header"><tr><th colspan="3" align="center">K.2.&nbsp;TODO</th></tr><tr><td width="20%" align="left"><a accesskey="p" href="hbase.rpc.html">Prev</a>&nbsp;</td><th width="60%" align="center">Appendix&nbsp;K.&nbsp;0.95 RPC Specification</th><td width="20%" align="right">&nbsp;<a accesskey="n" href="apks03.html">Next</a></t
 d></tr></table><hr></div><script type="text/javascript">
+    var disqus_shortname = 'hbase'; // required: replace example with your forum shortname
+    var disqus_url = 'http://hbase.apache.org/book/.html';
+    </script><div class="section"><div class="titlepage"><div><div><h2 class="title" style="clear: both"><a name="d4029e22378"></a>K.2.&nbsp;TODO</h2></div></div></div><p>
+            </p><div class="orderedlist"><ol class="orderedlist" type="1"><li class="listitem"><p>List of problems with currently specified format and where we would like
+                        to go in a version2, etc. For example, what would we have to change if
+                        anything to move server async or to support streaming/chunking?</p></li><li class="listitem"><p>Diagram on how it works</p></li><li class="listitem"><p>A grammar that succinctly describes the wire-format. Currently we have
+                        these words and the content of the rpc protobuf idl but a grammar for the
+                        back and forth would help with groking rpc. Also, a little state machine on
+                        client/server interactions would help with understanding (and ensuring
+                        correct implementation).</p></li></ol></div><p>
+        </p></div><div id="disqus_thread"></div><script type="text/javascript">
+    /* * * DON'T EDIT BELOW THIS LINE * * */
+    (function() {
+        var dsq = document.createElement('script'); dsq.type = 'text/javascript'; dsq.async = true;
+        dsq.src = 'http://' + disqus_shortname + '.disqus.com/embed.js';
+        (document.getElementsByTagName('head')[0] || document.getElementsByTagName('body')[0]).appendChild(dsq);
+    })();
+</script><noscript>Please enable JavaScript to view the <a href="http://disqus.com/?ref_noscript">comments powered by Disqus.</a></noscript><a href="http://disqus.com" class="dsq-brlink">comments powered by <span class="logo-disqus">Disqus</span></a><div class="navfooter"><hr><table width="100%" summary="Navigation footer"><tr><td width="40%" align="left"><a accesskey="p" href="hbase.rpc.html">Prev</a>&nbsp;</td><td width="20%" align="center"><a accesskey="u" href="hbase.rpc.html">Up</a></td><td width="40%" align="right">&nbsp;<a accesskey="n" href="apks03.html">Next</a></td></tr><tr><td width="40%" align="left" valign="top">Appendix&nbsp;K.&nbsp;0.95 RPC Specification&nbsp;</td><td width="20%" align="center"><a accesskey="h" href="book.html">Home</a></td><td width="40%" align="right" valign="top">&nbsp;K.3.&nbsp;RPC</td></tr></table></div></body></html>
\ No newline at end of file