You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hbase.apache.org by nd...@apache.org on 2015/08/19 00:35:59 UTC

[07/15] hbase git commit: HBASE-14066 clean out old docbook docs from branch-1.

http://git-wip-us.apache.org/repos/asf/hbase/blob/0acbff24/src/main/docbkx/performance.xml
----------------------------------------------------------------------
diff --git a/src/main/docbkx/performance.xml b/src/main/docbkx/performance.xml
deleted file mode 100644
index 1757d3f..0000000
--- a/src/main/docbkx/performance.xml
+++ /dev/null
@@ -1,1207 +0,0 @@
-<?xml version="1.0" encoding="UTF-8"?>
-<chapter
-  version="5.0"
-  xml:id="performance"
-  xmlns="http://docbook.org/ns/docbook"
-  xmlns:xlink="http://www.w3.org/1999/xlink"
-  xmlns:xi="http://www.w3.org/2001/XInclude"
-  xmlns:svg="http://www.w3.org/2000/svg"
-  xmlns:m="http://www.w3.org/1998/Math/MathML"
-  xmlns:html="http://www.w3.org/1999/xhtml"
-  xmlns:db="http://docbook.org/ns/docbook">
-  <!--
-/**
- * Licensed to the Apache Software Foundation (ASF) under one
- * or more contributor license agreements.  See the NOTICE file
- * distributed with this work for additional information
- * regarding copyright ownership.  The ASF licenses this file
- * to you under the Apache License, Version 2.0 (the
- * "License"); you may not use this file except in compliance
- * with the License.  You may obtain a copy of the License at
- *
- *     http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
--->
-  <title>Apache HBase Performance Tuning</title>
-
-  <section
-    xml:id="perf.os">
-    <title>Operating System</title>
-    <section
-      xml:id="perf.os.ram">
-      <title>Memory</title>
-      <para>RAM, RAM, RAM. Don't starve HBase.</para>
-    </section>
-    <section
-      xml:id="perf.os.64">
-      <title>64-bit</title>
-      <para>Use a 64-bit platform (and 64-bit JVM).</para>
-    </section>
-    <section
-      xml:id="perf.os.swap">
-      <title>Swapping</title>
-      <para>Watch out for swapping. Set swappiness to 0.</para>
-    </section>
-  </section>
-  <section
-    xml:id="perf.network">
-    <title>Network</title>
-    <para> Perhaps the most important factor in avoiding network issues degrading Hadoop and HBase
-      performance is the switching hardware that is used, decisions made early in the scope of the
-      project can cause major problems when you double or triple the size of your cluster (or more). </para>
-    <para> Important items to consider: <itemizedlist>
-        <listitem>
-          <para>Switching capacity of the device</para>
-        </listitem>
-        <listitem>
-          <para>Number of systems connected</para>
-        </listitem>
-        <listitem>
-          <para>Uplink capacity</para>
-        </listitem>
-      </itemizedlist>
-    </para>
-    <section
-      xml:id="perf.network.1switch">
-      <title>Single Switch</title>
-      <para>The single most important factor in this configuration is that the switching capacity of
-        the hardware is capable of handling the traffic which can be generated by all systems
-        connected to the switch. Some lower priced commodity hardware can have a slower switching
-        capacity than could be utilized by a full switch. </para>
-    </section>
-    <section
-      xml:id="perf.network.2switch">
-      <title>Multiple Switches</title>
-      <para>Multiple switches are a potential pitfall in the architecture. The most common
-        configuration of lower priced hardware is a simple 1Gbps uplink from one switch to another.
-        This often overlooked pinch point can easily become a bottleneck for cluster communication.
-        Especially with MapReduce jobs that are both reading and writing a lot of data the
-        communication across this uplink could be saturated. </para>
-      <para>Mitigation of this issue is fairly simple and can be accomplished in multiple ways: </para>
-      <itemizedlist>
-        <listitem>
-          <para>Use appropriate hardware for the scale of the cluster which you're attempting to
-            build.</para>
-        </listitem>
-        <listitem>
-          <para>Use larger single switch configurations i.e. single 48 port as opposed to 2x 24
-            port</para>
-        </listitem>
-        <listitem>
-          <para>Configure port trunking for uplinks to utilize multiple interfaces to increase cross
-            switch bandwidth.</para>
-        </listitem>
-      </itemizedlist>
-    </section>
-    <section
-      xml:id="perf.network.multirack">
-      <title>Multiple Racks</title>
-      <para>Multiple rack configurations carry the same potential issues as multiple switches, and
-        can suffer performance degradation from two main areas: </para>
-      <itemizedlist>
-        <listitem>
-          <para>Poor switch capacity performance</para>
-        </listitem>
-        <listitem>
-          <para>Insufficient uplink to another rack</para>
-        </listitem>
-      </itemizedlist>
-      <para>If the the switches in your rack have appropriate switching capacity to handle all the
-        hosts at full speed, the next most likely issue will be caused by homing more of your
-        cluster across racks. The easiest way to avoid issues when spanning multiple racks is to use
-        port trunking to create a bonded uplink to other racks. The downside of this method however,
-        is in the overhead of ports that could potentially be used. An example of this is, creating
-        an 8Gbps port channel from rack A to rack B, using 8 of your 24 ports to communicate between
-        racks gives you a poor ROI, using too few however can mean you're not getting the most out
-        of your cluster. </para>
-      <para>Using 10Gbe links between racks will greatly increase performance, and assuming your
-        switches support a 10Gbe uplink or allow for an expansion card will allow you to save your
-        ports for machines as opposed to uplinks. </para>
-    </section>
-    <section
-      xml:id="perf.network.ints">
-      <title>Network Interfaces</title>
-      <para>Are all the network interfaces functioning correctly? Are you sure? See the
-        Troubleshooting Case Study in <xref
-          linkend="casestudies.slownode" />. </para>
-    </section>
-  </section>
-  <!-- network -->
-
-  <section
-    xml:id="jvm">
-    <title>Java</title>
-
-    <section
-      xml:id="gc">
-      <title>The Garbage Collector and Apache HBase</title>
-
-      <section
-        xml:id="gcpause">
-        <title>Long GC pauses</title>
-
-        <para xml:id="mslab">In his presentation, <link
-            xlink:href="http://www.slideshare.net/cloudera/hbase-hug-presentation">Avoiding Full GCs
-            with MemStore-Local Allocation Buffers</link>, Todd Lipcon describes two cases of
-          stop-the-world garbage collections common in HBase, especially during loading; CMS failure
-          modes and old generation heap fragmentation brought. To address the first, start the CMS
-          earlier than default by adding <code>-XX:CMSInitiatingOccupancyFraction</code> and setting
-          it down from defaults. Start at 60 or 70 percent (The lower you bring down the threshold,
-          the more GCing is done, the more CPU used). To address the second fragmentation issue,
-          Todd added an experimental facility, <indexterm><primary>MSLAB</primary></indexterm>, that
-          must be explicitly enabled in Apache HBase 0.90.x (Its defaulted to be on in Apache 0.92.x
-          HBase). See <code>hbase.hregion.memstore.mslab.enabled</code> to true in your
-            <classname>Configuration</classname>. See the cited slides for background and detail.
-          The latest jvms do better regards fragmentation so make sure you are running a recent
-          release. Read down in the message, <link
-            xlink:href="http://osdir.com/ml/hotspot-gc-use/2011-11/msg00002.html">Identifying
-            concurrent mode failures caused by fragmentation</link>. Be aware that when enabled,
-          each MemStore instance will occupy at least an MSLAB instance of memory. If you have
-          thousands of regions or lots of regions each with many column families, this allocation of
-          MSLAB may be responsible for a good portion of your heap allocation and in an extreme case
-          cause you to OOME. Disable MSLAB in this case, or lower the amount of memory it uses or
-          float less regions per server. </para>
-        <para>If you have a write-heavy workload, check out <link
-            xlink:href="https://issues.apache.org/jira/browse/HBASE-8163">HBASE-8163
-            MemStoreChunkPool: An improvement for JAVA GC when using MSLAB</link>. It describes
-          configurations to lower the amount of young GC during write-heavy loadings. If you do not
-          have HBASE-8163 installed, and you are trying to improve your young GC times, one trick to
-          consider -- courtesy of our Liang Xie -- is to set the GC config
-            <varname>-XX:PretenureSizeThreshold</varname> in <filename>hbase-env.sh</filename> to be
-          just smaller than the size of <varname>hbase.hregion.memstore.mslab.chunksize</varname> so
-          MSLAB allocations happen in the tenured space directly rather than first in the young gen.
-          You'd do this because these MSLAB allocations are going to likely make it to the old gen
-          anyways and rather than pay the price of a copies between s0 and s1 in eden space followed
-          by the copy up from young to old gen after the MSLABs have achieved sufficient tenure,
-          save a bit of YGC churn and allocate in the old gen directly. </para>
-        <para>For more information about GC logs, see <xref
-            linkend="trouble.log.gc" />. </para>
-    <para>Consider also enabling the offheap Block Cache.  This has been shown to mitigate
-        GC pause times.  See <xref linkend="block.cache" /></para>
-      </section>
-    </section>
-  </section>
-
-  <section
-    xml:id="perf.configurations">
-    <title>HBase Configurations</title>
-
-    <para>See <xref
-        linkend="recommended_configurations" />.</para>
-
-    <section
-      xml:id="perf.compactions.and.splits">
-      <title>Managing Compactions</title>
-
-      <para>For larger systems, managing <link
-      linkend="disable.splitting">compactions and splits</link> may be
-      something you want to consider.</para>
-    </section>
-
-    <section xml:id="perf.handlers">
-        <title><varname>hbase.regionserver.handler.count</varname></title>
-        <para>See <xref linkend="hbase.regionserver.handler.count"/>.
-	    </para>
-    </section>
-    
-
-
-    <section xml:id="perf.hfile.block.cache.size">
-        <title><varname>hfile.block.cache.size</varname></title>
-        <para>See <xref linkend="hfile.block.cache.size"/>.
-        A memory setting for the RegionServer process.
-        </para>
-    </section>
-    <section xml:id="blockcache.prefetch">
-      <title>Prefetch Option for Blockcache</title>
-      <para><link xlink:href="https://issues.apache.org/jira/browse/HBASE-9857">HBASE-9857</link>
-        adds a new option to prefetch HFile contents when opening the blockcache, if a columnfamily
-        or regionserver property is set. This option is available for HBase 0.98.3 and later. The
-        purpose is to warm the blockcache as rapidly as possible after the cache is opened, using
-        in-memory table data, and not counting the prefetching as cache misses. This is great for
-        fast reads, but is not a good idea if the data to be preloaded will not fit into the
-        blockcache. It is useful for tuning the IO impact of prefetching versus the time before all
-        data blocks are in cache. </para>
-      <para>To enable prefetching on a given column family, you can use HBase Shell or use the
-        API.</para>
-      <example>
-        <title>Enable Prefetch Using HBase Shell</title>
-        <screen>hbase> create 'MyTable', { NAME => 'myCF', PREFETCH_BLOCKS_ON_OPEN => 'true' }</screen>
-      </example>
-      <example>
-        <title>Enable Prefetch Using the API</title>
-        <programlisting language="java">
-// ...
-HTableDescriptor tableDesc = new HTableDescriptor("myTable");
-HColumnDescriptor cfDesc = new HColumnDescriptor("myCF");
-cfDesc.setPrefetchBlocksOnOpen(true);
-tableDesc.addFamily(cfDesc);
-// ...        
-        </programlisting>
-      </example>
-      <para>See the API documentation for <link
-          xlink:href="https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/io/hfile/CacheConfig.html"
-          >CacheConfig</link>.</para>
-    </section>
-    <section xml:id="perf.rs.memstore.size">
-        <title><varname>hbase.regionserver.global.memstore.size</varname></title>
-        <para>See <xref linkend="hbase.regionserver.global.memstore.size"/>.
-        This memory setting is often adjusted for the RegionServer process depending on needs.
-        </para>
-    </section>
-    <section xml:id="perf.rs.memstore.size.lower.limit">
-        <title><varname>hbase.regionserver.global.memstore.size.lower.limit</varname></title>
-        <para>See <xref linkend="hbase.regionserver.global.memstore.size.lower.limit"/>.
-        This memory setting is often adjusted for the RegionServer process depending on needs.
-        </para>
-    </section>
-    <section xml:id="perf.hstore.blockingstorefiles">
-        <title><varname>hbase.hstore.blockingStoreFiles</varname></title>
-        <para>See <xref linkend="hbase.hstore.blockingStoreFiles"/>.
-        If there is blocking in the RegionServer logs, increasing this can help.
-        </para>
-    </section>
-    <section xml:id="perf.hregion.memstore.block.multiplier">
-        <title><varname>hbase.hregion.memstore.block.multiplier</varname></title>
-        <para>See <xref linkend="hbase.hregion.memstore.block.multiplier"/>.
-        If there is enough RAM, increasing this can help.
-        </para>
-    </section>
-    <section xml:id="hbase.regionserver.checksum.verify">
-        <title><varname>hbase.regionserver.checksum.verify</varname></title>
-        <para>Have HBase write the checksum into the datablock and save
-        having to do the checksum seek whenever you read.</para>
-
-        <para>See <xref linkend="hbase.regionserver.checksum.verify"/>,
-        <xref linkend="hbase.hstore.bytes.per.checksum"/> and <xref linkend="hbase.hstore.checksum.algorithm"/>
-        For more information see the
-        release note on <link xlink:href="https://issues.apache.org/jira/browse/HBASE-5074">HBASE-5074 support checksums in HBase block cache</link>.
-        </para>
-    </section>
-    <section>
-      <title>Tuning <code>callQueue</code> Options</title>
-      <para><link xlink:href="https://issues.apache.org/jira/browse/HBASE-11355">HBASE-11355</link>
-        introduces several callQueue tuning mechanisms which can increase performance. See the JIRA
-        for some benchmarking information.</para>
-      <itemizedlist>
-        <listitem>
-          <para>To increase the number of callqueues, set
-              <option>hbase.ipc.server.num.callqueue</option> to a value greater than
-              <literal>1</literal>.</para>
-        </listitem>
-        <listitem>
-          <para>To split the callqueue into separate read and write queues, set
-              <code>hbase.ipc.server.callqueue.read.ratio</code> to a value between
-              <literal>0</literal> and <literal>1</literal>. This factor weights the queues toward
-            writes (if below .5) or reads (if above .5). Another way to say this is that the factor
-            determines what percentage of the split queues are used for reads. The following
-            examples illustrate some of the possibilities. Note that you always have at least one
-            write queue, no matter what setting you use.</para>
-          <itemizedlist>
-            <listitem>
-              <para>The default value of <literal>0</literal> does not split the queue.</para>
-            </listitem>
-            <listitem>
-              <para>A value of <literal>.3</literal> uses 30% of the queues for reading and 60% for
-                writing. Given a value of <literal>10</literal> for
-                  <option>hbase.ipc.server.num.callqueue</option>, 3 queues would be used for reads
-                and 7 for writes.</para>
-            </listitem>
-            <listitem>
-              <para>A value of <literal>.5</literal> uses the same number of read queues and write
-                queues. Given a value of <literal>10</literal> for
-                <option>hbase.ipc.server.num.callqueue</option>, 5 queues would be used for reads
-                and 5 for writes.</para>
-            </listitem>
-            <listitem>
-              <para>A value of <literal>.6</literal> uses 60% of the queues for reading and 30% for
-                reading. Given a value of <literal>10</literal> for
-                <option>hbase.ipc.server.num.callqueue</option>, 7 queues would be used for reads
-                and 3 for writes.</para>
-            </listitem>
-            <listitem>
-              <para>A value of <literal>1.0</literal> uses one queue to process write requests, and
-                all other queues process read requests. A value higher than <literal>1.0</literal>
-                has the same effect as a value of <literal>1.0</literal>. Given a value of
-                  <literal>10</literal> for <option>hbase.ipc.server.num.callqueue</option>, 9
-                queues would be used for reads and 1 for writes.</para>
-            </listitem>
-          </itemizedlist>
-        </listitem>
-        <listitem>
-          <para>You can also split the read queues so that separate queues are used for short reads
-            (from Get operations) and long reads (from Scan operations), by setting the
-              <option>hbase.ipc.server.callqueue.scan.ratio</option> option. This option is a factor
-            between 0 and 1, which determine the ratio of read queues used for Gets and Scans. More
-            queues are used for Gets if the value is below <literal>.5</literal> and more are used
-            for scans if the value is above <literal>.5</literal>. No matter what setting you use,
-            at least one read queue is used for Get operations.</para>
-          <itemizedlist>
-            <listitem>
-              <para>A value of <literal>0</literal> does not split the read queue.</para>
-            </listitem>
-            <listitem>
-              <para>A value of <literal>.3</literal> uses 60% of the read queues for Gets and 30%
-                for Scans. Given a value of <literal>20</literal> for
-                  <option>hbase.ipc.server.num.callqueue</option> and a value of <literal>.5
-                </literal> for <option>hbase.ipc.server.callqueue.read.ratio</option>, 10 queues
-                would be used for reads, out of those 10, 7 would be used for Gets and 3 for
-                Scans.</para>
-            </listitem>
-            <listitem>
-              <para>A value of <literal>.5</literal> uses half the read queues for Gets and half for
-                Scans. Given a value of <literal>20</literal> for
-                  <option>hbase.ipc.server.num.callqueue</option> and a value of <literal>.5
-                </literal> for <option>hbase.ipc.server.callqueue.read.ratio</option>, 10 queues
-                would be used for reads, out of those 10, 5 would be used for Gets and 5 for
-                Scans.</para>
-            </listitem>
-            <listitem>
-              <para>A value of <literal>.6</literal> uses 30% of the read queues for Gets and 60%
-                for Scans. Given a value of <literal>20</literal> for
-                  <option>hbase.ipc.server.num.callqueue</option> and a value of <literal>.5
-                </literal> for <option>hbase.ipc.server.callqueue.read.ratio</option>, 10 queues
-                would be used for reads, out of those 10, 3 would be used for Gets and 7 for
-                Scans.</para>
-            </listitem>
-            <listitem>
-              <para>A value of <literal>1.0</literal> uses all but one of the read queues for Scans.
-                Given a value of <literal>20</literal> for
-                  <option>hbase.ipc.server.num.callqueue</option> and a value of <literal>.5
-                </literal> for <option>hbase.ipc.server.callqueue.read.ratio</option>, 10 queues
-                would be used for reads, out of those 10, 1 would be used for Gets and 9 for
-                Scans.</para>
-            </listitem>
-          </itemizedlist>
-        </listitem>
-        <listitem>
-          <para>You can use the new option
-              <option>hbase.ipc.server.callqueue.handler.factor</option> to programmatically tune
-            the number of queues:</para>
-          <itemizedlist>
-            <listitem>
-              <para>A value of <literal>0</literal> uses a single shared queue between all the
-                handlers.</para>
-            </listitem>
-            <listitem>
-              <para>A value of <literal>1</literal> uses a separate queue for each handler.</para>
-            </listitem>
-            <listitem>
-              <para>A value between <literal>0</literal> and <literal>1</literal> tunes the number
-                of queues against the number of handlers. For instance, a value of
-                  <literal>.5</literal> shares one queue between each two handlers.</para>
-            </listitem>
-          </itemizedlist>
-          <para>Having more queues, such as in a situation where you have one queue per handler,
-            reduces contention when adding a task to a queue or selecting it from a queue. The
-            trade-off is that if you have some queues with long-running tasks, a handler may end up
-            waiting to execute from that queue rather than processing another queue which has
-            waiting tasks.</para>
-        </listitem>
-      </itemizedlist>
-      <para>For these values to take effect on a given Region Server, the Region Server must be
-        restarted. These parameters are intended for testing purposes and should be used
-        carefully.</para>
-    </section>
-  </section>
-
-
-
-
-  <section
-    xml:id="perf.zookeeper">
-    <title>ZooKeeper</title>
-    <para>See <xref
-        linkend="zookeeper" /> for information on configuring ZooKeeper, and see the part about
-      having a dedicated disk. </para>
-  </section>
-  <section
-    xml:id="perf.schema">
-    <title>Schema Design</title>
-
-    <section
-      xml:id="perf.number.of.cfs">
-      <title>Number of Column Families</title>
-      <para>See <xref
-          linkend="number.of.cfs" />.</para>
-    </section>
-    <section
-      xml:id="perf.schema.keys">
-      <title>Key and Attribute Lengths</title>
-      <para>See <xref
-          linkend="keysize" />. See also <xref
-          linkend="perf.compression.however" /> for compression caveats.</para>
-    </section>
-    <section
-      xml:id="schema.regionsize">
-      <title>Table RegionSize</title>
-      <para>The regionsize can be set on a per-table basis via <code>setFileSize</code> on <link
-          xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/HTableDescriptor.html">HTableDescriptor</link>
-        in the event where certain tables require different regionsizes than the configured default
-        regionsize. </para>
-      <para>See <xref
-          linkend="ops.capacity.regions" /> for more information. </para>
-    </section>
-    <section
-      xml:id="schema.bloom">
-      <title>Bloom Filters</title>
-      <para>A Bloom filter, named for its creator, Burton Howard Bloom, is a data structure which is
-        designed to predict whether a given element is a member of a set of data. A positive result
-        from a Bloom filter is not always accurate, but a negative result is guaranteed to be
-        accurate. Bloom filters are designed to be "accurate enough" for sets of data which are so
-        large that conventional hashing mechanisms would be impractical. For more information about
-        Bloom filters in general, refer to <link
-          xlink:href="http://en.wikipedia.org/wiki/Bloom_filter" />.</para>
-      <para>In terms of HBase, Bloom filters provide a lightweight in-memory structure to reduce the
-        number of disk reads for a given Get operation (Bloom filters do not work with Scans) to only the StoreFiles likely to
-        contain the desired Row. The potential performance gain increases with the number of
-        parallel reads. </para>
-      <para>The Bloom filters themselves are stored in the metadata of each HFile and never need to
-        be updated. When an HFile is opened because a region is deployed to a RegionServer, the
-        Bloom filter is loaded into memory. </para>
-      <para>HBase includes some tuning mechanisms for folding the Bloom filter to reduce the size
-        and keep the false positive rate within a desired range.</para>
-      <para>Bloom filters were introduced in <link
-          xlink:href="https://issues.apache.org/jira/browse/HBASE-1200">HBASE-1200</link>. Since
-        HBase 0.96, row-based Bloom filters are enabled by default. (<link
-          xlink:href="https://issues.apache.org/jira/browse/HBASE-8450">HBASE-</link>)</para>
-      <para>For more information on Bloom filters in relation to HBase, see <xref
-          linkend="blooms" /> for more information, or the following Quora discussion: <link
-          xlink:href="http://www.quora.com/How-are-bloom-filters-used-in-HBase">How are bloom
-          filters used in HBase?</link>. </para>
-      
-      <section xml:id="bloom.filters.when">
-        <title>When To Use Bloom Filters</title>
-        <para>Since HBase 0.96, row-based Bloom filters are enabled by default. You may choose to
-          disable them or to change some tables to use row+column Bloom filters, depending on the
-          characteristics of your data and how it is loaded into HBase.</para>
-
-        <para>To determine whether Bloom filters could have a positive impact, check the value of
-          <code>blockCacheHitRatio</code> in the RegionServer metrics. If Bloom filters are enabled, the value of
-          <code>blockCacheHitRatio</code> should increase, because the Bloom filter is filtering out blocks that
-          are definitely not needed. </para>
-        <para>You can choose to enable Bloom filters for a row or for a row+column combination. If
-          you generally scan entire rows, the row+column combination will not provide any benefit. A
-          row-based Bloom filter can operate on a row+column Get, but not the other way around.
-          However, if you have a large number of column-level Puts, such that a row may be present
-          in every StoreFile, a row-based filter will always return a positive result and provide no
-          benefit. Unless you have one column per row, row+column Bloom filters require more space,
-          in order to store more keys. Bloom filters work best when the size of each data entry is
-          at least a few kilobytes in size. </para>
-        <para>Overhead will be reduced when your data is stored in a few larger StoreFiles, to avoid
-          extra disk IO during low-level scans to find a specific row. </para>
-        <para>Bloom filters need to be rebuilt upon deletion, so may not be appropriate in
-          environments with a large number of deletions.</para>
-      </section>
-      
-      <section>
-        <title>Enabling Bloom Filters</title>
-        <para>Bloom filters are enabled on a Column Family. You can do this by using the
-          setBloomFilterType method of HColumnDescriptor or using the HBase API. Valid values are
-            <literal>NONE</literal> (the default), <literal>ROW</literal>, or
-            <literal>ROWCOL</literal>. See <xref
-            linkend="bloom.filters.when" /> for more information on <literal>ROW</literal> versus
-            <literal>ROWCOL</literal>. See also the API documentation for <link
-            xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/HColumnDescriptor.html">HColumnDescriptor</link>.</para>
-          <para>The following example creates a table and enables a ROWCOL Bloom filter on the
-            <literal>colfam1</literal> column family.</para>
-        <screen>
-hbase> <userinput>create 'mytable',{NAME => 'colfam1', BLOOMFILTER => 'ROWCOL'}</userinput>          
-        </screen>
-      </section>
-      
-      <section>
-        <title>Configuring Server-Wide Behavior of Bloom Filters</title>
-        <para>You can configure the following settings in the <filename>hbase-site.xml</filename>.
-        </para>
-        <informaltable>
-          <tgroup cols="3">
-            <thead>
-              <row>
-                <entry>Parameter</entry>
-                <entry>Default</entry>
-                <entry>Description</entry>
-              </row>
-            </thead>
-            <tbody>
-              <row>
-                <entry><para><code>io.hfile.bloom.enabled</code></para></entry>
-                <entry><para><literal>yes</literal></para></entry>
-                <entry><para>Set to <literal>no</literal> to kill bloom filters server-wide if
-                    something goes wrong</para></entry>
-              </row>
-              <row>
-                <entry><para><code>io.hfile.bloom.error.rate</code></para></entry>
-                <entry><para><literal>.01</literal></para></entry>
-                <entry><para>The average false positive rate for bloom filters. Folding is used to
-                  maintain the false positive rate. Expressed as a decimal representation of a
-                  percentage.</para></entry>
-              </row>
-              <row>
-                <entry><para><code>io.hfile.bloom.max.fold</code></para></entry>
-                <entry><para><literal>7</literal></para></entry>
-                <entry><para>The guaranteed maximum fold rate. Changing this setting should not be
-                  necessary and is not recommended.</para></entry>
-              </row>
-              <row>
-                <entry><para><code>io.storefile.bloom.max.keys</code></para></entry>
-                <entry><para><literal>128000000</literal></para></entry>
-                <entry><para>For default (single-block) Bloom filters, this specifies the maximum
-                    number of keys.</para></entry>
-              </row>
-              <row>
-                <entry><para><code>io.storefile.delete.family.bloom.enabled</code></para></entry>
-                <entry><para><literal>true</literal></para></entry>
-                <entry><para>Master switch to enable Delete Family Bloom filters and store them in
-                  the StoreFile.</para></entry>
-              </row>
-              <row>
-                <entry><para><code>io.storefile.bloom.block.size</code></para></entry>
-                <entry><para><literal>65536</literal></para></entry>
-                <entry><para>Target Bloom block size. Bloom filter blocks of approximately this size
-                    are interleaved with data blocks.</para></entry>
-              </row>
-              <row>
-                <entry><para><code>hfile.block.bloom.cacheonwrite</code></para></entry>
-                <entry><para><literal>false</literal></para></entry>
-                <entry><para>Enables cache-on-write for inline blocks of a compound Bloom filter.</para></entry>
-              </row>
-            </tbody>
-          </tgroup>
-        </informaltable>
-      </section>
-    </section>
-    <section
-      xml:id="schema.cf.blocksize">
-      <title>ColumnFamily BlockSize</title>
-      <para>The blocksize can be configured for each ColumnFamily in a table, and this defaults to
-        64k. Larger cell values require larger blocksizes. There is an inverse relationship between
-        blocksize and the resulting StoreFile indexes (i.e., if the blocksize is doubled then the
-        resulting indexes should be roughly halved). </para>
-      <para>See <link
-          xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/HColumnDescriptor.html">HColumnDescriptor</link>
-        and <xref
-          linkend="store" />for more information. </para>
-    </section>
-    <section
-      xml:id="cf.in.memory">
-      <title>In-Memory ColumnFamilies</title>
-      <para>ColumnFamilies can optionally be defined as in-memory. Data is still persisted to disk,
-        just like any other ColumnFamily. In-memory blocks have the highest priority in the <xref
-          linkend="block.cache" />, but it is not a guarantee that the entire table will be in
-        memory. </para>
-      <para>See <link
-          xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/HColumnDescriptor.html">HColumnDescriptor</link>
-        for more information. </para>
-    </section>
-    <section
-      xml:id="perf.compression">
-      <title>Compression</title>
-      <para>Production systems should use compression with their ColumnFamily definitions. See <xref
-          linkend="compression" /> for more information. </para>
-      <section
-        xml:id="perf.compression.however">
-        <title>However...</title>
-        <para>Compression deflates data <emphasis>on disk</emphasis>. When it's in-memory (e.g., in
-          the MemStore) or on the wire (e.g., transferring between RegionServer and Client) it's
-          inflated. So while using ColumnFamily compression is a best practice, but it's not going
-          to completely eliminate the impact of over-sized Keys, over-sized ColumnFamily names, or
-          over-sized Column names. </para>
-        <para>See <xref
-            linkend="keysize" /> on for schema design tips, and <xref
-            linkend="keyvalue" /> for more information on HBase stores data internally. </para>
-      </section>
-    </section>
-  </section>
-  <!--  perf schema -->
-
-  <section
-    xml:id="perf.general">
-    <title>HBase General Patterns</title>
-    <section
-      xml:id="perf.general.constants">
-      <title>Constants</title>
-      <para>When people get started with HBase they have a tendency to write code that looks like
-        this:</para>
-      <programlisting language="java">
-Get get = new Get(rowkey);
-Result r = htable.get(get);
-byte[] b = r.getValue(Bytes.toBytes("cf"), Bytes.toBytes("attr"));  // returns current version of value
-      </programlisting>
-      <para>But especially when inside loops (and MapReduce jobs), converting the columnFamily and
-        column-names to byte-arrays repeatedly is surprisingly expensive. It's better to use
-        constants for the byte-arrays, like this:</para>
-      <programlisting language="java">
-public static final byte[] CF = "cf".getBytes();
-public static final byte[] ATTR = "attr".getBytes();
-...
-Get get = new Get(rowkey);
-Result r = htable.get(get);
-byte[] b = r.getValue(CF, ATTR);  // returns current version of value
-      </programlisting>
-    </section>
-
-  </section>
-  <section
-    xml:id="perf.writing">
-    <title>Writing to HBase</title>
-
-    <section
-      xml:id="perf.batch.loading">
-      <title>Batch Loading</title>
-      <para>Use the bulk load tool if you can. See <xref
-          linkend="arch.bulk.load" />. Otherwise, pay attention to the below. </para>
-    </section>
-    <!-- batch loading -->
-
-    <section
-      xml:id="precreate.regions">
-      <title> Table Creation: Pre-Creating Regions </title>
-      <para> Tables in HBase are initially created with one region by default. For bulk imports,
-        this means that all clients will write to the same region until it is large enough to split
-        and become distributed across the cluster. A useful pattern to speed up the bulk import
-        process is to pre-create empty regions. Be somewhat conservative in this, because too-many
-        regions can actually degrade performance. </para>
-      <para>There are two different approaches to pre-creating splits. The first approach is to rely
-        on the default <code>HBaseAdmin</code> strategy (which is implemented in
-          <code>Bytes.split</code>)... </para>
-      <programlisting language="java">
-byte[] startKey = ...;   	// your lowest key
-byte[] endKey = ...;   		// your highest key
-int numberOfRegions = ...;	// # of regions to create
-admin.createTable(table, startKey, endKey, numberOfRegions);
-      </programlisting>
-      <para>And the other approach is to define the splits yourself... </para>
-      <programlisting language="java">
-byte[][] splits = ...;   // create your own splits
-admin.createTable(table, splits);
-</programlisting>
-      <para> See <xref linkend="rowkey.regionsplits"/> for issues related to understanding your
-        keyspace and pre-creating regions. See <xref linkend="manual_region_splitting_decisions"/>
-        for discussion on manually pre-splitting regions.</para>
-    </section>
-    <section
-      xml:id="def.log.flush">
-      <title> Table Creation: Deferred Log Flush </title>
-      <para> The default behavior for Puts using the Write Ahead Log (WAL) is that
-          <classname>WAL</classname> edits will be written immediately. If deferred log flush is
-        used, WAL edits are kept in memory until the flush period. The benefit is aggregated and
-        asynchronous <classname>WAL</classname>- writes, but the potential downside is that if the
-        RegionServer goes down the yet-to-be-flushed edits are lost. This is safer, however, than
-        not using WAL at all with Puts. </para>
-      <para> Deferred log flush can be configured on tables via <link
-          xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/HTableDescriptor.html">HTableDescriptor</link>.
-        The default value of <varname>hbase.regionserver.optionallogflushinterval</varname> is
-        1000ms. </para>
-    </section>
-
-    <section
-      xml:id="perf.hbase.client.autoflush">
-      <title>HBase Client: AutoFlush</title>
-
-      <para>When performing a lot of Puts, make sure that setAutoFlush is set to false on your <link
-          xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/HTable.html">HTable</link>
-        instance. Otherwise, the Puts will be sent one at a time to the RegionServer. Puts added via
-          <code> htable.add(Put)</code> and <code> htable.add( &lt;List&gt; Put)</code> wind up in
-        the same write buffer. If <code>autoFlush = false</code>, these messages are not sent until
-        the write-buffer is filled. To explicitly flush the messages, call
-          <methodname>flushCommits</methodname>. Calling <methodname>close</methodname> on the
-          <classname>HTable</classname> instance will invoke
-        <methodname>flushCommits</methodname>.</para>
-    </section>
-    <section
-      xml:id="perf.hbase.client.putwal">
-      <title>HBase Client: Turn off WAL on Puts</title>
-      <para>A frequent request is to disable the WAL to increase performance of Puts. This is only
-        appropriate for bulk loads, as it puts your data at risk by removing the protection of the
-        WAL in the event of a region server crash. Bulk loads can be re-run in the event of a crash,
-        with little risk of data loss.</para>
-      <warning>
-        <para>If you disable the WAL for anything other than bulk loads, your data is at
-          risk.</para></warning>
-      <para>In general, it is best to use WAL for Puts, and where loading throughput is a concern to
-        use <link linkend="perf.batch.loading">bulk loading</link> techniques instead. For normal
-        Puts, you are not likely to see a performance improvement which would outweigh the risk. To
-        disable the WAL, see <xref linkend="wal.disable"/>.</para>
-    </section>
-    <section
-      xml:id="perf.hbase.client.regiongroup">
-      <title>HBase Client: Group Puts by RegionServer</title>
-      <para>In addition to using the writeBuffer, grouping <classname>Put</classname>s by
-        RegionServer can reduce the number of client RPC calls per writeBuffer flush. There is a
-        utility <classname>HTableUtil</classname> currently on TRUNK that does this, but you can
-        either copy that or implement your own version for those still on 0.90.x or earlier. </para>
-    </section>
-    <section
-      xml:id="perf.hbase.write.mr.reducer">
-      <title>MapReduce: Skip The Reducer</title>
-      <para>When writing a lot of data to an HBase table from a MR job (e.g., with <link
-          xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/mapreduce/TableOutputFormat.html">TableOutputFormat</link>),
-        and specifically where Puts are being emitted from the Mapper, skip the Reducer step. When a
-        Reducer step is used, all of the output (Puts) from the Mapper will get spooled to disk,
-        then sorted/shuffled to other Reducers that will most likely be off-node. It's far more
-        efficient to just write directly to HBase. </para>
-      <para>For summary jobs where HBase is used as a source and a sink, then writes will be coming
-        from the Reducer step (e.g., summarize values then write out result). This is a different
-        processing problem than from the the above case. </para>
-    </section>
-
-    <section
-      xml:id="perf.one.region">
-      <title>Anti-Pattern: One Hot Region</title>
-      <para>If all your data is being written to one region at a time, then re-read the section on
-        processing <link
-          linkend="timeseries">timeseries</link> data.</para>
-      <para>Also, if you are pre-splitting regions and all your data is <emphasis>still</emphasis>
-        winding up in a single region even though your keys aren't monotonically increasing, confirm
-        that your keyspace actually works with the split strategy. There are a variety of reasons
-        that regions may appear "well split" but won't work with your data. As the HBase client
-        communicates directly with the RegionServers, this can be obtained via <link
-          xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/HTable.html#getRegionLocation%28byte[]%29">HTable.getRegionLocation</link>. </para>
-      <para>See <xref
-          linkend="precreate.regions" />, as well as <xref
-          linkend="perf.configurations" />
-      </para>
-    </section>
-
-  </section>
-  <!--  writing -->
-
-  <section
-    xml:id="perf.reading">
-    <title>Reading from HBase</title>
-    <para>The mailing list can help if you are having performance issues. For example, here is a
-      good general thread on what to look at addressing read-time issues: <link
-        xlink:href="http://search-hadoop.com/m/qOo2yyHtCC1">HBase Random Read latency >
-      100ms</link></para>
-    <section
-      xml:id="perf.hbase.client.caching">
-      <title>Scan Caching</title>
-
-      <para>If HBase is used as an input source for a MapReduce job, for example, make sure that the
-        input <link
-          xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Scan.html">Scan</link>
-        instance to the MapReduce job has <methodname>setCaching</methodname> set to something
-        greater than the default (which is 1). Using the default value means that the map-task will
-        make call back to the region-server for every record processed. Setting this value to 500,
-        for example, will transfer 500 rows at a time to the client to be processed. There is a
-        cost/benefit to have the cache value be large because it costs more in memory for both
-        client and RegionServer, so bigger isn't always better.</para>
-      <section
-        xml:id="perf.hbase.client.caching.mr">
-        <title>Scan Caching in MapReduce Jobs</title>
-        <para>Scan settings in MapReduce jobs deserve special attention. Timeouts can result (e.g.,
-          UnknownScannerException) in Map tasks if it takes longer to process a batch of records
-          before the client goes back to the RegionServer for the next set of data. This problem can
-          occur because there is non-trivial processing occuring per row. If you process rows
-          quickly, set caching higher. If you process rows more slowly (e.g., lots of
-          transformations per row, writes), then set caching lower. </para>
-        <para>Timeouts can also happen in a non-MapReduce use case (i.e., single threaded HBase
-          client doing a Scan), but the processing that is often performed in MapReduce jobs tends
-          to exacerbate this issue. </para>
-      </section>
-    </section>
-    <section
-      xml:id="perf.hbase.client.selection">
-      <title>Scan Attribute Selection</title>
-
-      <para>Whenever a Scan is used to process large numbers of rows (and especially when used as a
-        MapReduce source), be aware of which attributes are selected. If <code>scan.addFamily</code>
-        is called then <emphasis>all</emphasis> of the attributes in the specified ColumnFamily will
-        be returned to the client. If only a small number of the available attributes are to be
-        processed, then only those attributes should be specified in the input scan because
-        attribute over-selection is a non-trivial performance penalty over large datasets. </para>
-    </section>
-    <section
-      xml:id="perf.hbase.client.seek">
-      <title>Avoid scan seeks</title>
-      <para>When columns are selected explicitly with <code>scan.addColumn</code>, HBase will
-        schedule seek operations to seek between the selected columns. When rows have few columns
-        and each column has only a few versions this can be inefficient. A seek operation is
-        generally slower if does not seek at least past 5-10 columns/versions or 512-1024
-        bytes.</para>
-      <para>In order to opportunistically look ahead a few columns/versions to see if the next
-        column/version can be found that way before a seek operation is scheduled, a new attribute
-          <code>Scan.HINT_LOOKAHEAD</code> can be set the on Scan object. The following code
-        instructs the RegionServer to attempt two iterations of next before a seek is
-        scheduled:</para>
-      <programlisting language="java">
-Scan scan = new Scan();
-scan.addColumn(...);
-scan.setAttribute(Scan.HINT_LOOKAHEAD, Bytes.toBytes(2));
-table.getScanner(scan);
-      </programlisting>
-    </section>
-    <section
-      xml:id="perf.hbase.mr.input">
-      <title>MapReduce - Input Splits</title>
-      <para>For MapReduce jobs that use HBase tables as a source, if there a pattern where the
-        "slow" map tasks seem to have the same Input Split (i.e., the RegionServer serving the
-        data), see the Troubleshooting Case Study in <xref
-          linkend="casestudies.slownode" />. </para>
-    </section>
-
-    <section
-      xml:id="perf.hbase.client.scannerclose">
-      <title>Close ResultScanners</title>
-
-      <para>This isn't so much about improving performance but rather <emphasis>avoiding</emphasis>
-        performance problems. If you forget to close <link
-          xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/ResultScanner.html">ResultScanners</link>
-        you can cause problems on the RegionServers. Always have ResultScanner processing enclosed
-        in try/catch blocks...</para>
-      <programlisting language="java">
-Scan scan = new Scan();
-// set attrs...
-ResultScanner rs = htable.getScanner(scan);
-try {
-  for (Result r = rs.next(); r != null; r = rs.next()) {
-  // process result...
-} finally {
-  rs.close();  // always close the ResultScanner!
-}
-htable.close();
-      </programlisting>
-    </section>
-
-    <section
-      xml:id="perf.hbase.client.blockcache">
-      <title>Block Cache</title>
-
-      <para><link
-          xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Scan.html">Scan</link>
-        instances can be set to use the block cache in the RegionServer via the
-          <methodname>setCacheBlocks</methodname> method. For input Scans to MapReduce jobs, this
-        should be <varname>false</varname>. For frequently accessed rows, it is advisable to use the
-        block cache.</para>
-
-    <para>Cache more data by moving your Block Cache offheap.  See <xref linkend="offheap.blockcache" /></para>
-    </section>
-    <section
-      xml:id="perf.hbase.client.rowkeyonly">
-      <title>Optimal Loading of Row Keys</title>
-      <para>When performing a table <link
-          xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Scan.html">scan</link>
-        where only the row keys are needed (no families, qualifiers, values or timestamps), add a
-        FilterList with a <varname>MUST_PASS_ALL</varname> operator to the scanner using
-          <methodname>setFilter</methodname>. The filter list should include both a <link
-          xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/filter/FirstKeyOnlyFilter.html">FirstKeyOnlyFilter</link>
-        and a <link
-          xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/filter/KeyOnlyFilter.html">KeyOnlyFilter</link>.
-        Using this filter combination will result in a worst case scenario of a RegionServer reading
-        a single value from disk and minimal network traffic to the client for a single row. </para>
-    </section>
-    <section
-      xml:id="perf.hbase.read.dist">
-      <title>Concurrency: Monitor Data Spread</title>
-      <para>When performing a high number of concurrent reads, monitor the data spread of the target
-        tables. If the target table(s) have too few regions then the reads could likely be served
-        from too few nodes. </para>
-      <para>See <xref
-          linkend="precreate.regions" />, as well as <xref
-          linkend="perf.configurations" />
-      </para>
-    </section>
-    <section
-      xml:id="blooms">
-      <title>Bloom Filters</title>
-      <para>Enabling Bloom Filters can save your having to go to disk and can help improve read
-        latencies.</para>
-      <para><link
-          xlink:href="http://en.wikipedia.org/wiki/Bloom_filter">Bloom filters</link> were developed
-        over in <link xlink:href="https://issues.apache.org/jira/browse/HBASE-1200">HBase-1200 Add
-          bloomfilters</link>. For description of the development process -- why static blooms rather than dynamic
-            -- and for an overview of the unique properties that pertain to blooms in HBase, as well
-            as possible future directions, see the <emphasis>Development Process</emphasis> section
-            of the document <link
-              xlink:href="https://issues.apache.org/jira/secure/attachment/12444007/Bloom_Filters_in_HBase.pdf">BloomFilters
-              in HBase</link> attached to <link
-              xlink:href="https://issues.apache.org/jira/browse/HBASE-1200">HBase-1200</link>.  The bloom filters described here are actually version two of blooms in HBase. In
-            versions up to 0.19.x, HBase had a dynamic bloom option based on work done by the <link
-              xlink:href="http://www.one-lab.org">European Commission One-Lab Project 034819</link>.
-            The core of the HBase bloom work was later pulled up into Hadoop to implement
-            org.apache.hadoop.io.BloomMapFile. Version 1 of HBase blooms never worked that well.
-            Version 2 is a rewrite from scratch though again it starts with the one-lab work.</para>
-        
-      <para>See also <xref
-          linkend="schema.bloom" />. </para>
-
-      <section
-        xml:id="bloom_footprint">
-        <title>Bloom StoreFile footprint</title>
-
-        <para>Bloom filters add an entry to the <classname>StoreFile</classname> general
-            <classname>FileInfo</classname> data structure and then two extra entries to the
-            <classname>StoreFile</classname> metadata section.</para>
-
-        <section>
-          <title>BloomFilter in the <classname>StoreFile</classname>
-            <classname>FileInfo</classname> data structure</title>
-
-          <para><classname>FileInfo</classname> has a <varname>BLOOM_FILTER_TYPE</varname> entry
-            which is set to <varname>NONE</varname>, <varname>ROW</varname> or
-              <varname>ROWCOL.</varname></para>
-        </section>
-
-        <section>
-          <title>BloomFilter entries in <classname>StoreFile</classname> metadata</title>
-
-          <para><varname>BLOOM_FILTER_META</varname> holds Bloom Size, Hash Function used, etc. Its
-            small in size and is cached on <classname>StoreFile.Reader</classname> load</para>
-          <para><varname>BLOOM_FILTER_DATA</varname> is the actual bloomfilter data. Obtained
-            on-demand. Stored in the LRU cache, if it is enabled (Its enabled by default).</para>
-        </section>
-      </section>
-      <section
-        xml:id="config.bloom">
-        <title>Bloom Filter Configuration</title>
-        <section>
-          <title><varname>io.hfile.bloom.enabled</varname> global kill switch</title>
-
-          <para><code>io.hfile.bloom.enabled</code> in <classname>Configuration</classname> serves
-            as the kill switch in case something goes wrong. Default =
-            <varname>true</varname>.</para>
-        </section>
-
-        <section>
-          <title><varname>io.hfile.bloom.error.rate</varname></title>
-
-          <para><varname>io.hfile.bloom.error.rate</varname> = average false positive rate. Default
-            = 1%. Decrease rate by ½ (e.g. to .5%) == +1 bit per bloom entry.</para>
-        </section>
-
-        <section>
-          <title><varname>io.hfile.bloom.max.fold</varname></title>
-
-          <para><varname>io.hfile.bloom.max.fold</varname> = guaranteed minimum fold rate. Most
-            people should leave this alone. Default = 7, or can collapse to at least 1/128th of
-            original size. See the <emphasis>Development Process</emphasis> section of the document <link
-              xlink:href="https://issues.apache.org/jira/secure/attachment/12444007/Bloom_Filters_in_HBase.pdf">BloomFilters
-              in HBase</link> for more on what this option means.</para>
-        </section>
-      </section>
-    </section>
-    <!--  bloom  -->
-    <section>
-      <title>Hedged Reads</title>
-      <para>Hedged reads are a feature of HDFS, introduced in <link
-          xlink:href="https://issues.apache.org/jira/browse/HDFS-5776">HDFS-5776</link>. Normally, a
-        single thread is spawned for each read request. However, if hedged reads are enabled, the
-        client waits some configurable amount of time, and if the read does not return, the client
-        spawns a second read request, against a different block replica of the same data. Whichever
-        read returns first is used, and the other read request is discarded. Hedged reads can be
-        helpful for times where a rare slow read is caused by a transient error such as a failing
-        disk or flaky network connection.</para>
-      <para> Because a HBase RegionServer is a HDFS client, you can enable hedged reads in HBase, by
-        adding the following properties to the RegionServer's hbase-site.xml and tuning the values
-        to suit your environment.</para>
-      <itemizedlist>
-        <title>Configuration for Hedged Reads</title>
-        <listitem>
-          <para><code>dfs.client.hedged.read.threadpool.size</code> - the number of threads
-            dedicated to servicing hedged reads. If this is set to 0 (the default), hedged reads are
-            disabled.</para>
-        </listitem>
-        <listitem>
-          <para><code>dfs.client.hedged.read.threshold.millis</code> - the number of milliseconds to
-            wait before spawning a second read thread.</para>
-        </listitem>
-      </itemizedlist>
-      <example>
-        <title>Hedged Reads Configuration Example</title>
-        <screen><![CDATA[<property>
-  <name>dfs.client.hedged.read.threadpool.size</name>
-  <value>20</value>  <!-- 20 threads -->
-</property>
-<property>
-  <name>dfs.client.hedged.read.threshold.millis</name>
-  <value>10</value>  <!-- 10 milliseconds -->
-</property>]]></screen>
-      </example>
-      <para>Use the following metrics to tune the settings for hedged reads on
-        your cluster. See <xref linkend="hbase_metrics"/>  for more information.</para>
-      <itemizedlist>
-        <title>Metrics for Hedged Reads</title>
-        <listitem>
-          <para>hedgedReadOps - the number of times hedged read threads have been triggered. This
-            could indicate that read requests are often slow, or that hedged reads are triggered too
-            quickly.</para>
-        </listitem>
-        <listitem>
-          <para>hedgeReadOpsWin - the number of times the hedged read thread was faster than the
-            original thread. This could indicate that a given RegionServer is having trouble
-            servicing requests.</para>
-        </listitem>
-      </itemizedlist>
-    </section>
-
-  </section>
-  <!--  reading -->
-
-  <section
-    xml:id="perf.deleting">
-    <title>Deleting from HBase</title>
-    <section
-      xml:id="perf.deleting.queue">
-      <title>Using HBase Tables as Queues</title>
-      <para>HBase tables are sometimes used as queues. In this case, special care must be taken to
-        regularly perform major compactions on tables used in this manner. As is documented in <xref
-          linkend="datamodel" />, marking rows as deleted creates additional StoreFiles which then
-        need to be processed on reads. Tombstones only get cleaned up with major compactions. </para>
-      <para>See also <xref
-          linkend="compaction" /> and <link
-          xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/HBaseAdmin.html#majorCompact%28java.lang.String%29">HBaseAdmin.majorCompact</link>.
-      </para>
-    </section>
-    <section
-      xml:id="perf.deleting.rpc">
-      <title>Delete RPC Behavior</title>
-      <para>Be aware that <code>htable.delete(Delete)</code> doesn't use the writeBuffer. It will
-        execute an RegionServer RPC with each invocation. For a large number of deletes, consider
-          <code>htable.delete(List)</code>. </para>
-      <para>See <link
-          xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/HTable.html#delete%28org.apache.hadoop.hbase.client.Delete%29" />
-      </para>
-    </section>
-  </section>
-  <!--  deleting -->
-
-  <section
-    xml:id="perf.hdfs">
-    <title>HDFS</title>
-    <para>Because HBase runs on <xref
-        linkend="arch.hdfs" /> it is important to understand how it works and how it affects HBase. </para>
-    <section
-      xml:id="perf.hdfs.curr">
-      <title>Current Issues With Low-Latency Reads</title>
-      <para>The original use-case for HDFS was batch processing. As such, there low-latency reads
-        were historically not a priority. With the increased adoption of Apache HBase this is
-        changing, and several improvements are already in development. See the <link
-          xlink:href="https://issues.apache.org/jira/browse/HDFS-1599">Umbrella Jira Ticket for HDFS
-          Improvements for HBase</link>. </para>
-    </section>
-    <section
-      xml:id="perf.hdfs.configs.localread">
-      <title>Leveraging local data</title>
-      <para>Since Hadoop 1.0.0 (also 0.22.1, 0.23.1, CDH3u3 and HDP 1.0) via <link
-          xlink:href="https://issues.apache.org/jira/browse/HDFS-2246">HDFS-2246</link>, it is
-        possible for the DFSClient to take a "short circuit" and read directly from the disk instead
-        of going through the DataNode when the data is local. What this means for HBase is that the
-        RegionServers can read directly off their machine's disks instead of having to open a socket
-        to talk to the DataNode, the former being generally much faster. See JD's <link
-              xlink:href="http://files.meetup.com/1350427/hug_ebay_jdcryans.pdf">Performance
-              Talk</link>. Also see <link
-          xlink:href="http://search-hadoop.com/m/zV6dKrLCVh1">HBase, mail # dev - read short
-          circuit</link> thread for more discussion around short circuit reads. </para>
-      <para>To enable "short circuit" reads, it will depend on your version of Hadoop. The original
-        shortcircuit read patch was much improved upon in Hadoop 2 in <link
-          xlink:href="https://issues.apache.org/jira/browse/HDFS-347">HDFS-347</link>. See <link
-          xlink:href="http://blog.cloudera.com/blog/2013/08/how-improved-short-circuit-local-reads-bring-better-performance-and-security-to-hadoop/" />
-        for details on the difference between the old and new implementations. See <link
-          xlink:href="http://archive.cloudera.com/cdh4/cdh/4/hadoop/hadoop-project-dist/hadoop-hdfs/ShortCircuitLocalReads.html">Hadoop
-          shortcircuit reads configuration page</link> for how to enable the latter, better version
-        of shortcircuit. For example, here is a minimal config. enabling short-circuit reads added
-        to <filename>hbase-site.xml</filename>: </para>
-      <programlisting language="xml"><![CDATA[<property>
-  <name>dfs.client.read.shortcircuit</name>
-  <value>true</value>
-  <description>
-    This configuration parameter turns on short-circuit local reads.
-  </description>
-</property>
-<property>
-  <name>dfs.domain.socket.path</name>
-  <value>/home/stack/sockets/short_circuit_read_socket_PORT</value>
-  <description>
-    Optional.  This is a path to a UNIX domain socket that will be used for
-    communication between the DataNode and local HDFS clients.
-    If the string "_PORT" is present in this path, it will be replaced by the
-    TCP port of the DataNode.
-  </description>
-</property>]]></programlisting>
-      <para>Be careful about permissions for the directory that hosts the shared domain socket;
-        dfsclient will complain if open to other than the hbase user. </para>
-      <para>If you are running on an old Hadoop, one that is without <link
-          xlink:href="https://issues.apache.org/jira/browse/HDFS-347">HDFS-347</link> but that has
-          <link xlink:href="https://issues.apache.org/jira/browse/HDFS-2246">HDFS-2246</link>, you
-        must set two configurations. First, the hdfs-site.xml needs to be amended. Set the property
-          <varname>dfs.block.local-path-access.user</varname> to be the <emphasis>only</emphasis>
-        user that can use the shortcut. This has to be the user that started HBase. Then in
-        hbase-site.xml, set <varname>dfs.client.read.shortcircuit</varname> to be
-          <varname>true</varname>
-      </para>
-        
-      <para> Services -- at least the HBase RegionServers -- will need to be restarted in order to
-        pick up the new configurations. </para>
-      <note
-        xml:id="dfs.client.read.shortcircuit.buffer.size">
-        <title>dfs.client.read.shortcircuit.buffer.size</title>
-        <para>The default for this value is too high when running on a highly trafficed HBase. In
-          HBase, if this value has not been set, we set it down from the default of 1M to 128k
-          (Since HBase 0.98.0 and 0.96.1). See <link
-            xlink:href="https://issues.apache.org/jira/browse/HBASE-8143">HBASE-8143 HBase on Hadoop
-            2 with local short circuit reads (ssr) causes OOM</link>). The Hadoop DFSClient in HBase
-          will allocate a direct byte buffer of this size for <emphasis>each</emphasis> block it has
-          open; given HBase keeps its HDFS files open all the time, this can add up quickly.</para>
-      </note>
-    </section>
-
-    <section
-      xml:id="perf.hdfs.comp">
-      <title>Performance Comparisons of HBase vs. HDFS</title>
-      <para>A fairly common question on the dist-list is why HBase isn't as performant as HDFS files
-        in a batch context (e.g., as a MapReduce source or sink). The short answer is that HBase is
-        doing a lot more than HDFS (e.g., reading the KeyValues, returning the most current row or
-        specified timestamps, etc.), and as such HBase is 4-5 times slower than HDFS in this
-        processing context. There is room for improvement and this gap will, over time, be reduced,
-        but HDFS will always be faster in this use-case. </para>
-    </section>
-  </section>
-
-  <section
-    xml:id="perf.ec2">
-    <title>Amazon EC2</title>
-    <para>Performance questions are common on Amazon EC2 environments because it is a shared
-      environment. You will not see the same throughput as a dedicated server. In terms of running
-      tests on EC2, run them several times for the same reason (i.e., it's a shared environment and
-      you don't know what else is happening on the server). </para>
-    <para>If you are running on EC2 and post performance questions on the dist-list, please state
-      this fact up-front that because EC2 issues are practically a separate class of performance
-      issues. </para>
-  </section>
-
-  <section
-    xml:id="perf.hbase.mr.cluster">
-    <title>Collocating HBase and MapReduce</title>
-    <para>It is often recommended to have different clusters for HBase and MapReduce. A better
-      qualification of this is: don't collocate a HBase that serves live requests with a heavy MR
-      workload. OLTP and OLAP-optimized systems have conflicting requirements and one will lose to
-      the other, usually the former. For example, short latency-sensitive disk reads will have to
-      wait in line behind longer reads that are trying to squeeze out as much throughput as
-      possible. MR jobs that write to HBase will also generate flushes and compactions, which will
-      in turn invalidate blocks in the <xref
-        linkend="block.cache" />. </para>
-    <para>If you need to process the data from your live HBase cluster in MR, you can ship the
-      deltas with <xref
-        linkend="copy.table" /> or use replication to get the new data in real time on the OLAP
-      cluster. In the worst case, if you really need to collocate both, set MR to use less Map and
-      Reduce slots than you'd normally configure, possibly just one. </para>
-    <para>When HBase is used for OLAP operations, it's preferable to set it up in a hardened way
-      like configuring the ZooKeeper session timeout higher and giving more memory to the MemStores
-      (the argument being that the Block Cache won't be used much since the workloads are usually
-      long scans). </para>
-  </section>
-
-  <section
-    xml:id="perf.casestudy">
-    <title>Case Studies</title>
-    <para>For Performance and Troubleshooting Case Studies, see <xref
-        linkend="casestudies" />. </para>
-  </section>
-</chapter>

http://git-wip-us.apache.org/repos/asf/hbase/blob/0acbff24/src/main/docbkx/preface.xml
----------------------------------------------------------------------
diff --git a/src/main/docbkx/preface.xml b/src/main/docbkx/preface.xml
deleted file mode 100644
index a8f6895..0000000
--- a/src/main/docbkx/preface.xml
+++ /dev/null
@@ -1,83 +0,0 @@
-<?xml version="1.0" encoding="UTF-8"?>
-<preface
-    version="5.0"
-    xml:id="preface"
-    xmlns="http://docbook.org/ns/docbook"
-    xmlns:xlink="http://www.w3.org/1999/xlink"
-    xmlns:xi="http://www.w3.org/2001/XInclude"
-    xmlns:svg="http://www.w3.org/2000/svg"
-    xmlns:m="http://www.w3.org/1998/Math/MathML"
-    xmlns:html="http://www.w3.org/1999/xhtml"
-    xmlns:db="http://docbook.org/ns/docbook">
-    <!--
-/**
- * Licensed to the Apache Software Foundation (ASF) under one
- * or more contributor license agreements.  See the NOTICE file
- * distributed with this work for additional information
- * regarding copyright ownership.  The ASF licenses this file
- * to you under the Apache License, Version 2.0 (the
- * "License"); you may not use this file except in compliance
- * with the License.  You may obtain a copy of the License at
- *
- *     http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
--->
-    <title>Preface</title>
-
-    <para>This is the official reference guide for the <link
-            xlink:href="http://hbase.apache.org/">HBase</link> version it ships with. Herein you
-        will find either the definitive documentation on an HBase topic as of its standing when the
-        referenced HBase version shipped, or it will point to the location in <link
-            xlink:href="http://hbase.apache.org/apidocs/index.html">javadoc</link>, <link
-            xlink:href="https://issues.apache.org/jira/browse/HBASE">JIRA</link> or <link
-            xlink:href="http://wiki.apache.org/hadoop/Hbase">wiki</link> where the pertinent
-        information can be found.</para>
-
-    <formalpara>
-        <title>About This Guide</title>
-        <para>This reference guide is a work in progress. The source for this guide can be found in
-            the <filename>src/main/docbkx</filename> directory of the HBase source. This reference
-            guide is marked up using <link xlink:href="http://www.docbook.org/">DocBook</link> from
-            which the the finished guide is generated as part of the 'site' build target. Run
-            <programlisting language="bourne">mvn site</programlisting> to generate this documentation. Amendments and
-            improvements to the documentation are welcomed. Click <link
-                xlink:href="https://issues.apache.org/jira/secure/CreateIssueDetails!init.jspa?pid=12310753&amp;issuetype=1&amp;components=12312132&amp;summary=SHORT+DESCRIPTION"
-                >this link</link> to file a new documentation bug against Apache HBase with some
-            values pre-selected.</para>
-    </formalpara>
-    <formalpara>
-        <title>Contributing to the Documentation</title>
-        <para>For an overview of Docbook and suggestions to get started contributing to the documentation, see <xref linkend="appendix_contributing_to_documentation" />.</para>
-    </formalpara>
-    <formalpara>
-        <title>Providing Feedback</title>
-        <para>This guide allows you to leave comments or questions on any page, using Disqus. Look
-            for the Comments area at the bottom of the page. Answering these questions is a
-            volunteer effort, and may be delayed.</para>
-    </formalpara>
-    
-    <note
-        xml:id="headsup">
-        <title>Heads-up if this is your first foray into the world of distributed
-            computing...</title>
-        <para> If this is your first foray into the wonderful world of Distributed Computing, then
-            you are in for some interesting times. First off, distributed systems are hard; making a
-            distributed system hum requires a disparate skillset that spans systems (hardware and
-            software) and networking. Your cluster' operation can hiccup because of any of a myriad
-            set of reasons from bugs in HBase itself through misconfigurations -- misconfiguration
-            of HBase but also operating system misconfigurations -- through to hardware problems
-            whether it be a bug in your network card drivers or an underprovisioned RAM bus (to
-            mention two recent examples of hardware issues that manifested as "HBase is slow"). You
-            will also need to do a recalibration if up to this your computing has been bound to a
-            single box. Here is one good starting point: <link
-                xlink:href="http://en.wikipedia.org/wiki/Fallacies_of_Distributed_Computing">Fallacies
-                of Distributed Computing</link>. That said, you are welcome. Its a fun place to be.
-            Yours, the HBase Community. </para>
-    </note>
-</preface>