You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hbase.apache.org by mi...@apache.org on 2014/12/22 06:45:39 UTC
[1/8] hbase git commit: HBASE-12738 Chunk Ref Guide into
file-per-chapter
Repository: hbase
Updated Branches:
refs/heads/master d9f25e30a -> a1fe1e096
http://git-wip-us.apache.org/repos/asf/hbase/blob/a1fe1e09/src/main/docbkx/hbase_history.xml
----------------------------------------------------------------------
diff --git a/src/main/docbkx/hbase_history.xml b/src/main/docbkx/hbase_history.xml
new file mode 100644
index 0000000..f7b9064
--- /dev/null
+++ b/src/main/docbkx/hbase_history.xml
@@ -0,0 +1,41 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<appendix
+ xml:id="hbase.history"
+ version="5.0"
+ xmlns="http://docbook.org/ns/docbook"
+ xmlns:xlink="http://www.w3.org/1999/xlink"
+ xmlns:xi="http://www.w3.org/2001/XInclude"
+ xmlns:svg="http://www.w3.org/2000/svg"
+ xmlns:m="http://www.w3.org/1998/Math/MathML"
+ xmlns:html="http://www.w3.org/1999/xhtml"
+ xmlns:db="http://docbook.org/ns/docbook">
+ <!--/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+-->
+ <title>HBase History</title>
+ <itemizedlist>
+ <listitem><para>2006: <link xlink:href="http://research.google.com/archive/bigtable.html">BigTable</link> paper published by Google.
+ </para></listitem>
+ <listitem><para>2006 (end of year): HBase development starts.
+ </para></listitem>
+ <listitem><para>2008: HBase becomes Hadoop sub-project.
+ </para></listitem>
+ <listitem><para>2010: HBase becomes Apache top-level project.
+ </para></listitem>
+ </itemizedlist>
+</appendix>
http://git-wip-us.apache.org/repos/asf/hbase/blob/a1fe1e09/src/main/docbkx/hbck_in_depth.xml
----------------------------------------------------------------------
diff --git a/src/main/docbkx/hbck_in_depth.xml b/src/main/docbkx/hbck_in_depth.xml
new file mode 100644
index 0000000..e2ee34f
--- /dev/null
+++ b/src/main/docbkx/hbck_in_depth.xml
@@ -0,0 +1,237 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<appendix
+ xml:id="hbck.in.depth"
+ version="5.0"
+ xmlns="http://docbook.org/ns/docbook"
+ xmlns:xlink="http://www.w3.org/1999/xlink"
+ xmlns:xi="http://www.w3.org/2001/XInclude"
+ xmlns:svg="http://www.w3.org/2000/svg"
+ xmlns:m="http://www.w3.org/1998/Math/MathML"
+ xmlns:html="http://www.w3.org/1999/xhtml"
+ xmlns:db="http://docbook.org/ns/docbook">
+ <!--/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+-->
+
+ <title>hbck In Depth</title>
+ <para>HBaseFsck (hbck) is a tool for checking for region consistency and table integrity problems
+ and repairing a corrupted HBase. It works in two basic modes -- a read-only inconsistency
+ identifying mode and a multi-phase read-write repair mode.
+ </para>
+ <section>
+ <title>Running hbck to identify inconsistencies</title>
+ <para>To check to see if your HBase cluster has corruptions, run hbck against your HBase cluster:</para>
+ <programlisting language="bourne">
+$ ./bin/hbase hbck
+</programlisting>
+ <para>
+ At the end of the commands output it prints OK or tells you the number of INCONSISTENCIES
+ present. You may also want to run run hbck a few times because some inconsistencies can be
+ transient (e.g. cluster is starting up or a region is splitting). Operationally you may want to run
+ hbck regularly and setup alert (e.g. via nagios) if it repeatedly reports inconsistencies .
+ A run of hbck will report a list of inconsistencies along with a brief description of the regions and
+ tables affected. The using the <code>-details</code> option will report more details including a representative
+ listing of all the splits present in all the tables.
+ </para>
+ <programlisting language="bourne">
+$ ./bin/hbase hbck -details
+</programlisting>
+ <para>If you just want to know if some tables are corrupted, you can limit hbck to identify inconsistencies
+ in only specific tables. For example the following command would only attempt to check table
+ TableFoo and TableBar. The benefit is that hbck will run in less time.</para>
+ <programlisting language="bourne">
+$ ./bin/hbase hbck TableFoo TableBar
+</programlisting>
+ </section>
+ <section><title>Inconsistencies</title>
+ <para>
+ If after several runs, inconsistencies continue to be reported, you may have encountered a
+ corruption. These should be rare, but in the event they occur newer versions of HBase include
+ the hbck tool enabled with automatic repair options.
+ </para>
+ <para>
+ There are two invariants that when violated create inconsistencies in HBase:
+ </para>
+ <itemizedlist>
+ <listitem><para>HBase’s region consistency invariant is satisfied if every region is assigned and
+ deployed on exactly one region server, and all places where this state kept is in
+ accordance.</para>
+ </listitem>
+ <listitem><para>HBase’s table integrity invariant is satisfied if for each table, every possible row key
+ resolves to exactly one region.</para>
+ </listitem>
+ </itemizedlist>
+ <para>
+ Repairs generally work in three phases -- a read-only information gathering phase that identifies
+ inconsistencies, a table integrity repair phase that restores the table integrity invariant, and then
+ finally a region consistency repair phase that restores the region consistency invariant.
+ Starting from version 0.90.0, hbck could detect region consistency problems report on a subset
+ of possible table integrity problems. It also included the ability to automatically fix the most
+ common inconsistency, region assignment and deployment consistency problems. This repair
+ could be done by using the <code>-fix</code> command line option. These problems close regions if they are
+ open on the wrong server or on multiple region servers and also assigns regions to region
+ servers if they are not open.
+ </para>
+ <para>
+ Starting from HBase versions 0.90.7, 0.92.2 and 0.94.0, several new command line options are
+ introduced to aid repairing a corrupted HBase. This hbck sometimes goes by the nickname
+ “uberhbck”. Each particular version of uber hbck is compatible with the HBase’s of the same
+ major version (0.90.7 uberhbck can repair a 0.90.4). However, versions <=0.90.6 and versions
+ <=0.92.1 may require restarting the master or failing over to a backup master.
+ </para>
+ </section>
+ <section><title>Localized repairs</title>
+ <para>
+ When repairing a corrupted HBase, it is best to repair the lowest risk inconsistencies first.
+ These are generally region consistency repairs -- localized single region repairs, that only modify
+ in-memory data, ephemeral zookeeper data, or patch holes in the META table.
+ Region consistency requires that the HBase instance has the state of the region’s data in HDFS
+ (.regioninfo files), the region’s row in the hbase:meta table., and region’s deployment/assignments on
+ region servers and the master in accordance. Options for repairing region consistency include:
+ <itemizedlist>
+ <listitem><para><code>-fixAssignments</code> (equivalent to the 0.90 <code>-fix</code> option) repairs unassigned, incorrectly
+ assigned or multiply assigned regions.</para>
+ </listitem>
+ <listitem><para><code>-fixMeta</code> which removes meta rows when corresponding regions are not present in
+ HDFS and adds new meta rows if they regions are present in HDFS while not in META.</para>
+ </listitem>
+ </itemizedlist>
+ To fix deployment and assignment problems you can run this command:
+ </para>
+ <programlisting language="bourne">
+$ ./bin/hbase hbck -fixAssignments
+</programlisting>
+ <para>To fix deployment and assignment problems as well as repairing incorrect meta rows you can
+ run this command:</para>
+ <programlisting language="bourne">
+$ ./bin/hbase hbck -fixAssignments -fixMeta
+</programlisting>
+ <para>There are a few classes of table integrity problems that are low risk repairs. The first two are
+ degenerate (startkey == endkey) regions and backwards regions (startkey > endkey). These are
+ automatically handled by sidelining the data to a temporary directory (/hbck/xxxx).
+ The third low-risk class is hdfs region holes. This can be repaired by using the:</para>
+ <itemizedlist>
+ <listitem><para><code>-fixHdfsHoles</code> option for fabricating new empty regions on the file system.
+ If holes are detected you can use -fixHdfsHoles and should include -fixMeta and -fixAssignments to make the new region consistent.</para>
+ </listitem>
+ </itemizedlist>
+ <programlisting language="bourne">
+$ ./bin/hbase hbck -fixAssignments -fixMeta -fixHdfsHoles
+</programlisting>
+ <para>Since this is a common operation, we’ve added a the <code>-repairHoles</code> flag that is equivalent to the
+ previous command:</para>
+ <programlisting language="bourne">
+$ ./bin/hbase hbck -repairHoles
+</programlisting>
+ <para>If inconsistencies still remain after these steps, you most likely have table integrity problems
+ related to orphaned or overlapping regions.</para>
+ </section>
+ <section><title>Region Overlap Repairs</title>
+ <para>Table integrity problems can require repairs that deal with overlaps. This is a riskier operation
+ because it requires modifications to the file system, requires some decision making, and may
+ require some manual steps. For these repairs it is best to analyze the output of a <code>hbck -details</code>
+ run so that you isolate repairs attempts only upon problems the checks identify. Because this is
+ riskier, there are safeguard that should be used to limit the scope of the repairs.
+ WARNING: This is a relatively new and have only been tested on online but idle HBase instances
+ (no reads/writes). Use at your own risk in an active production environment!
+ The options for repairing table integrity violations include:</para>
+ <itemizedlist>
+ <listitem><para><code>-fixHdfsOrphans</code> option for “adopting” a region directory that is missing a region
+ metadata file (the .regioninfo file).</para>
+ </listitem>
+ <listitem><para><code>-fixHdfsOverlaps</code> ability for fixing overlapping regions</para>
+ </listitem>
+ </itemizedlist>
+ <para>When repairing overlapping regions, a region’s data can be modified on the file system in two
+ ways: 1) by merging regions into a larger region or 2) by sidelining regions by moving data to
+ “sideline” directory where data could be restored later. Merging a large number of regions is
+ technically correct but could result in an extremely large region that requires series of costly
+ compactions and splitting operations. In these cases, it is probably better to sideline the regions
+ that overlap with the most other regions (likely the largest ranges) so that merges can happen on
+ a more reasonable scale. Since these sidelined regions are already laid out in HBase’s native
+ directory and HFile format, they can be restored by using HBase’s bulk load mechanism.
+ The default safeguard thresholds are conservative. These options let you override the default
+ thresholds and to enable the large region sidelining feature.</para>
+ <itemizedlist>
+ <listitem><para><code>-maxMerge <n></code> maximum number of overlapping regions to merge</para>
+ </listitem>
+ <listitem><para><code>-sidelineBigOverlaps</code> if more than maxMerge regions are overlapping, sideline attempt
+ to sideline the regions overlapping with the most other regions.</para>
+ </listitem>
+ <listitem><para><code>-maxOverlapsToSideline <n></code> if sidelining large overlapping regions, sideline at most n
+ regions.</para>
+ </listitem>
+ </itemizedlist>
+
+ <para>Since often times you would just want to get the tables repaired, you can use this option to turn
+ on all repair options:</para>
+ <itemizedlist>
+ <listitem><para><code>-repair</code> includes all the region consistency options and only the hole repairing table
+ integrity options.</para>
+ </listitem>
+ </itemizedlist>
+ <para>Finally, there are safeguards to limit repairs to only specific tables. For example the following
+ command would only attempt to check and repair table TableFoo and TableBar.</para>
+ <screen language="bourne">
+$ ./bin/hbase hbck -repair TableFoo TableBar
+</screen>
+ <section><title>Special cases: Meta is not properly assigned</title>
+ <para>There are a few special cases that hbck can handle as well.
+ Sometimes the meta table’s only region is inconsistently assigned or deployed. In this case
+ there is a special <code>-fixMetaOnly</code> option that can try to fix meta assignments.</para>
+ <screen language="bourne">
+$ ./bin/hbase hbck -fixMetaOnly -fixAssignments
+</screen>
+ </section>
+ <section><title>Special cases: HBase version file is missing</title>
+ <para>HBase’s data on the file system requires a version file in order to start. If this flie is missing, you
+ can use the <code>-fixVersionFile</code> option to fabricating a new HBase version file. This assumes that
+ the version of hbck you are running is the appropriate version for the HBase cluster.</para>
+ </section>
+ <section><title>Special case: Root and META are corrupt.</title>
+ <para>The most drastic corruption scenario is the case where the ROOT or META is corrupted and
+ HBase will not start. In this case you can use the OfflineMetaRepair tool create new ROOT
+ and META regions and tables.
+ This tool assumes that HBase is offline. It then marches through the existing HBase home
+ directory, loads as much information from region metadata files (.regioninfo files) as possible
+ from the file system. If the region metadata has proper table integrity, it sidelines the original root
+ and meta table directories, and builds new ones with pointers to the region directories and their
+ data.</para>
+ <screen language="bourne">
+$ ./bin/hbase org.apache.hadoop.hbase.util.hbck.OfflineMetaRepair
+</screen>
+ <para>NOTE: This tool is not as clever as uberhbck but can be used to bootstrap repairs that uberhbck
+ can complete.
+ If the tool succeeds you should be able to start hbase and run online repairs if necessary.</para>
+ </section>
+ <section><title>Special cases: Offline split parent</title>
+ <para>
+ Once a region is split, the offline parent will be cleaned up automatically. Sometimes, daughter regions
+ are split again before their parents are cleaned up. HBase can clean up parents in the right order. However,
+ there could be some lingering offline split parents sometimes. They are in META, in HDFS, and not deployed.
+ But HBase can't clean them up. In this case, you can use the <code>-fixSplitParents</code> option to reset
+ them in META to be online and not split. Therefore, hbck can merge them with other regions if fixing
+ overlapping regions option is used.
+ </para>
+ <para>
+ This option should not normally be used, and it is not in <code>-fixAll</code>.
+ </para>
+ </section>
+ </section>
+
+</appendix>
http://git-wip-us.apache.org/repos/asf/hbase/blob/a1fe1e09/src/main/docbkx/mapreduce.xml
----------------------------------------------------------------------
diff --git a/src/main/docbkx/mapreduce.xml b/src/main/docbkx/mapreduce.xml
new file mode 100644
index 0000000..9e9e474
--- /dev/null
+++ b/src/main/docbkx/mapreduce.xml
@@ -0,0 +1,630 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<chapter
+ xml:id="mapreduce"
+ version="5.0"
+ xmlns="http://docbook.org/ns/docbook"
+ xmlns:xlink="http://www.w3.org/1999/xlink"
+ xmlns:xi="http://www.w3.org/2001/XInclude"
+ xmlns:svg="http://www.w3.org/2000/svg"
+ xmlns:m="http://www.w3.org/1998/Math/MathML"
+ xmlns:html="http://www.w3.org/1999/xhtml"
+ xmlns:db="http://docbook.org/ns/docbook">
+ <!--/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+-->
+
+ <title>HBase and MapReduce</title>
+ <para>Apache MapReduce is a software framework used to analyze large amounts of data, and is
+ the framework used most often with <link
+ xlink:href="http://hadoop.apache.org/">Apache Hadoop</link>. MapReduce itself is out of the
+ scope of this document. A good place to get started with MapReduce is <link
+ xlink:href="http://hadoop.apache.org/docs/r1.2.1/mapred_tutorial.html" />. MapReduce version
+ 2 (MR2)is now part of <link
+ xlink:href="http://hadoop.apache.org/docs/r2.3.0/hadoop-yarn/hadoop-yarn-site/">YARN</link>. </para>
+
+ <para> This chapter discusses specific configuration steps you need to take to use MapReduce on
+ data within HBase. In addition, it discusses other interactions and issues between HBase and
+ MapReduce jobs.
+ <note>
+ <title>mapred and mapreduce</title>
+ <para>There are two mapreduce packages in HBase as in MapReduce itself: <filename>org.apache.hadoop.hbase.mapred</filename>
+ and <filename>org.apache.hadoop.hbase.mapreduce</filename>. The former does old-style API and the latter
+ the new style. The latter has more facility though you can usually find an equivalent in the older
+ package. Pick the package that goes with your mapreduce deploy. When in doubt or starting over, pick the
+ <filename>org.apache.hadoop.hbase.mapreduce</filename>. In the notes below, we refer to
+ o.a.h.h.mapreduce but replace with the o.a.h.h.mapred if that is what you are using.
+ </para>
+ </note>
+ </para>
+
+ <section
+ xml:id="hbase.mapreduce.classpath">
+ <title>HBase, MapReduce, and the CLASSPATH</title>
+ <para>By default, MapReduce jobs deployed to a MapReduce cluster do not have access to either
+ the HBase configuration under <envar>$HBASE_CONF_DIR</envar> or the HBase classes.</para>
+ <para>To give the MapReduce jobs the access they need, you could add
+ <filename>hbase-site.xml</filename> to the
+ <filename><replaceable>$HADOOP_HOME</replaceable>/conf/</filename> directory and add the
+ HBase JARs to the <filename><replaceable>HADOOP_HOME</replaceable>/conf/</filename>
+ directory, then copy these changes across your cluster. You could add hbase-site.xml to
+ $HADOOP_HOME/conf and add HBase jars to the $HADOOP_HOME/lib. You would then need to copy
+ these changes across your cluster or edit
+ <filename><replaceable>$HADOOP_HOME</replaceable>conf/hadoop-env.sh</filename> and add
+ them to the <envar>HADOOP_CLASSPATH</envar> variable. However, this approach is not
+ recommended because it will pollute your Hadoop install with HBase references. It also
+ requires you to restart the Hadoop cluster before Hadoop can use the HBase data.</para>
+ <para> Since HBase 0.90.x, HBase adds its dependency JARs to the job configuration itself. The
+ dependencies only need to be available on the local CLASSPATH. The following example runs
+ the bundled HBase <link
+ xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/mapreduce/RowCounter.html">RowCounter</link>
+ MapReduce job against a table named <systemitem>usertable</systemitem> If you have not set
+ the environment variables expected in the command (the parts prefixed by a
+ <literal>$</literal> sign and curly braces), you can use the actual system paths instead.
+ Be sure to use the correct version of the HBase JAR for your system. The backticks
+ (<literal>`</literal> symbols) cause ths shell to execute the sub-commands, setting the
+ CLASSPATH as part of the command. This example assumes you use a BASH-compatible shell. </para>
+ <screen language="bourne">$ <userinput>HADOOP_CLASSPATH=`${HBASE_HOME}/bin/hbase classpath` ${HADOOP_HOME}/bin/hadoop jar ${HBASE_HOME}/hbase-server-VERSION.jar rowcounter usertable</userinput></screen>
+ <para>When the command runs, internally, the HBase JAR finds the dependencies it needs for
+ zookeeper, guava, and its other dependencies on the passed <envar>HADOOP_CLASSPATH</envar>
+ and adds the JARs to the MapReduce job configuration. See the source at
+ TableMapReduceUtil#addDependencyJars(org.apache.hadoop.mapreduce.Job) for how this is done. </para>
+ <note>
+ <para> The example may not work if you are running HBase from its build directory rather
+ than an installed location. You may see an error like the following:</para>
+ <screen>java.lang.RuntimeException: java.lang.ClassNotFoundException: org.apache.hadoop.hbase.mapreduce.RowCounter$RowCounterMapper</screen>
+ <para>If this occurs, try modifying the command as follows, so that it uses the HBase JARs
+ from the <filename>target/</filename> directory within the build environment.</para>
+ <screen language="bourne">$ <userinput>HADOOP_CLASSPATH=${HBASE_HOME}/hbase-server/target/hbase-server-VERSION-SNAPSHOT.jar:`${HBASE_HOME}/bin/hbase classpath` ${HADOOP_HOME}/bin/hadoop jar ${HBASE_HOME}/hbase-server/target/hbase-server-VERSION-SNAPSHOT.jar rowcounter usertable</userinput></screen>
+ </note>
+ <caution>
+ <title>Notice to Mapreduce users of HBase 0.96.1 and above</title>
+ <para>Some mapreduce jobs that use HBase fail to launch. The symptom is an exception similar
+ to the following:</para>
+ <screen>
+Exception in thread "main" java.lang.IllegalAccessError: class
+ com.google.protobuf.ZeroCopyLiteralByteString cannot access its superclass
+ com.google.protobuf.LiteralByteString
+ at java.lang.ClassLoader.defineClass1(Native Method)
+ at java.lang.ClassLoader.defineClass(ClassLoader.java:792)
+ at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
+ at java.net.URLClassLoader.defineClass(URLClassLoader.java:449)
+ at java.net.URLClassLoader.access$100(URLClassLoader.java:71)
+ at java.net.URLClassLoader$1.run(URLClassLoader.java:361)
+ at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
+ at java.security.AccessController.doPrivileged(Native Method)
+ at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
+ at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
+ at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
+ at
+ org.apache.hadoop.hbase.protobuf.ProtobufUtil.toScan(ProtobufUtil.java:818)
+ at
+ org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.convertScanToString(TableMapReduceUtil.java:433)
+ at
+ org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.initTableMapperJob(TableMapReduceUtil.java:186)
+ at
+ org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.initTableMapperJob(TableMapReduceUtil.java:147)
+ at
+ org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.initTableMapperJob(TableMapReduceUtil.java:270)
+ at
+ org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.initTableMapperJob(TableMapReduceUtil.java:100)
+...
+</screen>
+ <para>This is caused by an optimization introduced in <link
+ xlink:href="https://issues.apache.org/jira/browse/HBASE-9867">HBASE-9867</link> that
+ inadvertently introduced a classloader dependency. </para>
+ <para>This affects both jobs using the <code>-libjars</code> option and "fat jar," those
+ which package their runtime dependencies in a nested <code>lib</code> folder.</para>
+ <para>In order to satisfy the new classloader requirements, hbase-protocol.jar must be
+ included in Hadoop's classpath. See <xref
+ linkend="hbase.mapreduce.classpath" /> for current recommendations for resolving
+ classpath errors. The following is included for historical purposes.</para>
+ <para>This can be resolved system-wide by including a reference to the hbase-protocol.jar in
+ hadoop's lib directory, via a symlink or by copying the jar into the new location.</para>
+ <para>This can also be achieved on a per-job launch basis by including it in the
+ <code>HADOOP_CLASSPATH</code> environment variable at job submission time. When
+ launching jobs that package their dependencies, all three of the following job launching
+ commands satisfy this requirement:</para>
+ <screen language="bourne">
+$ <userinput>HADOOP_CLASSPATH=/path/to/hbase-protocol.jar:/path/to/hbase/conf hadoop jar MyJob.jar MyJobMainClass</userinput>
+$ <userinput>HADOOP_CLASSPATH=$(hbase mapredcp):/path/to/hbase/conf hadoop jar MyJob.jar MyJobMainClass</userinput>
+$ <userinput>HADOOP_CLASSPATH=$(hbase classpath) hadoop jar MyJob.jar MyJobMainClass</userinput>
+ </screen>
+ <para>For jars that do not package their dependencies, the following command structure is
+ necessary:</para>
+ <screen language="bourne">
+$ <userinput>HADOOP_CLASSPATH=$(hbase mapredcp):/etc/hbase/conf hadoop jar MyApp.jar MyJobMainClass -libjars $(hbase mapredcp | tr ':' ',')</userinput> ...
+ </screen>
+ <para>See also <link
+ xlink:href="https://issues.apache.org/jira/browse/HBASE-10304">HBASE-10304</link> for
+ further discussion of this issue.</para>
+ </caution>
+ </section>
+
+ <section>
+ <title>MapReduce Scan Caching</title>
+ <para>TableMapReduceUtil now restores the option to set scanner caching (the number of rows
+ which are cached before returning the result to the client) on the Scan object that is
+ passed in. This functionality was lost due to a bug in HBase 0.95 (<link
+ xlink:href="https://issues.apache.org/jira/browse/HBASE-11558">HBASE-11558</link>), which
+ is fixed for HBase 0.98.5 and 0.96.3. The priority order for choosing the scanner caching is
+ as follows:</para>
+ <orderedlist>
+ <listitem>
+ <para>Caching settings which are set on the scan object.</para>
+ </listitem>
+ <listitem>
+ <para>Caching settings which are specified via the configuration option
+ <option>hbase.client.scanner.caching</option>, which can either be set manually in
+ <filename>hbase-site.xml</filename> or via the helper method
+ <code>TableMapReduceUtil.setScannerCaching()</code>.</para>
+ </listitem>
+ <listitem>
+ <para>The default value <code>HConstants.DEFAULT_HBASE_CLIENT_SCANNER_CACHING</code>, which is set to
+ <literal>100</literal>.</para>
+ </listitem>
+ </orderedlist>
+ <para>Optimizing the caching settings is a balance between the time the client waits for a
+ result and the number of sets of results the client needs to receive. If the caching setting
+ is too large, the client could end up waiting for a long time or the request could even time
+ out. If the setting is too small, the scan needs to return results in several pieces.
+ If you think of the scan as a shovel, a bigger cache setting is analogous to a bigger
+ shovel, and a smaller cache setting is equivalent to more shoveling in order to fill the
+ bucket.</para>
+ <para>The list of priorities mentioned above allows you to set a reasonable default, and
+ override it for specific operations.</para>
+ <para>See the API documentation for <link
+ xlink:href="https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Scan.html"
+ >Scan</link> for more details.</para>
+ </section>
+
+ <section>
+ <title>Bundled HBase MapReduce Jobs</title>
+ <para>The HBase JAR also serves as a Driver for some bundled mapreduce jobs. To learn about
+ the bundled MapReduce jobs, run the following command.</para>
+
+ <screen language="bourne">$ <userinput>${HADOOP_HOME}/bin/hadoop jar ${HBASE_HOME}/hbase-server-VERSION.jar</userinput>
+<computeroutput>An example program must be given as the first argument.
+Valid program names are:
+ copytable: Export a table from local cluster to peer cluster
+ completebulkload: Complete a bulk data load.
+ export: Write table data to HDFS.
+ import: Import data written by Export.
+ importtsv: Import data in TSV format.
+ rowcounter: Count rows in HBase table</computeroutput>
+ </screen>
+ <para>Each of the valid program names are bundled MapReduce jobs. To run one of the jobs,
+ model your command after the following example.</para>
+ <screen language="bourne">$ <userinput>${HADOOP_HOME}/bin/hadoop jar ${HBASE_HOME}/hbase-server-VERSION.jar rowcounter myTable</userinput></screen>
+ </section>
+
+ <section>
+ <title>HBase as a MapReduce Job Data Source and Data Sink</title>
+ <para>HBase can be used as a data source, <link
+ xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/mapreduce/TableInputFormat.html">TableInputFormat</link>,
+ and data sink, <link
+ xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/mapreduce/TableOutputFormat.html">TableOutputFormat</link>
+ or <link
+ xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/mapreduce/MultiTableOutputFormat.html">MultiTableOutputFormat</link>,
+ for MapReduce jobs. Writing MapReduce jobs that read or write HBase, it is advisable to
+ subclass <link
+ xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/mapreduce/TableMapper.html">TableMapper</link>
+ and/or <link
+ xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/mapreduce/TableReducer.html">TableReducer</link>.
+ See the do-nothing pass-through classes <link
+ xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/mapreduce/IdentityTableMapper.html">IdentityTableMapper</link>
+ and <link
+ xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/mapreduce/IdentityTableReducer.html">IdentityTableReducer</link>
+ for basic usage. For a more involved example, see <link
+ xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/mapreduce/RowCounter.html">RowCounter</link>
+ or review the <code>org.apache.hadoop.hbase.mapreduce.TestTableMapReduce</code> unit test. </para>
+ <para>If you run MapReduce jobs that use HBase as source or sink, need to specify source and
+ sink table and column names in your configuration.</para>
+
+ <para>When you read from HBase, the <code>TableInputFormat</code> requests the list of regions
+ from HBase and makes a map, which is either a <code>map-per-region</code> or
+ <code>mapreduce.job.maps</code> map, whichever is smaller. If your job only has two maps,
+ raise <code>mapreduce.job.maps</code> to a number greater than the number of regions. Maps
+ will run on the adjacent TaskTracker if you are running a TaskTracer and RegionServer per
+ node. When writing to HBase, it may make sense to avoid the Reduce step and write back into
+ HBase from within your map. This approach works when your job does not need the sort and
+ collation that MapReduce does on the map-emitted data. On insert, HBase 'sorts' so there is
+ no point double-sorting (and shuffling data around your MapReduce cluster) unless you need
+ to. If you do not need the Reduce, you myour map might emit counts of records processed for
+ reporting at the end of the jobj, or set the number of Reduces to zero and use
+ TableOutputFormat. If running the Reduce step makes sense in your case, you should typically
+ use multiple reducers so that load is spread across the HBase cluster.</para>
+
+ <para>A new HBase partitioner, the <link
+ xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/mapreduce/HRegionPartitioner.html">HRegionPartitioner</link>,
+ can run as many reducers the number of existing regions. The HRegionPartitioner is suitable
+ when your table is large and your upload will not greatly alter the number of existing
+ regions upon completion. Otherwise use the default partitioner. </para>
+ </section>
+
+ <section>
+ <title>Writing HFiles Directly During Bulk Import</title>
+ <para>If you are importing into a new table, you can bypass the HBase API and write your
+ content directly to the filesystem, formatted into HBase data files (HFiles). Your import
+ will run faster, perhaps an order of magnitude faster. For more on how this mechanism works,
+ see <xref
+ linkend="arch.bulk.load" />.</para>
+ </section>
+
+ <section>
+ <title>RowCounter Example</title>
+ <para>The included <link
+ xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/mapreduce/RowCounter.html">RowCounter</link>
+ MapReduce job uses <code>TableInputFormat</code> and does a count of all rows in the specified
+ table. To run it, use the following command: </para>
+ <screen language="bourne">$ <userinput>./bin/hadoop jar hbase-X.X.X.jar</userinput></screen>
+ <para>This will
+ invoke the HBase MapReduce Driver class. Select <literal>rowcounter</literal> from the choice of jobs
+ offered. This will print rowcouner usage advice to standard output. Specify the tablename,
+ column to count, and output
+ directory. If you have classpath errors, see <xref linkend="hbase.mapreduce.classpath" />.</para>
+ </section>
+
+ <section
+ xml:id="splitter">
+ <title>Map-Task Splitting</title>
+ <section
+ xml:id="splitter.default">
+ <title>The Default HBase MapReduce Splitter</title>
+ <para>When <link
+ xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/mapreduce/TableInputFormat.html">TableInputFormat</link>
+ is used to source an HBase table in a MapReduce job, its splitter will make a map task for
+ each region of the table. Thus, if there are 100 regions in the table, there will be 100
+ map-tasks for the job - regardless of how many column families are selected in the
+ Scan.</para>
+ </section>
+ <section
+ xml:id="splitter.custom">
+ <title>Custom Splitters</title>
+ <para>For those interested in implementing custom splitters, see the method
+ <code>getSplits</code> in <link
+ xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/mapreduce/TableInputFormatBase.html">TableInputFormatBase</link>.
+ That is where the logic for map-task assignment resides. </para>
+ </section>
+ </section>
+ <section
+ xml:id="mapreduce.example">
+ <title>HBase MapReduce Examples</title>
+ <section
+ xml:id="mapreduce.example.read">
+ <title>HBase MapReduce Read Example</title>
+ <para>The following is an example of using HBase as a MapReduce source in read-only manner.
+ Specifically, there is a Mapper instance but no Reducer, and nothing is being emitted from
+ the Mapper. There job would be defined as follows...</para>
+ <programlisting language="java">
+Configuration config = HBaseConfiguration.create();
+Job job = new Job(config, "ExampleRead");
+job.setJarByClass(MyReadJob.class); // class that contains mapper
+
+Scan scan = new Scan();
+scan.setCaching(500); // 1 is the default in Scan, which will be bad for MapReduce jobs
+scan.setCacheBlocks(false); // don't set to true for MR jobs
+// set other scan attrs
+...
+
+TableMapReduceUtil.initTableMapperJob(
+ tableName, // input HBase table name
+ scan, // Scan instance to control CF and attribute selection
+ MyMapper.class, // mapper
+ null, // mapper output key
+ null, // mapper output value
+ job);
+job.setOutputFormatClass(NullOutputFormat.class); // because we aren't emitting anything from mapper
+
+boolean b = job.waitForCompletion(true);
+if (!b) {
+ throw new IOException("error with job!");
+}
+ </programlisting>
+ <para>...and the mapper instance would extend <link
+ xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/mapreduce/TableMapper.html">TableMapper</link>...</para>
+ <programlisting language="java">
+public static class MyMapper extends TableMapper<Text, Text> {
+
+ public void map(ImmutableBytesWritable row, Result value, Context context) throws InterruptedException, IOException {
+ // process data for the row from the Result instance.
+ }
+}
+ </programlisting>
+ </section>
+ <section
+ xml:id="mapreduce.example.readwrite">
+ <title>HBase MapReduce Read/Write Example</title>
+ <para>The following is an example of using HBase both as a source and as a sink with
+ MapReduce. This example will simply copy data from one table to another.</para>
+ <programlisting language="java">
+Configuration config = HBaseConfiguration.create();
+Job job = new Job(config,"ExampleReadWrite");
+job.setJarByClass(MyReadWriteJob.class); // class that contains mapper
+
+Scan scan = new Scan();
+scan.setCaching(500); // 1 is the default in Scan, which will be bad for MapReduce jobs
+scan.setCacheBlocks(false); // don't set to true for MR jobs
+// set other scan attrs
+
+TableMapReduceUtil.initTableMapperJob(
+ sourceTable, // input table
+ scan, // Scan instance to control CF and attribute selection
+ MyMapper.class, // mapper class
+ null, // mapper output key
+ null, // mapper output value
+ job);
+TableMapReduceUtil.initTableReducerJob(
+ targetTable, // output table
+ null, // reducer class
+ job);
+job.setNumReduceTasks(0);
+
+boolean b = job.waitForCompletion(true);
+if (!b) {
+ throw new IOException("error with job!");
+}
+ </programlisting>
+ <para>An explanation is required of what <classname>TableMapReduceUtil</classname> is doing,
+ especially with the reducer. <link
+ xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/mapreduce/TableOutputFormat.html">TableOutputFormat</link>
+ is being used as the outputFormat class, and several parameters are being set on the
+ config (e.g., TableOutputFormat.OUTPUT_TABLE), as well as setting the reducer output key
+ to <classname>ImmutableBytesWritable</classname> and reducer value to
+ <classname>Writable</classname>. These could be set by the programmer on the job and
+ conf, but <classname>TableMapReduceUtil</classname> tries to make things easier.</para>
+ <para>The following is the example mapper, which will create a <classname>Put</classname>
+ and matching the input <classname>Result</classname> and emit it. Note: this is what the
+ CopyTable utility does. </para>
+ <programlisting language="java">
+public static class MyMapper extends TableMapper<ImmutableBytesWritable, Put> {
+
+ public void map(ImmutableBytesWritable row, Result value, Context context) throws IOException, InterruptedException {
+ // this example is just copying the data from the source table...
+ context.write(row, resultToPut(row,value));
+ }
+
+ private static Put resultToPut(ImmutableBytesWritable key, Result result) throws IOException {
+ Put put = new Put(key.get());
+ for (KeyValue kv : result.raw()) {
+ put.add(kv);
+ }
+ return put;
+ }
+}
+ </programlisting>
+ <para>There isn't actually a reducer step, so <classname>TableOutputFormat</classname> takes
+ care of sending the <classname>Put</classname> to the target table. </para>
+ <para>This is just an example, developers could choose not to use
+ <classname>TableOutputFormat</classname> and connect to the target table themselves.
+ </para>
+ </section>
+ <section
+ xml:id="mapreduce.example.readwrite.multi">
+ <title>HBase MapReduce Read/Write Example With Multi-Table Output</title>
+ <para>TODO: example for <classname>MultiTableOutputFormat</classname>. </para>
+ </section>
+ <section
+ xml:id="mapreduce.example.summary">
+ <title>HBase MapReduce Summary to HBase Example</title>
+ <para>The following example uses HBase as a MapReduce source and sink with a summarization
+ step. This example will count the number of distinct instances of a value in a table and
+ write those summarized counts in another table.
+ <programlisting language="java">
+Configuration config = HBaseConfiguration.create();
+Job job = new Job(config,"ExampleSummary");
+job.setJarByClass(MySummaryJob.class); // class that contains mapper and reducer
+
+Scan scan = new Scan();
+scan.setCaching(500); // 1 is the default in Scan, which will be bad for MapReduce jobs
+scan.setCacheBlocks(false); // don't set to true for MR jobs
+// set other scan attrs
+
+TableMapReduceUtil.initTableMapperJob(
+ sourceTable, // input table
+ scan, // Scan instance to control CF and attribute selection
+ MyMapper.class, // mapper class
+ Text.class, // mapper output key
+ IntWritable.class, // mapper output value
+ job);
+TableMapReduceUtil.initTableReducerJob(
+ targetTable, // output table
+ MyTableReducer.class, // reducer class
+ job);
+job.setNumReduceTasks(1); // at least one, adjust as required
+
+boolean b = job.waitForCompletion(true);
+if (!b) {
+ throw new IOException("error with job!");
+}
+ </programlisting>
+ In this example mapper a column with a String-value is chosen as the value to summarize
+ upon. This value is used as the key to emit from the mapper, and an
+ <classname>IntWritable</classname> represents an instance counter.
+ <programlisting language="java">
+public static class MyMapper extends TableMapper<Text, IntWritable> {
+ public static final byte[] CF = "cf".getBytes();
+ public static final byte[] ATTR1 = "attr1".getBytes();
+
+ private final IntWritable ONE = new IntWritable(1);
+ private Text text = new Text();
+
+ public void map(ImmutableBytesWritable row, Result value, Context context) throws IOException, InterruptedException {
+ String val = new String(value.getValue(CF, ATTR1));
+ text.set(val); // we can only emit Writables...
+
+ context.write(text, ONE);
+ }
+}
+ </programlisting>
+ In the reducer, the "ones" are counted (just like any other MR example that does this),
+ and then emits a <classname>Put</classname>.
+ <programlisting language="java">
+public static class MyTableReducer extends TableReducer<Text, IntWritable, ImmutableBytesWritable> {
+ public static final byte[] CF = "cf".getBytes();
+ public static final byte[] COUNT = "count".getBytes();
+
+ public void reduce(Text key, Iterable<IntWritable> values, Context context) throws IOException, InterruptedException {
+ int i = 0;
+ for (IntWritable val : values) {
+ i += val.get();
+ }
+ Put put = new Put(Bytes.toBytes(key.toString()));
+ put.add(CF, COUNT, Bytes.toBytes(i));
+
+ context.write(null, put);
+ }
+}
+ </programlisting>
+ </para>
+ </section>
+ <section
+ xml:id="mapreduce.example.summary.file">
+ <title>HBase MapReduce Summary to File Example</title>
+ <para>This very similar to the summary example above, with exception that this is using
+ HBase as a MapReduce source but HDFS as the sink. The differences are in the job setup and
+ in the reducer. The mapper remains the same. </para>
+ <programlisting language="java">
+Configuration config = HBaseConfiguration.create();
+Job job = new Job(config,"ExampleSummaryToFile");
+job.setJarByClass(MySummaryFileJob.class); // class that contains mapper and reducer
+
+Scan scan = new Scan();
+scan.setCaching(500); // 1 is the default in Scan, which will be bad for MapReduce jobs
+scan.setCacheBlocks(false); // don't set to true for MR jobs
+// set other scan attrs
+
+TableMapReduceUtil.initTableMapperJob(
+ sourceTable, // input table
+ scan, // Scan instance to control CF and attribute selection
+ MyMapper.class, // mapper class
+ Text.class, // mapper output key
+ IntWritable.class, // mapper output value
+ job);
+job.setReducerClass(MyReducer.class); // reducer class
+job.setNumReduceTasks(1); // at least one, adjust as required
+FileOutputFormat.setOutputPath(job, new Path("/tmp/mr/mySummaryFile")); // adjust directories as required
+
+boolean b = job.waitForCompletion(true);
+if (!b) {
+ throw new IOException("error with job!");
+}
+ </programlisting>
+ <para>As stated above, the previous Mapper can run unchanged with this example. As for the
+ Reducer, it is a "generic" Reducer instead of extending TableMapper and emitting
+ Puts.</para>
+ <programlisting language="java">
+ public static class MyReducer extends Reducer<Text, IntWritable, Text, IntWritable> {
+
+ public void reduce(Text key, Iterable<IntWritable> values, Context context) throws IOException, InterruptedException {
+ int i = 0;
+ for (IntWritable val : values) {
+ i += val.get();
+ }
+ context.write(key, new IntWritable(i));
+ }
+}
+ </programlisting>
+ </section>
+ <section
+ xml:id="mapreduce.example.summary.noreducer">
+ <title>HBase MapReduce Summary to HBase Without Reducer</title>
+ <para>It is also possible to perform summaries without a reducer - if you use HBase as the
+ reducer. </para>
+ <para>An HBase target table would need to exist for the job summary. The Table method
+ <code>incrementColumnValue</code> would be used to atomically increment values. From a
+ performance perspective, it might make sense to keep a Map of values with their values to
+ be incremeneted for each map-task, and make one update per key at during the <code>
+ cleanup</code> method of the mapper. However, your milage may vary depending on the
+ number of rows to be processed and unique keys. </para>
+ <para>In the end, the summary results are in HBase. </para>
+ </section>
+ <section
+ xml:id="mapreduce.example.summary.rdbms">
+ <title>HBase MapReduce Summary to RDBMS</title>
+ <para>Sometimes it is more appropriate to generate summaries to an RDBMS. For these cases,
+ it is possible to generate summaries directly to an RDBMS via a custom reducer. The
+ <code>setup</code> method can connect to an RDBMS (the connection information can be
+ passed via custom parameters in the context) and the cleanup method can close the
+ connection. </para>
+ <para>It is critical to understand that number of reducers for the job affects the
+ summarization implementation, and you'll have to design this into your reducer.
+ Specifically, whether it is designed to run as a singleton (one reducer) or multiple
+ reducers. Neither is right or wrong, it depends on your use-case. Recognize that the more
+ reducers that are assigned to the job, the more simultaneous connections to the RDBMS will
+ be created - this will scale, but only to a point. </para>
+ <programlisting language="java">
+ public static class MyRdbmsReducer extends Reducer<Text, IntWritable, Text, IntWritable> {
+
+ private Connection c = null;
+
+ public void setup(Context context) {
+ // create DB connection...
+ }
+
+ public void reduce(Text key, Iterable<IntWritable> values, Context context) throws IOException, InterruptedException {
+ // do summarization
+ // in this example the keys are Text, but this is just an example
+ }
+
+ public void cleanup(Context context) {
+ // close db connection
+ }
+
+}
+ </programlisting>
+ <para>In the end, the summary results are written to your RDBMS table/s. </para>
+ </section>
+
+ </section>
+ <!-- mr examples -->
+ <section
+ xml:id="mapreduce.htable.access">
+ <title>Accessing Other HBase Tables in a MapReduce Job</title>
+ <para>Although the framework currently allows one HBase table as input to a MapReduce job,
+ other HBase tables can be accessed as lookup tables, etc., in a MapReduce job via creating
+ an Table instance in the setup method of the Mapper.
+ <programlisting language="java">public class MyMapper extends TableMapper<Text, LongWritable> {
+ private Table myOtherTable;
+
+ public void setup(Context context) {
+ // In here create a Connection to the cluster and save it or use the Connection
+ // from the existing table
+ myOtherTable = connection.getTable("myOtherTable");
+ }
+
+ public void map(ImmutableBytesWritable row, Result value, Context context) throws IOException, InterruptedException {
+ // process Result...
+ // use 'myOtherTable' for lookups
+ }
+
+ </programlisting>
+ </para>
+ </section>
+ <section
+ xml:id="mapreduce.specex">
+ <title>Speculative Execution</title>
+ <para>It is generally advisable to turn off speculative execution for MapReduce jobs that use
+ HBase as a source. This can either be done on a per-Job basis through properties, on on the
+ entire cluster. Especially for longer running jobs, speculative execution will create
+ duplicate map-tasks which will double-write your data to HBase; this is probably not what
+ you want. </para>
+ <para>See <xref
+ linkend="spec.ex" /> for more information. </para>
+ </section>
+
+</chapter>
http://git-wip-us.apache.org/repos/asf/hbase/blob/a1fe1e09/src/main/docbkx/orca.xml
----------------------------------------------------------------------
diff --git a/src/main/docbkx/orca.xml b/src/main/docbkx/orca.xml
new file mode 100644
index 0000000..29d8727
--- /dev/null
+++ b/src/main/docbkx/orca.xml
@@ -0,0 +1,47 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<appendix
+ xml:id="orca"
+ version="5.0"
+ xmlns="http://docbook.org/ns/docbook"
+ xmlns:xlink="http://www.w3.org/1999/xlink"
+ xmlns:xi="http://www.w3.org/2001/XInclude"
+ xmlns:svg="http://www.w3.org/2000/svg"
+ xmlns:m="http://www.w3.org/1998/Math/MathML"
+ xmlns:html="http://www.w3.org/1999/xhtml"
+ xmlns:db="http://docbook.org/ns/docbook">
+ <!--/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+-->
+ <title>Apache HBase Orca</title>
+ <figure>
+ <title>Apache HBase Orca</title>
+ <mediaobject>
+ <imageobject>
+ <imagedata align="center" valign="right"
+ fileref="jumping-orca_rotated_25percent.png"/>
+ </imageobject>
+ </mediaobject>
+ </figure>
+ <para><link xlink:href="https://issues.apache.org/jira/browse/HBASE-4920">An Orca is the Apache
+ HBase mascot.</link>
+ See NOTICES.txt. Our Orca logo we got here: http://www.vectorfree.com/jumping-orca
+ It is licensed Creative Commons Attribution 3.0. See https://creativecommons.org/licenses/by/3.0/us/
+ We changed the logo by stripping the colored background, inverting
+ it and then rotating it some.
+ </para>
+</appendix>
http://git-wip-us.apache.org/repos/asf/hbase/blob/a1fe1e09/src/main/docbkx/other_info.xml
----------------------------------------------------------------------
diff --git a/src/main/docbkx/other_info.xml b/src/main/docbkx/other_info.xml
new file mode 100644
index 0000000..72ff274
--- /dev/null
+++ b/src/main/docbkx/other_info.xml
@@ -0,0 +1,83 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<appendix
+ xml:id="other.info"
+ version="5.0"
+ xmlns="http://docbook.org/ns/docbook"
+ xmlns:xlink="http://www.w3.org/1999/xlink"
+ xmlns:xi="http://www.w3.org/2001/XInclude"
+ xmlns:svg="http://www.w3.org/2000/svg"
+ xmlns:m="http://www.w3.org/1998/Math/MathML"
+ xmlns:html="http://www.w3.org/1999/xhtml"
+ xmlns:db="http://docbook.org/ns/docbook">
+ <!--/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+-->
+ <title>Other Information About HBase</title>
+ <section xml:id="other.info.videos"><title>HBase Videos</title>
+ <para>Introduction to HBase
+ <itemizedlist>
+ <listitem><para><link xlink:href="http://www.cloudera.com/content/cloudera/en/resources/library/presentation/chicago_data_summit_apache_hbase_an_introduction_todd_lipcon.html">Introduction to HBase</link> by Todd Lipcon (Chicago Data Summit 2011).
+ </para></listitem>
+ <listitem><para><link xlink:href="http://www.cloudera.com/videos/intorduction-hbase-todd-lipcon">Introduction to HBase</link> by Todd Lipcon (2010).
+ </para></listitem>
+ </itemizedlist>
+ </para>
+ <para><link xlink:href="http://www.cloudera.com/videos/hadoop-world-2011-presentation-video-building-realtime-big-data-services-at-facebook-with-hadoop-and-hbase">Building Real Time Services at Facebook with HBase</link> by Jonathan Gray (Hadoop World 2011).
+ </para>
+ <para><link xlink:href="http://www.cloudera.com/videos/hw10_video_how_stumbleupon_built_and_advertising_platform_using_hbase_and_hadoop">HBase and Hadoop, Mixing Real-Time and Batch Processing at StumbleUpon</link> by JD Cryans (Hadoop World 2010).
+ </para>
+ </section>
+ <section xml:id="other.info.pres"><title>HBase Presentations (Slides)</title>
+ <para><link xlink:href="http://www.cloudera.com/content/cloudera/en/resources/library/hadoopworld/hadoop-world-2011-presentation-video-advanced-hbase-schema-design.html">Advanced HBase Schema Design</link> by Lars George (Hadoop World 2011).
+ </para>
+ <para><link xlink:href="http://www.slideshare.net/cloudera/chicago-data-summit-apache-hbase-an-introduction">Introduction to HBase</link> by Todd Lipcon (Chicago Data Summit 2011).
+ </para>
+ <para><link xlink:href="http://www.slideshare.net/cloudera/hw09-practical-h-base-getting-the-most-from-your-h-base-install">Getting The Most From Your HBase Install</link> by Ryan Rawson, Jonathan Gray (Hadoop World 2009).
+ </para>
+ </section>
+ <section xml:id="other.info.papers"><title>HBase Papers</title>
+ <para><link xlink:href="http://research.google.com/archive/bigtable.html">BigTable</link> by Google (2006).
+ </para>
+ <para><link xlink:href="http://www.larsgeorge.com/2010/05/hbase-file-locality-in-hdfs.html">HBase and HDFS Locality</link> by Lars George (2010).
+ </para>
+ <para><link xlink:href="http://ianvarley.com/UT/MR/Varley_MastersReport_Full_2009-08-07.pdf">No Relation: The Mixed Blessings of Non-Relational Databases</link> by Ian Varley (2009).
+ </para>
+ </section>
+ <section xml:id="other.info.sites"><title>HBase Sites</title>
+ <para><link xlink:href="http://www.cloudera.com/blog/category/hbase/">Cloudera's HBase Blog</link> has a lot of links to useful HBase information.
+ <itemizedlist>
+ <listitem><para><link xlink:href="http://www.cloudera.com/blog/2010/04/cap-confusion-problems-with-partition-tolerance/">CAP Confusion</link> is a relevant entry for background information on
+ distributed storage systems.</para>
+ </listitem>
+ </itemizedlist>
+ </para>
+ <para><link xlink:href="http://wiki.apache.org/hadoop/HBase/HBasePresentations">HBase Wiki</link> has a page with a number of presentations.
+ </para>
+ <para><link xlink:href="http://refcardz.dzone.com/refcardz/hbase">HBase RefCard</link> from DZone.
+ </para>
+ </section>
+ <section xml:id="other.info.books"><title>HBase Books</title>
+ <para><link xlink:href="http://shop.oreilly.com/product/0636920014348.do">HBase: The Definitive Guide</link> by Lars George.
+ </para>
+ </section>
+ <section xml:id="other.info.books.hadoop"><title>Hadoop Books</title>
+ <para><link xlink:href="http://shop.oreilly.com/product/9780596521981.do">Hadoop: The Definitive Guide</link> by Tom White.
+ </para>
+ </section>
+
+</appendix>
http://git-wip-us.apache.org/repos/asf/hbase/blob/a1fe1e09/src/main/docbkx/performance.xml
----------------------------------------------------------------------
diff --git a/src/main/docbkx/performance.xml b/src/main/docbkx/performance.xml
index 1757d3f..42ed79b 100644
--- a/src/main/docbkx/performance.xml
+++ b/src/main/docbkx/performance.xml
@@ -273,7 +273,7 @@ tableDesc.addFamily(cfDesc);
If there is enough RAM, increasing this can help.
</para>
</section>
- <section xml:id="hbase.regionserver.checksum.verify">
+ <section xml:id="hbase.regionserver.checksum.verify.performance">
<title><varname>hbase.regionserver.checksum.verify</varname></title>
<para>Have HBase write the checksum into the datablock and save
having to do the checksum seek whenever you read.</para>
http://git-wip-us.apache.org/repos/asf/hbase/blob/a1fe1e09/src/main/docbkx/sql.xml
----------------------------------------------------------------------
diff --git a/src/main/docbkx/sql.xml b/src/main/docbkx/sql.xml
new file mode 100644
index 0000000..40f43d6
--- /dev/null
+++ b/src/main/docbkx/sql.xml
@@ -0,0 +1,40 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<appendix
+ xml:id="sql"
+ version="5.0"
+ xmlns="http://docbook.org/ns/docbook"
+ xmlns:xlink="http://www.w3.org/1999/xlink"
+ xmlns:xi="http://www.w3.org/2001/XInclude"
+ xmlns:svg="http://www.w3.org/2000/svg"
+ xmlns:m="http://www.w3.org/1998/Math/MathML"
+ xmlns:html="http://www.w3.org/1999/xhtml"
+ xmlns:db="http://docbook.org/ns/docbook">
+ <!--/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+-->
+ <title>SQL over HBase</title>
+ <section xml:id="phoenix">
+ <title>Apache Phoenix</title>
+ <para><link xlink:href="http://phoenix.apache.org">Apache Phoenix</link></para>
+ </section>
+ <section xml:id="trafodion">
+ <title>Trafodion</title>
+ <para><link xlink:href="https://wiki.trafodion.org/">Trafodion: Transactional SQL-on-HBase</link></para>
+ </section>
+
+</appendix>
http://git-wip-us.apache.org/repos/asf/hbase/blob/a1fe1e09/src/main/docbkx/upgrading.xml
----------------------------------------------------------------------
diff --git a/src/main/docbkx/upgrading.xml b/src/main/docbkx/upgrading.xml
index d5708a4..5d71e0f 100644
--- a/src/main/docbkx/upgrading.xml
+++ b/src/main/docbkx/upgrading.xml
@@ -240,7 +240,7 @@
</table>
</section>
- <section xml:id="hbase.client.api">
+ <section xml:id="hbase.client.api.surface">
<title>HBase API surface</title>
<para> HBase has a lot of API points, but for the compatibility matrix above, we differentiate between Client API, Limited Private API, and Private API. HBase uses a version of
<link xlink:href="https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/Compatibility.html">Hadoop's Interface classification</link>. HBase's Interface classification classes can be found <link xlink:href="https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/classification/package-summary.html"> here</link>.
http://git-wip-us.apache.org/repos/asf/hbase/blob/a1fe1e09/src/main/docbkx/ycsb.xml
----------------------------------------------------------------------
diff --git a/src/main/docbkx/ycsb.xml b/src/main/docbkx/ycsb.xml
new file mode 100644
index 0000000..695614c
--- /dev/null
+++ b/src/main/docbkx/ycsb.xml
@@ -0,0 +1,36 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<appendix xml:id="ycsb" version="5.0" xmlns="http://docbook.org/ns/docbook"
+ xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xi="http://www.w3.org/2001/XInclude"
+ xmlns:svg="http://www.w3.org/2000/svg" xmlns:m="http://www.w3.org/1998/Math/MathML"
+ xmlns:html="http://www.w3.org/1999/xhtml" xmlns:db="http://docbook.org/ns/docbook">
+ <!--/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+-->
+ <title>YCSB</title>
+ <para><link xlink:href="https://github.com/brianfrankcooper/YCSB/">YCSB: The
+ Yahoo! Cloud Serving Benchmark</link> and HBase</para>
+ <para>TODO: Describe how YCSB is poor for putting up a decent cluster load.</para>
+ <para>TODO: Describe setup of YCSB for HBase. In particular, presplit your tables before you
+ start a run. See <link xlink:href="https://issues.apache.org/jira/browse/HBASE-4163"
+ >HBASE-4163 Create Split Strategy for YCSB Benchmark</link> for why and a little shell
+ command for how to do it.</para>
+ <para>Ted Dunning redid YCSB so it's mavenized and added facility for verifying workloads. See
+ <link xlink:href="https://github.com/tdunning/YCSB">Ted Dunning's YCSB</link>.</para>
+
+
+</appendix>
[4/8] hbase git commit: HBASE-12738 Chunk Ref Guide into
file-per-chapter
Posted by mi...@apache.org.
http://git-wip-us.apache.org/repos/asf/hbase/blob/a1fe1e09/src/main/docbkx/compression.xml
----------------------------------------------------------------------
diff --git a/src/main/docbkx/compression.xml b/src/main/docbkx/compression.xml
new file mode 100644
index 0000000..d1971b1
--- /dev/null
+++ b/src/main/docbkx/compression.xml
@@ -0,0 +1,535 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<appendix
+ xml:id="compression"
+ version="5.0"
+ xmlns="http://docbook.org/ns/docbook"
+ xmlns:xlink="http://www.w3.org/1999/xlink"
+ xmlns:xi="http://www.w3.org/2001/XInclude"
+ xmlns:svg="http://www.w3.org/2000/svg"
+ xmlns:m="http://www.w3.org/1998/Math/MathML"
+ xmlns:html="http://www.w3.org/1999/xhtml"
+ xmlns:db="http://docbook.org/ns/docbook">
+ <!--/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+-->
+
+ <title>Compression and Data Block Encoding In
+ HBase<indexterm><primary>Compression</primary><secondary>Data Block
+ Encoding</secondary><seealso>codecs</seealso></indexterm></title>
+ <note>
+ <para>Codecs mentioned in this section are for encoding and decoding data blocks or row keys.
+ For information about replication codecs, see <xref
+ linkend="cluster.replication.preserving.tags" />.</para>
+ </note>
+ <para>Some of the information in this section is pulled from a <link
+ xlink:href="http://search-hadoop.com/m/lL12B1PFVhp1/v=threaded">discussion</link> on the
+ HBase Development mailing list.</para>
+ <para>HBase supports several different compression algorithms which can be enabled on a
+ ColumnFamily. Data block encoding attempts to limit duplication of information in keys, taking
+ advantage of some of the fundamental designs and patterns of HBase, such as sorted row keys
+ and the schema of a given table. Compressors reduce the size of large, opaque byte arrays in
+ cells, and can significantly reduce the storage space needed to store uncompressed
+ data.</para>
+ <para>Compressors and data block encoding can be used together on the same ColumnFamily.</para>
+
+ <formalpara>
+ <title>Changes Take Effect Upon Compaction</title>
+ <para>If you change compression or encoding for a ColumnFamily, the changes take effect during
+ compaction.</para>
+ </formalpara>
+
+ <para>Some codecs take advantage of capabilities built into Java, such as GZip compression.
+ Others rely on native libraries. Native libraries may be available as part of Hadoop, such as
+ LZ4. In this case, HBase only needs access to the appropriate shared library. Other codecs,
+ such as Google Snappy, need to be installed first. Some codecs are licensed in ways that
+ conflict with HBase's license and cannot be shipped as part of HBase.</para>
+
+ <para>This section discusses common codecs that are used and tested with HBase. No matter what
+ codec you use, be sure to test that it is installed correctly and is available on all nodes in
+ your cluster. Extra operational steps may be necessary to be sure that codecs are available on
+ newly-deployed nodes. You can use the <xref
+ linkend="compression.test" /> utility to check that a given codec is correctly
+ installed.</para>
+
+ <para>To configure HBase to use a compressor, see <xref
+ linkend="compressor.install" />. To enable a compressor for a ColumnFamily, see <xref
+ linkend="changing.compression" />. To enable data block encoding for a ColumnFamily, see
+ <xref linkend="data.block.encoding.enable" />.</para>
+ <itemizedlist>
+ <title>Block Compressors</title>
+ <listitem>
+ <para>none</para>
+ </listitem>
+ <listitem>
+ <para>Snappy</para>
+ </listitem>
+ <listitem>
+ <para>LZO</para>
+ </listitem>
+ <listitem>
+ <para>LZ4</para>
+ </listitem>
+ <listitem>
+ <para>GZ</para>
+ </listitem>
+ </itemizedlist>
+
+
+ <itemizedlist xml:id="data.block.encoding.types">
+ <title>Data Block Encoding Types</title>
+ <listitem>
+ <para>Prefix - Often, keys are very similar. Specifically, keys often share a common prefix
+ and only differ near the end. For instance, one key might be
+ <literal>RowKey:Family:Qualifier0</literal> and the next key might be
+ <literal>RowKey:Family:Qualifier1</literal>. In Prefix encoding, an extra column is
+ added which holds the length of the prefix shared between the current key and the previous
+ key. Assuming the first key here is totally different from the key before, its prefix
+ length is 0. The second key's prefix length is <literal>23</literal>, since they have the
+ first 23 characters in common.</para>
+ <para>Obviously if the keys tend to have nothing in common, Prefix will not provide much
+ benefit.</para>
+ <para>The following image shows a hypothetical ColumnFamily with no data block encoding.</para>
+ <figure>
+ <title>ColumnFamily with No Encoding</title>
+ <mediaobject>
+ <imageobject>
+ <imagedata fileref="data_block_no_encoding.png" width="800"/>
+ </imageobject>
+ <caption><para>A ColumnFamily with no encoding></para></caption>
+ </mediaobject>
+ </figure>
+ <para>Here is the same data with prefix data encoding.</para>
+ <figure>
+ <title>ColumnFamily with Prefix Encoding</title>
+ <mediaobject>
+ <imageobject>
+ <imagedata fileref="data_block_prefix_encoding.png" width="800"/>
+ </imageobject>
+ <caption><para>A ColumnFamily with prefix encoding</para></caption>
+ </mediaobject>
+ </figure>
+ </listitem>
+ <listitem>
+ <para>Diff - Diff encoding expands upon Prefix encoding. Instead of considering the key
+ sequentially as a monolithic series of bytes, each key field is split so that each part of
+ the key can be compressed more efficiently. Two new fields are added: timestamp and type.
+ If the ColumnFamily is the same as the previous row, it is omitted from the current row.
+ If the key length, value length or type are the same as the previous row, the field is
+ omitted. In addition, for increased compression, the timestamp is stored as a Diff from
+ the previous row's timestamp, rather than being stored in full. Given the two row keys in
+ the Prefix example, and given an exact match on timestamp and the same type, neither the
+ value length, or type needs to be stored for the second row, and the timestamp value for
+ the second row is just 0, rather than a full timestamp.</para>
+ <para>Diff encoding is disabled by default because writing and scanning are slower but more
+ data is cached.</para>
+ <para>This image shows the same ColumnFamily from the previous images, with Diff encoding.</para>
+ <figure>
+ <title>ColumnFamily with Diff Encoding</title>
+ <mediaobject>
+ <imageobject>
+ <imagedata fileref="data_block_diff_encoding.png" width="800"/>
+ </imageobject>
+ <caption><para>A ColumnFamily with diff encoding</para></caption>
+ </mediaobject>
+ </figure>
+ </listitem>
+ <listitem>
+ <para>Fast Diff - Fast Diff works similar to Diff, but uses a faster implementation. It also
+ adds another field which stores a single bit to track whether the data itself is the same
+ as the previous row. If it is, the data is not stored again. Fast Diff is the recommended
+ codec to use if you have long keys or many columns. The data format is nearly identical to
+ Diff encoding, so there is not an image to illustrate it.</para>
+ </listitem>
+ <listitem>
+ <para>Prefix Tree encoding was introduced as an experimental feature in HBase 0.96. It
+ provides similar memory savings to the Prefix, Diff, and Fast Diff encoder, but provides
+ faster random access at a cost of slower encoding speed. Prefix Tree may be appropriate
+ for applications that have high block cache hit ratios. It introduces new 'tree' fields
+ for the row and column. The row tree field contains a list of offsets/references
+ corresponding to the cells in that row. This allows for a good deal of compression. For
+ more details about Prefix Tree encoding, see <link
+ xlink:href="https://issues.apache.org/jira/browse/HBASE-4676">HBASE-4676</link>. It is
+ difficult to graphically illustrate a prefix tree, so no image is included. See the
+ Wikipedia article for <link
+ xlink:href="http://en.wikipedia.org/wiki/Trie">Trie</link> for more general information
+ about this data structure.</para>
+ </listitem>
+ </itemizedlist>
+
+ <section>
+ <title>Which Compressor or Data Block Encoder To Use</title>
+ <para>The compression or codec type to use depends on the characteristics of your data.
+ Choosing the wrong type could cause your data to take more space rather than less, and can
+ have performance implications. In general, you need to weigh your options between smaller
+ size and faster compression/decompression. Following are some general guidelines, expanded from a discussion at <link xlink:href="http://search-hadoop.com/m/lL12B1PFVhp1">Documenting Guidance on compression and codecs</link>. </para>
+ <itemizedlist>
+ <listitem>
+ <para>If you have long keys (compared to the values) or many columns, use a prefix
+ encoder. FAST_DIFF is recommended, as more testing is needed for Prefix Tree
+ encoding.</para>
+ </listitem>
+ <listitem>
+ <para>If the values are large (and not precompressed, such as images), use a data block
+ compressor.</para>
+ </listitem>
+ <listitem>
+ <para>Use GZIP for <firstterm>cold data</firstterm>, which is accessed infrequently. GZIP
+ compression uses more CPU resources than Snappy or LZO, but provides a higher
+ compression ratio.</para>
+ </listitem>
+ <listitem>
+ <para>Use Snappy or LZO for <firstterm>hot data</firstterm>, which is accessed
+ frequently. Snappy and LZO use fewer CPU resources than GZIP, but do not provide as high
+ of a compression ratio.</para>
+ </listitem>
+ <listitem>
+ <para>In most cases, enabling Snappy or LZO by default is a good choice, because they have
+ a low performance overhead and provide space savings.</para>
+ </listitem>
+ <listitem>
+ <para>Before Snappy became available by Google in 2011, LZO was the default. Snappy has
+ similar qualities as LZO but has been shown to perform better.</para>
+ </listitem>
+ </itemizedlist>
+ </section>
+ <section xml:id="hadoop.native.lib">
+ <title>Making use of Hadoop Native Libraries in HBase</title>
+ <para>The Hadoop shared library has a bunch of facility including
+ compression libraries and fast crc'ing. To make this facility available
+ to HBase, do the following. HBase/Hadoop will fall back to use
+ alternatives if it cannot find the native library versions -- or
+ fail outright if you asking for an explicit compressor and there is
+ no alternative available.</para>
+ <para>If you see the following in your HBase logs, you know that HBase was unable
+ to locate the Hadoop native libraries:
+ <programlisting>2014-08-07 09:26:20,139 WARN [main] util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable</programlisting>
+ If the libraries loaded successfully, the WARN message does not show.
+ </para>
+ <para>Lets presume your Hadoop shipped with a native library that
+ suits the platform you are running HBase on. To check if the Hadoop
+ native library is available to HBase, run the following tool (available in
+ Hadoop 2.1 and greater):
+ <programlisting>$ ./bin/hbase --config ~/conf_hbase org.apache.hadoop.util.NativeLibraryChecker
+2014-08-26 13:15:38,717 WARN [main] util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
+Native library checking:
+hadoop: false
+zlib: false
+snappy: false
+lz4: false
+bzip2: false
+2014-08-26 13:15:38,863 INFO [main] util.ExitUtil: Exiting with status 1</programlisting>
+Above shows that the native hadoop library is not available in HBase context.
+ </para>
+ <para>To fix the above, either copy the Hadoop native libraries local or symlink to
+ them if the Hadoop and HBase stalls are adjacent in the filesystem.
+ You could also point at their location by setting the <varname>LD_LIBRARY_PATH</varname> environment
+ variable.</para>
+ <para>Where the JVM looks to find native librarys is "system dependent"
+ (See <classname>java.lang.System#loadLibrary(name)</classname>). On linux, by default,
+ is going to look in <filename>lib/native/PLATFORM</filename> where <varname>PLATFORM</varname>
+ is the label for the platform your HBase is installed on.
+ On a local linux machine, it seems to be the concatenation of the java properties
+ <varname>os.name</varname> and <varname>os.arch</varname> followed by whether 32 or 64 bit.
+ HBase on startup prints out all of the java system properties so find the os.name and os.arch
+ in the log. For example:
+ <programlisting>....
+ 2014-08-06 15:27:22,853 INFO [main] zookeeper.ZooKeeper: Client environment:os.name=Linux
+ 2014-08-06 15:27:22,853 INFO [main] zookeeper.ZooKeeper: Client environment:os.arch=amd64
+ ...
+ </programlisting>
+ So in this case, the PLATFORM string is <varname>Linux-amd64-64</varname>.
+ Copying the Hadoop native libraries or symlinking at <filename>lib/native/Linux-amd64-64</filename>
+ will ensure they are found. Check with the Hadoop <filename>NativeLibraryChecker</filename>.
+ </para>
+
+ <para>Here is example of how to point at the Hadoop libs with <varname>LD_LIBRARY_PATH</varname>
+ environment variable:
+ <programlisting>$ LD_LIBRARY_PATH=~/hadoop-2.5.0-SNAPSHOT/lib/native ./bin/hbase --config ~/conf_hbase org.apache.hadoop.util.NativeLibraryChecker
+2014-08-26 13:42:49,332 INFO [main] bzip2.Bzip2Factory: Successfully loaded & initialized native-bzip2 library system-native
+2014-08-26 13:42:49,337 INFO [main] zlib.ZlibFactory: Successfully loaded & initialized native-zlib library
+Native library checking:
+hadoop: true /home/stack/hadoop-2.5.0-SNAPSHOT/lib/native/libhadoop.so.1.0.0
+zlib: true /lib64/libz.so.1
+snappy: true /usr/lib64/libsnappy.so.1
+lz4: true revision:99
+bzip2: true /lib64/libbz2.so.1</programlisting>
+Set in <filename>hbase-env.sh</filename> the LD_LIBRARY_PATH environment variable when starting your HBase.
+ </para>
+ </section>
+
+ <section>
+ <title>Compressor Configuration, Installation, and Use</title>
+ <section
+ xml:id="compressor.install">
+ <title>Configure HBase For Compressors</title>
+ <para>Before HBase can use a given compressor, its libraries need to be available. Due to
+ licensing issues, only GZ compression is available to HBase (via native Java libraries) in
+ a default installation. Other compression libraries are available via the shared library
+ bundled with your hadoop. The hadoop native library needs to be findable when HBase
+ starts. See </para>
+ <section>
+ <title>Compressor Support On the Master</title>
+ <para>A new configuration setting was introduced in HBase 0.95, to check the Master to
+ determine which data block encoders are installed and configured on it, and assume that
+ the entire cluster is configured the same. This option,
+ <code>hbase.master.check.compression</code>, defaults to <literal>true</literal>. This
+ prevents the situation described in <link
+ xlink:href="https://issues.apache.org/jira/browse/HBASE-6370">HBASE-6370</link>, where
+ a table is created or modified to support a codec that a region server does not support,
+ leading to failures that take a long time to occur and are difficult to debug. </para>
+ <para>If <code>hbase.master.check.compression</code> is enabled, libraries for all desired
+ compressors need to be installed and configured on the Master, even if the Master does
+ not run a region server.</para>
+ </section>
+ <section>
+ <title>Install GZ Support Via Native Libraries</title>
+ <para>HBase uses Java's built-in GZip support unless the native Hadoop libraries are
+ available on the CLASSPATH. The recommended way to add libraries to the CLASSPATH is to
+ set the environment variable <envar>HBASE_LIBRARY_PATH</envar> for the user running
+ HBase. If native libraries are not available and Java's GZIP is used, <literal>Got
+ brand-new compressor</literal> reports will be present in the logs. See <xref
+ linkend="brand.new.compressor" />).</para>
+ </section>
+ <section
+ xml:id="lzo.compression">
+ <title>Install LZO Support</title>
+ <para>HBase cannot ship with LZO because of incompatibility between HBase, which uses an
+ Apache Software License (ASL) and LZO, which uses a GPL license. See the <link
+ xlink:href="http://wiki.apache.org/hadoop/UsingLzoCompression">Using LZO
+ Compression</link> wiki page for information on configuring LZO support for HBase. </para>
+ <para>If you depend upon LZO compression, consider configuring your RegionServers to fail
+ to start if LZO is not available. See <xref
+ linkend="hbase.regionserver.codecs" />.</para>
+ </section>
+ <section
+ xml:id="lz4.compression">
+ <title>Configure LZ4 Support</title>
+ <para>LZ4 support is bundled with Hadoop. Make sure the hadoop shared library
+ (libhadoop.so) is accessible when you start
+ HBase. After configuring your platform (see <xref
+ linkend="hbase.native.platform" />), you can make a symbolic link from HBase to the native Hadoop
+ libraries. This assumes the two software installs are colocated. For example, if my
+ 'platform' is Linux-amd64-64:
+ <programlisting language="bourne">$ cd $HBASE_HOME
+$ mkdir lib/native
+$ ln -s $HADOOP_HOME/lib/native lib/native/Linux-amd64-64</programlisting>
+ Use the compression tool to check that LZ4 is installed on all nodes. Start up (or restart)
+ HBase. Afterward, you can create and alter tables to enable LZ4 as a
+ compression codec.:
+ <screen>
+hbase(main):003:0> <userinput>alter 'TestTable', {NAME => 'info', COMPRESSION => 'LZ4'}</userinput>
+ </screen>
+ </para>
+ </section>
+ <section
+ xml:id="snappy.compression.installation">
+ <title>Install Snappy Support</title>
+ <para>HBase does not ship with Snappy support because of licensing issues. You can install
+ Snappy binaries (for instance, by using <command>yum install snappy</command> on CentOS)
+ or build Snappy from source. After installing Snappy, search for the shared library,
+ which will be called <filename>libsnappy.so.X</filename> where X is a number. If you
+ built from source, copy the shared library to a known location on your system, such as
+ <filename>/opt/snappy/lib/</filename>.</para>
+ <para>In addition to the Snappy library, HBase also needs access to the Hadoop shared
+ library, which will be called something like <filename>libhadoop.so.X.Y</filename>,
+ where X and Y are both numbers. Make note of the location of the Hadoop library, or copy
+ it to the same location as the Snappy library.</para>
+ <note>
+ <para>The Snappy and Hadoop libraries need to be available on each node of your cluster.
+ See <xref
+ linkend="compression.test" /> to find out how to test that this is the case.</para>
+ <para>See <xref
+ linkend="hbase.regionserver.codecs" /> to configure your RegionServers to fail to
+ start if a given compressor is not available.</para>
+ </note>
+ <para>Each of these library locations need to be added to the environment variable
+ <envar>HBASE_LIBRARY_PATH</envar> for the operating system user that runs HBase. You
+ need to restart the RegionServer for the changes to take effect.</para>
+ </section>
+
+
+ <section
+ xml:id="compression.test">
+ <title>CompressionTest</title>
+ <para>You can use the CompressionTest tool to verify that your compressor is available to
+ HBase:</para>
+ <screen language="bourne">
+ $ hbase org.apache.hadoop.hbase.util.CompressionTest hdfs://<replaceable>host/path/to/hbase</replaceable> snappy
+ </screen>
+ </section>
+
+
+ <section
+ xml:id="hbase.regionserver.codecs">
+ <title>Enforce Compression Settings On a RegionServer</title>
+ <para>You can configure a RegionServer so that it will fail to restart if compression is
+ configured incorrectly, by adding the option hbase.regionserver.codecs to the
+ <filename>hbase-site.xml</filename>, and setting its value to a comma-separated list
+ of codecs that need to be available. For example, if you set this property to
+ <literal>lzo,gz</literal>, the RegionServer would fail to start if both compressors
+ were not available. This would prevent a new server from being added to the cluster
+ without having codecs configured properly.</para>
+ </section>
+ </section>
+
+ <section
+ xml:id="changing.compression">
+ <title>Enable Compression On a ColumnFamily</title>
+ <para>To enable compression for a ColumnFamily, use an <code>alter</code> command. You do
+ not need to re-create the table or copy data. If you are changing codecs, be sure the old
+ codec is still available until all the old StoreFiles have been compacted.</para>
+ <example>
+ <title>Enabling Compression on a ColumnFamily of an Existing Table using HBase
+ Shell</title>
+ <screen><![CDATA[
+hbase> disable 'test'
+hbase> alter 'test', {NAME => 'cf', COMPRESSION => 'GZ'}
+hbase> enable 'test']]>
+ </screen>
+ </example>
+ <example>
+ <title>Creating a New Table with Compression On a ColumnFamily</title>
+ <screen><![CDATA[
+hbase> create 'test2', { NAME => 'cf2', COMPRESSION => 'SNAPPY' }
+ ]]></screen>
+ </example>
+ <example>
+ <title>Verifying a ColumnFamily's Compression Settings</title>
+ <screen><![CDATA[
+hbase> describe 'test'
+DESCRIPTION ENABLED
+ 'test', {NAME => 'cf', DATA_BLOCK_ENCODING => 'NONE false
+ ', BLOOMFILTER => 'ROW', REPLICATION_SCOPE => '0',
+ VERSIONS => '1', COMPRESSION => 'GZ', MIN_VERSIONS
+ => '0', TTL => 'FOREVER', KEEP_DELETED_CELLS => 'fa
+ lse', BLOCKSIZE => '65536', IN_MEMORY => 'false', B
+ LOCKCACHE => 'true'}
+1 row(s) in 0.1070 seconds
+ ]]></screen>
+ </example>
+ </section>
+
+ <section>
+ <title>Testing Compression Performance</title>
+ <para>HBase includes a tool called LoadTestTool which provides mechanisms to test your
+ compression performance. You must specify either <literal>-write</literal> or
+ <literal>-update-read</literal> as your first parameter, and if you do not specify another
+ parameter, usage advice is printed for each option.</para>
+ <example>
+ <title><command>LoadTestTool</command> Usage</title>
+ <screen language="bourne"><![CDATA[
+$ bin/hbase org.apache.hadoop.hbase.util.LoadTestTool -h
+usage: bin/hbase org.apache.hadoop.hbase.util.LoadTestTool <options>
+Options:
+ -batchupdate Whether to use batch as opposed to separate
+ updates for every column in a row
+ -bloom <arg> Bloom filter type, one of [NONE, ROW, ROWCOL]
+ -compression <arg> Compression type, one of [LZO, GZ, NONE, SNAPPY,
+ LZ4]
+ -data_block_encoding <arg> Encoding algorithm (e.g. prefix compression) to
+ use for data blocks in the test column family, one
+ of [NONE, PREFIX, DIFF, FAST_DIFF, PREFIX_TREE].
+ -encryption <arg> Enables transparent encryption on the test table,
+ one of [AES]
+ -generator <arg> The class which generates load for the tool. Any
+ args for this class can be passed as colon
+ separated after class name
+ -h,--help Show usage
+ -in_memory Tries to keep the HFiles of the CF inmemory as far
+ as possible. Not guaranteed that reads are always
+ served from inmemory
+ -init_only Initialize the test table only, don't do any
+ loading
+ -key_window <arg> The 'key window' to maintain between reads and
+ writes for concurrent write/read workload. The
+ default is 0.
+ -max_read_errors <arg> The maximum number of read errors to tolerate
+ before terminating all reader threads. The default
+ is 10.
+ -multiput Whether to use multi-puts as opposed to separate
+ puts for every column in a row
+ -num_keys <arg> The number of keys to read/write
+ -num_tables <arg> A positive integer number. When a number n is
+ speicfied, load test tool will load n table
+ parallely. -tn parameter value becomes table name
+ prefix. Each table name is in format
+ <tn>_1...<tn>_n
+ -read <arg> <verify_percent>[:<#threads=20>]
+ -regions_per_server <arg> A positive integer number. When a number n is
+ specified, load test tool will create the test
+ table with n regions per server
+ -skip_init Skip the initialization; assume test table already
+ exists
+ -start_key <arg> The first key to read/write (a 0-based index). The
+ default value is 0.
+ -tn <arg> The name of the table to read or write
+ -update <arg> <update_percent>[:<#threads=20>][:<#whether to
+ ignore nonce collisions=0>]
+ -write <arg> <avg_cols_per_key>:<avg_data_size>[:<#threads=20>]
+ -zk <arg> ZK quorum as comma-separated host names without
+ port numbers
+ -zk_root <arg> name of parent znode in zookeeper
+ ]]></screen>
+ </example>
+ <example>
+ <title>Example Usage of LoadTestTool</title>
+ <screen language="bourne">
+$ hbase org.apache.hadoop.hbase.util.LoadTestTool -write 1:10:100 -num_keys 1000000
+ -read 100:30 -num_tables 1 -data_block_encoding NONE -tn load_test_tool_NONE
+ </screen>
+ </example>
+ </section>
+ </section>
+
+ <section xml:id="data.block.encoding.enable">
+ <title>Enable Data Block Encoding</title>
+ <para>Codecs are built into HBase so no extra configuration is needed. Codecs are enabled on a
+ table by setting the <code>DATA_BLOCK_ENCODING</code> property. Disable the table before
+ altering its DATA_BLOCK_ENCODING setting. Following is an example using HBase Shell:</para>
+ <example>
+ <title>Enable Data Block Encoding On a Table</title>
+ <screen><![CDATA[
+hbase> disable 'test'
+hbase> alter 'test', { NAME => 'cf', DATA_BLOCK_ENCODING => 'FAST_DIFF' }
+Updating all regions with the new schema...
+0/1 regions updated.
+1/1 regions updated.
+Done.
+0 row(s) in 2.2820 seconds
+hbase> enable 'test'
+0 row(s) in 0.1580 seconds
+ ]]></screen>
+ </example>
+ <example>
+ <title>Verifying a ColumnFamily's Data Block Encoding</title>
+ <screen><![CDATA[
+hbase> describe 'test'
+DESCRIPTION ENABLED
+ 'test', {NAME => 'cf', DATA_BLOCK_ENCODING => 'FAST true
+ _DIFF', BLOOMFILTER => 'ROW', REPLICATION_SCOPE =>
+ '0', VERSIONS => '1', COMPRESSION => 'GZ', MIN_VERS
+ IONS => '0', TTL => 'FOREVER', KEEP_DELETED_CELLS =
+ > 'false', BLOCKSIZE => '65536', IN_MEMORY => 'fals
+ e', BLOCKCACHE => 'true'}
+1 row(s) in 0.0650 seconds
+ ]]></screen>
+ </example>
+ </section>
+
+
+</appendix>
http://git-wip-us.apache.org/repos/asf/hbase/blob/a1fe1e09/src/main/docbkx/configuration.xml
----------------------------------------------------------------------
diff --git a/src/main/docbkx/configuration.xml b/src/main/docbkx/configuration.xml
index 74b8e52..a0b7d11 100644
--- a/src/main/docbkx/configuration.xml
+++ b/src/main/docbkx/configuration.xml
@@ -925,8 +925,8 @@ stopping hbase...............</screen>
<!--presumes the pre-site target has put the hbase-default.xml at this location-->
<xi:include
xmlns:xi="http://www.w3.org/2001/XInclude"
- href="../../../target/docbkx/hbase-default.xml">
- <xi:fallback>
+ href="hbase-default.xml">
+ <!--<xi:fallback>
<section
xml:id="hbase_default_configurations">
<title />
@@ -1007,7 +1007,7 @@ stopping hbase...............</screen>
</section>
</section>
</section>
- </xi:fallback>
+ </xi:fallback>-->
</xi:include>
</section>
http://git-wip-us.apache.org/repos/asf/hbase/blob/a1fe1e09/src/main/docbkx/customization-pdf.xsl
----------------------------------------------------------------------
diff --git a/src/main/docbkx/customization-pdf.xsl b/src/main/docbkx/customization-pdf.xsl
new file mode 100644
index 0000000..b21236f
--- /dev/null
+++ b/src/main/docbkx/customization-pdf.xsl
@@ -0,0 +1,129 @@
+<?xml version="1.0"?>
+<xsl:stylesheet
+ xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
+ version="1.0">
+<!--
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+-->
+ <xsl:import href="urn:docbkx:stylesheet/docbook.xsl"/>
+ <xsl:import href="urn:docbkx:stylesheet/highlight.xsl"/>
+
+
+ <!--###################################################
+ Paper & Page Size
+ ################################################### -->
+
+ <!-- Paper type, no headers on blank pages, no double sided printing -->
+ <xsl:param name="paper.type" select="'USletter'"/>
+ <xsl:param name="double.sided">0</xsl:param>
+ <xsl:param name="headers.on.blank.pages">0</xsl:param>
+ <xsl:param name="footers.on.blank.pages">0</xsl:param>
+
+ <!-- Space between paper border and content (chaotic stuff, don't touch) -->
+ <xsl:param name="page.margin.top">5mm</xsl:param>
+ <xsl:param name="region.before.extent">10mm</xsl:param>
+ <xsl:param name="body.margin.top">10mm</xsl:param>
+
+ <xsl:param name="body.margin.bottom">15mm</xsl:param>
+ <xsl:param name="region.after.extent">10mm</xsl:param>
+ <xsl:param name="page.margin.bottom">0mm</xsl:param>
+
+ <xsl:param name="page.margin.outer">18mm</xsl:param>
+ <xsl:param name="page.margin.inner">18mm</xsl:param>
+
+ <!-- No intendation of Titles -->
+ <xsl:param name="title.margin.left">0pc</xsl:param>
+
+ <!--###################################################
+ Fonts & Styles
+ ################################################### -->
+
+ <!-- Left aligned text and no hyphenation -->
+ <xsl:param name="alignment">justify</xsl:param>
+ <xsl:param name="hyphenate">true</xsl:param>
+
+ <!-- Default Font size -->
+ <xsl:param name="body.font.master">11</xsl:param>
+ <xsl:param name="body.font.small">8</xsl:param>
+
+ <!-- Line height in body text -->
+ <xsl:param name="line-height">1.4</xsl:param>
+
+ <!-- Force line break in long URLs -->
+ <xsl:param name="ulink.hyphenate.chars">/&?</xsl:param>
+ <xsl:param name="ulink.hyphenate">​</xsl:param>
+
+ <!-- Monospaced fonts are smaller than regular text -->
+ <xsl:attribute-set name="monospace.properties">
+ <xsl:attribute name="font-family">
+ <xsl:value-of select="$monospace.font.family"/>
+ </xsl:attribute>
+ <xsl:attribute name="font-size">0.8em</xsl:attribute>
+ <xsl:attribute name="wrap-option">wrap</xsl:attribute>
+ <xsl:attribute name="hyphenate">true</xsl:attribute>
+ </xsl:attribute-set>
+
+
+ <!-- add page break after abstract block -->
+ <xsl:attribute-set name="abstract.properties">
+ <xsl:attribute name="break-after">page</xsl:attribute>
+ </xsl:attribute-set>
+
+ <!-- add page break after toc -->
+ <xsl:attribute-set name="toc.margin.properties">
+ <xsl:attribute name="break-after">page</xsl:attribute>
+ </xsl:attribute-set>
+
+ <!-- add page break after first level sections -->
+ <xsl:attribute-set name="section.level1.properties">
+ <xsl:attribute name="break-after">page</xsl:attribute>
+ </xsl:attribute-set>
+
+ <!-- Show only Sections up to level 3 in the TOCs -->
+ <xsl:param name="toc.section.depth">2</xsl:param>
+
+ <!-- Dot and Whitespace as separator in TOC between Label and Title-->
+ <xsl:param name="autotoc.label.separator" select="'. '"/>
+
+ <!-- program listings / examples formatting -->
+ <xsl:attribute-set name="monospace.verbatim.properties">
+ <xsl:attribute name="font-family">Courier</xsl:attribute>
+ <xsl:attribute name="font-size">8pt</xsl:attribute>
+ <xsl:attribute name="keep-together.within-column">always</xsl:attribute>
+ </xsl:attribute-set>
+
+ <xsl:param name="shade.verbatim" select="1" />
+
+ <xsl:attribute-set name="shade.verbatim.style">
+ <xsl:attribute name="background-color">#E8E8E8</xsl:attribute>
+ <xsl:attribute name="border-width">0.5pt</xsl:attribute>
+ <xsl:attribute name="border-style">solid</xsl:attribute>
+ <xsl:attribute name="border-color">#575757</xsl:attribute>
+ <xsl:attribute name="padding">3pt</xsl:attribute>
+ </xsl:attribute-set>
+
+ <!-- callouts customization -->
+ <xsl:param name="callout.unicode" select="1" />
+ <xsl:param name="callout.graphics" select="0" />
+ <xsl:param name="callout.defaultcolumn">90</xsl:param>
+
+ <!-- Syntax Highlighting -->
+
+
+</xsl:stylesheet>
http://git-wip-us.apache.org/repos/asf/hbase/blob/a1fe1e09/src/main/docbkx/datamodel.xml
----------------------------------------------------------------------
diff --git a/src/main/docbkx/datamodel.xml b/src/main/docbkx/datamodel.xml
new file mode 100644
index 0000000..bdf697d
--- /dev/null
+++ b/src/main/docbkx/datamodel.xml
@@ -0,0 +1,865 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<chapter
+ xml:id="datamodel"
+ version="5.0"
+ xmlns="http://docbook.org/ns/docbook"
+ xmlns:xlink="http://www.w3.org/1999/xlink"
+ xmlns:xi="http://www.w3.org/2001/XInclude"
+ xmlns:svg="http://www.w3.org/2000/svg"
+ xmlns:m="http://www.w3.org/1998/Math/MathML"
+ xmlns:html="http://www.w3.org/1999/xhtml"
+ xmlns:db="http://docbook.org/ns/docbook">
+ <!--/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+-->
+
+ <title>Data Model</title>
+ <para>In HBase, data is stored in tables, which have rows and columns. This is a terminology
+ overlap with relational databases (RDBMSs), but this is not a helpful analogy. Instead, it can
+ be helpful to think of an HBase table as a multi-dimensional map.</para>
+ <variablelist>
+ <title>HBase Data Model Terminology</title>
+ <varlistentry>
+ <term>Table</term>
+ <listitem>
+ <para>An HBase table consists of multiple rows.</para>
+ </listitem>
+ </varlistentry>
+ <varlistentry>
+ <term>Row</term>
+ <listitem>
+ <para>A row in HBase consists of a row key and one or more columns with values associated
+ with them. Rows are sorted alphabetically by the row key as they are stored. For this
+ reason, the design of the row key is very important. The goal is to store data in such a
+ way that related rows are near each other. A common row key pattern is a website domain.
+ If your row keys are domains, you should probably store them in reverse (org.apache.www,
+ org.apache.mail, org.apache.jira). This way, all of the Apache domains are near each
+ other in the table, rather than being spread out based on the first letter of the
+ subdomain.</para>
+ </listitem>
+ </varlistentry>
+ <varlistentry>
+ <term>Column</term>
+ <listitem>
+ <para>A column in HBase consists of a column family and a column qualifier, which are
+ delimited by a <literal>:</literal> (colon) character.</para>
+ </listitem>
+ </varlistentry>
+ <varlistentry>
+ <term>Column Family</term>
+ <listitem>
+ <para>Column families physically colocate a set of columns and their values, often for
+ performance reasons. Each column family has a set of storage properties, such as whether
+ its values should be cached in memory, how its data is compressed or its row keys are
+ encoded, and others. Each row in a table has the same column
+ families, though a given row might not store anything in a given column family.</para>
+ <para>Column families are specified when you create your table, and influence the way your
+ data is stored in the underlying filesystem. Therefore, the column families should be
+ considered carefully during schema design.</para>
+ </listitem>
+ </varlistentry>
+ <varlistentry>
+ <term>Column Qualifier</term>
+ <listitem>
+ <para>A column qualifier is added to a column family to provide the index for a given
+ piece of data. Given a column family <literal>content</literal>, a column qualifier
+ might be <literal>content:html</literal>, and another might be
+ <literal>content:pdf</literal>. Though column families are fixed at table creation,
+ column qualifiers are mutable and may differ greatly between rows.</para>
+ </listitem>
+ </varlistentry>
+ <varlistentry>
+ <term>Cell</term>
+ <listitem>
+ <para>A cell is a combination of row, column family, and column qualifier, and contains a
+ value and a timestamp, which represents the value's version.</para>
+ <para>A cell's value is an uninterpreted array of bytes.</para>
+ </listitem>
+ </varlistentry>
+ <varlistentry>
+ <term>Timestamp</term>
+ <listitem>
+ <para>A timestamp is written alongside each value, and is the identifier for a given
+ version of a value. By default, the timestamp represents the time on the RegionServer
+ when the data was written, but you can specify a different timestamp value when you put
+ data into the cell.</para>
+ <caution>
+ <para>Direct manipulation of timestamps is an advanced feature which is only exposed for
+ special cases that are deeply integrated with HBase, and is discouraged in general.
+ Encoding a timestamp at the application level is the preferred pattern.</para>
+ </caution>
+ <para>You can specify the maximum number of versions of a value that HBase retains, per column
+ family. When the maximum number of versions is reached, the oldest versions are
+ eventually deleted. By default, only the newest version is kept.</para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+
+ <section
+ xml:id="conceptual.view">
+ <title>Conceptual View</title>
+ <para>You can read a very understandable explanation of the HBase data model in the blog post <link
+ xlink:href="http://jimbojw.com/wiki/index.php?title=Understanding_Hbase_and_BigTable">Understanding
+ HBase and BigTable</link> by Jim R. Wilson. Another good explanation is available in the
+ PDF <link
+ xlink:href="http://0b4af6cdc2f0c5998459-c0245c5c937c5dedcca3f1764ecc9b2f.r43.cf2.rackcdn.com/9353-login1210_khurana.pdf">Introduction
+ to Basic Schema Design</link> by Amandeep Khurana. It may help to read different
+ perspectives to get a solid understanding of HBase schema design. The linked articles cover
+ the same ground as the information in this section.</para>
+ <para> The following example is a slightly modified form of the one on page 2 of the <link
+ xlink:href="http://research.google.com/archive/bigtable.html">BigTable</link> paper. There
+ is a table called <varname>webtable</varname> that contains two rows
+ (<literal>com.cnn.www</literal>
+ and <literal>com.example.www</literal>), three column families named
+ <varname>contents</varname>, <varname>anchor</varname>, and <varname>people</varname>. In
+ this example, for the first row (<literal>com.cnn.www</literal>),
+ <varname>anchor</varname> contains two columns (<varname>anchor:cssnsi.com</varname>,
+ <varname>anchor:my.look.ca</varname>) and <varname>contents</varname> contains one column
+ (<varname>contents:html</varname>). This example contains 5 versions of the row with the
+ row key <literal>com.cnn.www</literal>, and one version of the row with the row key
+ <literal>com.example.www</literal>. The <varname>contents:html</varname> column qualifier contains the entire
+ HTML of a given website. Qualifiers of the <varname>anchor</varname> column family each
+ contain the external site which links to the site represented by the row, along with the
+ text it used in the anchor of its link. The <varname>people</varname> column family represents
+ people associated with the site.
+ </para>
+ <note>
+ <title>Column Names</title>
+ <para> By convention, a column name is made of its column family prefix and a
+ <emphasis>qualifier</emphasis>. For example, the column
+ <emphasis>contents:html</emphasis> is made up of the column family
+ <varname>contents</varname> and the <varname>html</varname> qualifier. The colon
+ character (<literal>:</literal>) delimits the column family from the column family
+ <emphasis>qualifier</emphasis>. </para>
+ </note>
+ <table
+ frame="all">
+ <title>Table <varname>webtable</varname></title>
+ <tgroup
+ cols="5"
+ align="left"
+ colsep="1"
+ rowsep="1">
+ <colspec
+ colname="c1" />
+ <colspec
+ colname="c2" />
+ <colspec
+ colname="c3" />
+ <colspec
+ colname="c4" />
+ <colspec
+ colname="c5" />
+ <thead>
+ <row>
+ <entry>Row Key</entry>
+ <entry>Time Stamp</entry>
+ <entry>ColumnFamily <varname>contents</varname></entry>
+ <entry>ColumnFamily <varname>anchor</varname></entry>
+ <entry>ColumnFamily <varname>people</varname></entry>
+ </row>
+ </thead>
+ <tbody>
+ <row>
+ <entry>"com.cnn.www"</entry>
+ <entry>t9</entry>
+ <entry />
+ <entry><varname>anchor:cnnsi.com</varname> = "CNN"</entry>
+ <entry />
+ </row>
+ <row>
+ <entry>"com.cnn.www"</entry>
+ <entry>t8</entry>
+ <entry />
+ <entry><varname>anchor:my.look.ca</varname> = "CNN.com"</entry>
+ <entry />
+ </row>
+ <row>
+ <entry>"com.cnn.www"</entry>
+ <entry>t6</entry>
+ <entry><varname>contents:html</varname> = "<html>..."</entry>
+ <entry />
+ <entry />
+ </row>
+ <row>
+ <entry>"com.cnn.www"</entry>
+ <entry>t5</entry>
+ <entry><varname>contents:html</varname> = "<html>..."</entry>
+ <entry />
+ <entry />
+ </row>
+ <row>
+ <entry>"com.cnn.www"</entry>
+ <entry>t3</entry>
+ <entry><varname>contents:html</varname> = "<html>..."</entry>
+ <entry />
+ <entry />
+ </row>
+ <row>
+ <entry>"com.example.www"</entry>
+ <entry>t5</entry>
+ <entry><varname>contents:html</varname> = "<html>..."</entry>
+ <entry></entry>
+ <entry>people:author = "John Doe"</entry>
+ </row>
+ </tbody>
+ </tgroup>
+ </table>
+ <para>Cells in this table that appear to be empty do not take space, or in fact exist, in
+ HBase. This is what makes HBase "sparse." A tabular view is not the only possible way to
+ look at data in HBase, or even the most accurate. The following represents the same
+ information as a multi-dimensional map. This is only a mock-up for illustrative
+ purposes and may not be strictly accurate.</para>
+ <programlisting><![CDATA[
+{
+ "com.cnn.www": {
+ contents: {
+ t6: contents:html: "<html>..."
+ t5: contents:html: "<html>..."
+ t3: contents:html: "<html>..."
+ }
+ anchor: {
+ t9: anchor:cnnsi.com = "CNN"
+ t8: anchor:my.look.ca = "CNN.com"
+ }
+ people: {}
+ }
+ "com.example.www": {
+ contents: {
+ t5: contents:html: "<html>..."
+ }
+ anchor: {}
+ people: {
+ t5: people:author: "John Doe"
+ }
+ }
+}
+ ]]></programlisting>
+
+ </section>
+ <section
+ xml:id="physical.view">
+ <title>Physical View</title>
+ <para> Although at a conceptual level tables may be viewed as a sparse set of rows, they are
+ physically stored by column family. A new column qualifier (column_family:column_qualifier)
+ can be added to an existing column family at any time.</para>
+ <table
+ frame="all">
+ <title>ColumnFamily <varname>anchor</varname></title>
+ <tgroup
+ cols="3"
+ align="left"
+ colsep="1"
+ rowsep="1">
+ <colspec
+ colname="c1" />
+ <colspec
+ colname="c2" />
+ <colspec
+ colname="c3" />
+ <thead>
+ <row>
+ <entry>Row Key</entry>
+ <entry>Time Stamp</entry>
+ <entry>Column Family <varname>anchor</varname></entry>
+ </row>
+ </thead>
+ <tbody>
+ <row>
+ <entry>"com.cnn.www"</entry>
+ <entry>t9</entry>
+ <entry><varname>anchor:cnnsi.com</varname> = "CNN"</entry>
+ </row>
+ <row>
+ <entry>"com.cnn.www"</entry>
+ <entry>t8</entry>
+ <entry><varname>anchor:my.look.ca</varname> = "CNN.com"</entry>
+ </row>
+ </tbody>
+ </tgroup>
+ </table>
+ <table
+ frame="all">
+ <title>ColumnFamily <varname>contents</varname></title>
+ <tgroup
+ cols="3"
+ align="left"
+ colsep="1"
+ rowsep="1">
+ <colspec
+ colname="c1" />
+ <colspec
+ colname="c2" />
+ <colspec
+ colname="c3" />
+ <thead>
+ <row>
+ <entry>Row Key</entry>
+ <entry>Time Stamp</entry>
+ <entry>ColumnFamily "contents:"</entry>
+ </row>
+ </thead>
+ <tbody>
+ <row>
+ <entry>"com.cnn.www"</entry>
+ <entry>t6</entry>
+ <entry><varname>contents:html</varname> = "<html>..."</entry>
+ </row>
+ <row>
+ <entry>"com.cnn.www"</entry>
+ <entry>t5</entry>
+ <entry><varname>contents:html</varname> = "<html>..."</entry>
+ </row>
+ <row>
+ <entry>"com.cnn.www"</entry>
+ <entry>t3</entry>
+ <entry><varname>contents:html</varname> = "<html>..."</entry>
+ </row>
+ </tbody>
+ </tgroup>
+ </table>
+ <para>The empty cells shown in the
+ conceptual view are not stored at all.
+ Thus a request for the value of the <varname>contents:html</varname> column at time stamp
+ <literal>t8</literal> would return no value. Similarly, a request for an
+ <varname>anchor:my.look.ca</varname> value at time stamp <literal>t9</literal> would
+ return no value. However, if no timestamp is supplied, the most recent value for a
+ particular column would be returned. Given multiple versions, the most recent is also the
+ first one found, since timestamps
+ are stored in descending order. Thus a request for the values of all columns in the row
+ <varname>com.cnn.www</varname> if no timestamp is specified would be: the value of
+ <varname>contents:html</varname> from timestamp <literal>t6</literal>, the value of
+ <varname>anchor:cnnsi.com</varname> from timestamp <literal>t9</literal>, the value of
+ <varname>anchor:my.look.ca</varname> from timestamp <literal>t8</literal>. </para>
+ <para>For more information about the internals of how Apache HBase stores data, see <xref
+ linkend="regions.arch" />. </para>
+ </section>
+
+ <section
+ xml:id="namespace">
+ <title>Namespace</title>
+ <para> A namespace is a logical grouping of tables analogous to a database in relation
+ database systems. This abstraction lays the groundwork for upcoming multi-tenancy related
+ features: <itemizedlist>
+ <listitem>
+ <para>Quota Management (HBASE-8410) - Restrict the amount of resources (ie regions,
+ tables) a namespace can consume.</para>
+ </listitem>
+ <listitem>
+ <para>Namespace Security Administration (HBASE-9206) - provide another level of security
+ administration for tenants.</para>
+ </listitem>
+ <listitem>
+ <para>Region server groups (HBASE-6721) - A namespace/table can be pinned onto a subset
+ of regionservers thus guaranteeing a course level of isolation.</para>
+ </listitem>
+ </itemizedlist>
+ </para>
+ <section
+ xml:id="namespace_creation">
+ <title>Namespace management</title>
+ <para> A namespace can be created, removed or altered. Namespace membership is determined
+ during table creation by specifying a fully-qualified table name of the form:</para>
+
+ <programlisting language="xml"><![CDATA[<table namespace>:<table qualifier>]]></programlisting>
+
+
+ <example>
+ <title>Examples</title>
+
+ <programlisting language="bourne">
+#Create a namespace
+create_namespace 'my_ns'
+ </programlisting>
+ <programlisting language="bourne">
+#create my_table in my_ns namespace
+create 'my_ns:my_table', 'fam'
+ </programlisting>
+ <programlisting language="bourne">
+#drop namespace
+drop_namespace 'my_ns'
+ </programlisting>
+ <programlisting language="bourne">
+#alter namespace
+alter_namespace 'my_ns', {METHOD => 'set', 'PROPERTY_NAME' => 'PROPERTY_VALUE'}
+ </programlisting>
+ </example>
+ </section>
+ <section
+ xml:id="namespace_special">
+ <title>Predefined namespaces</title>
+ <para> There are two predefined special namespaces: </para>
+ <itemizedlist>
+ <listitem>
+ <para>hbase - system namespace, used to contain hbase internal tables</para>
+ </listitem>
+ <listitem>
+ <para>default - tables with no explicit specified namespace will automatically fall into
+ this namespace.</para>
+ </listitem>
+ </itemizedlist>
+ <example>
+ <title>Examples</title>
+
+ <programlisting language="bourne">
+#namespace=foo and table qualifier=bar
+create 'foo:bar', 'fam'
+
+#namespace=default and table qualifier=bar
+create 'bar', 'fam'
+</programlisting>
+ </example>
+ </section>
+ </section>
+
+ <section
+ xml:id="table">
+ <title>Table</title>
+ <para> Tables are declared up front at schema definition time. </para>
+ </section>
+
+ <section
+ xml:id="row">
+ <title>Row</title>
+ <para>Row keys are uninterrpreted bytes. Rows are lexicographically sorted with the lowest
+ order appearing first in a table. The empty byte array is used to denote both the start and
+ end of a tables' namespace.</para>
+ </section>
+
+ <section
+ xml:id="columnfamily">
+ <title>Column Family<indexterm><primary>Column Family</primary></indexterm></title>
+ <para> Columns in Apache HBase are grouped into <emphasis>column families</emphasis>. All
+ column members of a column family have the same prefix. For example, the columns
+ <emphasis>courses:history</emphasis> and <emphasis>courses:math</emphasis> are both
+ members of the <emphasis>courses</emphasis> column family. The colon character
+ (<literal>:</literal>) delimits the column family from the <indexterm><primary>column
+ family qualifier</primary><secondary>Column Family Qualifier</secondary></indexterm>.
+ The column family prefix must be composed of <emphasis>printable</emphasis> characters. The
+ qualifying tail, the column family <emphasis>qualifier</emphasis>, can be made of any
+ arbitrary bytes. Column families must be declared up front at schema definition time whereas
+ columns do not need to be defined at schema time but can be conjured on the fly while the
+ table is up an running.</para>
+ <para>Physically, all column family members are stored together on the filesystem. Because
+ tunings and storage specifications are done at the column family level, it is advised that
+ all column family members have the same general access pattern and size
+ characteristics.</para>
+
+ </section>
+ <section
+ xml:id="cells">
+ <title>Cells<indexterm><primary>Cells</primary></indexterm></title>
+ <para>A <emphasis>{row, column, version} </emphasis>tuple exactly specifies a
+ <literal>cell</literal> in HBase. Cell content is uninterrpreted bytes</para>
+ </section>
+ <section
+ xml:id="data_model_operations">
+ <title>Data Model Operations</title>
+ <para>The four primary data model operations are Get, Put, Scan, and Delete. Operations are
+ applied via <link
+ xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Table.html">Table</link>
+ instances.
+ </para>
+ <section
+ xml:id="get">
+ <title>Get</title>
+ <para><link
+ xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Get.html">Get</link>
+ returns attributes for a specified row. Gets are executed via <link
+ xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Table.html#get(org.apache.hadoop.hbase.client.Get)">
+ Table.get</link>. </para>
+ </section>
+ <section
+ xml:id="put">
+ <title>Put</title>
+ <para><link
+ xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Put.html">Put</link>
+ either adds new rows to a table (if the key is new) or can update existing rows (if the
+ key already exists). Puts are executed via <link
+ xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Table.html#put(org.apache.hadoop.hbase.client.Put)">
+ Table.put</link> (writeBuffer) or <link
+ xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Table.html#batch(java.util.List, java.lang.Object[])">
+ Table.batch</link> (non-writeBuffer). </para>
+ </section>
+ <section
+ xml:id="scan">
+ <title>Scans</title>
+ <para><link
+ xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Scan.html">Scan</link>
+ allow iteration over multiple rows for specified attributes. </para>
+ <para>The following is an example of a Scan on a Table instance. Assume that a table is
+ populated with rows with keys "row1", "row2", "row3", and then another set of rows with
+ the keys "abc1", "abc2", and "abc3". The following example shows how to set a Scan
+ instance to return the rows beginning with "row".</para>
+<programlisting language="java">
+public static final byte[] CF = "cf".getBytes();
+public static final byte[] ATTR = "attr".getBytes();
+...
+
+Table table = ... // instantiate a Table instance
+
+Scan scan = new Scan();
+scan.addColumn(CF, ATTR);
+scan.setRowPrefixFilter(Bytes.toBytes("row"));
+ResultScanner rs = table.getScanner(scan);
+try {
+ for (Result r = rs.next(); r != null; r = rs.next()) {
+ // process result...
+} finally {
+ rs.close(); // always close the ResultScanner!
+</programlisting>
+ <para>Note that generally the easiest way to specify a specific stop point for a scan is by
+ using the <link
+ xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/filter/InclusiveStopFilter.html">InclusiveStopFilter</link>
+ class. </para>
+ </section>
+ <section
+ xml:id="delete">
+ <title>Delete</title>
+ <para><link
+ xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Delete.html">Delete</link>
+ removes a row from a table. Deletes are executed via <link
+ xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Table.html#delete(org.apache.hadoop.hbase.client.Delete)">
+ HTable.delete</link>. </para>
+ <para>HBase does not modify data in place, and so deletes are handled by creating new
+ markers called <emphasis>tombstones</emphasis>. These tombstones, along with the dead
+ values, are cleaned up on major compactions. </para>
+ <para>See <xref
+ linkend="version.delete" /> for more information on deleting versions of columns, and
+ see <xref
+ linkend="compaction" /> for more information on compactions. </para>
+
+ </section>
+
+ </section>
+
+
+ <section
+ xml:id="versions">
+ <title>Versions<indexterm><primary>Versions</primary></indexterm></title>
+
+ <para>A <emphasis>{row, column, version} </emphasis>tuple exactly specifies a
+ <literal>cell</literal> in HBase. It's possible to have an unbounded number of cells where
+ the row and column are the same but the cell address differs only in its version
+ dimension.</para>
+
+ <para>While rows and column keys are expressed as bytes, the version is specified using a long
+ integer. Typically this long contains time instances such as those returned by
+ <code>java.util.Date.getTime()</code> or <code>System.currentTimeMillis()</code>, that is:
+ <quote>the difference, measured in milliseconds, between the current time and midnight,
+ January 1, 1970 UTC</quote>.</para>
+
+ <para>The HBase version dimension is stored in decreasing order, so that when reading from a
+ store file, the most recent values are found first.</para>
+
+ <para>There is a lot of confusion over the semantics of <literal>cell</literal> versions, in
+ HBase. In particular:</para>
+ <itemizedlist>
+ <listitem>
+ <para>If multiple writes to a cell have the same version, only the last written is
+ fetchable.</para>
+ </listitem>
+
+ <listitem>
+ <para>It is OK to write cells in a non-increasing version order.</para>
+ </listitem>
+ </itemizedlist>
+
+ <para>Below we describe how the version dimension in HBase currently works. See <link
+ xlink:href="https://issues.apache.org/jira/browse/HBASE-2406">HBASE-2406</link> for
+ discussion of HBase versions. <link
+ xlink:href="http://outerthought.org/blog/417-ot.html">Bending time in HBase</link>
+ makes for a good read on the version, or time, dimension in HBase. It has more detail on
+ versioning than is provided here. As of this writing, the limiitation
+ <emphasis>Overwriting values at existing timestamps</emphasis> mentioned in the
+ article no longer holds in HBase. This section is basically a synopsis of this article
+ by Bruno Dumon.</para>
+
+ <section xml:id="specify.number.of.versions">
+ <title>Specifying the Number of Versions to Store</title>
+ <para>The maximum number of versions to store for a given column is part of the column
+ schema and is specified at table creation, or via an <command>alter</command> command, via
+ <code>HColumnDescriptor.DEFAULT_VERSIONS</code>. Prior to HBase 0.96, the default number
+ of versions kept was <literal>3</literal>, but in 0.96 and newer has been changed to
+ <literal>1</literal>.</para>
+ <example>
+ <title>Modify the Maximum Number of Versions for a Column</title>
+ <para>This example uses HBase Shell to keep a maximum of 5 versions of column
+ <code>f1</code>. You could also use <link
+ xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/HColumnDescriptor.html"
+ >HColumnDescriptor</link>.</para>
+ <screen><![CDATA[hbase> alter ‘t1′, NAME => ‘f1′, VERSIONS => 5]]></screen>
+ </example>
+ <example>
+ <title>Modify the Minimum Number of Versions for a Column</title>
+ <para>You can also specify the minimum number of versions to store. By default, this is
+ set to 0, which means the feature is disabled. The following example sets the minimum
+ number of versions on field <code>f1</code> to <literal>2</literal>, via HBase Shell.
+ You could also use <link
+ xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/HColumnDescriptor.html"
+ >HColumnDescriptor</link>.</para>
+ <screen><![CDATA[hbase> alter ‘t1′, NAME => ‘f1′, MIN_VERSIONS => 2]]></screen>
+ </example>
+ <para>Starting with HBase 0.98.2, you can specify a global default for the maximum number of
+ versions kept for all newly-created columns, by setting
+ <option>hbase.column.max.version</option> in <filename>hbase-site.xml</filename>. See
+ <xref linkend="hbase.column.max.version"/>.</para>
+ </section>
+
+ <section
+ xml:id="versions.ops">
+ <title>Versions and HBase Operations</title>
+
+ <para>In this section we look at the behavior of the version dimension for each of the core
+ HBase operations.</para>
+
+ <section>
+ <title>Get/Scan</title>
+
+ <para>Gets are implemented on top of Scans. The below discussion of <link
+ xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Get.html">Get</link>
+ applies equally to <link
+ xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Scan.html">Scans</link>.</para>
+
+ <para>By default, i.e. if you specify no explicit version, when doing a
+ <literal>get</literal>, the cell whose version has the largest value is returned
+ (which may or may not be the latest one written, see later). The default behavior can be
+ modified in the following ways:</para>
+
+ <itemizedlist>
+ <listitem>
+ <para>to return more than one version, see <link
+ xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Get.html#setMaxVersions()">Get.setMaxVersions()</link></para>
+ </listitem>
+
+ <listitem>
+ <para>to return versions other than the latest, see <link
+ xlink:href="???">Get.setTimeRange()</link></para>
+
+ <para>To retrieve the latest version that is less than or equal to a given value, thus
+ giving the 'latest' state of the record at a certain point in time, just use a range
+ from 0 to the desired version and set the max versions to 1.</para>
+ </listitem>
+ </itemizedlist>
+
+ </section>
+ <section
+ xml:id="default_get_example">
+ <title>Default Get Example</title>
+ <para>The following Get will only retrieve the current version of the row</para>
+ <programlisting language="java">
+public static final byte[] CF = "cf".getBytes();
+public static final byte[] ATTR = "attr".getBytes();
+...
+Get get = new Get(Bytes.toBytes("row1"));
+Result r = table.get(get);
+byte[] b = r.getValue(CF, ATTR); // returns current version of value
+</programlisting>
+ </section>
+ <section
+ xml:id="versioned_get_example">
+ <title>Versioned Get Example</title>
+ <para>The following Get will return the last 3 versions of the row.</para>
+ <programlisting language="java">
+public static final byte[] CF = "cf".getBytes();
+public static final byte[] ATTR = "attr".getBytes();
+...
+Get get = new Get(Bytes.toBytes("row1"));
+get.setMaxVersions(3); // will return last 3 versions of row
+Result r = table.get(get);
+byte[] b = r.getValue(CF, ATTR); // returns current version of value
+List<KeyValue> kv = r.getColumn(CF, ATTR); // returns all versions of this column
+</programlisting>
+ </section>
+
+ <section>
+ <title>Put</title>
+
+ <para>Doing a put always creates a new version of a <literal>cell</literal>, at a certain
+ timestamp. By default the system uses the server's <literal>currentTimeMillis</literal>,
+ but you can specify the version (= the long integer) yourself, on a per-column level.
+ This means you could assign a time in the past or the future, or use the long value for
+ non-time purposes.</para>
+
+ <para>To overwrite an existing value, do a put at exactly the same row, column, and
+ version as that of the cell you would overshadow.</para>
+ <section
+ xml:id="implicit_version_example">
+ <title>Implicit Version Example</title>
+ <para>The following Put will be implicitly versioned by HBase with the current
+ time.</para>
+ <programlisting language="java">
+public static final byte[] CF = "cf".getBytes();
+public static final byte[] ATTR = "attr".getBytes();
+...
+Put put = new Put(Bytes.toBytes(row));
+put.add(CF, ATTR, Bytes.toBytes( data));
+table.put(put);
+</programlisting>
+ </section>
+ <section
+ xml:id="explicit_version_example">
+ <title>Explicit Version Example</title>
+ <para>The following Put has the version timestamp explicitly set.</para>
+ <programlisting language="java">
+public static final byte[] CF = "cf".getBytes();
+public static final byte[] ATTR = "attr".getBytes();
+...
+Put put = new Put( Bytes.toBytes(row));
+long explicitTimeInMs = 555; // just an example
+put.add(CF, ATTR, explicitTimeInMs, Bytes.toBytes(data));
+table.put(put);
+</programlisting>
+ <para>Caution: the version timestamp is internally by HBase for things like time-to-live
+ calculations. It's usually best to avoid setting this timestamp yourself. Prefer using
+ a separate timestamp attribute of the row, or have the timestamp a part of the rowkey,
+ or both. </para>
+ </section>
+
+ </section>
+
+ <section
+ xml:id="version.delete">
+ <title>Delete</title>
+
+ <para>There are three different types of internal delete markers. See Lars Hofhansl's blog
+ for discussion of his attempt adding another, <link
+ xlink:href="http://hadoop-hbase.blogspot.com/2012/01/scanning-in-hbase.html">Scanning
+ in HBase: Prefix Delete Marker</link>. </para>
+ <itemizedlist>
+ <listitem>
+ <para>Delete: for a specific version of a column.</para>
+ </listitem>
+ <listitem>
+ <para>Delete column: for all versions of a column.</para>
+ </listitem>
+ <listitem>
+ <para>Delete family: for all columns of a particular ColumnFamily</para>
+ </listitem>
+ </itemizedlist>
+ <para>When deleting an entire row, HBase will internally create a tombstone for each
+ ColumnFamily (i.e., not each individual column). </para>
+ <para>Deletes work by creating <emphasis>tombstone</emphasis> markers. For example, let's
+ suppose we want to delete a row. For this you can specify a version, or else by default
+ the <literal>currentTimeMillis</literal> is used. What this means is <quote>delete all
+ cells where the version is less than or equal to this version</quote>. HBase never
+ modifies data in place, so for example a delete will not immediately delete (or mark as
+ deleted) the entries in the storage file that correspond to the delete condition.
+ Rather, a so-called <emphasis>tombstone</emphasis> is written, which will mask the
+ deleted values. When HBase does a major compaction, the tombstones are processed to
+ actually remove the dead values, together with the tombstones themselves. If the version
+ you specified when deleting a row is larger than the version of any value in the row,
+ then you can consider the complete row to be deleted.</para>
+ <para>For an informative discussion on how deletes and versioning interact, see the thread <link
+ xlink:href="http://comments.gmane.org/gmane.comp.java.hadoop.hbase.user/28421">Put w/
+ timestamp -> Deleteall -> Put w/ timestamp fails</link> up on the user mailing
+ list.</para>
+ <para>Also see <xref
+ linkend="keyvalue" /> for more information on the internal KeyValue format. </para>
+ <para>Delete markers are purged during the next major compaction of the store, unless the
+ <option>KEEP_DELETED_CELLS</option> option is set in the column family. To keep the
+ deletes for a configurable amount of time, you can set the delete TTL via the
+ <option>hbase.hstore.time.to.purge.deletes</option> property in
+ <filename>hbase-site.xml</filename>. If
+ <option>hbase.hstore.time.to.purge.deletes</option> is not set, or set to 0, all
+ delete markers, including those with timestamps in the future, are purged during the
+ next major compaction. Otherwise, a delete marker with a timestamp in the future is kept
+ until the major compaction which occurs after the time represented by the marker's
+ timestamp plus the value of <option>hbase.hstore.time.to.purge.deletes</option>, in
+ milliseconds. </para>
+ <note>
+ <para>This behavior represents a fix for an unexpected change that was introduced in
+ HBase 0.94, and was fixed in <link
+ xlink:href="https://issues.apache.org/jira/browse/HBASE-10118">HBASE-10118</link>.
+ The change has been backported to HBase 0.94 and newer branches.</para>
+ </note>
+ </section>
+ </section>
+
+ <section>
+ <title>Current Limitations</title>
+
+ <section>
+ <title>Deletes mask Puts</title>
+
+ <para>Deletes mask puts, even puts that happened after the delete
+ was entered. See <link xlink:href="https://issues.apache.org/jira/browse/HBASE-2256"
+ >HBASE-2256</link>. Remember that a delete writes a tombstone, which only
+ disappears after then next major compaction has run. Suppose you do
+ a delete of everything <= T. After this you do a new put with a
+ timestamp <= T. This put, even if it happened after the delete,
+ will be masked by the delete tombstone. Performing the put will not
+ fail, but when you do a get you will notice the put did have no
+ effect. It will start working again after the major compaction has
+ run. These issues should not be a problem if you use
+ always-increasing versions for new puts to a row. But they can occur
+ even if you do not care about time: just do delete and put
+ immediately after each other, and there is some chance they happen
+ within the same millisecond.</para>
+ </section>
+
+ <section
+ xml:id="major.compactions.change.query.results">
+ <title>Major compactions change query results</title>
+
+ <para><quote>...create three cell versions at t1, t2 and t3, with a maximum-versions
+ setting of 2. So when getting all versions, only the values at t2 and t3 will be
+ returned. But if you delete the version at t2 or t3, the one at t1 will appear again.
+ Obviously, once a major compaction has run, such behavior will not be the case
+ anymore...</quote> (See <emphasis>Garbage Collection</emphasis> in <link
+ xlink:href="http://outerthought.org/blog/417-ot.html">Bending time in
+ HBase</link>.)</para>
+ </section>
+ </section>
+ </section>
+ <section xml:id="dm.sort">
+ <title>Sort Order</title>
+ <para>All data model operations HBase return data in sorted order. First by row,
+ then by ColumnFamily, followed by column qualifier, and finally timestamp (sorted
+ in reverse, so newest records are returned first).
+ </para>
+ </section>
+ <section xml:id="dm.column.metadata">
+ <title>Column Metadata</title>
+ <para>There is no store of column metadata outside of the internal KeyValue instances for a ColumnFamily.
+ Thus, while HBase can support not only a wide number of columns per row, but a heterogenous set of columns
+ between rows as well, it is your responsibility to keep track of the column names.
+ </para>
+ <para>The only way to get a complete set of columns that exist for a ColumnFamily is to process all the rows.
+ For more information about how HBase stores data internally, see <xref linkend="keyvalue" />.
+ </para>
+ </section>
+ <section xml:id="joins"><title>Joins</title>
+ <para>Whether HBase supports joins is a common question on the dist-list, and there is a simple answer: it doesn't,
+ at not least in the way that RDBMS' support them (e.g., with equi-joins or outer-joins in SQL). As has been illustrated
+ in this chapter, the read data model operations in HBase are Get and Scan.
+ </para>
+ <para>However, that doesn't mean that equivalent join functionality can't be supported in your application, but
+ you have to do it yourself. The two primary strategies are either denormalizing the data upon writing to HBase,
+ or to have lookup tables and do the join between HBase tables in your application or MapReduce code (and as RDBMS'
+ demonstrate, there are several strategies for this depending on the size of the tables, e.g., nested loops vs.
+ hash-joins). So which is the best approach? It depends on what you are trying to do, and as such there isn't a single
+ answer that works for every use case.
+ </para>
+ </section>
+ <section xml:id="acid"><title>ACID</title>
+ <para>See <link xlink:href="http://hbase.apache.org/acid-semantics.html">ACID Semantics</link>.
+ Lars Hofhansl has also written a note on
+ <link xlink:href="http://hadoop-hbase.blogspot.com/2012/03/acid-in-hbase.html">ACID in HBase</link>.</para>
+ </section>
+ </chapter>
[2/8] hbase git commit: HBASE-12738 Chunk Ref Guide into
file-per-chapter
Posted by mi...@apache.org.
http://git-wip-us.apache.org/repos/asf/hbase/blob/a1fe1e09/src/main/docbkx/hbase-default.xml
----------------------------------------------------------------------
diff --git a/src/main/docbkx/hbase-default.xml b/src/main/docbkx/hbase-default.xml
new file mode 100644
index 0000000..125e3d2
--- /dev/null
+++ b/src/main/docbkx/hbase-default.xml
@@ -0,0 +1,538 @@
+<?xml version="1.0" encoding="UTF-8"?><glossary xml:id="hbase_default_configurations" version="5.0" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:m="http://www.w3.org/1998/Math/MathML" xmlns:db="http://docbook.org/ns/docbook" xmlns:xi="http://www.w3.org/2001/XInclude" xmlns:svg="http://www.w3.org/2000/svg" xmlns:html="http://www.w3.org/1999/xhtml" xmlns="http://docbook.org/ns/docbook"><title>HBase Default Configuration</title><para>
+The documentation below is generated using the default hbase configuration file,
+<filename>hbase-default.xml</filename>, as source.
+</para><glossentry xml:id="hbase.tmp.dir"><glossterm><varname>hbase.tmp.dir</varname></glossterm><glossdef><para>Temporary directory on the local filesystem.
+ Change this setting to point to a location more permanent
+ than '/tmp', the usual resolve for java.io.tmpdir, as the
+ '/tmp' directory is cleared on machine restart.</para><formalpara><title>Default</title><para><varname>${java.io.tmpdir}/hbase-${user.name}</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.rootdir"><glossterm><varname>hbase.rootdir</varname></glossterm><glossdef><para>The directory shared by region servers and into
+ which HBase persists. The URL should be 'fully-qualified'
+ to include the filesystem scheme. For example, to specify the
+ HDFS directory '/hbase' where the HDFS instance's namenode is
+ running at namenode.example.org on port 9000, set this value to:
+ hdfs://namenode.example.org:9000/hbase. By default, we write
+ to whatever ${hbase.tmp.dir} is set too -- usually /tmp --
+ so change this configuration or else all data will be lost on
+ machine restart.</para><formalpara><title>Default</title><para><varname>${hbase.tmp.dir}/hbase</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.cluster.distributed"><glossterm><varname>hbase.cluster.distributed</varname></glossterm><glossdef><para>The mode the cluster will be in. Possible values are
+ false for standalone mode and true for distributed mode. If
+ false, startup will run all HBase and ZooKeeper daemons together
+ in the one JVM.</para><formalpara><title>Default</title><para><varname>false</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.zookeeper.quorum"><glossterm><varname>hbase.zookeeper.quorum</varname></glossterm><glossdef><para>Comma separated list of servers in the ZooKeeper ensemble
+ (This config. should have been named hbase.zookeeper.ensemble).
+ For example, "host1.mydomain.com,host2.mydomain.com,host3.mydomain.com".
+ By default this is set to localhost for local and pseudo-distributed modes
+ of operation. For a fully-distributed setup, this should be set to a full
+ list of ZooKeeper ensemble servers. If HBASE_MANAGES_ZK is set in hbase-env.sh
+ this is the list of servers which hbase will start/stop ZooKeeper on as
+ part of cluster start/stop. Client-side, we will take this list of
+ ensemble members and put it together with the hbase.zookeeper.clientPort
+ config. and pass it into zookeeper constructor as the connectString
+ parameter.</para><formalpara><title>Default</title><para><varname>localhost</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.local.dir"><glossterm><varname>hbase.local.dir</varname></glossterm><glossdef><para>Directory on the local filesystem to be used
+ as a local storage.</para><formalpara><title>Default</title><para><varname>${hbase.tmp.dir}/local/</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.master.info.port"><glossterm><varname>hbase.master.info.port</varname></glossterm><glossdef><para>The port for the HBase Master web UI.
+ Set to -1 if you do not want a UI instance run.</para><formalpara><title>Default</title><para><varname>16010</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.master.info.bindAddress"><glossterm><varname>hbase.master.info.bindAddress</varname></glossterm><glossdef><para>The bind address for the HBase Master web UI
+ </para><formalpara><title>Default</title><para><varname>0.0.0.0</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.master.logcleaner.plugins"><glossterm><varname>hbase.master.logcleaner.plugins</varname></glossterm><glossdef><para>A comma-separated list of BaseLogCleanerDelegate invoked by
+ the LogsCleaner service. These WAL cleaners are called in order,
+ so put the cleaner that prunes the most files in front. To
+ implement your own BaseLogCleanerDelegate, just put it in HBase's classpath
+ and add the fully qualified class name here. Always add the above
+ default log cleaners in the list.</para><formalpara><title>Default</title><para><varname>org.apache.hadoop.hbase.master.cleaner.TimeToLiveLogCleaner</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.master.logcleaner.ttl"><glossterm><varname>hbase.master.logcleaner.ttl</varname></glossterm><glossdef><para>Maximum time a WAL can stay in the .oldlogdir directory,
+ after which it will be cleaned by a Master thread.</para><formalpara><title>Default</title><para><varname>600000</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.master.hfilecleaner.plugins"><glossterm><varname>hbase.master.hfilecleaner.plugins</varname></glossterm><glossdef><para>A comma-separated list of BaseHFileCleanerDelegate invoked by
+ the HFileCleaner service. These HFiles cleaners are called in order,
+ so put the cleaner that prunes the most files in front. To
+ implement your own BaseHFileCleanerDelegate, just put it in HBase's classpath
+ and add the fully qualified class name here. Always add the above
+ default log cleaners in the list as they will be overwritten in
+ hbase-site.xml.</para><formalpara><title>Default</title><para><varname>org.apache.hadoop.hbase.master.cleaner.TimeToLiveHFileCleaner</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.master.catalog.timeout"><glossterm><varname>hbase.master.catalog.timeout</varname></glossterm><glossdef><para>Timeout value for the Catalog Janitor from the master to
+ META.</para><formalpara><title>Default</title><para><varname>600000</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.master.infoserver.redirect"><glossterm><varname>hbase.master.infoserver.redirect</varname></glossterm><glossdef><para>Whether or not the Master listens to the Master web
+ UI port (hbase.master.info.port) and redirects requests to the web
+ UI server shared by the Master and RegionServer.</para><formalpara><title>Default</title><para><varname>true</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.regionserver.port"><glossterm><varname>hbase.regionserver.port</varname></glossterm><glossdef><para>The port the HBase RegionServer binds to.</para><formalpara><title>Default</title><para><varname>16020</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.regionserver.info.port"><glossterm><varname>hbase.regionserver.info.port</varname></glossterm><glossdef><para>The port for the HBase RegionServer web UI
+ Set to -1 if you do not want the RegionServer UI to run.</para><formalpara><title>Default</title><para><varname>16030</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.regionserver.info.bindAddress"><glossterm><varname>hbase.regionserver.info.bindAddress</varname></glossterm><glossdef><para>The address for the HBase RegionServer web UI</para><formalpara><title>Default</title><para><varname>0.0.0.0</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.regionserver.info.port.auto"><glossterm><varname>hbase.regionserver.info.port.auto</varname></glossterm><glossdef><para>Whether or not the Master or RegionServer
+ UI should search for a port to bind to. Enables automatic port
+ search if hbase.regionserver.info.port is already in use.
+ Useful for testing, turned off by default.</para><formalpara><title>Default</title><para><varname>false</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.regionserver.handler.count"><glossterm><varname>hbase.regionserver.handler.count</varname></glossterm><glossdef><para>Count of RPC Listener instances spun up on RegionServers.
+ Same property is used by the Master for count of master handlers.</para><formalpara><title>Default</title><para><varname>30</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.ipc.server.callqueue.handler.factor"><glossterm><varname>hbase.ipc.server.callqueue.handler.factor</varname></glossterm><glossdef><para>Factor to determine the number of call queues.
+ A value of 0 means a single queue shared between all the handlers.
+ A value of 1 means that each handler has its own queue.</para><formalpara><title>Default</title><para><varname>0.1</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.ipc.server.callqueue.read.ratio"><glossterm><varname>hbase.ipc.server.callqueue.read.ratio</varname></glossterm><glossdef><para>Split the call queues into read and write queues.
+ The specified interval (which should be between 0.0 and 1.0)
+ will be multiplied by the number of call queues.
+ A value of 0 indicate to not split the call queues, meaning that both read and write
+ requests will be pushed to the same set of queues.
+ A value lower than 0.5 means that there will be less read queues than write queues.
+ A value of 0.5 means there will be the same number of read and write queues.
+ A value greater than 0.5 means that there will be more read queues than write queues.
+ A value of 1.0 means that all the queues except one are used to dispatch read requests.
+
+ Example: Given the total number of call queues being 10
+ a read.ratio of 0 means that: the 10 queues will contain both read/write requests.
+ a read.ratio of 0.3 means that: 3 queues will contain only read requests
+ and 7 queues will contain only write requests.
+ a read.ratio of 0.5 means that: 5 queues will contain only read requests
+ and 5 queues will contain only write requests.
+ a read.ratio of 0.8 means that: 8 queues will contain only read requests
+ and 2 queues will contain only write requests.
+ a read.ratio of 1 means that: 9 queues will contain only read requests
+ and 1 queues will contain only write requests.
+ </para><formalpara><title>Default</title><para><varname>0</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.ipc.server.callqueue.scan.ratio"><glossterm><varname>hbase.ipc.server.callqueue.scan.ratio</varname></glossterm><glossdef><para>Given the number of read call queues, calculated from the total number
+ of call queues multiplied by the callqueue.read.ratio, the scan.ratio property
+ will split the read call queues into small-read and long-read queues.
+ A value lower than 0.5 means that there will be less long-read queues than short-read queues.
+ A value of 0.5 means that there will be the same number of short-read and long-read queues.
+ A value greater than 0.5 means that there will be more long-read queues than short-read queues
+ A value of 0 or 1 indicate to use the same set of queues for gets and scans.
+
+ Example: Given the total number of read call queues being 8
+ a scan.ratio of 0 or 1 means that: 8 queues will contain both long and short read requests.
+ a scan.ratio of 0.3 means that: 2 queues will contain only long-read requests
+ and 6 queues will contain only short-read requests.
+ a scan.ratio of 0.5 means that: 4 queues will contain only long-read requests
+ and 4 queues will contain only short-read requests.
+ a scan.ratio of 0.8 means that: 6 queues will contain only long-read requests
+ and 2 queues will contain only short-read requests.
+ </para><formalpara><title>Default</title><para><varname>0</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.regionserver.msginterval"><glossterm><varname>hbase.regionserver.msginterval</varname></glossterm><glossdef><para>Interval between messages from the RegionServer to Master
+ in milliseconds.</para><formalpara><title>Default</title><para><varname>3000</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.regionserver.regionSplitLimit"><glossterm><varname>hbase.regionserver.regionSplitLimit</varname></glossterm><glossdef><para>Limit for the number of regions after which no more region
+ splitting should take place. This is not a hard limit for the number of
+ regions but acts as a guideline for the regionserver to stop splitting after
+ a certain limit. Default is MAX_INT; i.e. do not block splitting.</para><formalpara><title>Default</title><para><varname>2147483647</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.regionserver.logroll.period"><glossterm><varname>hbase.regionserver.logroll.period</varname></glossterm><glossdef><para>Period at which we will roll the commit log regardless
+ of how many edits it has.</para><formalpara><title>Default</title><para><varname>3600000</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.regionserver.logroll.errors.tolerated"><glossterm><varname>hbase.regionserver.logroll.errors.tolerated</varname></glossterm><glossdef><para>The number of consecutive WAL close errors we will allow
+ before triggering a server abort. A setting of 0 will cause the
+ region server to abort if closing the current WAL writer fails during
+ log rolling. Even a small value (2 or 3) will allow a region server
+ to ride over transient HDFS errors.</para><formalpara><title>Default</title><para><varname>2</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.regionserver.hlog.reader.impl"><glossterm><varname>hbase.regionserver.hlog.reader.impl</varname></glossterm><glossdef><para>The WAL file reader implementation.</para><formalpara><title>Default</title><para><varname>org.apache.hadoop.hbase.regionserver.wal.ProtobufLogReader</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.regionserver.hlog.writer.impl"><glossterm><varname>hbase.regionserver.hlog.writer.impl</varname></glossterm><glossdef><para>The WAL file writer implementation.</para><formalpara><title>Default</title><para><varname>org.apache.hadoop.hbase.regionserver.wal.ProtobufLogWriter</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.master.distributed.log.replay"><glossterm><varname>hbase.master.distributed.log.replay</varname></glossterm><glossd
ef><para>Enable 'distributed log replay' as default engine splitting
+ WAL files on server crash. This default is new in hbase 1.0. To fall
+ back to the old mode 'distributed log splitter', set the value to
+ 'false'. 'Disributed log replay' improves MTTR because it does not
+ write intermediate files. 'DLR' required that 'hfile.format.version'
+ be set to version 3 or higher.
+ </para><formalpara><title>Default</title><para><varname>true</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.regionserver.global.memstore.size"><glossterm><varname>hbase.regionserver.global.memstore.size</varname></glossterm><glossdef><para>Maximum size of all memstores in a region server before new
+ updates are blocked and flushes are forced. Defaults to 40% of heap.
+ Updates are blocked and flushes are forced until size of all memstores
+ in a region server hits hbase.regionserver.global.memstore.size.lower.limit.</para><formalpara><title>Default</title><para><varname>0.4</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.regionserver.global.memstore.size.lower.limit"><glossterm><varname>hbase.regionserver.global.memstore.size.lower.limit</varname></glossterm><glossdef><para>Maximum size of all memstores in a region server before flushes are forced.
+ Defaults to 95% of hbase.regionserver.global.memstore.size.
+ A 100% value for this value causes the minimum possible flushing to occur when updates are
+ blocked due to memstore limiting.</para><formalpara><title>Default</title><para><varname>0.95</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.regionserver.optionalcacheflushinterval"><glossterm><varname>hbase.regionserver.optionalcacheflushinterval</varname></glossterm><glossdef><para>
+ Maximum amount of time an edit lives in memory before being automatically flushed.
+ Default 1 hour. Set it to 0 to disable automatic flushing.</para><formalpara><title>Default</title><para><varname>3600000</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.regionserver.catalog.timeout"><glossterm><varname>hbase.regionserver.catalog.timeout</varname></glossterm><glossdef><para>Timeout value for the Catalog Janitor from the regionserver to META.</para><formalpara><title>Default</title><para><varname>600000</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.regionserver.dns.interface"><glossterm><varname>hbase.regionserver.dns.interface</varname></glossterm><glossdef><para>The name of the Network Interface from which a region server
+ should report its IP address.</para><formalpara><title>Default</title><para><varname>default</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.regionserver.dns.nameserver"><glossterm><varname>hbase.regionserver.dns.nameserver</varname></glossterm><glossdef><para>The host name or IP address of the name server (DNS)
+ which a region server should use to determine the host name used by the
+ master for communication and display purposes.</para><formalpara><title>Default</title><para><varname>default</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.regionserver.region.split.policy"><glossterm><varname>hbase.regionserver.region.split.policy</varname></glossterm><glossdef><para>
+ A split policy determines when a region should be split. The various other split policies that
+ are available currently are ConstantSizeRegionSplitPolicy, DisabledRegionSplitPolicy,
+ DelimitedKeyPrefixRegionSplitPolicy, KeyPrefixRegionSplitPolicy etc.
+ </para><formalpara><title>Default</title><para><varname>org.apache.hadoop.hbase.regionserver.IncreasingToUpperBoundRegionSplitPolicy</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="zookeeper.session.timeout"><glossterm><varname>zookeeper.session.timeout</varname></glossterm><glossdef><para>ZooKeeper session timeout in milliseconds. It is used in two different ways.
+ First, this value is used in the ZK client that HBase uses to connect to the ensemble.
+ It is also used by HBase when it starts a ZK server and it is passed as the 'maxSessionTimeout'. See
+ http://hadoop.apache.org/zookeeper/docs/current/zookeeperProgrammers.html#ch_zkSessions.
+ For example, if a HBase region server connects to a ZK ensemble that's also managed by HBase, then the
+ session timeout will be the one specified by this configuration. But, a region server that connects
+ to an ensemble managed with a different configuration will be subjected that ensemble's maxSessionTimeout. So,
+ even though HBase might propose using 90 seconds, the ensemble can have a max timeout lower than this and
+ it will take precedence. The current default that ZK ships with is 40 seconds, which is lower than HBase's.
+ </para><formalpara><title>Default</title><para><varname>90000</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="zookeeper.znode.parent"><glossterm><varname>zookeeper.znode.parent</varname></glossterm><glossdef><para>Root ZNode for HBase in ZooKeeper. All of HBase's ZooKeeper
+ files that are configured with a relative path will go under this node.
+ By default, all of HBase's ZooKeeper file path are configured with a
+ relative path, so they will all go under this directory unless changed.</para><formalpara><title>Default</title><para><varname>/hbase</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="zookeeper.znode.rootserver"><glossterm><varname>zookeeper.znode.rootserver</varname></glossterm><glossdef><para>Path to ZNode holding root region location. This is written by
+ the master and read by clients and region servers. If a relative path is
+ given, the parent folder will be ${zookeeper.znode.parent}. By default,
+ this means the root location is stored at /hbase/root-region-server.</para><formalpara><title>Default</title><para><varname>root-region-server</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="zookeeper.znode.acl.parent"><glossterm><varname>zookeeper.znode.acl.parent</varname></glossterm><glossdef><para>Root ZNode for access control lists.</para><formalpara><title>Default</title><para><varname>acl</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.zookeeper.dns.interface"><glossterm><varname>hbase.zookeeper.dns.interface</varname></glossterm><glossdef><para>The name of the Network Interface from which a ZooKeeper server
+ should report its IP address.</para><formalpara><title>Default</title><para><varname>default</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.zookeeper.dns.nameserver"><glossterm><varname>hbase.zookeeper.dns.nameserver</varname></glossterm><glossdef><para>The host name or IP address of the name server (DNS)
+ which a ZooKeeper server should use to determine the host name used by the
+ master for communication and display purposes.</para><formalpara><title>Default</title><para><varname>default</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.zookeeper.peerport"><glossterm><varname>hbase.zookeeper.peerport</varname></glossterm><glossdef><para>Port used by ZooKeeper peers to talk to each other.
+ See http://hadoop.apache.org/zookeeper/docs/r3.1.1/zookeeperStarted.html#sc_RunningReplicatedZooKeeper
+ for more information.</para><formalpara><title>Default</title><para><varname>2888</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.zookeeper.leaderport"><glossterm><varname>hbase.zookeeper.leaderport</varname></glossterm><glossdef><para>Port used by ZooKeeper for leader election.
+ See http://hadoop.apache.org/zookeeper/docs/r3.1.1/zookeeperStarted.html#sc_RunningReplicatedZooKeeper
+ for more information.</para><formalpara><title>Default</title><para><varname>3888</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.zookeeper.useMulti"><glossterm><varname>hbase.zookeeper.useMulti</varname></glossterm><glossdef><para>Instructs HBase to make use of ZooKeeper's multi-update functionality.
+ This allows certain ZooKeeper operations to complete more quickly and prevents some issues
+ with rare Replication failure scenarios (see the release note of HBASE-2611 for an example).
+ IMPORTANT: only set this to true if all ZooKeeper servers in the cluster are on version 3.4+
+ and will not be downgraded. ZooKeeper versions before 3.4 do not support multi-update and
+ will not fail gracefully if multi-update is invoked (see ZOOKEEPER-1495).</para><formalpara><title>Default</title><para><varname>true</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.config.read.zookeeper.config"><glossterm><varname>hbase.config.read.zookeeper.config</varname></glossterm><glossdef><para>
+ Set to true to allow HBaseConfiguration to read the
+ zoo.cfg file for ZooKeeper properties. Switching this to true
+ is not recommended, since the functionality of reading ZK
+ properties from a zoo.cfg file has been deprecated.</para><formalpara><title>Default</title><para><varname>false</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.zookeeper.property.initLimit"><glossterm><varname>hbase.zookeeper.property.initLimit</varname></glossterm><glossdef><para>Property from ZooKeeper's config zoo.cfg.
+ The number of ticks that the initial synchronization phase can take.</para><formalpara><title>Default</title><para><varname>10</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.zookeeper.property.syncLimit"><glossterm><varname>hbase.zookeeper.property.syncLimit</varname></glossterm><glossdef><para>Property from ZooKeeper's config zoo.cfg.
+ The number of ticks that can pass between sending a request and getting an
+ acknowledgment.</para><formalpara><title>Default</title><para><varname>5</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.zookeeper.property.dataDir"><glossterm><varname>hbase.zookeeper.property.dataDir</varname></glossterm><glossdef><para>Property from ZooKeeper's config zoo.cfg.
+ The directory where the snapshot is stored.</para><formalpara><title>Default</title><para><varname>${hbase.tmp.dir}/zookeeper</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.zookeeper.property.clientPort"><glossterm><varname>hbase.zookeeper.property.clientPort</varname></glossterm><glossdef><para>Property from ZooKeeper's config zoo.cfg.
+ The port at which the clients will connect.</para><formalpara><title>Default</title><para><varname>2181</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.zookeeper.property.maxClientCnxns"><glossterm><varname>hbase.zookeeper.property.maxClientCnxns</varname></glossterm><glossdef><para>Property from ZooKeeper's config zoo.cfg.
+ Limit on number of concurrent connections (at the socket level) that a
+ single client, identified by IP address, may make to a single member of
+ the ZooKeeper ensemble. Set high to avoid zk connection issues running
+ standalone and pseudo-distributed.</para><formalpara><title>Default</title><para><varname>300</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.client.write.buffer"><glossterm><varname>hbase.client.write.buffer</varname></glossterm><glossdef><para>Default size of the HTable client write buffer in bytes.
+ A bigger buffer takes more memory -- on both the client and server
+ side since server instantiates the passed write buffer to process
+ it -- but a larger buffer size reduces the number of RPCs made.
+ For an estimate of server-side memory-used, evaluate
+ hbase.client.write.buffer * hbase.regionserver.handler.count</para><formalpara><title>Default</title><para><varname>2097152</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.client.pause"><glossterm><varname>hbase.client.pause</varname></glossterm><glossdef><para>General client pause value. Used mostly as value to wait
+ before running a retry of a failed get, region lookup, etc.
+ See hbase.client.retries.number for description of how we backoff from
+ this initial pause amount and how this pause works w/ retries.</para><formalpara><title>Default</title><para><varname>100</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.client.retries.number"><glossterm><varname>hbase.client.retries.number</varname></glossterm><glossdef><para>Maximum retries. Used as maximum for all retryable
+ operations such as the getting of a cell's value, starting a row update,
+ etc. Retry interval is a rough function based on hbase.client.pause. At
+ first we retry at this interval but then with backoff, we pretty quickly reach
+ retrying every ten seconds. See HConstants#RETRY_BACKOFF for how the backup
+ ramps up. Change this setting and hbase.client.pause to suit your workload.</para><formalpara><title>Default</title><para><varname>35</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.client.max.total.tasks"><glossterm><varname>hbase.client.max.total.tasks</varname></glossterm><glossdef><para>The maximum number of concurrent tasks a single HTable instance will
+ send to the cluster.</para><formalpara><title>Default</title><para><varname>100</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.client.max.perserver.tasks"><glossterm><varname>hbase.client.max.perserver.tasks</varname></glossterm><glossdef><para>The maximum number of concurrent tasks a single HTable instance will
+ send to a single region server.</para><formalpara><title>Default</title><para><varname>5</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.client.max.perregion.tasks"><glossterm><varname>hbase.client.max.perregion.tasks</varname></glossterm><glossdef><para>The maximum number of concurrent connections the client will
+ maintain to a single Region. That is, if there is already
+ hbase.client.max.perregion.tasks writes in progress for this region, new puts
+ won't be sent to this region until some writes finishes.</para><formalpara><title>Default</title><para><varname>1</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.client.scanner.caching"><glossterm><varname>hbase.client.scanner.caching</varname></glossterm><glossdef><para>Number of rows that will be fetched when calling next
+ on a scanner if it is not served from (local, client) memory. Higher
+ caching values will enable faster scanners but will eat up more memory
+ and some calls of next may take longer and longer times when the cache is empty.
+ Do not set this value such that the time between invocations is greater
+ than the scanner timeout; i.e. hbase.client.scanner.timeout.period</para><formalpara><title>Default</title><para><varname>100</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.client.keyvalue.maxsize"><glossterm><varname>hbase.client.keyvalue.maxsize</varname></glossterm><glossdef><para>Specifies the combined maximum allowed size of a KeyValue
+ instance. This is to set an upper boundary for a single entry saved in a
+ storage file. Since they cannot be split it helps avoiding that a region
+ cannot be split any further because the data is too large. It seems wise
+ to set this to a fraction of the maximum region size. Setting it to zero
+ or less disables the check.</para><formalpara><title>Default</title><para><varname>10485760</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.client.scanner.timeout.period"><glossterm><varname>hbase.client.scanner.timeout.period</varname></glossterm><glossdef><para>Client scanner lease period in milliseconds.</para><formalpara><title>Default</title><para><varname>60000</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.client.localityCheck.threadPoolSize"><glossterm><varname>hbase.client.localityCheck.threadPoolSize</varname></glossterm><glossdef><para/><formalpara><title>Default</title><para><varname>2</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.bulkload.retries.number"><glossterm><varname>hbase.bulkload.retries.number</varname></glossterm><glossdef><para>Maximum retries. This is maximum number of iterations
+ to atomic bulk loads are attempted in the face of splitting operations
+ 0 means never give up.</para><formalpara><title>Default</title><para><varname>10</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.balancer.period "><glossterm><varname>hbase.balancer.period
+ </varname></glossterm><glossdef><para>Period at which the region balancer runs in the Master.</para><formalpara><title>Default</title><para><varname>300000</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.regions.slop"><glossterm><varname>hbase.regions.slop</varname></glossterm><glossdef><para>Rebalance if any regionserver has average + (average * slop) regions.</para><formalpara><title>Default</title><para><varname>0.2</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.server.thread.wakefrequency"><glossterm><varname>hbase.server.thread.wakefrequency</varname></glossterm><glossdef><para>Time to sleep in between searches for work (in milliseconds).
+ Used as sleep interval by service threads such as log roller.</para><formalpara><title>Default</title><para><varname>10000</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.server.versionfile.writeattempts"><glossterm><varname>hbase.server.versionfile.writeattempts</varname></glossterm><glossdef><para>
+ How many time to retry attempting to write a version file
+ before just aborting. Each attempt is seperated by the
+ hbase.server.thread.wakefrequency milliseconds.</para><formalpara><title>Default</title><para><varname>3</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.hregion.memstore.flush.size"><glossterm><varname>hbase.hregion.memstore.flush.size</varname></glossterm><glossdef><para>
+ Memstore will be flushed to disk if size of the memstore
+ exceeds this number of bytes. Value is checked by a thread that runs
+ every hbase.server.thread.wakefrequency.</para><formalpara><title>Default</title><para><varname>134217728</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.hregion.percolumnfamilyflush.size.lower.bound"><glossterm><varname>hbase.hregion.percolumnfamilyflush.size.lower.bound</varname></glossterm><glossdef><para>
+ If FlushLargeStoresPolicy is used, then every time that we hit the
+ total memstore limit, we find out all the column families whose memstores
+ exceed this value, and only flush them, while retaining the others whose
+ memstores are lower than this limit. If none of the families have their
+ memstore size more than this, all the memstores will be flushed
+ (just as usual). This value should be less than half of the total memstore
+ threshold (hbase.hregion.memstore.flush.size).
+ </para><formalpara><title>Default</title><para><varname>16777216</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.hregion.preclose.flush.size"><glossterm><varname>hbase.hregion.preclose.flush.size</varname></glossterm><glossdef><para>
+ If the memstores in a region are this size or larger when we go
+ to close, run a "pre-flush" to clear out memstores before we put up
+ the region closed flag and take the region offline. On close,
+ a flush is run under the close flag to empty memory. During
+ this time the region is offline and we are not taking on any writes.
+ If the memstore content is large, this flush could take a long time to
+ complete. The preflush is meant to clean out the bulk of the memstore
+ before putting up the close flag and taking the region offline so the
+ flush that runs under the close flag has little to do.</para><formalpara><title>Default</title><para><varname>5242880</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.hregion.memstore.block.multiplier"><glossterm><varname>hbase.hregion.memstore.block.multiplier</varname></glossterm><glossdef><para>
+ Block updates if memstore has hbase.hregion.memstore.block.multiplier
+ times hbase.hregion.memstore.flush.size bytes. Useful preventing
+ runaway memstore during spikes in update traffic. Without an
+ upper-bound, memstore fills such that when it flushes the
+ resultant flush files take a long time to compact or split, or
+ worse, we OOME.</para><formalpara><title>Default</title><para><varname>4</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.hregion.memstore.mslab.enabled"><glossterm><varname>hbase.hregion.memstore.mslab.enabled</varname></glossterm><glossdef><para>
+ Enables the MemStore-Local Allocation Buffer,
+ a feature which works to prevent heap fragmentation under
+ heavy write loads. This can reduce the frequency of stop-the-world
+ GC pauses on large heaps.</para><formalpara><title>Default</title><para><varname>true</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.hregion.max.filesize"><glossterm><varname>hbase.hregion.max.filesize</varname></glossterm><glossdef><para>
+ Maximum HFile size. If the sum of the sizes of a region's HFiles has grown to exceed this
+ value, the region is split in two.</para><formalpara><title>Default</title><para><varname>10737418240</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.hregion.majorcompaction"><glossterm><varname>hbase.hregion.majorcompaction</varname></glossterm><glossdef><para>Time between major compactions, expressed in milliseconds. Set to 0 to disable
+ time-based automatic major compactions. User-requested and size-based major compactions will
+ still run. This value is multiplied by hbase.hregion.majorcompaction.jitter to cause
+ compaction to start at a somewhat-random time during a given window of time. The default value
+ is 7 days, expressed in milliseconds. If major compactions are causing disruption in your
+ environment, you can configure them to run at off-peak times for your deployment, or disable
+ time-based major compactions by setting this parameter to 0, and run major compactions in a
+ cron job or by another external mechanism.</para><formalpara><title>Default</title><para><varname>604800000</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.hregion.majorcompaction.jitter"><glossterm><varname>hbase.hregion.majorcompaction.jitter</varname></glossterm><glossdef><para>A multiplier applied to hbase.hregion.majorcompaction to cause compaction to occur
+ a given amount of time either side of hbase.hregion.majorcompaction. The smaller the number,
+ the closer the compactions will happen to the hbase.hregion.majorcompaction
+ interval.</para><formalpara><title>Default</title><para><varname>0.50</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.hstore.compactionThreshold"><glossterm><varname>hbase.hstore.compactionThreshold</varname></glossterm><glossdef><para> If more than this number of StoreFiles exist in any one Store
+ (one StoreFile is written per flush of MemStore), a compaction is run to rewrite all
+ StoreFiles into a single StoreFile. Larger values delay compaction, but when compaction does
+ occur, it takes longer to complete.</para><formalpara><title>Default</title><para><varname>3</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.hstore.flusher.count"><glossterm><varname>hbase.hstore.flusher.count</varname></glossterm><glossdef><para> The number of flush threads. With fewer threads, the MemStore flushes will be
+ queued. With more threads, the flushes will be executed in parallel, increasing the load on
+ HDFS, and potentially causing more compactions. </para><formalpara><title>Default</title><para><varname>2</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.hstore.blockingStoreFiles"><glossterm><varname>hbase.hstore.blockingStoreFiles</varname></glossterm><glossdef><para> If more than this number of StoreFiles exist in any one Store (one StoreFile
+ is written per flush of MemStore), updates are blocked for this region until a compaction is
+ completed, or until hbase.hstore.blockingWaitTime has been exceeded.</para><formalpara><title>Default</title><para><varname>10</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.hstore.blockingWaitTime"><glossterm><varname>hbase.hstore.blockingWaitTime</varname></glossterm><glossdef><para> The time for which a region will block updates after reaching the StoreFile limit
+ defined by hbase.hstore.blockingStoreFiles. After this time has elapsed, the region will stop
+ blocking updates even if a compaction has not been completed.</para><formalpara><title>Default</title><para><varname>90000</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.hstore.compaction.min"><glossterm><varname>hbase.hstore.compaction.min</varname></glossterm><glossdef><para>The minimum number of StoreFiles which must be eligible for compaction before
+ compaction can run. The goal of tuning hbase.hstore.compaction.min is to avoid ending up with
+ too many tiny StoreFiles to compact. Setting this value to 2 would cause a minor compaction
+ each time you have two StoreFiles in a Store, and this is probably not appropriate. If you
+ set this value too high, all the other values will need to be adjusted accordingly. For most
+ cases, the default value is appropriate. In previous versions of HBase, the parameter
+ hbase.hstore.compaction.min was named hbase.hstore.compactionThreshold.</para><formalpara><title>Default</title><para><varname>3</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.hstore.compaction.max"><glossterm><varname>hbase.hstore.compaction.max</varname></glossterm><glossdef><para>The maximum number of StoreFiles which will be selected for a single minor
+ compaction, regardless of the number of eligible StoreFiles. Effectively, the value of
+ hbase.hstore.compaction.max controls the length of time it takes a single compaction to
+ complete. Setting it larger means that more StoreFiles are included in a compaction. For most
+ cases, the default value is appropriate.</para><formalpara><title>Default</title><para><varname>10</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.hstore.compaction.min.size"><glossterm><varname>hbase.hstore.compaction.min.size</varname></glossterm><glossdef><para>A StoreFile smaller than this size will always be eligible for minor compaction.
+ HFiles this size or larger are evaluated by hbase.hstore.compaction.ratio to determine if
+ they are eligible. Because this limit represents the "automatic include"limit for all
+ StoreFiles smaller than this value, this value may need to be reduced in write-heavy
+ environments where many StoreFiles in the 1-2 MB range are being flushed, because every
+ StoreFile will be targeted for compaction and the resulting StoreFiles may still be under the
+ minimum size and require further compaction. If this parameter is lowered, the ratio check is
+ triggered more quickly. This addressed some issues seen in earlier versions of HBase but
+ changing this parameter is no longer necessary in most situations. Default: 128 MB expressed
+ in bytes.</para><formalpara><title>Default</title><para><varname>134217728</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.hstore.compaction.max.size"><glossterm><varname>hbase.hstore.compaction.max.size</varname></glossterm><glossdef><para>A StoreFile larger than this size will be excluded from compaction. The effect of
+ raising hbase.hstore.compaction.max.size is fewer, larger StoreFiles that do not get
+ compacted often. If you feel that compaction is happening too often without much benefit, you
+ can try raising this value. Default: the value of LONG.MAX_VALUE, expressed in bytes.</para><formalpara><title>Default</title><para><varname>9223372036854775807</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.hstore.compaction.ratio"><glossterm><varname>hbase.hstore.compaction.ratio</varname></glossterm><glossdef><para>For minor compaction, this ratio is used to determine whether a given StoreFile
+ which is larger than hbase.hstore.compaction.min.size is eligible for compaction. Its
+ effect is to limit compaction of large StoreFiles. The value of hbase.hstore.compaction.ratio
+ is expressed as a floating-point decimal. A large ratio, such as 10, will produce a single
+ giant StoreFile. Conversely, a low value, such as .25, will produce behavior similar to the
+ BigTable compaction algorithm, producing four StoreFiles. A moderate value of between 1.0 and
+ 1.4 is recommended. When tuning this value, you are balancing write costs with read costs.
+ Raising the value (to something like 1.4) will have more write costs, because you will
+ compact larger StoreFiles. However, during reads, HBase will need to seek through fewer
+ StoreFiles to accomplish the read. Consider this approach if you cannot take advantage of
+ Bloom filters. Otherwise, you can lower this value to something like 1.0 to reduce the
+ background cost of writes, and use Bloom filters to control the number of StoreFiles touched
+ during reads. For most cases, the default value is appropriate.</para><formalpara><title>Default</title><para><varname>1.2F</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.hstore.compaction.ratio.offpeak"><glossterm><varname>hbase.hstore.compaction.ratio.offpeak</varname></glossterm><glossdef><para>Allows you to set a different (by default, more aggressive) ratio for determining
+ whether larger StoreFiles are included in compactions during off-peak hours. Works in the
+ same way as hbase.hstore.compaction.ratio. Only applies if hbase.offpeak.start.hour and
+ hbase.offpeak.end.hour are also enabled.</para><formalpara><title>Default</title><para><varname>5.0F</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.hstore.time.to.purge.deletes"><glossterm><varname>hbase.hstore.time.to.purge.deletes</varname></glossterm><glossdef><para>The amount of time to delay purging of delete markers with future timestamps. If
+ unset, or set to 0, all delete markers, including those with future timestamps, are purged
+ during the next major compaction. Otherwise, a delete marker is kept until the major compaction
+ which occurs after the marker's timestamp plus the value of this setting, in milliseconds.
+ </para><formalpara><title>Default</title><para><varname>0</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.offpeak.start.hour"><glossterm><varname>hbase.offpeak.start.hour</varname></glossterm><glossdef><para>The start of off-peak hours, expressed as an integer between 0 and 23, inclusive.
+ Set to -1 to disable off-peak.</para><formalpara><title>Default</title><para><varname>-1</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.offpeak.end.hour"><glossterm><varname>hbase.offpeak.end.hour</varname></glossterm><glossdef><para>The end of off-peak hours, expressed as an integer between 0 and 23, inclusive. Set
+ to -1 to disable off-peak.</para><formalpara><title>Default</title><para><varname>-1</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.regionserver.thread.compaction.throttle"><glossterm><varname>hbase.regionserver.thread.compaction.throttle</varname></glossterm><glossdef><para>There are two different thread pools for compactions, one for large compactions and
+ the other for small compactions. This helps to keep compaction of lean tables (such as
+ hbase:meta) fast. If a compaction is larger than this threshold, it
+ goes into the large compaction pool. In most cases, the default value is appropriate. Default:
+ 2 x hbase.hstore.compaction.max x hbase.hregion.memstore.flush.size (which defaults to 128MB).
+ The value field assumes that the value of hbase.hregion.memstore.flush.size is unchanged from
+ the default.</para><formalpara><title>Default</title><para><varname>2684354560</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.hstore.compaction.kv.max"><glossterm><varname>hbase.hstore.compaction.kv.max</varname></glossterm><glossdef><para>The maximum number of KeyValues to read and then write in a batch when flushing or
+ compacting. Set this lower if you have big KeyValues and problems with Out Of Memory
+ Exceptions Set this higher if you have wide, small rows. </para><formalpara><title>Default</title><para><varname>10</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.storescanner.parallel.seek.enable"><glossterm><varname>hbase.storescanner.parallel.seek.enable</varname></glossterm><glossdef><para>
+ Enables StoreFileScanner parallel-seeking in StoreScanner,
+ a feature which can reduce response latency under special conditions.</para><formalpara><title>Default</title><para><varname>false</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.storescanner.parallel.seek.threads"><glossterm><varname>hbase.storescanner.parallel.seek.threads</varname></glossterm><glossdef><para>
+ The default thread pool size if parallel-seeking feature enabled.</para><formalpara><title>Default</title><para><varname>10</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="hfile.block.cache.size"><glossterm><varname>hfile.block.cache.size</varname></glossterm><glossdef><para>Percentage of maximum heap (-Xmx setting) to allocate to block cache
+ used by a StoreFile. Default of 0.4 means allocate 40%.
+ Set to 0 to disable but it's not recommended; you need at least
+ enough cache to hold the storefile indices.</para><formalpara><title>Default</title><para><varname>0.4</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="hfile.block.index.cacheonwrite"><glossterm><varname>hfile.block.index.cacheonwrite</varname></glossterm><glossdef><para>This allows to put non-root multi-level index blocks into the block
+ cache at the time the index is being written.</para><formalpara><title>Default</title><para><varname>false</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="hfile.index.block.max.size"><glossterm><varname>hfile.index.block.max.size</varname></glossterm><glossdef><para>When the size of a leaf-level, intermediate-level, or root-level
+ index block in a multi-level block index grows to this size, the
+ block is written out and a new block is started.</para><formalpara><title>Default</title><para><varname>131072</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.bucketcache.ioengine"><glossterm><varname>hbase.bucketcache.ioengine</varname></glossterm><glossdef><para>Where to store the contents of the bucketcache. One of: onheap,
+ offheap, or file. If a file, set it to file:PATH_TO_FILE. See https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/io/hfile/CacheConfig.html for more information.
+ </para><formalpara><title>Default</title><para><varname/></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.bucketcache.combinedcache.enabled"><glossterm><varname>hbase.bucketcache.combinedcache.enabled</varname></glossterm><glossdef><para>Whether or not the bucketcache is used in league with the LRU
+ on-heap block cache. In this mode, indices and blooms are kept in the LRU
+ blockcache and the data blocks are kept in the bucketcache.</para><formalpara><title>Default</title><para><varname>true</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.bucketcache.size"><glossterm><varname>hbase.bucketcache.size</varname></glossterm><glossdef><para>The size of the buckets for the bucketcache if you only use a single size.
+ Defaults to the default blocksize, which is 64 * 1024.</para><formalpara><title>Default</title><para><varname>65536</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.bucketcache.sizes"><glossterm><varname>hbase.bucketcache.sizes</varname></glossterm><glossdef><para>A comma-separated list of sizes for buckets for the bucketcache
+ if you use multiple sizes. Should be a list of block sizes in order from smallest
+ to largest. The sizes you use will depend on your data access patterns.</para><formalpara><title>Default</title><para><varname/></para></formalpara></glossdef></glossentry><glossentry xml:id="hfile.format.version"><glossterm><varname>hfile.format.version</varname></glossterm><glossdef><para>The HFile format version to use for new files.
+ Version 3 adds support for tags in hfiles (See http://hbase.apache.org/book.html#hbase.tags).
+ Distributed Log Replay requires that tags are enabled. Also see the configuration
+ 'hbase.replication.rpc.codec'.
+ </para><formalpara><title>Default</title><para><varname>3</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="hfile.block.bloom.cacheonwrite"><glossterm><varname>hfile.block.bloom.cacheonwrite</varname></glossterm><glossdef><para>Enables cache-on-write for inline blocks of a compound Bloom filter.</para><formalpara><title>Default</title><para><varname>false</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="io.storefile.bloom.block.size"><glossterm><varname>io.storefile.bloom.block.size</varname></glossterm><glossdef><para>The size in bytes of a single block ("chunk") of a compound Bloom
+ filter. This size is approximate, because Bloom blocks can only be
+ inserted at data block boundaries, and the number of keys per data
+ block varies.</para><formalpara><title>Default</title><para><varname>131072</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.rs.cacheblocksonwrite"><glossterm><varname>hbase.rs.cacheblocksonwrite</varname></glossterm><glossdef><para>Whether an HFile block should be added to the block cache when the
+ block is finished.</para><formalpara><title>Default</title><para><varname>false</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.rpc.timeout"><glossterm><varname>hbase.rpc.timeout</varname></glossterm><glossdef><para>This is for the RPC layer to define how long HBase client applications
+ take for a remote call to time out. It uses pings to check connections
+ but will eventually throw a TimeoutException.</para><formalpara><title>Default</title><para><varname>60000</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.rpc.shortoperation.timeout"><glossterm><varname>hbase.rpc.shortoperation.timeout</varname></glossterm><glossdef><para>This is another version of "hbase.rpc.timeout". For those RPC operation
+ within cluster, we rely on this configuration to set a short timeout limitation
+ for short operation. For example, short rpc timeout for region server's trying
+ to report to active master can benefit quicker master failover process.</para><formalpara><title>Default</title><para><varname>10000</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.ipc.client.tcpnodelay"><glossterm><varname>hbase.ipc.client.tcpnodelay</varname></glossterm><glossdef><para>Set no delay on rpc socket connections. See
+ http://docs.oracle.com/javase/1.5.0/docs/api/java/net/Socket.html#getTcpNoDelay()</para><formalpara><title>Default</title><para><varname>true</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.master.keytab.file"><glossterm><varname>hbase.master.keytab.file</varname></glossterm><glossdef><para>Full path to the kerberos keytab file to use for logging in
+ the configured HMaster server principal.</para><formalpara><title>Default</title><para><varname/></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.master.kerberos.principal"><glossterm><varname>hbase.master.kerberos.principal</varname></glossterm><glossdef><para>Ex. "hbase/_HOST@EXAMPLE.COM". The kerberos principal name
+ that should be used to run the HMaster process. The principal name should
+ be in the form: user/hostname@DOMAIN. If "_HOST" is used as the hostname
+ portion, it will be replaced with the actual hostname of the running
+ instance.</para><formalpara><title>Default</title><para><varname/></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.regionserver.keytab.file"><glossterm><varname>hbase.regionserver.keytab.file</varname></glossterm><glossdef><para>Full path to the kerberos keytab file to use for logging in
+ the configured HRegionServer server principal.</para><formalpara><title>Default</title><para><varname/></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.regionserver.kerberos.principal"><glossterm><varname>hbase.regionserver.kerberos.principal</varname></glossterm><glossdef><para>Ex. "hbase/_HOST@EXAMPLE.COM". The kerberos principal name
+ that should be used to run the HRegionServer process. The principal name
+ should be in the form: user/hostname@DOMAIN. If "_HOST" is used as the
+ hostname portion, it will be replaced with the actual hostname of the
+ running instance. An entry for this principal must exist in the file
+ specified in hbase.regionserver.keytab.file</para><formalpara><title>Default</title><para><varname/></para></formalpara></glossdef></glossentry><glossentry xml:id="hadoop.policy.file"><glossterm><varname>hadoop.policy.file</varname></glossterm><glossdef><para>The policy configuration file used by RPC servers to make
+ authorization decisions on client requests. Only used when HBase
+ security is enabled.</para><formalpara><title>Default</title><para><varname>hbase-policy.xml</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.superuser"><glossterm><varname>hbase.superuser</varname></glossterm><glossdef><para>List of users or groups (comma-separated), who are allowed
+ full privileges, regardless of stored ACLs, across the cluster.
+ Only used when HBase security is enabled.</para><formalpara><title>Default</title><para><varname/></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.auth.key.update.interval"><glossterm><varname>hbase.auth.key.update.interval</varname></glossterm><glossdef><para>The update interval for master key for authentication tokens
+ in servers in milliseconds. Only used when HBase security is enabled.</para><formalpara><title>Default</title><para><varname>86400000</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.auth.token.max.lifetime"><glossterm><varname>hbase.auth.token.max.lifetime</varname></glossterm><glossdef><para>The maximum lifetime in milliseconds after which an
+ authentication token expires. Only used when HBase security is enabled.</para><formalpara><title>Default</title><para><varname>604800000</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.ipc.client.fallback-to-simple-auth-allowed"><glossterm><varname>hbase.ipc.client.fallback-to-simple-auth-allowed</varname></glossterm><glossdef><para>When a client is configured to attempt a secure connection, but attempts to
+ connect to an insecure server, that server may instruct the client to
+ switch to SASL SIMPLE (unsecure) authentication. This setting controls
+ whether or not the client will accept this instruction from the server.
+ When false (the default), the client will not allow the fallback to SIMPLE
+ authentication, and will abort the connection.</para><formalpara><title>Default</title><para><varname>false</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.display.keys"><glossterm><varname>hbase.display.keys</varname></glossterm><glossdef><para>When this is set to true the webUI and such will display all start/end keys
+ as part of the table details, region names, etc. When this is set to false,
+ the keys are hidden.</para><formalpara><title>Default</title><para><varname>true</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.coprocessor.region.classes"><glossterm><varname>hbase.coprocessor.region.classes</varname></glossterm><glossdef><para>A comma-separated list of Coprocessors that are loaded by
+ default on all tables. For any override coprocessor method, these classes
+ will be called in order. After implementing your own Coprocessor, just put
+ it in HBase's classpath and add the fully qualified class name here.
+ A coprocessor can also be loaded on demand by setting HTableDescriptor.</para><formalpara><title>Default</title><para><varname/></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.rest.port"><glossterm><varname>hbase.rest.port</varname></glossterm><glossdef><para>The port for the HBase REST server.</para><formalpara><title>Default</title><para><varname>8080</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.rest.readonly"><glossterm><varname>hbase.rest.readonly</varname></glossterm><glossdef><para>Defines the mode the REST server will be started in. Possible values are:
+ false: All HTTP methods are permitted - GET/PUT/POST/DELETE.
+ true: Only the GET method is permitted.</para><formalpara><title>Default</title><para><varname>false</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.rest.threads.max"><glossterm><varname>hbase.rest.threads.max</varname></glossterm><glossdef><para>The maximum number of threads of the REST server thread pool.
+ Threads in the pool are reused to process REST requests. This
+ controls the maximum number of requests processed concurrently.
+ It may help to control the memory used by the REST server to
+ avoid OOM issues. If the thread pool is full, incoming requests
+ will be queued up and wait for some free threads.</para><formalpara><title>Default</title><para><varname>100</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.rest.threads.min"><glossterm><varname>hbase.rest.threads.min</varname></glossterm><glossdef><para>The minimum number of threads of the REST server thread pool.
+ The thread pool always has at least these number of threads so
+ the REST server is ready to serve incoming requests.</para><formalpara><title>Default</title><para><varname>2</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.rest.support.proxyuser"><glossterm><varname>hbase.rest.support.proxyuser</varname></glossterm><glossdef><para>Enables running the REST server to support proxy-user mode.</para><formalpara><title>Default</title><para><varname>false</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.defaults.for.version.skip"><glossterm><varname>hbase.defaults.for.version.skip</varname></glossterm><glossdef><para>Set to true to skip the 'hbase.defaults.for.version' check.
+ Setting this to true can be useful in contexts other than
+ the other side of a maven generation; i.e. running in an
+ ide. You'll want to set this boolean to true to avoid
+ seeing the RuntimException complaint: "hbase-default.xml file
+ seems to be for and old version of HBase (\${hbase.version}), this
+ version is X.X.X-SNAPSHOT"</para><formalpara><title>Default</title><para><varname>false</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.coprocessor.master.classes"><glossterm><varname>hbase.coprocessor.master.classes</varname></glossterm><glossdef><para>A comma-separated list of
+ org.apache.hadoop.hbase.coprocessor.MasterObserver coprocessors that are
+ loaded by default on the active HMaster process. For any implemented
+ coprocessor methods, the listed classes will be called in order. After
+ implementing your own MasterObserver, just put it in HBase's classpath
+ and add the fully qualified class name here.</para><formalpara><title>Default</title><para><varname/></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.coprocessor.abortonerror"><glossterm><varname>hbase.coprocessor.abortonerror</varname></glossterm><glossdef><para>Set to true to cause the hosting server (master or regionserver)
+ to abort if a coprocessor fails to load, fails to initialize, or throws an
+ unexpected Throwable object. Setting this to false will allow the server to
+ continue execution but the system wide state of the coprocessor in question
+ will become inconsistent as it will be properly executing in only a subset
+ of servers, so this is most useful for debugging only.</para><formalpara><title>Default</title><para><varname>true</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.online.schema.update.enable"><glossterm><varname>hbase.online.schema.update.enable</varname></glossterm><glossdef><para>Set true to enable online schema changes.</para><formalpara><title>Default</title><para><varname>true</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.table.lock.enable"><glossterm><varname>hbase.table.lock.enable</varname></glossterm><glossdef><para>Set to true to enable locking the table in zookeeper for schema change operations.
+ Table locking from master prevents concurrent schema modifications to corrupt table
+ state.</para><formalpara><title>Default</title><para><varname>true</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.table.max.rowsize"><glossterm><varname>hbase.table.max.rowsize</varname></glossterm><glossdef><para>
+ Maximum size of single row in bytes (default is 1 Gb) for Get'ting
+ or Scan'ning without in-row scan flag set. If row size exceeds this limit
+ RowTooBigException is thrown to client.
+ </para><formalpara><title>Default</title><para><varname>1073741824</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.thrift.minWorkerThreads"><glossterm><varname>hbase.thrift.minWorkerThreads</varname></glossterm><glossdef><para>The "core size" of the thread pool. New threads are created on every
+ connection until this many threads are created.</para><formalpara><title>Default</title><para><varname>16</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.thrift.maxWorkerThreads"><glossterm><varname>hbase.thrift.maxWorkerThreads</varname></glossterm><glossdef><para>The maximum size of the thread pool. When the pending request queue
+ overflows, new threads are created until their number reaches this number.
+ After that, the server starts dropping connections.</para><formalpara><title>Default</title><para><varname>1000</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.thrift.maxQueuedRequests"><glossterm><varname>hbase.thrift.maxQueuedRequests</varname></glossterm><glossdef><para>The maximum number of pending Thrift connections waiting in the queue. If
+ there are no idle threads in the pool, the server queues requests. Only
+ when the queue overflows, new threads are added, up to
+ hbase.thrift.maxQueuedRequests threads.</para><formalpara><title>Default</title><para><varname>1000</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.thrift.htablepool.size.max"><glossterm><varname>hbase.thrift.htablepool.size.max</varname></glossterm><glossdef><para>The upper bound for the table pool used in the Thrift gateways server.
+ Since this is per table name, we assume a single table and so with 1000 default
+ worker threads max this is set to a matching number. For other workloads this number
+ can be adjusted as needed.
+ </para><formalpara><title>Default</title><para><varname>1000</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.regionserver.thrift.framed"><glossterm><varname>hbase.regionserver.thrift.framed</varname></glossterm><glossdef><para>Use Thrift TFramedTransport on the server side.
+ This is the recommended transport for thrift servers and requires a similar setting
+ on the client side. Changing this to false will select the default transport,
+ vulnerable to DoS when malformed requests are issued due to THRIFT-601.
+ </para><formalpara><title>Default</title><para><varname>false</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.regionserver.thrift.framed.max_frame_size_in_mb"><glossterm><varname>hbase.regionserver.thrift.framed.max_frame_size_in_mb</varname></glossterm><glossdef><para>Default frame size when using framed transport</para><formalpara><title>Default</title><para><varname>2</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.regionserver.thrift.compact"><glossterm><varname>hbase.regionserver.thrift.compact</varname></glossterm><glossdef><para>Use Thrift TCompactProtocol binary serialization protocol.</para><formalpara><title>Default</title><para><varname>false</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.data.umask.enable"><glossterm><varname>hbase.data.umask.enable</varname></glossterm><glossdef><para>Enable, if true, that file permissions should be assigned
+ to the files written by the regionserver</para><formalpara><title>Default</title><para><varname>false</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.data.umask"><glossterm><varname>hbase.data.umask</varname></glossterm><glossdef><para>File permissions that should be used to write data
+ files when hbase.data.umask.enable is true</para><formalpara><title>Default</title><para><varname>000</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.metrics.showTableName"><glossterm><varname>hbase.metrics.showTableName</varname></glossterm><glossdef><para>Whether to include the prefix "tbl.tablename" in per-column family metrics.
+ If true, for each metric M, per-cf metrics will be reported for tbl.T.cf.CF.M, if false,
+ per-cf metrics will be aggregated by column-family across tables, and reported for cf.CF.M.
+ In both cases, the aggregated metric M across tables and cfs will be reported.</para><formalpara><title>Default</title><para><varname>true</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.metrics.exposeOperationTimes"><glossterm><varname>hbase.metrics.exposeOperationTimes</varname></glossterm><glossdef><para>Whether to report metrics about time taken performing an
+ operation on the region server. Get, Put, Delete, Increment, and Append can all
+ have their times exposed through Hadoop metrics per CF and per region.</para><formalpara><title>Default</title><para><varname>true</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.snapshot.enabled"><glossterm><varname>hbase.snapshot.enabled</varname></glossterm><glossdef><para>Set to true to allow snapshots to be taken / restored / cloned.</para><formalpara><title>Default</title><para><varname>true</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.snapshot.restore.take.failsafe.snapshot"><glossterm><varname>hbase.snapshot.restore.take.failsafe.snapshot</varname></glossterm><glossdef><para>Set to true to take a snapshot before the restore operation.
+ The snapshot taken will be used in case of failure, to restore the previous state.
+ At the end of the restore operation this snapshot will be deleted</para><formalpara><title>Default</title><para><varname>true</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.snapshot.restore.failsafe.name"><glossterm><varname>hbase.snapshot.restore.failsafe.name</varname></glossterm><glossdef><para>Name of the failsafe snapshot taken by the restore operation.
+ You can use the {snapshot.name}, {table.name} and {restore.timestamp} variables
+ to create a name based on what you are restoring.</para><formalpara><title>Default</title><para><varname>hbase-failsafe-{snapshot.name}-{restore.timestamp}</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.server.compactchecker.interval.multiplier"><glossterm><varname>hbase.server.compactchecker.interval.multiplier</varname></glossterm><glossdef><para>The number that determines how often we scan to see if compaction is necessary.
+ Normally, compactions are done after some events (such as memstore flush), but if
+ region didn't receive a lot of writes for some time, or due to different compaction
+ policies, it may be necessary to check it periodically. The interval between checks is
+ hbase.server.compactchecker.interval.multiplier multiplied by
+ hbase.server.thread.wakefrequency.</para><formalpara><title>Default</title><para><varname>1000</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.lease.recovery.timeout"><glossterm><varname>hbase.lease.recovery.timeout</varname></glossterm><glossdef><para>How long we wait on dfs lease recovery in total before giving up.</para><formalpara><title>Default</title><para><varname>900000</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.lease.recovery.dfs.timeout"><glossterm><varname>hbase.lease.recovery.dfs.timeout</varname></glossterm><glossdef><para>How long between dfs recover lease invocations. Should be larger than the sum of
+ the time it takes for the namenode to issue a block recovery command as part of
+ datanode; dfs.heartbeat.interval and the time it takes for the primary
+ datanode, performing block recovery to timeout on a dead datanode; usually
+ dfs.client.socket-timeout. See the end of HBASE-8389 for more.</para><formalpara><title>Default</title><para><varname>64000</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.column.max.version"><glossterm><varname>hbase.column.max.version</varname></glossterm><glossdef><para>New column family descriptors will use this value as the default number of versions
+ to keep.</para><formalpara><title>Default</title><para><varname>1</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.dfs.client.read.shortcircuit.buffer.size"><glossterm><varname>hbase.dfs.client.read.shortcircuit.buffer.size</varname></glossterm><glossdef><para>If the DFSClient configuration
+ dfs.client.read.shortcircuit.buffer.size is unset, we will
+ use what is configured here as the short circuit read default
+ direct byte buffer size. DFSClient native default is 1MB; HBase
+ keeps its HDFS files open so number of file blocks * 1MB soon
+ starts to add up and threaten OOME because of a shortage of
+ direct memory. So, we set it down from the default. Make
+ it > the default hbase block size set in the HColumnDescriptor
+ which is usually 64k.
+ </para><formalpara><title>Default</title><para><varname>131072</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.regionserver.checksum.verify"><glossterm><varname>hbase.regionserver.checksum.verify</varname></glossterm><glossdef><para>
+ If set to true (the default), HBase verifies the checksums for hfile
+ blocks. HBase writes checksums inline with the data when it writes out
+ hfiles. HDFS (as of this writing) writes checksums to a separate file
+ than the data file necessitating extra seeks. Setting this flag saves
+ some on i/o. Checksum verification by HDFS will be internally disabled
+ on hfile streams when this flag is set. If the hbase-checksum verification
+ fails, we will switch back to using HDFS checksums (so do not disable HDFS
+ checksums! And besides this feature applies to hfiles only, not to WALs).
+ If this parameter is set to false, then hbase will not verify any checksums,
+ instead it will depend on checksum verification being done in the HDFS client.
+ </para><formalpara><title>Default</title><para><varname>true</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.hstore.bytes.per.checksum"><glossterm><varname>hbase.hstore.bytes.per.checksum</varname></glossterm><glossdef><para>
+ Number of bytes in a newly created checksum chunk for HBase-level
+ checksums in hfile blocks.
+ </para><formalpara><title>Default</title><para><varname>16384</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.hstore.checksum.algorithm"><glossterm><varname>hbase.hstore.checksum.algorithm</varname></glossterm><glossdef><para>
+ Name of an algorithm that is used to compute checksums. Possible values
+ are NULL, CRC32, CRC32C.
+ </para><formalpara><title>Default</title><para><varname>CRC32</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.status.published"><glossterm><varname>hbase.status.published</varname></glossterm><glossdef><para>
+ This setting activates the publication by the master of the status of the region server.
+ When a region server dies and its recovery starts, the master will push this information
+ to the client application, to let them cut the connection immediately instead of waiting
+ for a timeout.
+ </para><formalpara><title>Default</title><para><varname>false</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.status.publisher.class"><glossterm><varname>hbase.status.publisher.class</varname></glossterm><glossdef><para>
+ Implementation of the status publication with a multicast message.
+ </para><formalpara><title>Default</title><para><varname>org.apache.hadoop.hbase.master.ClusterStatusPublisher$MulticastPublisher</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.status.listener.class"><glossterm><varname>hbase.status.listener.class</varname></glossterm><glossdef><para>
+ Implementation of the status listener with a multicast message.
+ </para><formalpara><title>Default</title><para><varname>org.apache.hadoop.hbase.client.ClusterStatusListener$MulticastListener</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.status.multicast.address.ip"><glossterm><varname>hbase.status.multicast.address.ip</varname></glossterm><glossdef><para>
+ Multicast address to use for the status publication by multicast.
+ </para><formalpara><title>Default</title><para><varname>226.1.1.3</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.status.multicast.address.port"><glossterm><varname>hbase.status.multicast.address.port</varname></glossterm><glossdef><para>
+ Multicast port to use for the status publication by multicast.
+ </para><formalpara><title>Default</title><para><varname>16100</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.dynamic.jars.dir"><glossterm><varname>hbase.dynamic.jars.dir</varname></glossterm><glossdef><para>
+ The directory from which the custom filter/co-processor jars can be loaded
+ dynamically by the region server without the need to restart. However,
+ an already loaded filter/co-processor class would not be un-loaded. See
+ HBASE-1936 for more details.
+ </para><formalpara><title>Default</title><para><varname>${hbase.rootdir}/lib</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.security.authentication"><glossterm><varname>hbase.security.authentication</varname></glossterm><glossdef><para>
+ Controls whether or not secure authentication is enabled for HBase.
+ Possible values are 'simple' (no authentication), and 'kerberos'.
+ </para><formalpara><title>Default</title><para><varname>simple</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.rest.filter.classes"><glossterm><varname>hbase.rest.filter.classes</varname></glossterm><glossdef><para>
+ Servlet filters for REST service.
+ </para><formalpara><title>Default</title><para><varname>org.apache.hadoop.hbase.rest.filter.GzipFilter</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.master.loadbalancer.class"><glossterm><varname>hbase.master.loadbalancer.class</varname></glossterm><glossdef><para>
+ Class used to execute the regions balancing when the period occurs.
+ See the class comment for more on how it works
+ http://hbase.apache.org/devapidocs/org/apache/hadoop/hbase/master/balancer/StochasticLoadBalancer.html
+ It replaces the DefaultLoadBalancer as the default (since renamed
+ as the SimpleLoadBalancer).
+ </para><formalpara><title>Default</title><para><varname>org.apache.hadoop.hbase.master.balancer.StochasticLoadBalancer</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.security.exec.permission.checks"><glossterm><varname>hbase.security.exec.permission.checks</varname></glossterm><glossdef><para>
+ If this setting is enabled and ACL based access control is active (the
+ AccessController coprocessor is installed either as a system coprocessor
+ or on a table as a table coprocessor) then you must grant all relevant
+ users EXEC privilege if they require the ability to execute coprocessor
+ endpoint calls. EXEC privilege, like any other permission, can be
+ granted globally to a user, or to a user on a per table or per namespace
+ basis. For more information on coprocessor endpoints, see the coprocessor
+ section of the HBase online manual. For more information on granting or
+ revoking permissions using the AccessController, see the security
+ section of the HBase online manual.
+ </para><formalpara><title>Default</title><para><varname>false</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.procedure.regionserver.classes"><glossterm><varname>hbase.procedure.regionserver.classes</varname></glossterm><glossdef><para>A comma-separated list of
+ org.apache.hadoop.hbase.procedure.RegionServerProcedureManager procedure managers that are
+ loaded by default on the active HRegionServer process. The lifecycle methods (init/start/stop)
+ will be called by the active HRegionServer process to perform the specific globally barriered
+ procedure. After implementing your own RegionServerProcedureManager, just put it in
+ HBase's classpath and add the fully qualified class name here.
+ </para><formalpara><title>Default</title><para><varname/></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.procedure.master.classes"><glossterm><varname>hbase.procedure.master.classes</varname></glossterm><glossdef><para>A comma-separated list of
+ org.apache.hadoop.hbase.procedure.MasterProcedureManager procedure managers that are
+ loaded by default on the active HMaster process. A procedure is identified by its signature and
+ users can use the signature and an instant name to trigger an execution of a globally barriered
+ procedure. After implementing your own MasterProcedureManager, just put it in HBase's classpath
+ and add the fully qualified class name here.</para><formalpara><title>Default</title><para><varname/></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.coordinated.state.manager.class"><glossterm><varname>hbase.coordinated.state.manager.class</varname></glossterm><glossdef><para>Fully qualified name of class implementing coordinated state manager.</para><formalpara><title>Default</title><para><varname>org.apache.hadoop.hbase.coordination.ZkCoordinatedStateManager</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.regionserver.storefile.refresh.period"><glossterm><varname>hbase.regionserver.storefile.refresh.period</varname></glossterm><glossdef><para>
+ The period (in milliseconds) for refreshing the store files for the secondary regions. 0
+ means this feature is disabled. Secondary regions sees new files (from flushes and
+ compactions) from primary once the secondary region refreshes the list of files in the
+ region (there is no notification mechanism). But too frequent refreshes might cause
+ extra Namenode pressure. If the files cannot be refreshed for longer than HFile TTL
+ (hbase.master.hfilecleaner.ttl) the requests are rejected. Configuring HFile TTL to a larger
+ value is also recommended with this setting.
+ </para><formalpara><title>Default</title><para><varname>0</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.region.replica.replication.enabled"><glossterm><varname>hbase.region.replica.replication.enabled</varname></glossterm><glossdef><para>
+ Whether asynchronous WAL replication to the secondary region replicas is enabled or not.
+ If this is enabled, a replication peer named "region_replica_replication" will be created
+ which will tail the logs and replicate the mutatations to region replicas for tables that
+ have region replication > 1. If this is enabled once, disabling this replication also
+ requires disabling the replication peer using shell or ReplicationAdmin java class.
+ Replication to secondary region replicas works over standard inter-cluster replication.
+ So replication, if disabled explicitly, also has to be enabled by setting "hbase.replication"
+ to true for this feature to work.
+ </para><formalpara><title>Default</title><para><varname>false</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.http.filter.initializers"><glossterm><varname>hbase.http.filter.initializers</varname></glossterm><glossdef><para>
+ A comma separated list of class names. Each class in the list must extend
+ org.apache.hadoop.hbase.http.FilterInitializer. The corresponding Filter will
+ be initialized. Then, the Filter will be applied to all user facing jsp
+ and servlet web pages.
+ The ordering of the list defines the ordering of the filters.
+ The default StaticUserWebFilter add a user principal as defined by the
+ hbase.http.staticuser.user property.
+ </para><formalpara><title>Default</title><para><varname>org.apache.hadoop.hbase.http.lib.StaticUserWebFilter</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.security.visibility.mutations.checkauths"><glossterm><varname>hbase.security.visibility.mutations.checkauths</varname></glossterm><glossdef><para>
+ This property if enabled, will check whether the labels in the visibility expression are associated
+ with the user issuing the mutation
+ </para><formalpara><title>Default</title><para><varname>false</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.http.max.threads"><glossterm><varname>hbase.http.max.threads</varname></glossterm><glossdef><para>
+ The maximum number of threads that the HTTP Server will create in its
+ ThreadPool.
+ </para><formalpara><title>Default</title><para><varname>10</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.replication.rpc.codec"><glossterm><varname>hbase.replication.rpc.codec</varname></glossterm><glossdef><para>
+ The codec that is to be used when replication is enabled so that
+ the tags are also replicated. This is used along with HFileV3 which
+ supports tags in them. If tags are not used or if the hfile version used
+ is HFileV2 then KeyValueCodec can be used as the replication codec. Note that
+ using KeyValueCodecWithTags for replication when there are no tags causes no harm.
+ </para><formalpara><title>Default</title><para><varname>org.apache.hadoop.hbase.codec.KeyValueCodecWithTags</varname></para></formalpara></glossdef></glossentry><glossentry xml:id="hbase.http.staticuser.user"><glossterm><varname>hbase.http.staticuser.user</varname></glossterm><glossdef><para>
+ The user name to filter as, on static web filters
+ while rendering content. An example use is the HDFS
+ web UI (user to be used for browsing files).
+ </para><formalpara><title>Default</title><para><varname>dr.stack</varname></para></formalpara></glossdef></glossentry></glossary>
\ No newline at end of file
[6/8] hbase git commit: HBASE-12738 Chunk Ref Guide into
file-per-chapter
Posted by mi...@apache.org.
http://git-wip-us.apache.org/repos/asf/hbase/blob/a1fe1e09/src/main/docbkx/asf.xml
----------------------------------------------------------------------
diff --git a/src/main/docbkx/asf.xml b/src/main/docbkx/asf.xml
new file mode 100644
index 0000000..1455b4a
--- /dev/null
+++ b/src/main/docbkx/asf.xml
@@ -0,0 +1,44 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<appendix
+ xml:id="asf"
+ version="5.0"
+ xmlns="http://docbook.org/ns/docbook"
+ xmlns:xlink="http://www.w3.org/1999/xlink"
+ xmlns:xi="http://www.w3.org/2001/XInclude"
+ xmlns:svg="http://www.w3.org/2000/svg"
+ xmlns:m="http://www.w3.org/1998/Math/MathML"
+ xmlns:html="http://www.w3.org/1999/xhtml"
+ xmlns:db="http://docbook.org/ns/docbook">
+ <!--/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+-->
+ <title>HBase and the Apache Software Foundation</title>
+ <para>HBase is a project in the Apache Software Foundation and as such there are responsibilities to the ASF to ensure
+ a healthy project.</para>
+ <section xml:id="asf.devprocess"><title>ASF Development Process</title>
+ <para>See the <link xlink:href="http://www.apache.org/dev/#committers">Apache Development Process page</link>
+ for all sorts of information on how the ASF is structured (e.g., PMC, committers, contributors), to tips on contributing
+ and getting involved, and how open-source works at ASF.
+ </para>
+ </section>
+ <section xml:id="asf.reporting"><title>ASF Board Reporting</title>
+ <para>Once a quarter, each project in the ASF portfolio submits a report to the ASF board. This is done by the HBase project
+ lead and the committers. See <link xlink:href="http://www.apache.org/foundation/board/reporting">ASF board reporting</link> for more information.
+ </para>
+ </section>
+</appendix>
[3/8] hbase git commit: HBASE-12738 Chunk Ref Guide into
file-per-chapter
Posted by mi...@apache.org.
http://git-wip-us.apache.org/repos/asf/hbase/blob/a1fe1e09/src/main/docbkx/faq.xml
----------------------------------------------------------------------
diff --git a/src/main/docbkx/faq.xml b/src/main/docbkx/faq.xml
new file mode 100644
index 0000000..d7bcb0c
--- /dev/null
+++ b/src/main/docbkx/faq.xml
@@ -0,0 +1,270 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<appendix
+ xml:id="faq"
+ version="5.0"
+ xmlns="http://docbook.org/ns/docbook"
+ xmlns:xlink="http://www.w3.org/1999/xlink"
+ xmlns:xi="http://www.w3.org/2001/XInclude"
+ xmlns:svg="http://www.w3.org/2000/svg"
+ xmlns:m="http://www.w3.org/1998/Math/MathML"
+ xmlns:html="http://www.w3.org/1999/xhtml"
+ xmlns:db="http://docbook.org/ns/docbook">
+ <!--/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+-->
+ <title >FAQ</title>
+ <qandaset defaultlabel='qanda'>
+ <qandadiv><title>General</title>
+ <qandaentry>
+ <question><para>When should I use HBase?</para></question>
+ <answer>
+ <para>See the <xref linkend="arch.overview" /> in the Architecture chapter.
+ </para>
+ </answer>
+ </qandaentry>
+ <qandaentry>
+ <question><para>Are there other HBase FAQs?</para></question>
+ <answer>
+ <para>
+ See the FAQ that is up on the wiki, <link xlink:href="http://wiki.apache.org/hadoop/Hbase/FAQ">HBase Wiki FAQ</link>.
+ </para>
+ </answer>
+ </qandaentry>
+ <qandaentry xml:id="faq.sql">
+ <question><para>Does HBase support SQL?</para></question>
+ <answer>
+ <para>
+ Not really. SQL-ish support for HBase via <link xlink:href="http://hive.apache.org/">Hive</link> is in development, however Hive is based on MapReduce which is not generally suitable for low-latency requests.
+ See the <xref linkend="datamodel" /> section for examples on the HBase client.
+ </para>
+ </answer>
+ </qandaentry>
+ <qandaentry>
+ <question><para>How can I find examples of NoSQL/HBase?</para></question>
+ <answer>
+ <para>See the link to the BigTable paper in <xref linkend="other.info" /> in the appendix, as
+ well as the other papers.
+ </para>
+ </answer>
+ </qandaentry>
+ <qandaentry>
+ <question><para>What is the history of HBase?</para></question>
+ <answer>
+ <para>See <xref linkend="hbase.history"/>.
+ </para>
+ </answer>
+ </qandaentry>
+ </qandadiv>
+ <qandadiv>
+ <title>Upgrading</title>
+ <qandaentry>
+ <question>
+ <para>How do I upgrade Maven-managed projects from HBase 0.94 to HBase 0.96+?</para>
+ </question>
+ <answer>
+ <para>In HBase 0.96, the project moved to a modular structure. Adjust your project's
+ dependencies to rely upon the <filename>hbase-client</filename> module or another
+ module as appropriate, rather than a single JAR. You can model your Maven depency
+ after one of the following, depending on your targeted version of HBase. See <xref
+ linkend="upgrade0.96"/> or <xref linkend="upgrade0.98"/> for more
+ information.</para>
+ <example>
+ <title>Maven Dependency for HBase 0.98</title>
+ <programlisting language="xml"><![CDATA[
+<dependency>
+ <groupId>org.apache.hbase</groupId>
+ <artifactId>hbase-client</artifactId>
+ <version>0.98.5-hadoop2</version>
+</dependency>
+ ]]></programlisting>
+ </example>
+ <example>
+ <title>Maven Dependency for HBase 0.96</title>
+ <programlisting language="xml"><![CDATA[
+<dependency>
+ <groupId>org.apache.hbase</groupId>
+ <artifactId>hbase-client</artifactId>
+ <version>0.96.2-hadoop2</version>
+</dependency>
+ ]]></programlisting>
+ </example>
+ <example>
+ <title>Maven Dependency for HBase 0.94</title>
+ <programlisting language="xml"><![CDATA[
+<dependency>
+ <groupId>org.apache.hbase</groupId>
+ <artifactId>hbase</artifactId>
+ <version>0.94.3</version>
+</dependency>
+ ]]></programlisting>
+ </example>
+ </answer>
+ </qandaentry>
+ </qandadiv>
+ <qandadiv xml:id="faq.arch"><title>Architecture</title>
+ <qandaentry xml:id="faq.arch.regions">
+ <question><para>How does HBase handle Region-RegionServer assignment and locality?</para></question>
+ <answer>
+ <para>
+ See <xref linkend="regions.arch" />.
+ </para>
+ </answer>
+ </qandaentry>
+ </qandadiv>
+ <qandadiv xml:id="faq.config"><title>Configuration</title>
+ <qandaentry xml:id="faq.config.started">
+ <question><para>How can I get started with my first cluster?</para></question>
+ <answer>
+ <para>
+ See <xref linkend="quickstart" />.
+ </para>
+ </answer>
+ </qandaentry>
+ <qandaentry xml:id="faq.config.options">
+ <question><para>Where can I learn about the rest of the configuration options?</para></question>
+ <answer>
+ <para>
+ See <xref linkend="configuration" />.
+ </para>
+ </answer>
+ </qandaentry>
+ </qandadiv>
+ <qandadiv xml:id="faq.design"><title>Schema Design / Data Access</title>
+ <qandaentry xml:id="faq.design.schema">
+ <question><para>How should I design my schema in HBase?</para></question>
+ <answer>
+ <para>
+ See <xref linkend="datamodel" /> and <xref linkend="schema" />
+ </para>
+ </answer>
+ </qandaentry>
+ <qandaentry>
+ <question><para>
+ How can I store (fill in the blank) in HBase?
+ </para></question>
+ <answer>
+ <para>
+ See <xref linkend="supported.datatypes" />.
+ </para>
+ </answer>
+ </qandaentry>
+ <qandaentry xml:id="secondary.indices">
+ <question><para>
+ How can I handle secondary indexes in HBase?
+ </para></question>
+ <answer>
+ <para>
+ See <xref linkend="secondary.indexes" />
+ </para>
+ </answer>
+ </qandaentry>
+ <qandaentry xml:id="faq.changing.rowkeys">
+ <question><para>Can I change a table's rowkeys?</para></question>
+ <answer>
+ <para> This is a very common question. You can't. See <xref
+ linkend="changing.rowkeys"/>. </para>
+ </answer>
+ </qandaentry>
+ <qandaentry xml:id="faq.apis">
+ <question><para>What APIs does HBase support?</para></question>
+ <answer>
+ <para>
+ See <xref linkend="datamodel" />, <xref linkend="client" /> and <xref linkend="nonjava.jvm"/>.
+ </para>
+ </answer>
+ </qandaentry>
+ </qandadiv>
+ <qandadiv xml:id="faq.mapreduce"><title>MapReduce</title>
+ <qandaentry xml:id="faq.mapreduce.use">
+ <question><para>How can I use MapReduce with HBase?</para></question>
+ <answer>
+ <para>
+ See <xref linkend="mapreduce" />
+ </para>
+ </answer>
+ </qandaentry>
+ </qandadiv>
+ <qandadiv><title>Performance and Troubleshooting</title>
+ <qandaentry>
+ <question><para>
+ How can I improve HBase cluster performance?
+ </para></question>
+ <answer>
+ <para>
+ See <xref linkend="performance" />.
+ </para>
+ </answer>
+ </qandaentry>
+ <qandaentry>
+ <question><para>
+ How can I troubleshoot my HBase cluster?
+ </para></question>
+ <answer>
+ <para>
+ See <xref linkend="trouble" />.
+ </para>
+ </answer>
+ </qandaentry>
+ </qandadiv>
+ <qandadiv xml:id="ec2"><title>Amazon EC2</title>
+ <qandaentry>
+ <question><para>
+ I am running HBase on Amazon EC2 and...
+ </para></question>
+ <answer>
+ <para>
+ EC2 issues are a special case. See Troubleshooting <xref linkend="trouble.ec2" /> and Performance <xref linkend="perf.ec2" /> sections.
+ </para>
+ </answer>
+ </qandaentry>
+ </qandadiv>
+ <qandadiv><title xml:id="faq.operations">Operations</title>
+ <qandaentry>
+ <question><para>
+ How do I manage my HBase cluster?
+ </para></question>
+ <answer>
+ <para>
+ See <xref linkend="ops_mgt" />
+ </para>
+ </answer>
+ </qandaentry>
+ <qandaentry>
+ <question><para>
+ How do I back up my HBase cluster?
+ </para></question>
+ <answer>
+ <para>
+ See <xref linkend="ops.backup" />
+ </para>
+ </answer>
+ </qandaentry>
+ </qandadiv>
+ <qandadiv><title>HBase in Action</title>
+ <qandaentry>
+ <question><para>Where can I find interesting videos and presentations on HBase?</para></question>
+ <answer>
+ <para>
+ See <xref linkend="other.info" />
+ </para>
+ </answer>
+ </qandaentry>
+ </qandadiv>
+ </qandaset>
+
+</appendix>
[5/8] hbase git commit: HBASE-12738 Chunk Ref Guide into
file-per-chapter
Posted by mi...@apache.org.
http://git-wip-us.apache.org/repos/asf/hbase/blob/a1fe1e09/src/main/docbkx/book.xml
----------------------------------------------------------------------
diff --git a/src/main/docbkx/book.xml b/src/main/docbkx/book.xml
index ee2d7fb..3010055 100644
--- a/src/main/docbkx/book.xml
+++ b/src/main/docbkx/book.xml
@@ -1,4 +1,5 @@
<?xml version="1.0" encoding="UTF-8"?>
+
<!--
/**
*
@@ -80,4926 +81,16 @@
</info>
<!--XInclude some chapters-->
- <xi:include
- xmlns:xi="http://www.w3.org/2001/XInclude"
- href="preface.xml" />
- <xi:include
- xmlns:xi="http://www.w3.org/2001/XInclude"
- href="getting_started.xml" />
- <xi:include
- xmlns:xi="http://www.w3.org/2001/XInclude"
- href="configuration.xml" />
- <xi:include
- xmlns:xi="http://www.w3.org/2001/XInclude"
- href="upgrading.xml" />
- <xi:include
- xmlns:xi="http://www.w3.org/2001/XInclude"
- href="shell.xml" />
-
- <chapter
- xml:id="datamodel">
- <title>Data Model</title>
- <para>In HBase, data is stored in tables, which have rows and columns. This is a terminology
- overlap with relational databases (RDBMSs), but this is not a helpful analogy. Instead, it can
- be helpful to think of an HBase table as a multi-dimensional map.</para>
- <variablelist>
- <title>HBase Data Model Terminology</title>
- <varlistentry>
- <term>Table</term>
- <listitem>
- <para>An HBase table consists of multiple rows.</para>
- </listitem>
- </varlistentry>
- <varlistentry>
- <term>Row</term>
- <listitem>
- <para>A row in HBase consists of a row key and one or more columns with values associated
- with them. Rows are sorted alphabetically by the row key as they are stored. For this
- reason, the design of the row key is very important. The goal is to store data in such a
- way that related rows are near each other. A common row key pattern is a website domain.
- If your row keys are domains, you should probably store them in reverse (org.apache.www,
- org.apache.mail, org.apache.jira). This way, all of the Apache domains are near each
- other in the table, rather than being spread out based on the first letter of the
- subdomain.</para>
- </listitem>
- </varlistentry>
- <varlistentry>
- <term>Column</term>
- <listitem>
- <para>A column in HBase consists of a column family and a column qualifier, which are
- delimited by a <literal>:</literal> (colon) character.</para>
- </listitem>
- </varlistentry>
- <varlistentry>
- <term>Column Family</term>
- <listitem>
- <para>Column families physically colocate a set of columns and their values, often for
- performance reasons. Each column family has a set of storage properties, such as whether
- its values should be cached in memory, how its data is compressed or its row keys are
- encoded, and others. Each row in a table has the same column
- families, though a given row might not store anything in a given column family.</para>
- <para>Column families are specified when you create your table, and influence the way your
- data is stored in the underlying filesystem. Therefore, the column families should be
- considered carefully during schema design.</para>
- </listitem>
- </varlistentry>
- <varlistentry>
- <term>Column Qualifier</term>
- <listitem>
- <para>A column qualifier is added to a column family to provide the index for a given
- piece of data. Given a column family <literal>content</literal>, a column qualifier
- might be <literal>content:html</literal>, and another might be
- <literal>content:pdf</literal>. Though column families are fixed at table creation,
- column qualifiers are mutable and may differ greatly between rows.</para>
- </listitem>
- </varlistentry>
- <varlistentry>
- <term>Cell</term>
- <listitem>
- <para>A cell is a combination of row, column family, and column qualifier, and contains a
- value and a timestamp, which represents the value's version.</para>
- <para>A cell's value is an uninterpreted array of bytes.</para>
- </listitem>
- </varlistentry>
- <varlistentry>
- <term>Timestamp</term>
- <listitem>
- <para>A timestamp is written alongside each value, and is the identifier for a given
- version of a value. By default, the timestamp represents the time on the RegionServer
- when the data was written, but you can specify a different timestamp value when you put
- data into the cell.</para>
- <caution>
- <para>Direct manipulation of timestamps is an advanced feature which is only exposed for
- special cases that are deeply integrated with HBase, and is discouraged in general.
- Encoding a timestamp at the application level is the preferred pattern.</para>
- </caution>
- <para>You can specify the maximum number of versions of a value that HBase retains, per column
- family. When the maximum number of versions is reached, the oldest versions are
- eventually deleted. By default, only the newest version is kept.</para>
- </listitem>
- </varlistentry>
- </variablelist>
-
- <section
- xml:id="conceptual.view">
- <title>Conceptual View</title>
- <para>You can read a very understandable explanation of the HBase data model in the blog post <link
- xlink:href="http://jimbojw.com/wiki/index.php?title=Understanding_Hbase_and_BigTable">Understanding
- HBase and BigTable</link> by Jim R. Wilson. Another good explanation is available in the
- PDF <link
- xlink:href="http://0b4af6cdc2f0c5998459-c0245c5c937c5dedcca3f1764ecc9b2f.r43.cf2.rackcdn.com/9353-login1210_khurana.pdf">Introduction
- to Basic Schema Design</link> by Amandeep Khurana. It may help to read different
- perspectives to get a solid understanding of HBase schema design. The linked articles cover
- the same ground as the information in this section.</para>
- <para> The following example is a slightly modified form of the one on page 2 of the <link
- xlink:href="http://research.google.com/archive/bigtable.html">BigTable</link> paper. There
- is a table called <varname>webtable</varname> that contains two rows
- (<literal>com.cnn.www</literal>
- and <literal>com.example.www</literal>), three column families named
- <varname>contents</varname>, <varname>anchor</varname>, and <varname>people</varname>. In
- this example, for the first row (<literal>com.cnn.www</literal>),
- <varname>anchor</varname> contains two columns (<varname>anchor:cssnsi.com</varname>,
- <varname>anchor:my.look.ca</varname>) and <varname>contents</varname> contains one column
- (<varname>contents:html</varname>). This example contains 5 versions of the row with the
- row key <literal>com.cnn.www</literal>, and one version of the row with the row key
- <literal>com.example.www</literal>. The <varname>contents:html</varname> column qualifier contains the entire
- HTML of a given website. Qualifiers of the <varname>anchor</varname> column family each
- contain the external site which links to the site represented by the row, along with the
- text it used in the anchor of its link. The <varname>people</varname> column family represents
- people associated with the site.
- </para>
- <note>
- <title>Column Names</title>
- <para> By convention, a column name is made of its column family prefix and a
- <emphasis>qualifier</emphasis>. For example, the column
- <emphasis>contents:html</emphasis> is made up of the column family
- <varname>contents</varname> and the <varname>html</varname> qualifier. The colon
- character (<literal>:</literal>) delimits the column family from the column family
- <emphasis>qualifier</emphasis>. </para>
- </note>
- <table
- frame="all">
- <title>Table <varname>webtable</varname></title>
- <tgroup
- cols="5"
- align="left"
- colsep="1"
- rowsep="1">
- <colspec
- colname="c1" />
- <colspec
- colname="c2" />
- <colspec
- colname="c3" />
- <colspec
- colname="c4" />
- <colspec
- colname="c5" />
- <thead>
- <row>
- <entry>Row Key</entry>
- <entry>Time Stamp</entry>
- <entry>ColumnFamily <varname>contents</varname></entry>
- <entry>ColumnFamily <varname>anchor</varname></entry>
- <entry>ColumnFamily <varname>people</varname></entry>
- </row>
- </thead>
- <tbody>
- <row>
- <entry>"com.cnn.www"</entry>
- <entry>t9</entry>
- <entry />
- <entry><varname>anchor:cnnsi.com</varname> = "CNN"</entry>
- <entry />
- </row>
- <row>
- <entry>"com.cnn.www"</entry>
- <entry>t8</entry>
- <entry />
- <entry><varname>anchor:my.look.ca</varname> = "CNN.com"</entry>
- <entry />
- </row>
- <row>
- <entry>"com.cnn.www"</entry>
- <entry>t6</entry>
- <entry><varname>contents:html</varname> = "<html>..."</entry>
- <entry />
- <entry />
- </row>
- <row>
- <entry>"com.cnn.www"</entry>
- <entry>t5</entry>
- <entry><varname>contents:html</varname> = "<html>..."</entry>
- <entry />
- <entry />
- </row>
- <row>
- <entry>"com.cnn.www"</entry>
- <entry>t3</entry>
- <entry><varname>contents:html</varname> = "<html>..."</entry>
- <entry />
- <entry />
- </row>
- <row>
- <entry>"com.example.www"</entry>
- <entry>t5</entry>
- <entry><varname>contents:html</varname> = "<html>..."</entry>
- <entry></entry>
- <entry>people:author = "John Doe"</entry>
- </row>
- </tbody>
- </tgroup>
- </table>
- <para>Cells in this table that appear to be empty do not take space, or in fact exist, in
- HBase. This is what makes HBase "sparse." A tabular view is not the only possible way to
- look at data in HBase, or even the most accurate. The following represents the same
- information as a multi-dimensional map. This is only a mock-up for illustrative
- purposes and may not be strictly accurate.</para>
- <programlisting><![CDATA[
-{
- "com.cnn.www": {
- contents: {
- t6: contents:html: "<html>..."
- t5: contents:html: "<html>..."
- t3: contents:html: "<html>..."
- }
- anchor: {
- t9: anchor:cnnsi.com = "CNN"
- t8: anchor:my.look.ca = "CNN.com"
- }
- people: {}
- }
- "com.example.www": {
- contents: {
- t5: contents:html: "<html>..."
- }
- anchor: {}
- people: {
- t5: people:author: "John Doe"
- }
- }
-}
- ]]></programlisting>
-
- </section>
- <section
- xml:id="physical.view">
- <title>Physical View</title>
- <para> Although at a conceptual level tables may be viewed as a sparse set of rows, they are
- physically stored by column family. A new column qualifier (column_family:column_qualifier)
- can be added to an existing column family at any time.</para>
- <table
- frame="all">
- <title>ColumnFamily <varname>anchor</varname></title>
- <tgroup
- cols="3"
- align="left"
- colsep="1"
- rowsep="1">
- <colspec
- colname="c1" />
- <colspec
- colname="c2" />
- <colspec
- colname="c3" />
- <thead>
- <row>
- <entry>Row Key</entry>
- <entry>Time Stamp</entry>
- <entry>Column Family <varname>anchor</varname></entry>
- </row>
- </thead>
- <tbody>
- <row>
- <entry>"com.cnn.www"</entry>
- <entry>t9</entry>
- <entry><varname>anchor:cnnsi.com</varname> = "CNN"</entry>
- </row>
- <row>
- <entry>"com.cnn.www"</entry>
- <entry>t8</entry>
- <entry><varname>anchor:my.look.ca</varname> = "CNN.com"</entry>
- </row>
- </tbody>
- </tgroup>
- </table>
- <table
- frame="all">
- <title>ColumnFamily <varname>contents</varname></title>
- <tgroup
- cols="3"
- align="left"
- colsep="1"
- rowsep="1">
- <colspec
- colname="c1" />
- <colspec
- colname="c2" />
- <colspec
- colname="c3" />
- <thead>
- <row>
- <entry>Row Key</entry>
- <entry>Time Stamp</entry>
- <entry>ColumnFamily "contents:"</entry>
- </row>
- </thead>
- <tbody>
- <row>
- <entry>"com.cnn.www"</entry>
- <entry>t6</entry>
- <entry><varname>contents:html</varname> = "<html>..."</entry>
- </row>
- <row>
- <entry>"com.cnn.www"</entry>
- <entry>t5</entry>
- <entry><varname>contents:html</varname> = "<html>..."</entry>
- </row>
- <row>
- <entry>"com.cnn.www"</entry>
- <entry>t3</entry>
- <entry><varname>contents:html</varname> = "<html>..."</entry>
- </row>
- </tbody>
- </tgroup>
- </table>
- <para>The empty cells shown in the
- conceptual view are not stored at all.
- Thus a request for the value of the <varname>contents:html</varname> column at time stamp
- <literal>t8</literal> would return no value. Similarly, a request for an
- <varname>anchor:my.look.ca</varname> value at time stamp <literal>t9</literal> would
- return no value. However, if no timestamp is supplied, the most recent value for a
- particular column would be returned. Given multiple versions, the most recent is also the
- first one found, since timestamps
- are stored in descending order. Thus a request for the values of all columns in the row
- <varname>com.cnn.www</varname> if no timestamp is specified would be: the value of
- <varname>contents:html</varname> from timestamp <literal>t6</literal>, the value of
- <varname>anchor:cnnsi.com</varname> from timestamp <literal>t9</literal>, the value of
- <varname>anchor:my.look.ca</varname> from timestamp <literal>t8</literal>. </para>
- <para>For more information about the internals of how Apache HBase stores data, see <xref
- linkend="regions.arch" />. </para>
- </section>
-
- <section
- xml:id="namespace">
- <title>Namespace</title>
- <para> A namespace is a logical grouping of tables analogous to a database in relation
- database systems. This abstraction lays the groundwork for upcoming multi-tenancy related
- features: <itemizedlist>
- <listitem>
- <para>Quota Management (HBASE-8410) - Restrict the amount of resources (ie regions,
- tables) a namespace can consume.</para>
- </listitem>
- <listitem>
- <para>Namespace Security Administration (HBASE-9206) - provide another level of security
- administration for tenants.</para>
- </listitem>
- <listitem>
- <para>Region server groups (HBASE-6721) - A namespace/table can be pinned onto a subset
- of regionservers thus guaranteeing a course level of isolation.</para>
- </listitem>
- </itemizedlist>
- </para>
- <section
- xml:id="namespace_creation">
- <title>Namespace management</title>
- <para> A namespace can be created, removed or altered. Namespace membership is determined
- during table creation by specifying a fully-qualified table name of the form:</para>
-
- <programlisting language="xml"><![CDATA[<table namespace>:<table qualifier>]]></programlisting>
-
-
- <example>
- <title>Examples</title>
-
- <programlisting language="bourne">
-#Create a namespace
-create_namespace 'my_ns'
- </programlisting>
- <programlisting language="bourne">
-#create my_table in my_ns namespace
-create 'my_ns:my_table', 'fam'
- </programlisting>
- <programlisting language="bourne">
-#drop namespace
-drop_namespace 'my_ns'
- </programlisting>
- <programlisting language="bourne">
-#alter namespace
-alter_namespace 'my_ns', {METHOD => 'set', 'PROPERTY_NAME' => 'PROPERTY_VALUE'}
- </programlisting>
- </example>
- </section>
- <section
- xml:id="namespace_special">
- <title>Predefined namespaces</title>
- <para> There are two predefined special namespaces: </para>
- <itemizedlist>
- <listitem>
- <para>hbase - system namespace, used to contain hbase internal tables</para>
- </listitem>
- <listitem>
- <para>default - tables with no explicit specified namespace will automatically fall into
- this namespace.</para>
- </listitem>
- </itemizedlist>
- <example>
- <title>Examples</title>
-
- <programlisting language="bourne">
-#namespace=foo and table qualifier=bar
-create 'foo:bar', 'fam'
-
-#namespace=default and table qualifier=bar
-create 'bar', 'fam'
-</programlisting>
- </example>
- </section>
- </section>
-
- <section
- xml:id="table">
- <title>Table</title>
- <para> Tables are declared up front at schema definition time. </para>
- </section>
-
- <section
- xml:id="row">
- <title>Row</title>
- <para>Row keys are uninterrpreted bytes. Rows are lexicographically sorted with the lowest
- order appearing first in a table. The empty byte array is used to denote both the start and
- end of a tables' namespace.</para>
- </section>
-
- <section
- xml:id="columnfamily">
- <title>Column Family<indexterm><primary>Column Family</primary></indexterm></title>
- <para> Columns in Apache HBase are grouped into <emphasis>column families</emphasis>. All
- column members of a column family have the same prefix. For example, the columns
- <emphasis>courses:history</emphasis> and <emphasis>courses:math</emphasis> are both
- members of the <emphasis>courses</emphasis> column family. The colon character
- (<literal>:</literal>) delimits the column family from the <indexterm><primary>column
- family qualifier</primary><secondary>Column Family Qualifier</secondary></indexterm>.
- The column family prefix must be composed of <emphasis>printable</emphasis> characters. The
- qualifying tail, the column family <emphasis>qualifier</emphasis>, can be made of any
- arbitrary bytes. Column families must be declared up front at schema definition time whereas
- columns do not need to be defined at schema time but can be conjured on the fly while the
- table is up an running.</para>
- <para>Physically, all column family members are stored together on the filesystem. Because
- tunings and storage specifications are done at the column family level, it is advised that
- all column family members have the same general access pattern and size
- characteristics.</para>
-
- </section>
- <section
- xml:id="cells">
- <title>Cells<indexterm><primary>Cells</primary></indexterm></title>
- <para>A <emphasis>{row, column, version} </emphasis>tuple exactly specifies a
- <literal>cell</literal> in HBase. Cell content is uninterrpreted bytes</para>
- </section>
- <section
- xml:id="data_model_operations">
- <title>Data Model Operations</title>
- <para>The four primary data model operations are Get, Put, Scan, and Delete. Operations are
- applied via <link
- xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Table.html">Table</link>
- instances.
- </para>
- <section
- xml:id="get">
- <title>Get</title>
- <para><link
- xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Get.html">Get</link>
- returns attributes for a specified row. Gets are executed via <link
- xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Table.html#get(org.apache.hadoop.hbase.client.Get)">
- Table.get</link>. </para>
- </section>
- <section
- xml:id="put">
- <title>Put</title>
- <para><link
- xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Put.html">Put</link>
- either adds new rows to a table (if the key is new) or can update existing rows (if the
- key already exists). Puts are executed via <link
- xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Table.html#put(org.apache.hadoop.hbase.client.Put)">
- Table.put</link> (writeBuffer) or <link
- xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Table.html#batch(java.util.List, java.lang.Object[])">
- Table.batch</link> (non-writeBuffer). </para>
- </section>
- <section
- xml:id="scan">
- <title>Scans</title>
- <para><link
- xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Scan.html">Scan</link>
- allow iteration over multiple rows for specified attributes. </para>
- <para>The following is an example of a Scan on a Table instance. Assume that a table is
- populated with rows with keys "row1", "row2", "row3", and then another set of rows with
- the keys "abc1", "abc2", and "abc3". The following example shows how to set a Scan
- instance to return the rows beginning with "row".</para>
-<programlisting language="java">
-public static final byte[] CF = "cf".getBytes();
-public static final byte[] ATTR = "attr".getBytes();
-...
-
-Table table = ... // instantiate a Table instance
-
-Scan scan = new Scan();
-scan.addColumn(CF, ATTR);
-scan.setRowPrefixFilter(Bytes.toBytes("row"));
-ResultScanner rs = table.getScanner(scan);
-try {
- for (Result r = rs.next(); r != null; r = rs.next()) {
- // process result...
-} finally {
- rs.close(); // always close the ResultScanner!
-</programlisting>
- <para>Note that generally the easiest way to specify a specific stop point for a scan is by
- using the <link
- xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/filter/InclusiveStopFilter.html">InclusiveStopFilter</link>
- class. </para>
- </section>
- <section
- xml:id="delete">
- <title>Delete</title>
- <para><link
- xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Delete.html">Delete</link>
- removes a row from a table. Deletes are executed via <link
- xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Table.html#delete(org.apache.hadoop.hbase.client.Delete)">
- HTable.delete</link>. </para>
- <para>HBase does not modify data in place, and so deletes are handled by creating new
- markers called <emphasis>tombstones</emphasis>. These tombstones, along with the dead
- values, are cleaned up on major compactions. </para>
- <para>See <xref
- linkend="version.delete" /> for more information on deleting versions of columns, and
- see <xref
- linkend="compaction" /> for more information on compactions. </para>
-
- </section>
-
- </section>
-
-
- <section
- xml:id="versions">
- <title>Versions<indexterm><primary>Versions</primary></indexterm></title>
-
- <para>A <emphasis>{row, column, version} </emphasis>tuple exactly specifies a
- <literal>cell</literal> in HBase. It's possible to have an unbounded number of cells where
- the row and column are the same but the cell address differs only in its version
- dimension.</para>
-
- <para>While rows and column keys are expressed as bytes, the version is specified using a long
- integer. Typically this long contains time instances such as those returned by
- <code>java.util.Date.getTime()</code> or <code>System.currentTimeMillis()</code>, that is:
- <quote>the difference, measured in milliseconds, between the current time and midnight,
- January 1, 1970 UTC</quote>.</para>
-
- <para>The HBase version dimension is stored in decreasing order, so that when reading from a
- store file, the most recent values are found first.</para>
-
- <para>There is a lot of confusion over the semantics of <literal>cell</literal> versions, in
- HBase. In particular:</para>
- <itemizedlist>
- <listitem>
- <para>If multiple writes to a cell have the same version, only the last written is
- fetchable.</para>
- </listitem>
-
- <listitem>
- <para>It is OK to write cells in a non-increasing version order.</para>
- </listitem>
- </itemizedlist>
-
- <para>Below we describe how the version dimension in HBase currently works. See <link
- xlink:href="https://issues.apache.org/jira/browse/HBASE-2406">HBASE-2406</link> for
- discussion of HBase versions. <link
- xlink:href="http://outerthought.org/blog/417-ot.html">Bending time in HBase</link>
- makes for a good read on the version, or time, dimension in HBase. It has more detail on
- versioning than is provided here. As of this writing, the limiitation
- <emphasis>Overwriting values at existing timestamps</emphasis> mentioned in the
- article no longer holds in HBase. This section is basically a synopsis of this article
- by Bruno Dumon.</para>
-
- <section xml:id="specify.number.of.versions">
- <title>Specifying the Number of Versions to Store</title>
- <para>The maximum number of versions to store for a given column is part of the column
- schema and is specified at table creation, or via an <command>alter</command> command, via
- <code>HColumnDescriptor.DEFAULT_VERSIONS</code>. Prior to HBase 0.96, the default number
- of versions kept was <literal>3</literal>, but in 0.96 and newer has been changed to
- <literal>1</literal>.</para>
- <example>
- <title>Modify the Maximum Number of Versions for a Column</title>
- <para>This example uses HBase Shell to keep a maximum of 5 versions of column
- <code>f1</code>. You could also use <link
- xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/HColumnDescriptor.html"
- >HColumnDescriptor</link>.</para>
- <screen><![CDATA[hbase> alter ‘t1′, NAME => ‘f1′, VERSIONS => 5]]></screen>
- </example>
- <example>
- <title>Modify the Minimum Number of Versions for a Column</title>
- <para>You can also specify the minimum number of versions to store. By default, this is
- set to 0, which means the feature is disabled. The following example sets the minimum
- number of versions on field <code>f1</code> to <literal>2</literal>, via HBase Shell.
- You could also use <link
- xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/HColumnDescriptor.html"
- >HColumnDescriptor</link>.</para>
- <screen><![CDATA[hbase> alter ‘t1′, NAME => ‘f1′, MIN_VERSIONS => 2]]></screen>
- </example>
- <para>Starting with HBase 0.98.2, you can specify a global default for the maximum number of
- versions kept for all newly-created columns, by setting
- <option>hbase.column.max.version</option> in <filename>hbase-site.xml</filename>. See
- <xref linkend="hbase.column.max.version"/>.</para>
- </section>
-
- <section
- xml:id="versions.ops">
- <title>Versions and HBase Operations</title>
-
- <para>In this section we look at the behavior of the version dimension for each of the core
- HBase operations.</para>
-
- <section>
- <title>Get/Scan</title>
-
- <para>Gets are implemented on top of Scans. The below discussion of <link
- xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Get.html">Get</link>
- applies equally to <link
- xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Scan.html">Scans</link>.</para>
-
- <para>By default, i.e. if you specify no explicit version, when doing a
- <literal>get</literal>, the cell whose version has the largest value is returned
- (which may or may not be the latest one written, see later). The default behavior can be
- modified in the following ways:</para>
-
- <itemizedlist>
- <listitem>
- <para>to return more than one version, see <link
- xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Get.html#setMaxVersions()">Get.setMaxVersions()</link></para>
- </listitem>
-
- <listitem>
- <para>to return versions other than the latest, see <link
- xlink:href="???">Get.setTimeRange()</link></para>
-
- <para>To retrieve the latest version that is less than or equal to a given value, thus
- giving the 'latest' state of the record at a certain point in time, just use a range
- from 0 to the desired version and set the max versions to 1.</para>
- </listitem>
- </itemizedlist>
-
- </section>
- <section
- xml:id="default_get_example">
- <title>Default Get Example</title>
- <para>The following Get will only retrieve the current version of the row</para>
- <programlisting language="java">
-public static final byte[] CF = "cf".getBytes();
-public static final byte[] ATTR = "attr".getBytes();
-...
-Get get = new Get(Bytes.toBytes("row1"));
-Result r = table.get(get);
-byte[] b = r.getValue(CF, ATTR); // returns current version of value
-</programlisting>
- </section>
- <section
- xml:id="versioned_get_example">
- <title>Versioned Get Example</title>
- <para>The following Get will return the last 3 versions of the row.</para>
- <programlisting language="java">
-public static final byte[] CF = "cf".getBytes();
-public static final byte[] ATTR = "attr".getBytes();
-...
-Get get = new Get(Bytes.toBytes("row1"));
-get.setMaxVersions(3); // will return last 3 versions of row
-Result r = table.get(get);
-byte[] b = r.getValue(CF, ATTR); // returns current version of value
-List<KeyValue> kv = r.getColumn(CF, ATTR); // returns all versions of this column
-</programlisting>
- </section>
-
- <section>
- <title>Put</title>
-
- <para>Doing a put always creates a new version of a <literal>cell</literal>, at a certain
- timestamp. By default the system uses the server's <literal>currentTimeMillis</literal>,
- but you can specify the version (= the long integer) yourself, on a per-column level.
- This means you could assign a time in the past or the future, or use the long value for
- non-time purposes.</para>
-
- <para>To overwrite an existing value, do a put at exactly the same row, column, and
- version as that of the cell you would overshadow.</para>
- <section
- xml:id="implicit_version_example">
- <title>Implicit Version Example</title>
- <para>The following Put will be implicitly versioned by HBase with the current
- time.</para>
- <programlisting language="java">
-public static final byte[] CF = "cf".getBytes();
-public static final byte[] ATTR = "attr".getBytes();
-...
-Put put = new Put(Bytes.toBytes(row));
-put.add(CF, ATTR, Bytes.toBytes( data));
-table.put(put);
-</programlisting>
- </section>
- <section
- xml:id="explicit_version_example">
- <title>Explicit Version Example</title>
- <para>The following Put has the version timestamp explicitly set.</para>
- <programlisting language="java">
-public static final byte[] CF = "cf".getBytes();
-public static final byte[] ATTR = "attr".getBytes();
-...
-Put put = new Put( Bytes.toBytes(row));
-long explicitTimeInMs = 555; // just an example
-put.add(CF, ATTR, explicitTimeInMs, Bytes.toBytes(data));
-table.put(put);
-</programlisting>
- <para>Caution: the version timestamp is internally by HBase for things like time-to-live
- calculations. It's usually best to avoid setting this timestamp yourself. Prefer using
- a separate timestamp attribute of the row, or have the timestamp a part of the rowkey,
- or both. </para>
- </section>
-
- </section>
-
- <section
- xml:id="version.delete">
- <title>Delete</title>
-
- <para>There are three different types of internal delete markers. See Lars Hofhansl's blog
- for discussion of his attempt adding another, <link
- xlink:href="http://hadoop-hbase.blogspot.com/2012/01/scanning-in-hbase.html">Scanning
- in HBase: Prefix Delete Marker</link>. </para>
- <itemizedlist>
- <listitem>
- <para>Delete: for a specific version of a column.</para>
- </listitem>
- <listitem>
- <para>Delete column: for all versions of a column.</para>
- </listitem>
- <listitem>
- <para>Delete family: for all columns of a particular ColumnFamily</para>
- </listitem>
- </itemizedlist>
- <para>When deleting an entire row, HBase will internally create a tombstone for each
- ColumnFamily (i.e., not each individual column). </para>
- <para>Deletes work by creating <emphasis>tombstone</emphasis> markers. For example, let's
- suppose we want to delete a row. For this you can specify a version, or else by default
- the <literal>currentTimeMillis</literal> is used. What this means is <quote>delete all
- cells where the version is less than or equal to this version</quote>. HBase never
- modifies data in place, so for example a delete will not immediately delete (or mark as
- deleted) the entries in the storage file that correspond to the delete condition.
- Rather, a so-called <emphasis>tombstone</emphasis> is written, which will mask the
- deleted values. When HBase does a major compaction, the tombstones are processed to
- actually remove the dead values, together with the tombstones themselves. If the version
- you specified when deleting a row is larger than the version of any value in the row,
- then you can consider the complete row to be deleted.</para>
- <para>For an informative discussion on how deletes and versioning interact, see the thread <link
- xlink:href="http://comments.gmane.org/gmane.comp.java.hadoop.hbase.user/28421">Put w/
- timestamp -> Deleteall -> Put w/ timestamp fails</link> up on the user mailing
- list.</para>
- <para>Also see <xref
- linkend="keyvalue" /> for more information on the internal KeyValue format. </para>
- <para>Delete markers are purged during the next major compaction of the store, unless the
- <option>KEEP_DELETED_CELLS</option> option is set in the column family. To keep the
- deletes for a configurable amount of time, you can set the delete TTL via the
- <option>hbase.hstore.time.to.purge.deletes</option> property in
- <filename>hbase-site.xml</filename>. If
- <option>hbase.hstore.time.to.purge.deletes</option> is not set, or set to 0, all
- delete markers, including those with timestamps in the future, are purged during the
- next major compaction. Otherwise, a delete marker with a timestamp in the future is kept
- until the major compaction which occurs after the time represented by the marker's
- timestamp plus the value of <option>hbase.hstore.time.to.purge.deletes</option>, in
- milliseconds. </para>
- <note>
- <para>This behavior represents a fix for an unexpected change that was introduced in
- HBase 0.94, and was fixed in <link
- xlink:href="https://issues.apache.org/jira/browse/HBASE-10118">HBASE-10118</link>.
- The change has been backported to HBase 0.94 and newer branches.</para>
- </note>
- </section>
- </section>
-
- <section>
- <title>Current Limitations</title>
-
- <section>
- <title>Deletes mask Puts</title>
-
- <para>Deletes mask puts, even puts that happened after the delete
- was entered. See <link xlink:href="https://issues.apache.org/jira/browse/HBASE-2256"
- >HBASE-2256</link>. Remember that a delete writes a tombstone, which only
- disappears after then next major compaction has run. Suppose you do
- a delete of everything <= T. After this you do a new put with a
- timestamp <= T. This put, even if it happened after the delete,
- will be masked by the delete tombstone. Performing the put will not
- fail, but when you do a get you will notice the put did have no
- effect. It will start working again after the major compaction has
- run. These issues should not be a problem if you use
- always-increasing versions for new puts to a row. But they can occur
- even if you do not care about time: just do delete and put
- immediately after each other, and there is some chance they happen
- within the same millisecond.</para>
- </section>
-
- <section
- xml:id="major.compactions.change.query.results">
- <title>Major compactions change query results</title>
-
- <para><quote>...create three cell versions at t1, t2 and t3, with a maximum-versions
- setting of 2. So when getting all versions, only the values at t2 and t3 will be
- returned. But if you delete the version at t2 or t3, the one at t1 will appear again.
- Obviously, once a major compaction has run, such behavior will not be the case
- anymore...</quote> (See <emphasis>Garbage Collection</emphasis> in <link
- xlink:href="http://outerthought.org/blog/417-ot.html">Bending time in
- HBase</link>.)</para>
- </section>
- </section>
- </section>
- <section xml:id="dm.sort">
- <title>Sort Order</title>
- <para>All data model operations HBase return data in sorted order. First by row,
- then by ColumnFamily, followed by column qualifier, and finally timestamp (sorted
- in reverse, so newest records are returned first).
- </para>
- </section>
- <section xml:id="dm.column.metadata">
- <title>Column Metadata</title>
- <para>There is no store of column metadata outside of the internal KeyValue instances for a ColumnFamily.
- Thus, while HBase can support not only a wide number of columns per row, but a heterogenous set of columns
- between rows as well, it is your responsibility to keep track of the column names.
- </para>
- <para>The only way to get a complete set of columns that exist for a ColumnFamily is to process all the rows.
- For more information about how HBase stores data internally, see <xref linkend="keyvalue" />.
- </para>
- </section>
- <section xml:id="joins"><title>Joins</title>
- <para>Whether HBase supports joins is a common question on the dist-list, and there is a simple answer: it doesn't,
- at not least in the way that RDBMS' support them (e.g., with equi-joins or outer-joins in SQL). As has been illustrated
- in this chapter, the read data model operations in HBase are Get and Scan.
- </para>
- <para>However, that doesn't mean that equivalent join functionality can't be supported in your application, but
- you have to do it yourself. The two primary strategies are either denormalizing the data upon writing to HBase,
- or to have lookup tables and do the join between HBase tables in your application or MapReduce code (and as RDBMS'
- demonstrate, there are several strategies for this depending on the size of the tables, e.g., nested loops vs.
- hash-joins). So which is the best approach? It depends on what you are trying to do, and as such there isn't a single
- answer that works for every use case.
- </para>
- </section>
- <section xml:id="acid"><title>ACID</title>
- <para>See <link xlink:href="http://hbase.apache.org/acid-semantics.html">ACID Semantics</link>.
- Lars Hofhansl has also written a note on
- <link xlink:href="http://hadoop-hbase.blogspot.com/2012/03/acid-in-hbase.html">ACID in HBase</link>.</para>
- </section>
- </chapter> <!-- data model -->
-
- <!-- schema design -->
+ <xi:include xmlns:xi="http://www.w3.org/2001/XInclude" href="preface.xml"/>
+ <xi:include xmlns:xi="http://www.w3.org/2001/XInclude" href="getting_started.xml"/>
+ <xi:include xmlns:xi="http://www.w3.org/2001/XInclude" href="configuration.xml"/>
+ <xi:include xmlns:xi="http://www.w3.org/2001/XInclude" href="upgrading.xml"/>
+ <xi:include xmlns:xi="http://www.w3.org/2001/XInclude" href="shell.xml"/>
+ <xi:include xmlns:xi="http://www.w3.org/2001/XInclude" href="datamodel.xml"/>
<xi:include xmlns:xi="http://www.w3.org/2001/XInclude" href="schema_design.xml"/>
-
- <chapter
- xml:id="mapreduce">
- <title>HBase and MapReduce</title>
- <para>Apache MapReduce is a software framework used to analyze large amounts of data, and is
- the framework used most often with <link
- xlink:href="http://hadoop.apache.org/">Apache Hadoop</link>. MapReduce itself is out of the
- scope of this document. A good place to get started with MapReduce is <link
- xlink:href="http://hadoop.apache.org/docs/r1.2.1/mapred_tutorial.html" />. MapReduce version
- 2 (MR2)is now part of <link
- xlink:href="http://hadoop.apache.org/docs/r2.3.0/hadoop-yarn/hadoop-yarn-site/">YARN</link>. </para>
-
- <para> This chapter discusses specific configuration steps you need to take to use MapReduce on
- data within HBase. In addition, it discusses other interactions and issues between HBase and
- MapReduce jobs.
- <note>
- <title>mapred and mapreduce</title>
- <para>There are two mapreduce packages in HBase as in MapReduce itself: <filename>org.apache.hadoop.hbase.mapred</filename>
- and <filename>org.apache.hadoop.hbase.mapreduce</filename>. The former does old-style API and the latter
- the new style. The latter has more facility though you can usually find an equivalent in the older
- package. Pick the package that goes with your mapreduce deploy. When in doubt or starting over, pick the
- <filename>org.apache.hadoop.hbase.mapreduce</filename>. In the notes below, we refer to
- o.a.h.h.mapreduce but replace with the o.a.h.h.mapred if that is what you are using.
- </para>
- </note>
- </para>
-
- <section
- xml:id="hbase.mapreduce.classpath">
- <title>HBase, MapReduce, and the CLASSPATH</title>
- <para>By default, MapReduce jobs deployed to a MapReduce cluster do not have access to either
- the HBase configuration under <envar>$HBASE_CONF_DIR</envar> or the HBase classes.</para>
- <para>To give the MapReduce jobs the access they need, you could add
- <filename>hbase-site.xml</filename> to the
- <filename><replaceable>$HADOOP_HOME</replaceable>/conf/</filename> directory and add the
- HBase JARs to the <filename><replaceable>HADOOP_HOME</replaceable>/conf/</filename>
- directory, then copy these changes across your cluster. You could add hbase-site.xml to
- $HADOOP_HOME/conf and add HBase jars to the $HADOOP_HOME/lib. You would then need to copy
- these changes across your cluster or edit
- <filename><replaceable>$HADOOP_HOME</replaceable>conf/hadoop-env.sh</filename> and add
- them to the <envar>HADOOP_CLASSPATH</envar> variable. However, this approach is not
- recommended because it will pollute your Hadoop install with HBase references. It also
- requires you to restart the Hadoop cluster before Hadoop can use the HBase data.</para>
- <para> Since HBase 0.90.x, HBase adds its dependency JARs to the job configuration itself. The
- dependencies only need to be available on the local CLASSPATH. The following example runs
- the bundled HBase <link
- xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/mapreduce/RowCounter.html">RowCounter</link>
- MapReduce job against a table named <systemitem>usertable</systemitem> If you have not set
- the environment variables expected in the command (the parts prefixed by a
- <literal>$</literal> sign and curly braces), you can use the actual system paths instead.
- Be sure to use the correct version of the HBase JAR for your system. The backticks
- (<literal>`</literal> symbols) cause ths shell to execute the sub-commands, setting the
- CLASSPATH as part of the command. This example assumes you use a BASH-compatible shell. </para>
- <screen language="bourne">$ <userinput>HADOOP_CLASSPATH=`${HBASE_HOME}/bin/hbase classpath` ${HADOOP_HOME}/bin/hadoop jar ${HBASE_HOME}/hbase-server-VERSION.jar rowcounter usertable</userinput></screen>
- <para>When the command runs, internally, the HBase JAR finds the dependencies it needs for
- zookeeper, guava, and its other dependencies on the passed <envar>HADOOP_CLASSPATH</envar>
- and adds the JARs to the MapReduce job configuration. See the source at
- TableMapReduceUtil#addDependencyJars(org.apache.hadoop.mapreduce.Job) for how this is done. </para>
- <note>
- <para> The example may not work if you are running HBase from its build directory rather
- than an installed location. You may see an error like the following:</para>
- <screen>java.lang.RuntimeException: java.lang.ClassNotFoundException: org.apache.hadoop.hbase.mapreduce.RowCounter$RowCounterMapper</screen>
- <para>If this occurs, try modifying the command as follows, so that it uses the HBase JARs
- from the <filename>target/</filename> directory within the build environment.</para>
- <screen language="bourne">$ <userinput>HADOOP_CLASSPATH=${HBASE_HOME}/hbase-server/target/hbase-server-VERSION-SNAPSHOT.jar:`${HBASE_HOME}/bin/hbase classpath` ${HADOOP_HOME}/bin/hadoop jar ${HBASE_HOME}/hbase-server/target/hbase-server-VERSION-SNAPSHOT.jar rowcounter usertable</userinput></screen>
- </note>
- <caution>
- <title>Notice to Mapreduce users of HBase 0.96.1 and above</title>
- <para>Some mapreduce jobs that use HBase fail to launch. The symptom is an exception similar
- to the following:</para>
- <screen>
-Exception in thread "main" java.lang.IllegalAccessError: class
- com.google.protobuf.ZeroCopyLiteralByteString cannot access its superclass
- com.google.protobuf.LiteralByteString
- at java.lang.ClassLoader.defineClass1(Native Method)
- at java.lang.ClassLoader.defineClass(ClassLoader.java:792)
- at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
- at java.net.URLClassLoader.defineClass(URLClassLoader.java:449)
- at java.net.URLClassLoader.access$100(URLClassLoader.java:71)
- at java.net.URLClassLoader$1.run(URLClassLoader.java:361)
- at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
- at java.security.AccessController.doPrivileged(Native Method)
- at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
- at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
- at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
- at
- org.apache.hadoop.hbase.protobuf.ProtobufUtil.toScan(ProtobufUtil.java:818)
- at
- org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.convertScanToString(TableMapReduceUtil.java:433)
- at
- org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.initTableMapperJob(TableMapReduceUtil.java:186)
- at
- org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.initTableMapperJob(TableMapReduceUtil.java:147)
- at
- org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.initTableMapperJob(TableMapReduceUtil.java:270)
- at
- org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.initTableMapperJob(TableMapReduceUtil.java:100)
-...
-</screen>
- <para>This is caused by an optimization introduced in <link
- xlink:href="https://issues.apache.org/jira/browse/HBASE-9867">HBASE-9867</link> that
- inadvertently introduced a classloader dependency. </para>
- <para>This affects both jobs using the <code>-libjars</code> option and "fat jar," those
- which package their runtime dependencies in a nested <code>lib</code> folder.</para>
- <para>In order to satisfy the new classloader requirements, hbase-protocol.jar must be
- included in Hadoop's classpath. See <xref
- linkend="hbase.mapreduce.classpath" /> for current recommendations for resolving
- classpath errors. The following is included for historical purposes.</para>
- <para>This can be resolved system-wide by including a reference to the hbase-protocol.jar in
- hadoop's lib directory, via a symlink or by copying the jar into the new location.</para>
- <para>This can also be achieved on a per-job launch basis by including it in the
- <code>HADOOP_CLASSPATH</code> environment variable at job submission time. When
- launching jobs that package their dependencies, all three of the following job launching
- commands satisfy this requirement:</para>
- <screen language="bourne">
-$ <userinput>HADOOP_CLASSPATH=/path/to/hbase-protocol.jar:/path/to/hbase/conf hadoop jar MyJob.jar MyJobMainClass</userinput>
-$ <userinput>HADOOP_CLASSPATH=$(hbase mapredcp):/path/to/hbase/conf hadoop jar MyJob.jar MyJobMainClass</userinput>
-$ <userinput>HADOOP_CLASSPATH=$(hbase classpath) hadoop jar MyJob.jar MyJobMainClass</userinput>
- </screen>
- <para>For jars that do not package their dependencies, the following command structure is
- necessary:</para>
- <screen language="bourne">
-$ <userinput>HADOOP_CLASSPATH=$(hbase mapredcp):/etc/hbase/conf hadoop jar MyApp.jar MyJobMainClass -libjars $(hbase mapredcp | tr ':' ',')</userinput> ...
- </screen>
- <para>See also <link
- xlink:href="https://issues.apache.org/jira/browse/HBASE-10304">HBASE-10304</link> for
- further discussion of this issue.</para>
- </caution>
- </section>
-
- <section>
- <title>MapReduce Scan Caching</title>
- <para>TableMapReduceUtil now restores the option to set scanner caching (the number of rows
- which are cached before returning the result to the client) on the Scan object that is
- passed in. This functionality was lost due to a bug in HBase 0.95 (<link
- xlink:href="https://issues.apache.org/jira/browse/HBASE-11558">HBASE-11558</link>), which
- is fixed for HBase 0.98.5 and 0.96.3. The priority order for choosing the scanner caching is
- as follows:</para>
- <orderedlist>
- <listitem>
- <para>Caching settings which are set on the scan object.</para>
- </listitem>
- <listitem>
- <para>Caching settings which are specified via the configuration option
- <option>hbase.client.scanner.caching</option>, which can either be set manually in
- <filename>hbase-site.xml</filename> or via the helper method
- <code>TableMapReduceUtil.setScannerCaching()</code>.</para>
- </listitem>
- <listitem>
- <para>The default value <code>HConstants.DEFAULT_HBASE_CLIENT_SCANNER_CACHING</code>, which is set to
- <literal>100</literal>.</para>
- </listitem>
- </orderedlist>
- <para>Optimizing the caching settings is a balance between the time the client waits for a
- result and the number of sets of results the client needs to receive. If the caching setting
- is too large, the client could end up waiting for a long time or the request could even time
- out. If the setting is too small, the scan needs to return results in several pieces.
- If you think of the scan as a shovel, a bigger cache setting is analogous to a bigger
- shovel, and a smaller cache setting is equivalent to more shoveling in order to fill the
- bucket.</para>
- <para>The list of priorities mentioned above allows you to set a reasonable default, and
- override it for specific operations.</para>
- <para>See the API documentation for <link
- xlink:href="https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Scan.html"
- >Scan</link> for more details.</para>
- </section>
-
- <section>
- <title>Bundled HBase MapReduce Jobs</title>
- <para>The HBase JAR also serves as a Driver for some bundled mapreduce jobs. To learn about
- the bundled MapReduce jobs, run the following command.</para>
-
- <screen language="bourne">$ <userinput>${HADOOP_HOME}/bin/hadoop jar ${HBASE_HOME}/hbase-server-VERSION.jar</userinput>
-<computeroutput>An example program must be given as the first argument.
-Valid program names are:
- copytable: Export a table from local cluster to peer cluster
- completebulkload: Complete a bulk data load.
- export: Write table data to HDFS.
- import: Import data written by Export.
- importtsv: Import data in TSV format.
- rowcounter: Count rows in HBase table</computeroutput>
- </screen>
- <para>Each of the valid program names are bundled MapReduce jobs. To run one of the jobs,
- model your command after the following example.</para>
- <screen language="bourne">$ <userinput>${HADOOP_HOME}/bin/hadoop jar ${HBASE_HOME}/hbase-server-VERSION.jar rowcounter myTable</userinput></screen>
- </section>
-
- <section>
- <title>HBase as a MapReduce Job Data Source and Data Sink</title>
- <para>HBase can be used as a data source, <link
- xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/mapreduce/TableInputFormat.html">TableInputFormat</link>,
- and data sink, <link
- xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/mapreduce/TableOutputFormat.html">TableOutputFormat</link>
- or <link
- xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/mapreduce/MultiTableOutputFormat.html">MultiTableOutputFormat</link>,
- for MapReduce jobs. Writing MapReduce jobs that read or write HBase, it is advisable to
- subclass <link
- xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/mapreduce/TableMapper.html">TableMapper</link>
- and/or <link
- xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/mapreduce/TableReducer.html">TableReducer</link>.
- See the do-nothing pass-through classes <link
- xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/mapreduce/IdentityTableMapper.html">IdentityTableMapper</link>
- and <link
- xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/mapreduce/IdentityTableReducer.html">IdentityTableReducer</link>
- for basic usage. For a more involved example, see <link
- xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/mapreduce/RowCounter.html">RowCounter</link>
- or review the <code>org.apache.hadoop.hbase.mapreduce.TestTableMapReduce</code> unit test. </para>
- <para>If you run MapReduce jobs that use HBase as source or sink, need to specify source and
- sink table and column names in your configuration.</para>
-
- <para>When you read from HBase, the <code>TableInputFormat</code> requests the list of regions
- from HBase and makes a map, which is either a <code>map-per-region</code> or
- <code>mapreduce.job.maps</code> map, whichever is smaller. If your job only has two maps,
- raise <code>mapreduce.job.maps</code> to a number greater than the number of regions. Maps
- will run on the adjacent TaskTracker if you are running a TaskTracer and RegionServer per
- node. When writing to HBase, it may make sense to avoid the Reduce step and write back into
- HBase from within your map. This approach works when your job does not need the sort and
- collation that MapReduce does on the map-emitted data. On insert, HBase 'sorts' so there is
- no point double-sorting (and shuffling data around your MapReduce cluster) unless you need
- to. If you do not need the Reduce, you myour map might emit counts of records processed for
- reporting at the end of the jobj, or set the number of Reduces to zero and use
- TableOutputFormat. If running the Reduce step makes sense in your case, you should typically
- use multiple reducers so that load is spread across the HBase cluster.</para>
-
- <para>A new HBase partitioner, the <link
- xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/mapreduce/HRegionPartitioner.html">HRegionPartitioner</link>,
- can run as many reducers the number of existing regions. The HRegionPartitioner is suitable
- when your table is large and your upload will not greatly alter the number of existing
- regions upon completion. Otherwise use the default partitioner. </para>
- </section>
-
- <section>
- <title>Writing HFiles Directly During Bulk Import</title>
- <para>If you are importing into a new table, you can bypass the HBase API and write your
- content directly to the filesystem, formatted into HBase data files (HFiles). Your import
- will run faster, perhaps an order of magnitude faster. For more on how this mechanism works,
- see <xref
- linkend="arch.bulk.load" />.</para>
- </section>
-
- <section>
- <title>RowCounter Example</title>
- <para>The included <link
- xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/mapreduce/RowCounter.html">RowCounter</link>
- MapReduce job uses <code>TableInputFormat</code> and does a count of all rows in the specified
- table. To run it, use the following command: </para>
- <screen language="bourne">$ <userinput>./bin/hadoop jar hbase-X.X.X.jar</userinput></screen>
- <para>This will
- invoke the HBase MapReduce Driver class. Select <literal>rowcounter</literal> from the choice of jobs
- offered. This will print rowcouner usage advice to standard output. Specify the tablename,
- column to count, and output
- directory. If you have classpath errors, see <xref linkend="hbase.mapreduce.classpath" />.</para>
- </section>
-
- <section
- xml:id="splitter">
- <title>Map-Task Splitting</title>
- <section
- xml:id="splitter.default">
- <title>The Default HBase MapReduce Splitter</title>
- <para>When <link
- xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/mapreduce/TableInputFormat.html">TableInputFormat</link>
- is used to source an HBase table in a MapReduce job, its splitter will make a map task for
- each region of the table. Thus, if there are 100 regions in the table, there will be 100
- map-tasks for the job - regardless of how many column families are selected in the
- Scan.</para>
- </section>
- <section
- xml:id="splitter.custom">
- <title>Custom Splitters</title>
- <para>For those interested in implementing custom splitters, see the method
- <code>getSplits</code> in <link
- xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/mapreduce/TableInputFormatBase.html">TableInputFormatBase</link>.
- That is where the logic for map-task assignment resides. </para>
- </section>
- </section>
- <section
- xml:id="mapreduce.example">
- <title>HBase MapReduce Examples</title>
- <section
- xml:id="mapreduce.example.read">
- <title>HBase MapReduce Read Example</title>
- <para>The following is an example of using HBase as a MapReduce source in read-only manner.
- Specifically, there is a Mapper instance but no Reducer, and nothing is being emitted from
- the Mapper. There job would be defined as follows...</para>
- <programlisting language="java">
-Configuration config = HBaseConfiguration.create();
-Job job = new Job(config, "ExampleRead");
-job.setJarByClass(MyReadJob.class); // class that contains mapper
-
-Scan scan = new Scan();
-scan.setCaching(500); // 1 is the default in Scan, which will be bad for MapReduce jobs
-scan.setCacheBlocks(false); // don't set to true for MR jobs
-// set other scan attrs
-...
-
-TableMapReduceUtil.initTableMapperJob(
- tableName, // input HBase table name
- scan, // Scan instance to control CF and attribute selection
- MyMapper.class, // mapper
- null, // mapper output key
- null, // mapper output value
- job);
-job.setOutputFormatClass(NullOutputFormat.class); // because we aren't emitting anything from mapper
-
-boolean b = job.waitForCompletion(true);
-if (!b) {
- throw new IOException("error with job!");
-}
- </programlisting>
- <para>...and the mapper instance would extend <link
- xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/mapreduce/TableMapper.html">TableMapper</link>...</para>
- <programlisting language="java">
-public static class MyMapper extends TableMapper<Text, Text> {
-
- public void map(ImmutableBytesWritable row, Result value, Context context) throws InterruptedException, IOException {
- // process data for the row from the Result instance.
- }
-}
- </programlisting>
- </section>
- <section
- xml:id="mapreduce.example.readwrite">
- <title>HBase MapReduce Read/Write Example</title>
- <para>The following is an example of using HBase both as a source and as a sink with
- MapReduce. This example will simply copy data from one table to another.</para>
- <programlisting language="java">
-Configuration config = HBaseConfiguration.create();
-Job job = new Job(config,"ExampleReadWrite");
-job.setJarByClass(MyReadWriteJob.class); // class that contains mapper
-
-Scan scan = new Scan();
-scan.setCaching(500); // 1 is the default in Scan, which will be bad for MapReduce jobs
-scan.setCacheBlocks(false); // don't set to true for MR jobs
-// set other scan attrs
-
-TableMapReduceUtil.initTableMapperJob(
- sourceTable, // input table
- scan, // Scan instance to control CF and attribute selection
- MyMapper.class, // mapper class
- null, // mapper output key
- null, // mapper output value
- job);
-TableMapReduceUtil.initTableReducerJob(
- targetTable, // output table
- null, // reducer class
- job);
-job.setNumReduceTasks(0);
-
-boolean b = job.waitForCompletion(true);
-if (!b) {
- throw new IOException("error with job!");
-}
- </programlisting>
- <para>An explanation is required of what <classname>TableMapReduceUtil</classname> is doing,
- especially with the reducer. <link
- xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/mapreduce/TableOutputFormat.html">TableOutputFormat</link>
- is being used as the outputFormat class, and several parameters are being set on the
- config (e.g., TableOutputFormat.OUTPUT_TABLE), as well as setting the reducer output key
- to <classname>ImmutableBytesWritable</classname> and reducer value to
- <classname>Writable</classname>. These could be set by the programmer on the job and
- conf, but <classname>TableMapReduceUtil</classname> tries to make things easier.</para>
- <para>The following is the example mapper, which will create a <classname>Put</classname>
- and matching the input <classname>Result</classname> and emit it. Note: this is what the
- CopyTable utility does. </para>
- <programlisting language="java">
-public static class MyMapper extends TableMapper<ImmutableBytesWritable, Put> {
-
- public void map(ImmutableBytesWritable row, Result value, Context context) throws IOException, InterruptedException {
- // this example is just copying the data from the source table...
- context.write(row, resultToPut(row,value));
- }
-
- private static Put resultToPut(ImmutableBytesWritable key, Result result) throws IOException {
- Put put = new Put(key.get());
- for (KeyValue kv : result.raw()) {
- put.add(kv);
- }
- return put;
- }
-}
- </programlisting>
- <para>There isn't actually a reducer step, so <classname>TableOutputFormat</classname> takes
- care of sending the <classname>Put</classname> to the target table. </para>
- <para>This is just an example, developers could choose not to use
- <classname>TableOutputFormat</classname> and connect to the target table themselves.
- </para>
- </section>
- <section
- xml:id="mapreduce.example.readwrite.multi">
- <title>HBase MapReduce Read/Write Example With Multi-Table Output</title>
- <para>TODO: example for <classname>MultiTableOutputFormat</classname>. </para>
- </section>
- <section
- xml:id="mapreduce.example.summary">
- <title>HBase MapReduce Summary to HBase Example</title>
- <para>The following example uses HBase as a MapReduce source and sink with a summarization
- step. This example will count the number of distinct instances of a value in a table and
- write those summarized counts in another table.
- <programlisting language="java">
-Configuration config = HBaseConfiguration.create();
-Job job = new Job(config,"ExampleSummary");
-job.setJarByClass(MySummaryJob.class); // class that contains mapper and reducer
-
-Scan scan = new Scan();
-scan.setCaching(500); // 1 is the default in Scan, which will be bad for MapReduce jobs
-scan.setCacheBlocks(false); // don't set to true for MR jobs
-// set other scan attrs
-
-TableMapReduceUtil.initTableMapperJob(
- sourceTable, // input table
- scan, // Scan instance to control CF and attribute selection
- MyMapper.class, // mapper class
- Text.class, // mapper output key
- IntWritable.class, // mapper output value
- job);
-TableMapReduceUtil.initTableReducerJob(
- targetTable, // output table
- MyTableReducer.class, // reducer class
- job);
-job.setNumReduceTasks(1); // at least one, adjust as required
-
-boolean b = job.waitForCompletion(true);
-if (!b) {
- throw new IOException("error with job!");
-}
- </programlisting>
- In this example mapper a column with a String-value is chosen as the value to summarize
- upon. This value is used as the key to emit from the mapper, and an
- <classname>IntWritable</classname> represents an instance counter.
- <programlisting language="java">
-public static class MyMapper extends TableMapper<Text, IntWritable> {
- public static final byte[] CF = "cf".getBytes();
- public static final byte[] ATTR1 = "attr1".getBytes();
-
- private final IntWritable ONE = new IntWritable(1);
- private Text text = new Text();
-
- public void map(ImmutableBytesWritable row, Result value, Context context) throws IOException, InterruptedException {
- String val = new String(value.getValue(CF, ATTR1));
- text.set(val); // we can only emit Writables...
-
- context.write(text, ONE);
- }
-}
- </programlisting>
- In the reducer, the "ones" are counted (just like any other MR example that does this),
- and then emits a <classname>Put</classname>.
- <programlisting language="java">
-public static class MyTableReducer extends TableReducer<Text, IntWritable, ImmutableBytesWritable> {
- public static final byte[] CF = "cf".getBytes();
- public static final byte[] COUNT = "count".getBytes();
-
- public void reduce(Text key, Iterable<IntWritable> values, Context context) throws IOException, InterruptedException {
- int i = 0;
- for (IntWritable val : values) {
- i += val.get();
- }
- Put put = new Put(Bytes.toBytes(key.toString()));
- put.add(CF, COUNT, Bytes.toBytes(i));
-
- context.write(null, put);
- }
-}
- </programlisting>
- </para>
- </section>
- <section
- xml:id="mapreduce.example.summary.file">
- <title>HBase MapReduce Summary to File Example</title>
- <para>This very similar to the summary example above, with exception that this is using
- HBase as a MapReduce source but HDFS as the sink. The differences are in the job setup and
- in the reducer. The mapper remains the same. </para>
- <programlisting language="java">
-Configuration config = HBaseConfiguration.create();
-Job job = new Job(config,"ExampleSummaryToFile");
-job.setJarByClass(MySummaryFileJob.class); // class that contains mapper and reducer
-
-Scan scan = new Scan();
-scan.setCaching(500); // 1 is the default in Scan, which will be bad for MapReduce jobs
-scan.setCacheBlocks(false); // don't set to true for MR jobs
-// set other scan attrs
-
-TableMapReduceUtil.initTableMapperJob(
- sourceTable, // input table
- scan, // Scan instance to control CF and attribute selection
- MyMapper.class, // mapper class
- Text.class, // mapper output key
- IntWritable.class, // mapper output value
- job);
-job.setReducerClass(MyReducer.class); // reducer class
-job.setNumReduceTasks(1); // at least one, adjust as required
-FileOutputFormat.setOutputPath(job, new Path("/tmp/mr/mySummaryFile")); // adjust directories as required
-
-boolean b = job.waitForCompletion(true);
-if (!b) {
- throw new IOException("error with job!");
-}
- </programlisting>
- <para>As stated above, the previous Mapper can run unchanged with this example. As for the
- Reducer, it is a "generic" Reducer instead of extending TableMapper and emitting
- Puts.</para>
- <programlisting language="java">
- public static class MyReducer extends Reducer<Text, IntWritable, Text, IntWritable> {
-
- public void reduce(Text key, Iterable<IntWritable> values, Context context) throws IOException, InterruptedException {
- int i = 0;
- for (IntWritable val : values) {
- i += val.get();
- }
- context.write(key, new IntWritable(i));
- }
-}
- </programlisting>
- </section>
- <section
- xml:id="mapreduce.example.summary.noreducer">
- <title>HBase MapReduce Summary to HBase Without Reducer</title>
- <para>It is also possible to perform summaries without a reducer - if you use HBase as the
- reducer. </para>
- <para>An HBase target table would need to exist for the job summary. The Table method
- <code>incrementColumnValue</code> would be used to atomically increment values. From a
- performance perspective, it might make sense to keep a Map of values with their values to
- be incremeneted for each map-task, and make one update per key at during the <code>
- cleanup</code> method of the mapper. However, your milage may vary depending on the
- number of rows to be processed and unique keys. </para>
- <para>In the end, the summary results are in HBase. </para>
- </section>
- <section
- xml:id="mapreduce.example.summary.rdbms">
- <title>HBase MapReduce Summary to RDBMS</title>
- <para>Sometimes it is more appropriate to generate summaries to an RDBMS. For these cases,
- it is possible to generate summaries directly to an RDBMS via a custom reducer. The
- <code>setup</code> method can connect to an RDBMS (the connection information can be
- passed via custom parameters in the context) and the cleanup method can close the
- connection. </para>
- <para>It is critical to understand that number of reducers for the job affects the
- summarization implementation, and you'll have to design this into your reducer.
- Specifically, whether it is designed to run as a singleton (one reducer) or multiple
- reducers. Neither is right or wrong, it depends on your use-case. Recognize that the more
- reducers that are assigned to the job, the more simultaneous connections to the RDBMS will
- be created - this will scale, but only to a point. </para>
- <programlisting language="java">
- public static class MyRdbmsReducer extends Reducer<Text, IntWritable, Text, IntWritable> {
-
- private Connection c = null;
-
- public void setup(Context context) {
- // create DB connection...
- }
-
- public void reduce(Text key, Iterable<IntWritable> values, Context context) throws IOException, InterruptedException {
- // do summarization
- // in this example the keys are Text, but this is just an example
- }
-
- public void cleanup(Context context) {
- // close db connection
- }
-
-}
- </programlisting>
- <para>In the end, the summary results are written to your RDBMS table/s. </para>
- </section>
-
- </section>
- <!-- mr examples -->
- <section
- xml:id="mapreduce.htable.access">
- <title>Accessing Other HBase Tables in a MapReduce Job</title>
- <para>Although the framework currently allows one HBase table as input to a MapReduce job,
- other HBase tables can be accessed as lookup tables, etc., in a MapReduce job via creating
- an Table instance in the setup method of the Mapper.
- <programlisting language="java">public class MyMapper extends TableMapper<Text, LongWritable> {
- private Table myOtherTable;
-
- public void setup(Context context) {
- // In here create a Connection to the cluster and save it or use the Connection
- // from the existing table
- myOtherTable = connection.getTable("myOtherTable");
- }
-
- public void map(ImmutableBytesWritable row, Result value, Context context) throws IOException, InterruptedException {
- // process Result...
- // use 'myOtherTable' for lookups
- }
-
- </programlisting>
- </para>
- </section>
- <section
- xml:id="mapreduce.specex">
- <title>Speculative Execution</title>
- <para>It is generally advisable to turn off speculative execution for MapReduce jobs that use
- HBase as a source. This can either be done on a per-Job basis through properties, on on the
- entire cluster. Especially for longer running jobs, speculative execution will create
- duplicate map-tasks which will double-write your data to HBase; this is probably not what
- you want. </para>
- <para>See <xref
- linkend="spec.ex" /> for more information. </para>
- </section>
- </chapter> <!-- mapreduce -->
-
+ <xi:include xmlns:xi="http://www.w3.org/2001/XInclude" href="mapreduce.xml" />
<xi:include xmlns:xi="http://www.w3.org/2001/XInclude" href="security.xml" />
-
- <chapter xml:id="architecture">
- <title>Architecture</title>
- <section xml:id="arch.overview">
- <title>Overview</title>
- <section xml:id="arch.overview.nosql">
- <title>NoSQL?</title>
- <para>HBase is a type of "NoSQL" database. "NoSQL" is a general term meaning that the database isn't an RDBMS which
- supports SQL as its primary access language, but there are many types of NoSQL databases: BerkeleyDB is an
- example of a local NoSQL database, whereas HBase is very much a distributed database. Technically speaking,
- HBase is really more a "Data Store" than "Data Base" because it lacks many of the features you find in an RDBMS,
- such as typed columns, secondary indexes, triggers, and advanced query languages, etc.
- </para>
- <para>However, HBase has many features which supports both linear and modular scaling. HBase clusters expand
- by adding RegionServers that are hosted on commodity class servers. If a cluster expands from 10 to 20
- RegionServers, for example, it doubles both in terms of storage and as well as processing capacity.
- RDBMS can scale well, but only up to a point - specifically, the size of a single database server - and for the best
- performance requires specialized hardware and storage devices. HBase features of note are:
- <itemizedlist>
- <listitem><para>Strongly consistent reads/writes: HBase is not an "eventually consistent" DataStore. This
- makes it very suitable for tasks such as high-speed counter aggregation.</para> </listitem>
- <listitem><para>Automatic sharding: HBase tables are distributed on the cluster via regions, and regions are
- automatically split and re-distributed as your data grows.</para></listitem>
- <listitem><para>Automatic RegionServer failover</para></listitem>
- <listitem><para>Hadoop/HDFS Integration: HBase supports HDFS out of the box as its distributed file system.</para></listitem>
- <listitem><para>MapReduce: HBase supports massively parallelized processing via MapReduce for using HBase as both
- source and sink.</para></listitem>
- <listitem><para>Java Client API: HBase supports an easy to use Java API for programmatic access.</para></listitem>
- <listitem><para>Thrift/REST API: HBase also supports Thrift and REST for non-Java front-ends.</para></listitem>
- <listitem><para>Block Cache and Bloom Filters: HBase supports a Block Cache and Bloom Filters for high volume query optimization.</para></listitem>
- <listitem><para>Operational Management: HBase provides build-in web-pages for operational insight as well as JMX metrics.</para></listitem>
- </itemizedlist>
- </para>
- </section>
-
- <section xml:id="arch.overview.when">
- <title>When Should I Use HBase?</title>
- <para>HBase isn't suitable for every problem.</para>
- <para>First, make sure you have enough data. If you have hundreds of millions or billions of rows, then
- HBase is a good candidate. If you only have a few thousand/million rows, then using a traditional RDBMS
- might be a better choice due to the fact that all of your data might wind up on a single node (or two) and
- the rest of the cluster may be sitting idle.
- </para>
- <para>Second, make sure you can live without all the extra features that an RDBMS provides (e.g., typed columns,
- secondary indexes, transactions, advanced query languages, etc.) An application built against an RDBMS cannot be
- "ported" to HBase by simply changing a JDBC driver, for example. Consider moving from an RDBMS to HBase as a
- complete redesign as opposed to a port.
- </para>
- <para>Third, make sure you have enough hardware. Even HDFS doesn't do well with anything less than
- 5 DataNodes (due to things such as HDFS block replication which has a default of 3), plus a NameNode.
- </para>
- <para>HBase can run quite well stand-alone on a laptop - but this should be considered a development
- configuration only.
- </para>
- </section>
- <section xml:id="arch.overview.hbasehdfs">
- <title>What Is The Difference Between HBase and Hadoop/HDFS?</title>
- <para><link xlink:href="http://hadoop.apache.org/hdfs/">HDFS</link> is a distributed file system that is well suited for the storage of large files.
- Its documentation states that it is not, however, a general purpose file system, and does not provide fast individual record lookups in files.
- HBase, on the other hand, is built on top of HDFS and provides fast record lookups (and updates) for large tables.
- This can sometimes be a point of conceptual confusion. HBase internally puts your data in indexed "StoreFiles" that exist
- on HDFS for high-speed lookups. See the <xref linkend="datamodel" /> and the rest of this chapter for more information on how HBase achieves its goals.
- </para>
- </section>
- </section>
-
- <section
- xml:id="arch.catalog">
- <title>Catalog Tables</title>
- <para>The catalog table <code>hbase:meta</code> exists as an HBase table and is filtered out of the HBase
- shell's <code>list</code> command, but is in fact a table just like any other. </para>
- <section
- xml:id="arch.catalog.root">
- <title>-ROOT-</title>
- <note>
- <para>The <code>-ROOT-</code> table was removed in HBase 0.96.0. Information here should
- be considered historical.</para>
- </note>
- <para>The <code>-ROOT-</code> table kept track of the location of the
- <code>.META</code> table (the previous name for the table now called <code>hbase:meta</code>) prior to HBase
- 0.96. The <code>-ROOT-</code> table structure was as follows: </para>
- <itemizedlist>
- <title>Key</title>
- <listitem>
- <para>.META. region key (<code>.META.,,1</code>)</para>
- </listitem>
- </itemizedlist>
-
- <itemizedlist>
- <title>Values</title>
- <listitem>
- <para><code>info:regioninfo</code> (serialized <link
- xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/HRegionInfo.html">HRegionInfo</link>
- instance of hbase:meta)</para>
- </listitem>
- <listitem>
- <para><code>info:server</code> (server:port of the RegionServer holding
- hbase:meta)</para>
- </listitem>
- <listitem>
- <para><code>info:serverstartcode</code> (start-time of the RegionServer process holding
- hbase:meta)</para>
- </listitem>
- </itemizedlist>
- </section>
- <section
- xml:id="arch.catalog.meta">
- <title>hbase:meta</title>
- <para>The <code>hbase:meta</code> table (previously called <code>.META.</code>) keeps a list
- of all regions in the system. The location of <code>hbase:meta</code> was previously
- tracked within the <code>-ROOT-</code> table, but is now stored in Zookeeper.</para>
- <para>The <code>hbase:meta</code> table structure is as follows: </para>
- <itemizedlist>
- <title>Key</title>
- <listitem>
- <para>Region key of the format (<code>[table],[region start key],[region
- id]</code>)</para>
- </listitem>
- </itemizedlist>
- <itemizedlist>
- <title>Values</title>
- <listitem>
- <para><code>info:regioninfo</code> (serialized <link
- xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/HRegionInfo.html">
- HRegionInfo</link> instance for this region)</para>
- </listitem>
- <listitem>
- <para><code>info:server</code> (server:port of the RegionServer containing this
- region)</para>
- </listitem>
- <listitem>
- <para><code>info:serverstartcode</code> (start-time of the RegionServer process
- containing this region)</para>
- </listitem>
- </itemizedlist>
- <para>When a table is in the process of splitting, two other columns will be created, called
- <code>info:splitA</code> and <code>info:splitB</code>. These columns represent the two
- daughter regions. The values for these columns are also serialized HRegionInfo instances.
- After the region has been split, eventually this row will be deleted. </para>
- <note>
- <title>Note on HRegionInfo</title>
- <para>The empty key is used to denote table start and table end. A region with an empty
- start key is the first region in a table. If a region has both an empty start and an
- empty end key, it is the only region in the table </para>
- </note>
- <para>In the (hopefully unlikely) event that programmatic processing of catalog metadata is
- required, see the <link
- xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/util/Writables.html#getHRegionInfo%28byte[]%29">Writables</link>
- utility. </para>
- </section>
- <section
- xml:id="arch.catalog.startup">
- <title>Startup Sequencing</title>
- <para>First, the location of <code>hbase:meta</code> is looked up in Zookeeper. Next,
- <code>hbase:meta</code> is updated with server and startcode values.</para>
- <para>For information on region-RegionServer assignment, see <xref
- linkend="regions.arch.assignment" />. </para>
- </section>
- </section> <!-- catalog -->
-
- <section
- xml:id="client">
- <title>Client</title>
- <para>The HBase client finds the RegionServers that are serving the particular row range of
- interest. It does this by querying the <code>hbase:meta</code> table. See <xref
- linkend="arch.catalog.meta" /> for details. After locating the required region(s), the
- client contacts the RegionServer serving that region, rather than going through the master,
- and issues the read or write request. This information is cached in the client so that
- subsequent requests need not go through the lookup process. Should a region be reassigned
- either by the master load balancer or because a RegionServer has died, the client will
- requery the catalog tables to determine the new location of the user region. </para>
-
- <para>See <xref
- linkend="master.runtime" /> for more information about the impact of the Master on HBase
- Client communication. </para>
- <para>Administrative functions are done via an instance of <link
- xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Admin.html">Admin</link>
- </para>
-
- <section
- xml:id="client.connections">
- <title>Cluster Connections</title>
- <para>The API changed in HBase 1.0. Its been cleaned up and users are returned
- Interfaces to work against rather than particular types. In HBase 1.0,
- obtain a cluster Connection from ConnectionFactory and thereafter, get from it
- instances of Table, Admin, and RegionLocator on an as-need basis. When done, close
- obtained instances. Finally, be sure to cleanup your Connection instance before
- exiting. Connections are heavyweigh
<TRUNCATED>
[7/8] hbase git commit: HBASE-12738 Chunk Ref Guide into
file-per-chapter
Posted by mi...@apache.org.
http://git-wip-us.apache.org/repos/asf/hbase/blob/a1fe1e09/src/main/docbkx/architecture.xml
----------------------------------------------------------------------
diff --git a/src/main/docbkx/architecture.xml b/src/main/docbkx/architecture.xml
new file mode 100644
index 0000000..16b298a
--- /dev/null
+++ b/src/main/docbkx/architecture.xml
@@ -0,0 +1,3489 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<chapter
+ xml:id="architecture"
+ version="5.0"
+ xmlns="http://docbook.org/ns/docbook"
+ xmlns:xlink="http://www.w3.org/1999/xlink"
+ xmlns:xi="http://www.w3.org/2001/XInclude"
+ xmlns:svg="http://www.w3.org/2000/svg"
+ xmlns:m="http://www.w3.org/1998/Math/MathML"
+ xmlns:html="http://www.w3.org/1999/xhtml"
+ xmlns:db="http://docbook.org/ns/docbook">
+ <!--/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+-->
+
+ <title>Architecture</title>
+ <section xml:id="arch.overview">
+ <title>Overview</title>
+ <section xml:id="arch.overview.nosql">
+ <title>NoSQL?</title>
+ <para>HBase is a type of "NoSQL" database. "NoSQL" is a general term meaning that the database isn't an RDBMS which
+ supports SQL as its primary access language, but there are many types of NoSQL databases: BerkeleyDB is an
+ example of a local NoSQL database, whereas HBase is very much a distributed database. Technically speaking,
+ HBase is really more a "Data Store" than "Data Base" because it lacks many of the features you find in an RDBMS,
+ such as typed columns, secondary indexes, triggers, and advanced query languages, etc.
+ </para>
+ <para>However, HBase has many features which supports both linear and modular scaling. HBase clusters expand
+ by adding RegionServers that are hosted on commodity class servers. If a cluster expands from 10 to 20
+ RegionServers, for example, it doubles both in terms of storage and as well as processing capacity.
+ RDBMS can scale well, but only up to a point - specifically, the size of a single database server - and for the best
+ performance requires specialized hardware and storage devices. HBase features of note are:
+ <itemizedlist>
+ <listitem><para>Strongly consistent reads/writes: HBase is not an "eventually consistent" DataStore. This
+ makes it very suitable for tasks such as high-speed counter aggregation.</para> </listitem>
+ <listitem><para>Automatic sharding: HBase tables are distributed on the cluster via regions, and regions are
+ automatically split and re-distributed as your data grows.</para></listitem>
+ <listitem><para>Automatic RegionServer failover</para></listitem>
+ <listitem><para>Hadoop/HDFS Integration: HBase supports HDFS out of the box as its distributed file system.</para></listitem>
+ <listitem><para>MapReduce: HBase supports massively parallelized processing via MapReduce for using HBase as both
+ source and sink.</para></listitem>
+ <listitem><para>Java Client API: HBase supports an easy to use Java API for programmatic access.</para></listitem>
+ <listitem><para>Thrift/REST API: HBase also supports Thrift and REST for non-Java front-ends.</para></listitem>
+ <listitem><para>Block Cache and Bloom Filters: HBase supports a Block Cache and Bloom Filters for high volume query optimization.</para></listitem>
+ <listitem><para>Operational Management: HBase provides build-in web-pages for operational insight as well as JMX metrics.</para></listitem>
+ </itemizedlist>
+ </para>
+ </section>
+
+ <section xml:id="arch.overview.when">
+ <title>When Should I Use HBase?</title>
+ <para>HBase isn't suitable for every problem.</para>
+ <para>First, make sure you have enough data. If you have hundreds of millions or billions of rows, then
+ HBase is a good candidate. If you only have a few thousand/million rows, then using a traditional RDBMS
+ might be a better choice due to the fact that all of your data might wind up on a single node (or two) and
+ the rest of the cluster may be sitting idle.
+ </para>
+ <para>Second, make sure you can live without all the extra features that an RDBMS provides (e.g., typed columns,
+ secondary indexes, transactions, advanced query languages, etc.) An application built against an RDBMS cannot be
+ "ported" to HBase by simply changing a JDBC driver, for example. Consider moving from an RDBMS to HBase as a
+ complete redesign as opposed to a port.
+ </para>
+ <para>Third, make sure you have enough hardware. Even HDFS doesn't do well with anything less than
+ 5 DataNodes (due to things such as HDFS block replication which has a default of 3), plus a NameNode.
+ </para>
+ <para>HBase can run quite well stand-alone on a laptop - but this should be considered a development
+ configuration only.
+ </para>
+ </section>
+ <section xml:id="arch.overview.hbasehdfs">
+ <title>What Is The Difference Between HBase and Hadoop/HDFS?</title>
+ <para><link xlink:href="http://hadoop.apache.org/hdfs/">HDFS</link> is a distributed file system that is well suited for the storage of large files.
+ Its documentation states that it is not, however, a general purpose file system, and does not provide fast individual record lookups in files.
+ HBase, on the other hand, is built on top of HDFS and provides fast record lookups (and updates) for large tables.
+ This can sometimes be a point of conceptual confusion. HBase internally puts your data in indexed "StoreFiles" that exist
+ on HDFS for high-speed lookups. See the <xref linkend="datamodel" /> and the rest of this chapter for more information on how HBase achieves its goals.
+ </para>
+ </section>
+ </section>
+
+ <section
+ xml:id="arch.catalog">
+ <title>Catalog Tables</title>
+ <para>The catalog table <code>hbase:meta</code> exists as an HBase table and is filtered out of the HBase
+ shell's <code>list</code> command, but is in fact a table just like any other. </para>
+ <section
+ xml:id="arch.catalog.root">
+ <title>-ROOT-</title>
+ <note>
+ <para>The <code>-ROOT-</code> table was removed in HBase 0.96.0. Information here should
+ be considered historical.</para>
+ </note>
+ <para>The <code>-ROOT-</code> table kept track of the location of the
+ <code>.META</code> table (the previous name for the table now called <code>hbase:meta</code>) prior to HBase
+ 0.96. The <code>-ROOT-</code> table structure was as follows: </para>
+ <itemizedlist>
+ <title>Key</title>
+ <listitem>
+ <para>.META. region key (<code>.META.,,1</code>)</para>
+ </listitem>
+ </itemizedlist>
+
+ <itemizedlist>
+ <title>Values</title>
+ <listitem>
+ <para><code>info:regioninfo</code> (serialized <link
+ xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/HRegionInfo.html">HRegionInfo</link>
+ instance of hbase:meta)</para>
+ </listitem>
+ <listitem>
+ <para><code>info:server</code> (server:port of the RegionServer holding
+ hbase:meta)</para>
+ </listitem>
+ <listitem>
+ <para><code>info:serverstartcode</code> (start-time of the RegionServer process holding
+ hbase:meta)</para>
+ </listitem>
+ </itemizedlist>
+ </section>
+ <section
+ xml:id="arch.catalog.meta">
+ <title>hbase:meta</title>
+ <para>The <code>hbase:meta</code> table (previously called <code>.META.</code>) keeps a list
+ of all regions in the system. The location of <code>hbase:meta</code> was previously
+ tracked within the <code>-ROOT-</code> table, but is now stored in Zookeeper.</para>
+ <para>The <code>hbase:meta</code> table structure is as follows: </para>
+ <itemizedlist>
+ <title>Key</title>
+ <listitem>
+ <para>Region key of the format (<code>[table],[region start key],[region
+ id]</code>)</para>
+ </listitem>
+ </itemizedlist>
+ <itemizedlist>
+ <title>Values</title>
+ <listitem>
+ <para><code>info:regioninfo</code> (serialized <link
+ xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/HRegionInfo.html">
+ HRegionInfo</link> instance for this region)</para>
+ </listitem>
+ <listitem>
+ <para><code>info:server</code> (server:port of the RegionServer containing this
+ region)</para>
+ </listitem>
+ <listitem>
+ <para><code>info:serverstartcode</code> (start-time of the RegionServer process
+ containing this region)</para>
+ </listitem>
+ </itemizedlist>
+ <para>When a table is in the process of splitting, two other columns will be created, called
+ <code>info:splitA</code> and <code>info:splitB</code>. These columns represent the two
+ daughter regions. The values for these columns are also serialized HRegionInfo instances.
+ After the region has been split, eventually this row will be deleted. </para>
+ <note>
+ <title>Note on HRegionInfo</title>
+ <para>The empty key is used to denote table start and table end. A region with an empty
+ start key is the first region in a table. If a region has both an empty start and an
+ empty end key, it is the only region in the table </para>
+ </note>
+ <para>In the (hopefully unlikely) event that programmatic processing of catalog metadata is
+ required, see the <link
+ xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/util/Writables.html#getHRegionInfo%28byte[]%29">Writables</link>
+ utility. </para>
+ </section>
+ <section
+ xml:id="arch.catalog.startup">
+ <title>Startup Sequencing</title>
+ <para>First, the location of <code>hbase:meta</code> is looked up in Zookeeper. Next,
+ <code>hbase:meta</code> is updated with server and startcode values.</para>
+ <para>For information on region-RegionServer assignment, see <xref
+ linkend="regions.arch.assignment" />. </para>
+ </section>
+ </section> <!-- catalog -->
+
+ <section
+ xml:id="client">
+ <title>Client</title>
+ <para>The HBase client finds the RegionServers that are serving the particular row range of
+ interest. It does this by querying the <code>hbase:meta</code> table. See <xref
+ linkend="arch.catalog.meta" /> for details. After locating the required region(s), the
+ client contacts the RegionServer serving that region, rather than going through the master,
+ and issues the read or write request. This information is cached in the client so that
+ subsequent requests need not go through the lookup process. Should a region be reassigned
+ either by the master load balancer or because a RegionServer has died, the client will
+ requery the catalog tables to determine the new location of the user region. </para>
+
+ <para>See <xref
+ linkend="master.runtime" /> for more information about the impact of the Master on HBase
+ Client communication. </para>
+ <para>Administrative functions are done via an instance of <link
+ xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Admin.html">Admin</link>
+ </para>
+
+ <section
+ xml:id="client.connections">
+ <title>Cluster Connections</title>
+ <para>The API changed in HBase 1.0. Its been cleaned up and users are returned
+ Interfaces to work against rather than particular types. In HBase 1.0,
+ obtain a cluster Connection from ConnectionFactory and thereafter, get from it
+ instances of Table, Admin, and RegionLocator on an as-need basis. When done, close
+ obtained instances. Finally, be sure to cleanup your Connection instance before
+ exiting. Connections are heavyweight objects. Create once and keep an instance around.
+ Table, Admin and RegionLocator instances are lightweight. Create as you go and then
+ let go as soon as you are done by closing them. See the
+ <link xlink:href="/Users/stack/checkouts/hbase.git/target/site/apidocs/org/apache/hadoop/hbase/client/package-summary.html">Client Package Javadoc Description</link> for example usage of the new HBase 1.0 API.</para>
+
+ <para>For connection configuration information, see <xref linkend="client_dependencies" />. </para>
+
+ <para><emphasis><link
+ xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Table.html">Table</link>
+ instances are not thread-safe</emphasis>. Only one thread can use an instance of Table at
+ any given time. When creating Table instances, it is advisable to use the same <link
+ xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/HBaseConfiguration">HBaseConfiguration</link>
+ instance. This will ensure sharing of ZooKeeper and socket instances to the RegionServers
+ which is usually what you want. For example, this is preferred:</para>
+ <programlisting language="java">HBaseConfiguration conf = HBaseConfiguration.create();
+HTable table1 = new HTable(conf, "myTable");
+HTable table2 = new HTable(conf, "myTable");</programlisting>
+ <para>as opposed to this:</para>
+ <programlisting language="java">HBaseConfiguration conf1 = HBaseConfiguration.create();
+HTable table1 = new HTable(conf1, "myTable");
+HBaseConfiguration conf2 = HBaseConfiguration.create();
+HTable table2 = new HTable(conf2, "myTable");</programlisting>
+
+ <para>For more information about how connections are handled in the HBase client,
+ see <link xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/HConnectionManager.html">HConnectionManager</link>.
+ </para>
+ <section xml:id="client.connection.pooling"><title>Connection Pooling</title>
+ <para>For applications which require high-end multithreaded access (e.g., web-servers or application servers that may serve many application threads
+ in a single JVM), you can pre-create an <classname>HConnection</classname>, as shown in
+ the following example:</para>
+ <example>
+ <title>Pre-Creating a <code>HConnection</code></title>
+ <programlisting language="java">// Create a connection to the cluster.
+HConnection connection = HConnectionManager.createConnection(Configuration);
+HTableInterface table = connection.getTable("myTable");
+// use table as needed, the table returned is lightweight
+table.close();
+// use the connection for other access to the cluster
+connection.close();</programlisting>
+ </example>
+ <para>Constructing HTableInterface implementation is very lightweight and resources are
+ controlled.</para>
+ <warning>
+ <title><code>HTablePool</code> is Deprecated</title>
+ <para>Previous versions of this guide discussed <code>HTablePool</code>, which was
+ deprecated in HBase 0.94, 0.95, and 0.96, and removed in 0.98.1, by <link
+ xlink:href="https://issues.apache.org/jira/browse/HBASE-6580">HBASE-6500</link>.
+ Please use <link xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/HConnection.html"><code>HConnection</code></link> instead.</para>
+ </warning>
+ </section>
+ </section>
+ <section xml:id="client.writebuffer"><title>WriteBuffer and Batch Methods</title>
+ <para>If <xref linkend="perf.hbase.client.autoflush" /> is turned off on
+ <link xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/HTable.html">HTable</link>,
+ <classname>Put</classname>s are sent to RegionServers when the writebuffer
+ is filled. The writebuffer is 2MB by default. Before an HTable instance is
+ discarded, either <methodname>close()</methodname> or
+ <methodname>flushCommits()</methodname> should be invoked so Puts
+ will not be lost.
+ </para>
+ <para>Note: <code>htable.delete(Delete);</code> does not go in the writebuffer! This only applies to Puts.
+ </para>
+ <para>For additional information on write durability, review the <link xlink:href="../acid-semantics.html">ACID semantics</link> page.
+ </para>
+ <para>For fine-grained control of batching of
+ <classname>Put</classname>s or <classname>Delete</classname>s,
+ see the <link xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/HTable.html#batch%28java.util.List%29">batch</link> methods on HTable.
+ </para>
+ </section>
+ <section xml:id="client.external"><title>External Clients</title>
+ <para>Information on non-Java clients and custom protocols is covered in <xref linkend="external_apis" />
+ </para>
+ </section>
+ </section>
+
+ <section xml:id="client.filter"><title>Client Request Filters</title>
+ <para><link xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Get.html">Get</link> and <link xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Scan.html">Scan</link> instances can be
+ optionally configured with <link xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/filter/Filter.html">filters</link> which are applied on the RegionServer.
+ </para>
+ <para>Filters can be confusing because there are many different types, and it is best to approach them by understanding the groups
+ of Filter functionality.
+ </para>
+ <section xml:id="client.filter.structural"><title>Structural</title>
+ <para>Structural Filters contain other Filters.</para>
+ <section xml:id="client.filter.structural.fl"><title>FilterList</title>
+ <para><link xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/filter/FilterList.html">FilterList</link>
+ represents a list of Filters with a relationship of <code>FilterList.Operator.MUST_PASS_ALL</code> or
+ <code>FilterList.Operator.MUST_PASS_ONE</code> between the Filters. The following example shows an 'or' between two
+ Filters (checking for either 'my value' or 'my other value' on the same attribute).</para>
+<programlisting language="java">
+FilterList list = new FilterList(FilterList.Operator.MUST_PASS_ONE);
+SingleColumnValueFilter filter1 = new SingleColumnValueFilter(
+ cf,
+ column,
+ CompareOp.EQUAL,
+ Bytes.toBytes("my value")
+ );
+list.add(filter1);
+SingleColumnValueFilter filter2 = new SingleColumnValueFilter(
+ cf,
+ column,
+ CompareOp.EQUAL,
+ Bytes.toBytes("my other value")
+ );
+list.add(filter2);
+scan.setFilter(list);
+</programlisting>
+ </section>
+ </section>
+ <section
+ xml:id="client.filter.cv">
+ <title>Column Value</title>
+ <section
+ xml:id="client.filter.cv.scvf">
+ <title>SingleColumnValueFilter</title>
+ <para><link
+ xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/filter/SingleColumnValueFilter.html">SingleColumnValueFilter</link>
+ can be used to test column values for equivalence (<code><link
+ xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/filter/CompareFilter.CompareOp.html">CompareOp.EQUAL</link>
+ </code>), inequality (<code>CompareOp.NOT_EQUAL</code>), or ranges (e.g.,
+ <code>CompareOp.GREATER</code>). The following is example of testing equivalence a
+ column to a String value "my value"...</para>
+ <programlisting language="java">
+SingleColumnValueFilter filter = new SingleColumnValueFilter(
+ cf,
+ column,
+ CompareOp.EQUAL,
+ Bytes.toBytes("my value")
+ );
+scan.setFilter(filter);
+</programlisting>
+ </section>
+ </section>
+ <section
+ xml:id="client.filter.cvp">
+ <title>Column Value Comparators</title>
+ <para>There are several Comparator classes in the Filter package that deserve special
+ mention. These Comparators are used in concert with other Filters, such as <xref
+ linkend="client.filter.cv.scvf" />. </para>
+ <section
+ xml:id="client.filter.cvp.rcs">
+ <title>RegexStringComparator</title>
+ <para><link
+ xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/filter/RegexStringComparator.html">RegexStringComparator</link>
+ supports regular expressions for value comparisons.</para>
+ <programlisting language="java">
+RegexStringComparator comp = new RegexStringComparator("my."); // any value that starts with 'my'
+SingleColumnValueFilter filter = new SingleColumnValueFilter(
+ cf,
+ column,
+ CompareOp.EQUAL,
+ comp
+ );
+scan.setFilter(filter);
+</programlisting>
+ <para>See the Oracle JavaDoc for <link
+ xlink:href="http://download.oracle.com/javase/6/docs/api/java/util/regex/Pattern.html">supported
+ RegEx patterns in Java</link>. </para>
+ </section>
+ <section
+ xml:id="client.filter.cvp.SubStringComparator">
+ <title>SubstringComparator</title>
+ <para><link
+ xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/filter/SubstringComparator.html">SubstringComparator</link>
+ can be used to determine if a given substring exists in a value. The comparison is
+ case-insensitive. </para>
+ <programlisting language="java">
+SubstringComparator comp = new SubstringComparator("y val"); // looking for 'my value'
+SingleColumnValueFilter filter = new SingleColumnValueFilter(
+ cf,
+ column,
+ CompareOp.EQUAL,
+ comp
+ );
+scan.setFilter(filter);
+</programlisting>
+ </section>
+ <section
+ xml:id="client.filter.cvp.bfp">
+ <title>BinaryPrefixComparator</title>
+ <para>See <link
+ xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/filter/BinaryPrefixComparator.html">BinaryPrefixComparator</link>.</para>
+ </section>
+ <section
+ xml:id="client.filter.cvp.bc">
+ <title>BinaryComparator</title>
+ <para>See <link
+ xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/filter/BinaryComparator.html">BinaryComparator</link>.</para>
+ </section>
+ </section>
+ <section
+ xml:id="client.filter.kvm">
+ <title>KeyValue Metadata</title>
+ <para>As HBase stores data internally as KeyValue pairs, KeyValue Metadata Filters evaluate
+ the existence of keys (i.e., ColumnFamily:Column qualifiers) for a row, as opposed to
+ values the previous section. </para>
+ <section
+ xml:id="client.filter.kvm.ff">
+ <title>FamilyFilter</title>
+ <para><link
+ xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/filter/FamilyFilter.html">FamilyFilter</link>
+ can be used to filter on the ColumnFamily. It is generally a better idea to select
+ ColumnFamilies in the Scan than to do it with a Filter.</para>
+ </section>
+ <section
+ xml:id="client.filter.kvm.qf">
+ <title>QualifierFilter</title>
+ <para><link
+ xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/filter/QualifierFilter.html">QualifierFilter</link>
+ can be used to filter based on Column (aka Qualifier) name. </para>
+ </section>
+ <section
+ xml:id="client.filter.kvm.cpf">
+ <title>ColumnPrefixFilter</title>
+ <para><link
+ xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/filter/ColumnPrefixFilter.html">ColumnPrefixFilter</link>
+ can be used to filter based on the lead portion of Column (aka Qualifier) names. </para>
+ <para>A ColumnPrefixFilter seeks ahead to the first column matching the prefix in each row
+ and for each involved column family. It can be used to efficiently get a subset of the
+ columns in very wide rows. </para>
+ <para>Note: The same column qualifier can be used in different column families. This
+ filter returns all matching columns. </para>
+ <para>Example: Find all columns in a row and family that start with "abc"</para>
+ <programlisting language="java">
+HTableInterface t = ...;
+byte[] row = ...;
+byte[] family = ...;
+byte[] prefix = Bytes.toBytes("abc");
+Scan scan = new Scan(row, row); // (optional) limit to one row
+scan.addFamily(family); // (optional) limit to one family
+Filter f = new ColumnPrefixFilter(prefix);
+scan.setFilter(f);
+scan.setBatch(10); // set this if there could be many columns returned
+ResultScanner rs = t.getScanner(scan);
+for (Result r = rs.next(); r != null; r = rs.next()) {
+ for (KeyValue kv : r.raw()) {
+ // each kv represents a column
+ }
+}
+rs.close();
+</programlisting>
+ </section>
+ <section
+ xml:id="client.filter.kvm.mcpf">
+ <title>MultipleColumnPrefixFilter</title>
+ <para><link
+ xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/filter/MultipleColumnPrefixFilter.html">MultipleColumnPrefixFilter</link>
+ behaves like ColumnPrefixFilter but allows specifying multiple prefixes. </para>
+ <para>Like ColumnPrefixFilter, MultipleColumnPrefixFilter efficiently seeks ahead to the
+ first column matching the lowest prefix and also seeks past ranges of columns between
+ prefixes. It can be used to efficiently get discontinuous sets of columns from very wide
+ rows. </para>
+ <para>Example: Find all columns in a row and family that start with "abc" or "xyz"</para>
+ <programlisting language="java">
+HTableInterface t = ...;
+byte[] row = ...;
+byte[] family = ...;
+byte[][] prefixes = new byte[][] {Bytes.toBytes("abc"), Bytes.toBytes("xyz")};
+Scan scan = new Scan(row, row); // (optional) limit to one row
+scan.addFamily(family); // (optional) limit to one family
+Filter f = new MultipleColumnPrefixFilter(prefixes);
+scan.setFilter(f);
+scan.setBatch(10); // set this if there could be many columns returned
+ResultScanner rs = t.getScanner(scan);
+for (Result r = rs.next(); r != null; r = rs.next()) {
+ for (KeyValue kv : r.raw()) {
+ // each kv represents a column
+ }
+}
+rs.close();
+</programlisting>
+ </section>
+ <section
+ xml:id="client.filter.kvm.crf ">
+ <title>ColumnRangeFilter</title>
+ <para>A <link
+ xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/filter/ColumnRangeFilter.html">ColumnRangeFilter</link>
+ allows efficient intra row scanning. </para>
+ <para>A ColumnRangeFilter can seek ahead to the first matching column for each involved
+ column family. It can be used to efficiently get a 'slice' of the columns of a very wide
+ row. i.e. you have a million columns in a row but you only want to look at columns
+ bbbb-bbdd. </para>
+ <para>Note: The same column qualifier can be used in different column families. This
+ filter returns all matching columns. </para>
+ <para>Example: Find all columns in a row and family between "bbbb" (inclusive) and "bbdd"
+ (inclusive)</para>
+ <programlisting language="java">
+HTableInterface t = ...;
+byte[] row = ...;
+byte[] family = ...;
+byte[] startColumn = Bytes.toBytes("bbbb");
+byte[] endColumn = Bytes.toBytes("bbdd");
+Scan scan = new Scan(row, row); // (optional) limit to one row
+scan.addFamily(family); // (optional) limit to one family
+Filter f = new ColumnRangeFilter(startColumn, true, endColumn, true);
+scan.setFilter(f);
+scan.setBatch(10); // set this if there could be many columns returned
+ResultScanner rs = t.getScanner(scan);
+for (Result r = rs.next(); r != null; r = rs.next()) {
+ for (KeyValue kv : r.raw()) {
+ // each kv represents a column
+ }
+}
+rs.close();
+</programlisting>
+ <para>Note: Introduced in HBase 0.92</para>
+ </section>
+ </section>
+ <section xml:id="client.filter.row"><title>RowKey</title>
+ <section xml:id="client.filter.row.rf"><title>RowFilter</title>
+ <para>It is generally a better idea to use the startRow/stopRow methods on Scan for row selection, however
+ <link xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/filter/RowFilter.html">RowFilter</link> can also be used.</para>
+ </section>
+ </section>
+ <section xml:id="client.filter.utility"><title>Utility</title>
+ <section xml:id="client.filter.utility.fkof"><title>FirstKeyOnlyFilter</title>
+ <para>This is primarily used for rowcount jobs.
+ See <link xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/filter/FirstKeyOnlyFilter.html">FirstKeyOnlyFilter</link>.</para>
+ </section>
+ </section>
+ </section> <!-- client.filter -->
+
+ <section xml:id="master"><title>Master</title>
+ <para><code>HMaster</code> is the implementation of the Master Server. The Master server is
+ responsible for monitoring all RegionServer instances in the cluster, and is the interface
+ for all metadata changes. In a distributed cluster, the Master typically runs on the <xref
+ linkend="arch.hdfs.nn"/>. J Mohamed Zahoor goes into some more detail on the Master
+ Architecture in this blog posting, <link
+ xlink:href="http://blog.zahoor.in/2012/08/hbase-hmaster-architecture/">HBase HMaster
+ Architecture </link>.</para>
+ <section xml:id="master.startup"><title>Startup Behavior</title>
+ <para>If run in a multi-Master environment, all Masters compete to run the cluster. If the active
+ Master loses its lease in ZooKeeper (or the Master shuts down), then then the remaining Masters jostle to
+ take over the Master role.
+ </para>
+ </section>
+ <section
+ xml:id="master.runtime">
+ <title>Runtime Impact</title>
+ <para>A common dist-list question involves what happens to an HBase cluster when the Master
+ goes down. Because the HBase client talks directly to the RegionServers, the cluster can
+ still function in a "steady state." Additionally, per <xref
+ linkend="arch.catalog" />, <code>hbase:meta</code> exists as an HBase table and is not
+ resident in the Master. However, the Master controls critical functions such as
+ RegionServer failover and completing region splits. So while the cluster can still run for
+ a short time without the Master, the Master should be restarted as soon as possible.
+ </para>
+ </section>
+ <section xml:id="master.api"><title>Interface</title>
+ <para>The methods exposed by <code>HMasterInterface</code> are primarily metadata-oriented methods:
+ <itemizedlist>
+ <listitem><para>Table (createTable, modifyTable, removeTable, enable, disable)
+ </para></listitem>
+ <listitem><para>ColumnFamily (addColumn, modifyColumn, removeColumn)
+ </para></listitem>
+ <listitem><para>Region (move, assign, unassign)
+ </para></listitem>
+ </itemizedlist>
+ For example, when the <code>HBaseAdmin</code> method <code>disableTable</code> is invoked, it is serviced by the Master server.
+ </para>
+ </section>
+ <section xml:id="master.processes"><title>Processes</title>
+ <para>The Master runs several background threads:
+ </para>
+ <section xml:id="master.processes.loadbalancer"><title>LoadBalancer</title>
+ <para>Periodically, and when there are no regions in transition,
+ a load balancer will run and move regions around to balance the cluster's load.
+ See <xref linkend="balancer_config" /> for configuring this property.</para>
+ <para>See <xref linkend="regions.arch.assignment"/> for more information on region assignment.
+ </para>
+ </section>
+ <section xml:id="master.processes.catalog"><title>CatalogJanitor</title>
+ <para>Periodically checks and cleans up the hbase:meta table. See <xref linkend="arch.catalog.meta" /> for more information on META.</para>
+ </section>
+ </section>
+
+ </section>
+ <section
+ xml:id="regionserver.arch">
+ <title>RegionServer</title>
+ <para><code>HRegionServer</code> is the RegionServer implementation. It is responsible for
+ serving and managing regions. In a distributed cluster, a RegionServer runs on a <xref
+ linkend="arch.hdfs.dn" />. </para>
+ <section
+ xml:id="regionserver.arch.api">
+ <title>Interface</title>
+ <para>The methods exposed by <code>HRegionRegionInterface</code> contain both data-oriented
+ and region-maintenance methods: <itemizedlist>
+ <listitem>
+ <para>Data (get, put, delete, next, etc.)</para>
+ </listitem>
+ <listitem>
+ <para>Region (splitRegion, compactRegion, etc.)</para>
+ </listitem>
+ </itemizedlist> For example, when the <code>HBaseAdmin</code> method
+ <code>majorCompact</code> is invoked on a table, the client is actually iterating
+ through all regions for the specified table and requesting a major compaction directly to
+ each region. </para>
+ </section>
+ <section
+ xml:id="regionserver.arch.processes">
+ <title>Processes</title>
+ <para>The RegionServer runs a variety of background threads:</para>
+ <section
+ xml:id="regionserver.arch.processes.compactsplit">
+ <title>CompactSplitThread</title>
+ <para>Checks for splits and handle minor compactions.</para>
+ </section>
+ <section
+ xml:id="regionserver.arch.processes.majorcompact">
+ <title>MajorCompactionChecker</title>
+ <para>Checks for major compactions.</para>
+ </section>
+ <section
+ xml:id="regionserver.arch.processes.memstore">
+ <title>MemStoreFlusher</title>
+ <para>Periodically flushes in-memory writes in the MemStore to StoreFiles.</para>
+ </section>
+ <section
+ xml:id="regionserver.arch.processes.log">
+ <title>LogRoller</title>
+ <para>Periodically checks the RegionServer's WAL.</para>
+ </section>
+ </section>
+
+ <section
+ xml:id="coprocessors">
+ <title>Coprocessors</title>
+ <para>Coprocessors were added in 0.92. There is a thorough <link
+ xlink:href="https://blogs.apache.org/hbase/entry/coprocessor_introduction">Blog Overview
+ of CoProcessors</link> posted. Documentation will eventually move to this reference
+ guide, but the blog is the most current information available at this time. </para>
+ </section>
+
+ <section
+ xml:id="block.cache">
+ <title>Block Cache</title>
+
+ <para>HBase provides two different BlockCache implementations: the default onheap
+ LruBlockCache and BucketCache, which is (usually) offheap. This section
+ discusses benefits and drawbacks of each implementation, how to choose the appropriate
+ option, and configuration options for each.</para>
+
+ <note><title>Block Cache Reporting: UI</title>
+ <para>See the RegionServer UI for detail on caching deploy. Since HBase-0.98.4, the
+ Block Cache detail has been significantly extended showing configurations,
+ sizings, current usage, time-in-the-cache, and even detail on block counts and types.</para>
+ </note>
+
+ <section>
+
+ <title>Cache Choices</title>
+ <para><classname>LruBlockCache</classname> is the original implementation, and is
+ entirely within the Java heap. <classname>BucketCache</classname> is mainly
+ intended for keeping blockcache data offheap, although BucketCache can also
+ keep data onheap and serve from a file-backed cache.
+ <note><title>BucketCache is production ready as of hbase-0.98.6</title>
+ <para>To run with BucketCache, you need HBASE-11678. This was included in
+ hbase-0.98.6.
+ </para>
+ </note>
+ </para>
+
+ <para>Fetching will always be slower when fetching from BucketCache,
+ as compared to the native onheap LruBlockCache. However, latencies tend to be
+ less erratic across time, because there is less garbage collection when you use
+ BucketCache since it is managing BlockCache allocations, not the GC. If the
+ BucketCache is deployed in offheap mode, this memory is not managed by the
+ GC at all. This is why you'd use BucketCache, so your latencies are less erratic and to mitigate GCs
+ and heap fragmentation. See Nick Dimiduk's <link
+ xlink:href="http://www.n10k.com/blog/blockcache-101/">BlockCache 101</link> for
+ comparisons running onheap vs offheap tests. Also see
+ <link xlink:href="http://people.apache.org/~stack/bc/">Comparing BlockCache Deploys</link>
+ which finds that if your dataset fits inside your LruBlockCache deploy, use it otherwise
+ if you are experiencing cache churn (or you want your cache to exist beyond the
+ vagaries of java GC), use BucketCache.
+ </para>
+
+ <para>When you enable BucketCache, you are enabling a two tier caching
+ system, an L1 cache which is implemented by an instance of LruBlockCache and
+ an offheap L2 cache which is implemented by BucketCache. Management of these
+ two tiers and the policy that dictates how blocks move between them is done by
+ <classname>CombinedBlockCache</classname>. It keeps all DATA blocks in the L2
+ BucketCache and meta blocks -- INDEX and BLOOM blocks --
+ onheap in the L1 <classname>LruBlockCache</classname>.
+ See <xref linkend="offheap.blockcache" /> for more detail on going offheap.</para>
+ </section>
+
+ <section xml:id="cache.configurations">
+ <title>General Cache Configurations</title>
+ <para>Apart from the cache implementation itself, you can set some general configuration
+ options to control how the cache performs. See <link
+ xlink:href="http://hbase.apache.org/devapidocs/org/apache/hadoop/hbase/io/hfile/CacheConfig.html"
+ />. After setting any of these options, restart or rolling restart your cluster for the
+ configuration to take effect. Check logs for errors or unexpected behavior.</para>
+ <para>See also <xref linkend="blockcache.prefetch"/>, which discusses a new option
+ introduced in <link xlink:href="https://issues.apache.org/jira/browse/HBASE-9857"
+ >HBASE-9857</link>.</para>
+ </section>
+
+ <section
+ xml:id="block.cache.design">
+ <title>LruBlockCache Design</title>
+ <para>The LruBlockCache is an LRU cache that contains three levels of block priority to
+ allow for scan-resistance and in-memory ColumnFamilies: </para>
+ <itemizedlist>
+ <listitem>
+ <para>Single access priority: The first time a block is loaded from HDFS it normally
+ has this priority and it will be part of the first group to be considered during
+ evictions. The advantage is that scanned blocks are more likely to get evicted than
+ blocks that are getting more usage.</para>
+ </listitem>
+ <listitem>
+ <para>Mutli access priority: If a block in the previous priority group is accessed
+ again, it upgrades to this priority. It is thus part of the second group considered
+ during evictions.</para>
+ </listitem>
+ <listitem xml:id="hbase.cache.inmemory">
+ <para>In-memory access priority: If the block's family was configured to be
+ "in-memory", it will be part of this priority disregarding the number of times it
+ was accessed. Catalog tables are configured like this. This group is the last one
+ considered during evictions.</para>
+ <para>To mark a column family as in-memory, call
+ <programlisting language="java">HColumnDescriptor.setInMemory(true);</programlisting> if creating a table from java,
+ or set <command>IN_MEMORY => true</command> when creating or altering a table in
+ the shell: e.g. <programlisting>hbase(main):003:0> create 't', {NAME => 'f', IN_MEMORY => 'true'}</programlisting></para>
+ </listitem>
+ </itemizedlist>
+ <para> For more information, see the <link
+ xlink:href="http://hbase.apache.org/xref/org/apache/hadoop/hbase/io/hfile/LruBlockCache.html">LruBlockCache
+ source</link>
+ </para>
+ </section>
+ <section
+ xml:id="block.cache.usage">
+ <title>LruBlockCache Usage</title>
+ <para>Block caching is enabled by default for all the user tables which means that any
+ read operation will load the LRU cache. This might be good for a large number of use
+ cases, but further tunings are usually required in order to achieve better performance.
+ An important concept is the <link
+ xlink:href="http://en.wikipedia.org/wiki/Working_set_size">working set size</link>, or
+ WSS, which is: "the amount of memory needed to compute the answer to a problem". For a
+ website, this would be the data that's needed to answer the queries over a short amount
+ of time. </para>
+ <para>The way to calculate how much memory is available in HBase for caching is: </para>
+ <programlisting>
+ number of region servers * heap size * hfile.block.cache.size * 0.99
+ </programlisting>
+ <para>The default value for the block cache is 0.25 which represents 25% of the available
+ heap. The last value (99%) is the default acceptable loading factor in the LRU cache
+ after which eviction is started. The reason it is included in this equation is that it
+ would be unrealistic to say that it is possible to use 100% of the available memory
+ since this would make the process blocking from the point where it loads new blocks.
+ Here are some examples: </para>
+ <itemizedlist>
+ <listitem>
+ <para>One region server with the default heap size (1 GB) and the default block cache
+ size will have 253 MB of block cache available.</para>
+ </listitem>
+ <listitem>
+ <para>20 region servers with the heap size set to 8 GB and a default block cache size
+ will have 39.6 of block cache.</para>
+ </listitem>
+ <listitem>
+ <para>100 region servers with the heap size set to 24 GB and a block cache size of 0.5
+ will have about 1.16 TB of block cache.</para>
+ </listitem>
+ </itemizedlist>
+ <para>Your data is not the only resident of the block cache. Here are others that you may have to take into account:
+ </para>
+ <variablelist>
+ <varlistentry>
+ <term>Catalog Tables</term>
+ <listitem>
+ <para>The <code>-ROOT-</code> (prior to HBase 0.96. See <xref
+ linkend="arch.catalog.root" />) and <code>hbase:meta</code> tables are forced
+ into the block cache and have the in-memory priority which means that they are
+ harder to evict. The former never uses more than a few hundreds of bytes while the
+ latter can occupy a few MBs (depending on the number of regions).</para>
+ </listitem>
+ </varlistentry>
+ <varlistentry>
+ <term>HFiles Indexes</term>
+ <listitem>
+ <para>An <firstterm>hfile</firstterm> is the file format that HBase uses to store
+ data in HDFS. It contains a multi-layered index which allows HBase to seek to the
+ data without having to read the whole file. The size of those indexes is a factor
+ of the block size (64KB by default), the size of your keys and the amount of data
+ you are storing. For big data sets it's not unusual to see numbers around 1GB per
+ region server, although not all of it will be in cache because the LRU will evict
+ indexes that aren't used.</para>
+ </listitem>
+ </varlistentry>
+ <varlistentry>
+ <term>Keys</term>
+ <listitem>
+ <para>The values that are stored are only half the picture, since each value is
+ stored along with its keys (row key, family qualifier, and timestamp). See <xref
+ linkend="keysize" />.</para>
+ </listitem>
+ </varlistentry>
+ <varlistentry>
+ <term>Bloom Filters</term>
+ <listitem>
+ <para>Just like the HFile indexes, those data structures (when enabled) are stored
+ in the LRU.</para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ <para>Currently the recommended way to measure HFile indexes and bloom filters sizes is to
+ look at the region server web UI and checkout the relevant metrics. For keys, sampling
+ can be done by using the HFile command line tool and look for the average key size
+ metric. Since HBase 0.98.3, you can view detail on BlockCache stats and metrics
+ in a special Block Cache section in the UI.</para>
+ <para>It's generally bad to use block caching when the WSS doesn't fit in memory. This is
+ the case when you have for example 40GB available across all your region servers' block
+ caches but you need to process 1TB of data. One of the reasons is that the churn
+ generated by the evictions will trigger more garbage collections unnecessarily. Here are
+ two use cases: </para>
+ <itemizedlist>
+ <listitem>
+ <para>Fully random reading pattern: This is a case where you almost never access the
+ same row twice within a short amount of time such that the chance of hitting a
+ cached block is close to 0. Setting block caching on such a table is a waste of
+ memory and CPU cycles, more so that it will generate more garbage to pick up by the
+ JVM. For more information on monitoring GC, see <xref
+ linkend="trouble.log.gc" />.</para>
+ </listitem>
+ <listitem>
+ <para>Mapping a table: In a typical MapReduce job that takes a table in input, every
+ row will be read only once so there's no need to put them into the block cache. The
+ Scan object has the option of turning this off via the setCaching method (set it to
+ false). You can still keep block caching turned on on this table if you need fast
+ random read access. An example would be counting the number of rows in a table that
+ serves live traffic, caching every block of that table would create massive churn
+ and would surely evict data that's currently in use. </para>
+ </listitem>
+ </itemizedlist>
+ <section xml:id="data.blocks.in.fscache">
+ <title>Caching META blocks only (DATA blocks in fscache)</title>
+ <para>An interesting setup is one where we cache META blocks only and we read DATA
+ blocks in on each access. If the DATA blocks fit inside fscache, this alternative
+ may make sense when access is completely random across a very large dataset.
+ To enable this setup, alter your table and for each column family
+ set <varname>BLOCKCACHE => 'false'</varname>. You are 'disabling' the
+ BlockCache for this column family only you can never disable the caching of
+ META blocks. Since
+ <link xlink:href="https://issues.apache.org/jira/browse/HBASE-4683">HBASE-4683 Always cache index and bloom blocks</link>,
+ we will cache META blocks even if the BlockCache is disabled.
+ </para>
+ </section>
+ </section>
+ <section
+ xml:id="offheap.blockcache">
+ <title>Offheap Block Cache</title>
+ <section xml:id="enable.bucketcache">
+ <title>How to Enable BucketCache</title>
+ <para>The usual deploy of BucketCache is via a managing class that sets up two caching tiers: an L1 onheap cache
+ implemented by LruBlockCache and a second L2 cache implemented with BucketCache. The managing class is <link
+ xlink:href="http://hbase.apache.org/devapidocs/org/apache/hadoop/hbase/io/hfile/CombinedBlockCache.html">CombinedBlockCache</link> by default.
+ The just-previous link describes the caching 'policy' implemented by CombinedBlockCache. In short, it works
+ by keeping meta blocks -- INDEX and BLOOM in the L1, onheap LruBlockCache tier -- and DATA
+ blocks are kept in the L2, BucketCache tier. It is possible to amend this behavior in
+ HBase since version 1.0 and ask that a column family have both its meta and DATA blocks hosted onheap in the L1 tier by
+ setting <varname>cacheDataInL1</varname> via
+ <code>(HColumnDescriptor.setCacheDataInL1(true)</code>
+ or in the shell, creating or amending column families setting <varname>CACHE_DATA_IN_L1</varname>
+ to true: e.g. <programlisting>hbase(main):003:0> create 't', {NAME => 't', CONFIGURATION => {CACHE_DATA_IN_L1 => 'true'}}</programlisting></para>
+
+ <para>The BucketCache Block Cache can be deployed onheap, offheap, or file based.
+ You set which via the
+ <varname>hbase.bucketcache.ioengine</varname> setting. Setting it to
+ <varname>heap</varname> will have BucketCache deployed inside the
+ allocated java heap. Setting it to <varname>offheap</varname> will have
+ BucketCache make its allocations offheap,
+ and an ioengine setting of <varname>file:PATH_TO_FILE</varname> will direct
+ BucketCache to use a file caching (Useful in particular if you have some fast i/o attached to the box such
+ as SSDs).
+ </para>
+ <para xml:id="raw.l1.l2">It is possible to deploy an L1+L2 setup where we bypass the CombinedBlockCache
+ policy and have BucketCache working as a strict L2 cache to the L1
+ LruBlockCache. For such a setup, set <varname>CacheConfig.BUCKET_CACHE_COMBINED_KEY</varname> to
+ <literal>false</literal>. In this mode, on eviction from L1, blocks go to L2.
+ When a block is cached, it is cached first in L1. When we go to look for a cached block,
+ we look first in L1 and if none found, then search L2. Let us call this deploy format,
+ <emphasis><indexterm><primary>Raw L1+L2</primary></indexterm></emphasis>.</para>
+ <para>Other BucketCache configs include: specifying a location to persist cache to across
+ restarts, how many threads to use writing the cache, etc. See the
+ <link xlink:href="https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/io/hfile/CacheConfig.html">CacheConfig.html</link>
+ class for configuration options and descriptions.</para>
+
+ <procedure>
+ <title>BucketCache Example Configuration</title>
+ <para>This sample provides a configuration for a 4 GB offheap BucketCache with a 1 GB
+ onheap cache. Configuration is performed on the RegionServer. Setting
+ <varname>hbase.bucketcache.ioengine</varname> and
+ <varname>hbase.bucketcache.size</varname> > 0 enables CombinedBlockCache.
+ Let us presume that the RegionServer has been set to run with a 5G heap:
+ i.e. HBASE_HEAPSIZE=5g.
+ </para>
+ <step>
+ <para>First, edit the RegionServer's <filename>hbase-env.sh</filename> and set
+ <varname>HBASE_OFFHEAPSIZE</varname> to a value greater than the offheap size wanted, in
+ this case, 4 GB (expressed as 4G). Lets set it to 5G. That'll be 4G
+ for our offheap cache and 1G for any other uses of offheap memory (there are
+ other users of offheap memory other than BlockCache; e.g. DFSClient
+ in RegionServer can make use of offheap memory). See <xref linkend="direct.memory" />.</para>
+ <programlisting>HBASE_OFFHEAPSIZE=5G</programlisting>
+ </step>
+ <step>
+ <para>Next, add the following configuration to the RegionServer's
+ <filename>hbase-site.xml</filename>.</para>
+ <programlisting language="xml">
+<![CDATA[<property>
+ <name>hbase.bucketcache.ioengine</name>
+ <value>offheap</value>
+</property>
+<property>
+ <name>hfile.block.cache.size</name>
+ <value>0.2</value>
+</property>
+<property>
+ <name>hbase.bucketcache.size</name>
+ <value>4196</value>
+</property>]]>
+ </programlisting>
+ </step>
+ <step>
+ <para>Restart or rolling restart your cluster, and check the logs for any
+ issues.</para>
+ </step>
+ </procedure>
+ <para>In the above, we set bucketcache to be 4G. The onheap lrublockcache we
+ configured to have 0.2 of the RegionServer's heap size (0.2 * 5G = 1G).
+ In other words, you configure the L1 LruBlockCache as you would normally,
+ as you would when there is no L2 BucketCache present.
+ </para>
+ <para><link xlink:href="https://issues.apache.org/jira/browse/HBASE-10641"
+ >HBASE-10641</link> introduced the ability to configure multiple sizes for the
+ buckets of the bucketcache, in HBase 0.98 and newer. To configurable multiple bucket
+ sizes, configure the new property <option>hfile.block.cache.sizes</option> (instead of
+ <option>hfile.block.cache.size</option>) to a comma-separated list of block sizes,
+ ordered from smallest to largest, with no spaces. The goal is to optimize the bucket
+ sizes based on your data access patterns. The following example configures buckets of
+ size 4096 and 8192.</para>
+ <screen language="xml"><![CDATA[
+<property>
+ <name>hfile.block.cache.sizes</name>
+ <value>4096,8192</value>
+</property>
+ ]]></screen>
+ <note xml:id="direct.memory">
+ <title>Direct Memory Usage In HBase</title>
+ <para>The default maximum direct memory varies by JVM. Traditionally it is 64M
+ or some relation to allocated heap size (-Xmx) or no limit at all (JDK7 apparently).
+ HBase servers use direct memory, in particular short-circuit reading, the hosted DFSClient will
+ allocate direct memory buffers. If you do offheap block caching, you'll
+ be making use of direct memory. Starting your JVM, make sure
+ the <varname>-XX:MaxDirectMemorySize</varname> setting in
+ <filename>conf/hbase-env.sh</filename> is set to some value that is
+ higher than what you have allocated to your offheap blockcache
+ (<varname>hbase.bucketcache.size</varname>). It should be larger than your offheap block
+ cache and then some for DFSClient usage (How much the DFSClient uses is not
+ easy to quantify; it is the number of open hfiles * <varname>hbase.dfs.client.read.shortcircuit.buffer.size</varname>
+ where hbase.dfs.client.read.shortcircuit.buffer.size is set to 128k in HBase -- see <filename>hbase-default.xml</filename>
+ default configurations).
+ Direct memory, which is part of the Java process heap, is separate from the object
+ heap allocated by -Xmx. The value allocated by MaxDirectMemorySize must not exceed
+ physical RAM, and is likely to be less than the total available RAM due to other
+ memory requirements and system constraints.
+ </para>
+ <para>You can see how much memory -- onheap and offheap/direct -- a RegionServer is
+ configured to use and how much it is using at any one time by looking at the
+ <emphasis>Server Metrics: Memory</emphasis> tab in the UI. It can also be gotten
+ via JMX. In particular the direct memory currently used by the server can be found
+ on the <varname>java.nio.type=BufferPool,name=direct</varname> bean. Terracotta has
+ a <link
+ xlink:href="http://terracotta.org/documentation/4.0/bigmemorygo/configuration/storage-options"
+ >good write up</link> on using offheap memory in java. It is for their product
+ BigMemory but alot of the issues noted apply in general to any attempt at going
+ offheap. Check it out.</para>
+ </note>
+ <note xml:id="hbase.bucketcache.percentage.in.combinedcache"><title>hbase.bucketcache.percentage.in.combinedcache</title>
+ <para>This is a pre-HBase 1.0 configuration removed because it
+ was confusing. It was a float that you would set to some value
+ between 0.0 and 1.0. Its default was 0.9. If the deploy was using
+ CombinedBlockCache, then the LruBlockCache L1 size was calculated to
+ be (1 - <varname>hbase.bucketcache.percentage.in.combinedcache</varname>) * <varname>size-of-bucketcache</varname>
+ and the BucketCache size was <varname>hbase.bucketcache.percentage.in.combinedcache</varname> * size-of-bucket-cache.
+ where size-of-bucket-cache itself is EITHER the value of the configuration hbase.bucketcache.size
+ IF it was specified as megabytes OR <varname>hbase.bucketcache.size</varname> * <varname>-XX:MaxDirectMemorySize</varname> if
+ <varname>hbase.bucketcache.size</varname> between 0 and 1.0.
+ </para>
+ <para>In 1.0, it should be more straight-forward. L1 LruBlockCache size
+ is set as a fraction of java heap using hfile.block.cache.size setting
+ (not the best name) and L2 is set as above either in absolute
+ megabytes or as a fraction of allocated maximum direct memory.
+ </para>
+ </note>
+ </section>
+ </section>
+ <section>
+ <title>Comprewssed Blockcache</title>
+ <para><link xlink:href="https://issues.apache.org/jira/browse/HBASE-11331"
+ >HBASE-11331</link> introduced lazy blockcache decompression, more simply referred to
+ as compressed blockcache. When compressed blockcache is enabled. data and encoded data
+ blocks are cached in the blockcache in their on-disk format, rather than being
+ decompressed and decrypted before caching.</para>
+ <para xlink:href="https://issues.apache.org/jira/browse/HBASE-11331">For a RegionServer
+ hosting more data than can fit into cache, enabling this feature with SNAPPY compression
+ has been shown to result in 50% increase in throughput and 30% improvement in mean
+ latency while, increasing garbage collection by 80% and increasing overall CPU load by
+ 2%. See HBASE-11331 for more details about how performance was measured and achieved.
+ For a RegionServer hosting data that can comfortably fit into cache, or if your workload
+ is sensitive to extra CPU or garbage-collection load, you may receive less
+ benefit.</para>
+ <para>Compressed blockcache is disabled by default. To enable it, set
+ <code>hbase.block.data.cachecompressed</code> to <code>true</code> in
+ <filename>hbase-site.xml</filename> on all RegionServers.</para>
+ </section>
+ </section>
+
+ <section
+ xml:id="wal">
+ <title>Write Ahead Log (WAL)</title>
+
+ <section
+ xml:id="purpose.wal">
+ <title>Purpose</title>
+ <para>The <firstterm>Write Ahead Log (WAL)</firstterm> records all changes to data in
+ HBase, to file-based storage. Under normal operations, the WAL is not needed because
+ data changes move from the MemStore to StoreFiles. However, if a RegionServer crashes or
+ becomes unavailable before the MemStore is flushed, the WAL ensures that the changes to
+ the data can be replayed. If writing to the WAL fails, the entire operation to modify the
+ data fails.</para>
+ <para>
+ HBase uses an implementation of the <link xlink:href=
+ "http://hbase.apache.org/devapidocs/org/apache/hadoop/hbase/wal/WAL.html"
+ >WAL</link> interface. Usually, there is only one instance of a WAL per RegionServer.
+ The RegionServer records Puts and Deletes to it, before recording them to the <xref
+ linkend="store.memstore" /> for the affected <xref
+ linkend="store" />.
+ </para>
+ <note>
+ <title>The HLog</title>
+ <para>
+ Prior to 2.0, the interface for WALs in HBase was named <classname>HLog</classname>.
+ In 0.94, HLog was the name of the implementation of the WAL. You will likely find
+ references to the HLog in documentation tailored to these older versions.
+ </para>
+ </note>
+ <para>The WAL resides in HDFS in the <filename>/hbase/WALs/</filename> directory (prior to
+ HBase 0.94, they were stored in <filename>/hbase/.logs/</filename>), with subdirectories per
+ region.</para>
+ <para> For more general information about the concept of write ahead logs, see the
+ Wikipedia <link
+ xlink:href="http://en.wikipedia.org/wiki/Write-ahead_logging">Write-Ahead Log</link>
+ article. </para>
+ </section>
+ <section
+ xml:id="wal_flush">
+ <title>WAL Flushing</title>
+ <para>TODO (describe). </para>
+ </section>
+
+ <section
+ xml:id="wal_splitting">
+ <title>WAL Splitting</title>
+
+ <para>A RegionServer serves many regions. All of the regions in a region server share the
+ same active WAL file. Each edit in the WAL file includes information about which region
+ it belongs to. When a region is opened, the edits in the WAL file which belong to that
+ region need to be replayed. Therefore, edits in the WAL file must be grouped by region
+ so that particular sets can be replayed to regenerate the data in a particular region.
+ The process of grouping the WAL edits by region is called <firstterm>log
+ splitting</firstterm>. It is a critical process for recovering data if a region server
+ fails.</para>
+ <para>Log splitting is done by the HMaster during cluster start-up or by the ServerShutdownHandler
+ as a region server shuts down. So that consistency is guaranteed, affected regions
+ are unavailable until data is restored. All WAL edits need to be recovered and replayed
+ before a given region can become available again. As a result, regions affected by
+ log splitting are unavailable until the process completes.</para>
+ <procedure xml:id="log.splitting.step.by.step">
+ <title>Log Splitting, Step by Step</title>
+ <step>
+ <title>The <filename>/hbase/WALs/<host>,<port>,<startcode></filename> directory is renamed.</title>
+ <para>Renaming the directory is important because a RegionServer may still be up and
+ accepting requests even if the HMaster thinks it is down. If the RegionServer does
+ not respond immediately and does not heartbeat its ZooKeeper session, the HMaster
+ may interpret this as a RegionServer failure. Renaming the logs directory ensures
+ that existing, valid WAL files which are still in use by an active but busy
+ RegionServer are not written to by accident.</para>
+ <para>The new directory is named according to the following pattern:</para>
+ <screen><![CDATA[/hbase/WALs/<host>,<port>,<startcode>-splitting]]></screen>
+ <para>An example of such a renamed directory might look like the following:</para>
+ <screen>/hbase/WALs/srv.example.com,60020,1254173957298-splitting</screen>
+ </step>
+ <step>
+ <title>Each log file is split, one at a time.</title>
+ <para>The log splitter reads the log file one edit entry at a time and puts each edit
+ entry into the buffer corresponding to the edit’s region. At the same time, the
+ splitter starts several writer threads. Writer threads pick up a corresponding
+ buffer and write the edit entries in the buffer to a temporary recovered edit
+ file. The temporary edit file is stored to disk with the following naming pattern:</para>
+ <screen><![CDATA[/hbase/<table_name>/<region_id>/recovered.edits/.temp]]></screen>
+ <para>This file is used to store all the edits in the WAL log for this region. After
+ log splitting completes, the <filename>.temp</filename> file is renamed to the
+ sequence ID of the first log written to the file.</para>
+ <para>To determine whether all edits have been written, the sequence ID is compared to
+ the sequence of the last edit that was written to the HFile. If the sequence of the
+ last edit is greater than or equal to the sequence ID included in the file name, it
+ is clear that all writes from the edit file have been completed.</para>
+ </step>
+ <step>
+ <title>After log splitting is complete, each affected region is assigned to a
+ RegionServer.</title>
+ <para> When the region is opened, the <filename>recovered.edits</filename> folder is checked for recovered
+ edits files. If any such files are present, they are replayed by reading the edits
+ and saving them to the MemStore. After all edit files are replayed, the contents of
+ the MemStore are written to disk (HFile) and the edit files are deleted.</para>
+ </step>
+ </procedure>
+
+ <section>
+ <title>Handling of Errors During Log Splitting</title>
+
+ <para>If you set the <varname>hbase.hlog.split.skip.errors</varname> option to
+ <constant>true</constant>, errors are treated as follows:</para>
+ <itemizedlist>
+ <listitem>
+ <para>Any error encountered during splitting will be logged.</para>
+ </listitem>
+ <listitem>
+ <para>The problematic WAL log will be moved into the <filename>.corrupt</filename>
+ directory under the hbase <varname>rootdir</varname>,</para>
+ </listitem>
+ <listitem>
+ <para>Processing of the WAL will continue</para>
+ </listitem>
+ </itemizedlist>
+ <para>If the <varname>hbase.hlog.split.skip.errors</varname> optionset to
+ <literal>false</literal>, the default, the exception will be propagated and the
+ split will be logged as failed. See <link
+ xlink:href="https://issues.apache.org/jira/browse/HBASE-2958">HBASE-2958 When
+ hbase.hlog.split.skip.errors is set to false, we fail the split but thats
+ it</link>. We need to do more than just fail split if this flag is set.</para>
+
+ <section>
+ <title>How EOFExceptions are treated when splitting a crashed RegionServers'
+ WALs</title>
+
+ <para>If an EOFException occurs while splitting logs, the split proceeds even when
+ <varname>hbase.hlog.split.skip.errors</varname> is set to
+ <literal>false</literal>. An EOFException while reading the last log in the set of
+ files to split is likely, because the RegionServer is likely to be in the process of
+ writing a record at the time of a crash. For background, see <link
+ xlink:href="https://issues.apache.org/jira/browse/HBASE-2643">HBASE-2643
+ Figure how to deal with eof splitting logs</link></para>
+ </section>
+ </section>
+
+ <section>
+ <title>Performance Improvements during Log Splitting</title>
+ <para>
+ WAL log splitting and recovery can be resource intensive and take a long time,
+ depending on the number of RegionServers involved in the crash and the size of the
+ regions. <xref linkend="distributed.log.splitting" /> and <xref
+ linkend="distributed.log.replay" /> were developed to improve
+ performance during log splitting.
+ </para>
+ <section xml:id="distributed.log.splitting">
+ <title>Distributed Log Splitting</title>
+ <para><firstterm>Distributed Log Splitting</firstterm> was added in HBase version 0.92
+ (<link xlink:href="https://issues.apache.org/jira/browse/HBASE-1364">HBASE-1364</link>)
+ by Prakash Khemani from Facebook. It reduces the time to complete log splitting
+ dramatically, improving the availability of regions and tables. For
+ example, recovering a crashed cluster took around 9 hours with single-threaded log
+ splitting, but only about six minutes with distributed log splitting.</para>
+ <para>The information in this section is sourced from Jimmy Xiang's blog post at <link
+ xlink:href="http://blog.cloudera.com/blog/2012/07/hbase-log-splitting/" />.</para>
+
+ <formalpara>
+ <title>Enabling or Disabling Distributed Log Splitting</title>
+ <para>Distributed log processing is enabled by default since HBase 0.92. The setting
+ is controlled by the <property>hbase.master.distributed.log.splitting</property>
+ property, which can be set to <literal>true</literal> or <literal>false</literal>,
+ but defaults to <literal>true</literal>. </para>
+ </formalpara>
+ <procedure>
+ <title>Distributed Log Splitting, Step by Step</title>
+ <para>After configuring distributed log splitting, the HMaster controls the process.
+ The HMaster enrolls each RegionServer in the log splitting process, and the actual
+ work of splitting the logs is done by the RegionServers. The general process for
+ log splitting, as described in <xref
+ linkend="log.splitting.step.by.step" /> still applies here.</para>
+ <step>
+ <para>If distributed log processing is enabled, the HMaster creates a
+ <firstterm>split log manager</firstterm> instance when the cluster is started.
+ The split log manager manages all log files which need
+ to be scanned and split. The split log manager places all the logs into the
+ ZooKeeper splitlog node (<filename>/hbase/splitlog</filename>) as tasks. You can
+ view the contents of the splitlog by issuing the following
+ <command>zkcli</command> command. Example output is shown.</para>
+ <screen language="bourne">ls /hbase/splitlog
+[hdfs%3A%2F%2Fhost2.sample.com%3A56020%2Fhbase%2F.logs%2Fhost8.sample.com%2C57020%2C1340474893275-splitting%2Fhost8.sample.com%253A57020.1340474893900,
+hdfs%3A%2F%2Fhost2.sample.com%3A56020%2Fhbase%2F.logs%2Fhost3.sample.com%2C57020%2C1340474893299-splitting%2Fhost3.sample.com%253A57020.1340474893931,
+hdfs%3A%2F%2Fhost2.sample.com%3A56020%2Fhbase%2F.logs%2Fhost4.sample.com%2C57020%2C1340474893287-splitting%2Fhost4.sample.com%253A57020.1340474893946]
+ </screen>
+ <para>The output contains some non-ASCII characters. When decoded, it looks much
+ more simple:</para>
+ <screen>
+[hdfs://host2.sample.com:56020/hbase/.logs
+/host8.sample.com,57020,1340474893275-splitting
+/host8.sample.com%3A57020.1340474893900,
+hdfs://host2.sample.com:56020/hbase/.logs
+/host3.sample.com,57020,1340474893299-splitting
+/host3.sample.com%3A57020.1340474893931,
+hdfs://host2.sample.com:56020/hbase/.logs
+/host4.sample.com,57020,1340474893287-splitting
+/host4.sample.com%3A57020.1340474893946]
+ </screen>
+ <para>The listing represents WAL file names to be scanned and split, which is a
+ list of log splitting tasks.</para>
+ </step>
+ <step>
+ <title>The split log manager monitors the log-splitting tasks and workers.</title>
+ <para>The split log manager is responsible for the following ongoing tasks:</para>
+ <itemizedlist>
+ <listitem>
+ <para>Once the split log manager publishes all the tasks to the splitlog
+ znode, it monitors these task nodes and waits for them to be
+ processed.</para>
+ </listitem>
+ <listitem>
+ <para>Checks to see if there are any dead split log
+ workers queued up. If it finds tasks claimed by unresponsive workers, it
+ will resubmit those tasks. If the resubmit fails due to some ZooKeeper
+ exception, the dead worker is queued up again for retry.</para>
+ </listitem>
+ <listitem>
+ <para>Checks to see if there are any unassigned
+ tasks. If it finds any, it create an ephemeral rescan node so that each
+ split log worker is notified to re-scan unassigned tasks via the
+ <code>nodeChildrenChanged</code> ZooKeeper event.</para>
+ </listitem>
+ <listitem>
+ <para>Checks for tasks which are assigned but expired. If any are found, they
+ are moved back to <code>TASK_UNASSIGNED</code> state again so that they can
+ be retried. It is possible that these tasks are assigned to slow workers, or
+ they may already be finished. This is not a problem, because log splitting
+ tasks have the property of idempotence. In other words, the same log
+ splitting task can be processed many times without causing any
+ problem.</para>
+ </listitem>
+ <listitem>
+ <para>The split log manager watches the HBase split log znodes constantly. If
+ any split log task node data is changed, the split log manager retrieves the
+ node data. The
+ node data contains the current state of the task. You can use the
+ <command>zkcli</command> <command>get</command> command to retrieve the
+ current state of a task. In the example output below, the first line of the
+ output shows that the task is currently unassigned.</para>
+ <screen>
+<userinput>get /hbase/splitlog/hdfs%3A%2F%2Fhost2.sample.com%3A56020%2Fhbase%2F.logs%2Fhost6.sample.com%2C57020%2C1340474893287-splitting%2Fhost6.sample.com%253A57020.1340474893945
+</userinput>
+<computeroutput>unassigned host2.sample.com:57000
+cZxid = 0×7115
+ctime = Sat Jun 23 11:13:40 PDT 2012
+...</computeroutput>
+ </screen>
+ <para>Based on the state of the task whose data is changed, the split log
+ manager does one of the following:</para>
+
+ <itemizedlist>
+ <listitem>
+ <para>Resubmit the task if it is unassigned</para>
+ </listitem>
+ <listitem>
+ <para>Heartbeat the task if it is assigned</para>
+ </listitem>
+ <listitem>
+ <para>Resubmit or fail the task if it is resigned (see <xref
+ linkend="distributed.log.replay.failure.reasons" />)</para>
+ </listitem>
+ <listitem>
+ <para>Resubmit or fail the task if it is completed with errors (see <xref
+ linkend="distributed.log.replay.failure.reasons" />)</para>
+ </listitem>
+ <listitem>
+ <para>Resubmit or fail the task if it could not complete due to
+ errors (see <xref
+ linkend="distributed.log.replay.failure.reasons" />)</para>
+ </listitem>
+ <listitem>
+ <para>Delete the task if it is successfully completed or failed</para>
+ </listitem>
+ </itemizedlist>
+ <itemizedlist xml:id="distributed.log.replay.failure.reasons">
+ <title>Reasons a Task Will Fail</title>
+ <listitem><para>The task has been deleted.</para></listitem>
+ <listitem><para>The node no longer exists.</para></listitem>
+ <listitem><para>The log status manager failed to move the state of the task
+ to TASK_UNASSIGNED.</para></listitem>
+ <listitem><para>The number of resubmits is over the resubmit
+ threshold.</para></listitem>
+ </itemizedlist>
+ </listitem>
+ </itemizedlist>
+ </step>
+ <step>
+ <title>Each RegionServer's split log worker performs the log-splitting tasks.</title>
+ <para>Each RegionServer runs a daemon thread called the <firstterm>split log
+ worker</firstterm>, which does the work to split the logs. The daemon thread
+ starts when the RegionServer starts, and registers itself to watch HBase znodes.
+ If any splitlog znode children change, it notifies a sleeping worker thread to
+ wake up and grab more tasks. If if a worker's current task’s node data is
+ changed, the worker checks to see if the task has been taken by another worker.
+ If so, the worker thread stops work on the current task.</para>
+ <para>The worker monitors
+ the splitlog znode constantly. When a new task appears, the split log worker
+ retrieves the task paths and checks each one until it finds an unclaimed task,
+ which it attempts to claim. If the claim was successful, it attempts to perform
+ the task and updates the task's <property>state</property> property based on the
+ splitting outcome. At this point, the split log worker scans for another
+ unclaimed task.</para>
+ <itemizedlist>
+ <title>How the Split Log Worker Approaches a Task</title>
+
+ <listitem>
+ <para>It queries the task state and only takes action if the task is in
+ <literal>TASK_UNASSIGNED </literal>state.</para>
+ </listitem>
+ <listitem>
+ <para>If the task is is in <literal>TASK_UNASSIGNED</literal> state, the
+ worker attempts to set the state to <literal>TASK_OWNED</literal> by itself.
+ If it fails to set the state, another worker will try to grab it. The split
+ log manager will also ask all workers to rescan later if the task remains
+ unassigned.</para>
+ </listitem>
+ <listitem>
+ <para>If the worker succeeds in taking ownership of the task, it tries to get
+ the task state again to make sure it really gets it asynchronously. In the
+ meantime, it starts a split task executor to do the actual work: </para>
+ <itemizedlist>
+ <listitem>
+ <para>Get the HBase root folder, create a temp folder under the root, and
+ split the log file to the temp folder.</para>
+ </listitem>
+ <listitem>
+ <para>If the split was successful, the task executor sets the task to
+ state <literal>TASK_DONE</literal>.</para>
+ </listitem>
+ <listitem>
+ <para>If the worker catches an unexpected IOException, the task is set to
+ state <literal>TASK_ERR</literal>.</para>
+ </listitem>
+ <listitem>
+ <para>If the worker is shutting down, set the the task to state
+ <literal>TASK_RESIGNED</literal>.</para>
+ </listitem>
+ <listitem>
+ <para>If the task is taken by another worker, just log it.</para>
+ </listitem>
+ </itemizedlist>
+ </listitem>
+ </itemizedlist>
+ </step>
+ <step>
+ <title>The split log manager monitors for uncompleted tasks.</title>
+ <para>The split log manager returns when all tasks are completed successfully. If
+ all tasks are completed with some failures, the split log manager throws an
+ exception so that the log splitting can be retried. Due to an asynchronous
+ implementation, in very rare cases, the split log manager loses track of some
+ completed tasks. For that reason, it periodically checks for remaining
+ uncompleted task in its task map or ZooKeeper. If none are found, it throws an
+ exception so that the log splitting can be retried right away instead of hanging
+ there waiting for something that won’t happen.</para>
+ </step>
+ </procedure>
+ </section>
+ <section xml:id="distributed.log.replay">
+ <title>Distributed Log Replay</title>
+ <para>After a RegionServer fails, its failed region is assigned to another
+ RegionServer, which is marked as "recovering" in ZooKeeper. A split log worker directly
+ replays edits from the WAL of the failed region server to the region at its new
+ location. When a region is in "recovering" state, it can accept writes but no reads
+ (including Append and Increment), region splits or merges. </para>
+ <para>Distributed Log Replay extends the <xref linkend="distributed.log.splitting" /> framework. It works by
+ directly replaying WAL edits to another RegionServer instead of creating
+ <filename>recovered.edits</filename> files. It provides the following advantages
+ over distributed log splitting alone:</para>
+ <itemizedlist>
+ <listitem><para>It eliminates the overhead of writing and reading a large number of
+ <filename>recovered.edits</filename> files. It is not unusual for thousands of
+ <filename>recovered.edits</filename> files to be created and written concurrently
+ during a RegionServer recovery. Many small random writes can degrade overall
+ system performance.</para></listitem>
+ <listitem><para>It allows writes even when a region is in recovering state. It only takes seconds for a recovering region to accept writes again.
+</para></listitem>
+ </itemizedlist>
+ <formalpara>
+ <title>Enabling Distributed Log Replay</title>
+ <para>To enable distributed log replay, set <varname>hbase.master.distributed.log.replay</varname> to
+ true. This will be the default for HBase 0.99 (<link
+ xlink:href="https://issues.apache.org/jira/browse/HBASE-10888">HBASE-10888</link>).</para>
+ </formalpara>
+ <para>You must also enable HFile version 3 (which is the default HFile format starting
+ in HBase 0.99. See <link
+ xlink:href="https://issues.apache.org/jira/browse/HBASE-10855">HBASE-1085
<TRUNCATED>
[8/8] hbase git commit: HBASE-12738 Chunk Ref Guide into
file-per-chapter
Posted by mi...@apache.org.
HBASE-12738 Chunk Ref Guide into file-per-chapter
Project: http://git-wip-us.apache.org/repos/asf/hbase/repo
Commit: http://git-wip-us.apache.org/repos/asf/hbase/commit/a1fe1e09
Tree: http://git-wip-us.apache.org/repos/asf/hbase/tree/a1fe1e09
Diff: http://git-wip-us.apache.org/repos/asf/hbase/diff/a1fe1e09
Branch: refs/heads/master
Commit: a1fe1e09642355aa8165c11da3f759d621da1421
Parents: d9f25e3
Author: Misty Stanley-Jones <ms...@cloudera.com>
Authored: Mon Dec 22 15:26:59 2014 +1000
Committer: Misty Stanley-Jones <ms...@cloudera.com>
Committed: Mon Dec 22 15:46:49 2014 +1000
----------------------------------------------------------------------
src/main/docbkx/architecture.xml | 3489 ++++++++++++++++
src/main/docbkx/asf.xml | 44 +
src/main/docbkx/book.xml | 6021 +---------------------------
src/main/docbkx/compression.xml | 535 +++
src/main/docbkx/configuration.xml | 6 +-
src/main/docbkx/customization-pdf.xsl | 129 +
src/main/docbkx/datamodel.xml | 865 ++++
src/main/docbkx/faq.xml | 270 ++
src/main/docbkx/hbase-default.xml | 538 +++
src/main/docbkx/hbase_history.xml | 41 +
src/main/docbkx/hbck_in_depth.xml | 237 ++
src/main/docbkx/mapreduce.xml | 630 +++
src/main/docbkx/orca.xml | 47 +
src/main/docbkx/other_info.xml | 83 +
src/main/docbkx/performance.xml | 2 +-
src/main/docbkx/sql.xml | 40 +
src/main/docbkx/upgrading.xml | 2 +-
src/main/docbkx/ycsb.xml | 36 +
18 files changed, 7008 insertions(+), 6007 deletions(-)
----------------------------------------------------------------------