You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hbase.apache.org by dm...@apache.org on 2012/05/24 01:41:02 UTC
svn commit: r1342094 - in /hbase/trunk/src/docbkx: book.xml ops_mgt.xml
Author: dmeil
Date: Wed May 23 23:41:02 2012
New Revision: 1342094
URL: http://svn.apache.org/viewvc?rev=1342094&view=rev
Log:
hbase-6082. porting hbck document to the RefGuide Appendix
Modified:
hbase/trunk/src/docbkx/book.xml
hbase/trunk/src/docbkx/ops_mgt.xml
Modified: hbase/trunk/src/docbkx/book.xml
URL: http://svn.apache.org/viewvc/hbase/trunk/src/docbkx/book.xml?rev=1342094&r1=1342093&r2=1342094&view=diff
==============================================================================
--- hbase/trunk/src/docbkx/book.xml (original)
+++ hbase/trunk/src/docbkx/book.xml Wed May 23 23:41:02 2012
@@ -2711,6 +2711,195 @@ myHtd.setValue(HTableDescriptor.SPLIT_PO
</qandaset>
</appendix>
+ <appendix xml:id="hbck.in.depth">
+ <title>hbck In Depth</title>
+ <para>HBaseFsck (hbck) is a tool for checking for region consistency and table integrity problems
+and repairing a corrupted HBase. It works in two basic modes -- a read-only inconsistency
+identifying mode and a multi-phase read-write repair mode.
+ </para>
+ <section>
+ <title>Running hbck to identify inconsistencies</title>
+To check to see if your HBase cluster has corruptions, run hbck against your HBase cluster:
+<programlisting>
+$ ./bin/hbase hbck
+</programlisting>
+ <para>
+At the end of the commands output it prints OK or tells you the number of INCONSISTENCIES
+present. You may also want to run run hbck a few times because some inconsistencies can be
+transient (e.g. cluster is starting up or a region is splitting). Operationally you may want to run
+hbck regularly and setup alert (e.g. via nagios) if it repeatedly reports inconsistencies .
+A run of hbck will report a list of inconsistencies along with a brief description of the regions and
+tables affected. The using the <code>-details</code> option will report more details including a representative
+listing of all the splits present in all the tables.
+ </para>
+<programlisting>
+$ ./bin/hbase hbck -details
+</programlisting>
+ </section>
+ <section><title>Inconsistencies</title>
+ <para>
+ If after several runs, inconsistencies continue to be reported, you may have encountered a
+corruption. These should be rare, but in the event they occur newer versions of HBase include
+the hbck tool enabled with automatic repair options.
+ </para>
+ <para>
+ There are two invariants that when violated create inconsistencies in HBase:
+ </para>
+ <itemizedlist>
+ <listitem>HBaseâs region consistency invariant is satisfied if every region is assigned and
+deployed on exactly one region server, and all places where this state kept is in
+accordance.
+ </listitem>
+ <listitem>HBaseâs table integrity invariant is satisfied if for each table, every possible row key
+resolves to exactly one region.
+ </listitem>
+ </itemizedlist>
+ <para>
+Repairs generally work in three phases -- a read-only information gathering phase that identifies
+inconsistencies, a table integrity repair phase that restores the table integrity invariant, and then
+finally a region consistency repair phase that restores the region consistency invariant.
+Starting from version 0.90.0, hbck could detect region consistency problems report on a subset
+of possible table integrity problems. It also included the ability to automatically fix the most
+common inconsistency, region assignment and deployment consistency problems. This repair
+could be done by using the <code>-fix</code> command line option. These problems close regions if they are
+open on the wrong server or on multiple region servers and also assigns regions to region
+servers if they are not open.
+</para>
+<para>
+Starting from HBase versions 0.90.7, 0.92.2 and 0.94.0, several new command line options are
+introduced to aid repairing a corrupted HBase. This hbck sometimes goes by the nickname
+âuberhbckâ. Each particular version of uber hbck is compatible with the HBaseâs of the same
+major version (0.90.7 uberhbck can repair a 0.90.4). However, versions <=0.90.6 and versions
+<=0.92.1 may require restarting the master or failing over to a backup master.
+</para>
+ </section>
+ <section><title>Localized repairs</title>
+ <para>
+ When repairing a corrupted HBase, it is best to repair the lowest risk inconsistencies first.
+These are generally region consistency repairs -- localized single region repairs, that only modify
+in-memory data, ephemeral zookeeper data, or patch holes in the META table.
+Region consistency requires that the HBase instance has the state of the regionâs data in HDFS
+(.regioninfo files), the regionâs row in the .META. table., and regionâs deployment/assignments on
+region servers and the master in accordance. Options for repairing region consistency include:
+ <itemizedlist>
+ <listitem><code>-fixAssignments</code> (equivalent to the 0.90 <code>-fix</code> option) repairs unassigned, incorrectly
+assigned or multiply assigned regions.
+ </listitem>
+ <listitem><code>-fixMeta</code> which removes meta rows when corresponding regions are not present in
+HDFS and adds new meta rows if they regions are present in HDFS while not in META.
+ </listitem>
+ </itemizedlist>
+ To fix deployment and assignment problems you can run this command:
+</para>
+<programlisting>
+$ ./bin/hbase hbck -fixAssignments
+</programlisting>
+To fix deployment and assignment problems as well as repairing incorrect meta rows you can
+run this command:.
+<programlisting>
+$ ./bin/hbase hbck -fixAssignments -fixMeta
+</programlisting>
+There are a few classes of table integrity problems that are low risk repairs. The first two are
+degenerate (startkey == endkey) regions and backwards regions (startkey > endkey). These are
+automatically handled by sidelining the data to a temporary directory (/hbck/xxxx).
+The third low-risk class is hdfs region holes. This can be repaired by using the:
+ <itemizedlist>
+ <listitem><code>-fixHdfsHoles</code> option for fabricating new empty regions on the file system.
+If holes are detected you can use -fixHdfsHoles and should include -fixMeta and -fixAssignments to make the new region consistent.
+ </listitem>
+ </itemizedlist>
+<programlisting>
+$ ./bin/hbase hbck -fixAssignments -fixMeta -fixHdfsHoles
+</programlisting>
+Since this is a common operation, weâve added a the <code>-repairHoles</code> flag that is equivalent to the
+previous command:
+<programlisting>
+$ ./bin/hbase hbck -repairHoles
+</programlisting>
+If inconsistencies still remain after these steps, you most likely have table integrity problems
+related to orphaned or overlapping regions.
+ </section>
+ <section><title>Region Overlap Repairs</title>
+Table integrity problems can require repairs that deal with overlaps. This is a riskier operation
+because it requires modifications to the file system, requires some decision making, and may
+require some manual steps. For these repairs it is best to analyze the output of a <code>hbck -details</code>
+run so that you isolate repairs attempts only upon problems the checks identify. Because this is
+riskier, there are safeguard that should be used to limit the scope of the repairs.
+WARNING: This is a relatively new and have only been tested on online but idle HBase instances
+(no reads/writes). Use at your own risk in an active production environment!
+The options for repairing table integrity violations include:
+ <itemizedlist>
+ <listitem><code>-fixHdfsOrphans</code> option for âadoptingâ a region directory that is missing a region
+metadata file (the .regioninfo file).
+ </listitem>
+ <listitem><code>-fixHdfsOverlaps</code> ability for fixing overlapping regions
+ </listitem>
+ </itemizedlist>
+When repairing overlapping regions, a regionâs data can be modified on the file system in two
+ways: 1) by merging regions into a larger region or 2) by sidelining regions by moving data to
+âsidelineâ directory where data could be restored later. Merging a large number of regions is
+technically correct but could result in an extremely large region that requires series of costly
+compactions and splitting operations. In these cases, it is probably better to sideline the regions
+that overlap with the most other regions (likely the largest ranges) so that merges can happen on
+a more reasonable scale. Since these sidelined regions are already laid out in HBaseâs native
+directory and HFile format, they can be restored by using HBaseâs bulk load mechanism.
+The default safeguard thresholds are conservative. These options let you override the default
+thresholds and to enable the large region sidelining feature.
+ <itemizedlist>
+ <listitem><code>-maxMerge <n></code> maximum number of overlapping regions to merge
+ </listitem>
+ <listitem><code>-sidelineBigOverlaps</code> if more than maxMerge regions are overlapping, sideline attempt
+to sideline the regions overlapping with the most other regions.
+ </listitem>
+ <listitem><code>-maxOverlapsToSideline <n></code> if sidelining large overlapping regions, sideline at most n
+regions.
+ </listitem>
+ </itemizedlist>
+
+Since often times you would just want to get the tables repaired, you can use this option to turn
+on all repair options:
+ <itemizedlist>
+ <listitem><code>-repair</code> includes all the region consistency options and only the hole repairing table
+integrity options.
+ </listitem>
+ </itemizedlist>
+Finally, there are safeguards to limit repairs to only specific tables. For example the following
+command would only attempt to repair table TableFoo and TableBar.
+<programlisting>
+$ ./bin/hbase/ hbck -repair TableFoo TableBar
+</programlisting>
+ <section><title>Special cases: Meta is not properly assigned</title>
+There are a few special cases that hbck can handle as well.
+Sometimes the meta tableâs only region is inconsistently assigned or deployed. In this case
+there is a special <code>-fixMetaOnly</code> option that can try to fix meta assignments.
+<programlisting>
+$ ./bin/hbase hbck -fixMetaOnly -fixAssignments
+</programlisting>
+ </section>
+ <section><title>Special cases: HBase version file is missing</title>
+HBaseâs data on the file system requires a version file in order to start. If this flie is missing, you
+can use the <code>-fixVersionFile</code> option to fabricating a new HBase version file. This assumes that
+the version of hbck you are running is the appropriate version for the HBase cluster.
+ </section>
+ <section><title>Special case: Root and META are corrupt.</title>
+The most drastic corruption scenario is the case where the ROOT or META is corrupted and
+HBase will not start. In this case you can use the OfflineMetaRepair tool create new ROOT
+and META regions and tables.
+This tool assumes that HBase is offline. It then marches through the existing HBase home
+directory, loads as much information from region metadata files (.regioninfo files) as possible
+from the file system. If the region metadata has proper table integrity, it sidelines the original root
+and meta table directories, and builds new ones with pointers to the region directories and their
+data.
+<programlisting>
+$ ./bin/hbase org.apache.hadoop.hbase.util.OfflineMetaRepair
+</programlisting>
+NOTE: This tool is not as clever as uberhbck but can be used to bootstrap repairs that uberhbck
+can complete.
+If the tool succeeds you should be able to start hbase and run online repairs if necessary.
+ </section>
+ </section>
+ </appendix>
+
<appendix xml:id="compression">
<title >Compression In HBase<indexterm><primary>Compression</primary></indexterm></title>
Modified: hbase/trunk/src/docbkx/ops_mgt.xml
URL: http://svn.apache.org/viewvc/hbase/trunk/src/docbkx/ops_mgt.xml?rev=1342094&r1=1342093&r2=1342094&view=diff
==============================================================================
--- hbase/trunk/src/docbkx/ops_mgt.xml (original)
+++ hbase/trunk/src/docbkx/ops_mgt.xml Wed May 23 23:41:02 2012
@@ -69,6 +69,8 @@ Valid program names are:
Passing <command>-fix</command> may correct the inconsistency (This latter
is an experimental feature).
</para>
+ <para>For more information, see <xref linkend="hbck.in.depth"/>.
+ </para>
</section>
<section xml:id="hfile_tool2"><title>HFile Tool</title>
<para>See <xref linkend="hfile_tool" />.</para>