You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hbase.apache.org by nd...@apache.org on 2016/01/17 01:07:18 UTC

[4/4] hbase git commit: updating docs from master

updating docs from master


Project: http://git-wip-us.apache.org/repos/asf/hbase/repo
Commit: http://git-wip-us.apache.org/repos/asf/hbase/commit/c07ddc6d
Tree: http://git-wip-us.apache.org/repos/asf/hbase/tree/c07ddc6d
Diff: http://git-wip-us.apache.org/repos/asf/hbase/diff/c07ddc6d

Branch: refs/heads/branch-1.1
Commit: c07ddc6dbeb225b7145b9d884a6a5cc5022331d7
Parents: 5a1cfc1
Author: Nick Dimiduk <nd...@apache.org>
Authored: Sat Jan 16 16:00:17 2016 -0800
Committer: Nick Dimiduk <nd...@apache.org>
Committed: Sat Jan 16 16:00:17 2016 -0800

----------------------------------------------------------------------
 .../asciidoc/_chapters/appendix_acl_matrix.adoc |   2 +-
 .../appendix_contributing_to_documentation.adoc |  13 +-
 .../_chapters/appendix_hfile_format.adoc        |  22 +-
 src/main/asciidoc/_chapters/architecture.adoc   | 233 +++---
 src/main/asciidoc/_chapters/asf.adoc            |   4 +-
 src/main/asciidoc/_chapters/case_studies.adoc   |   2 +-
 src/main/asciidoc/_chapters/community.adoc      |  34 +-
 src/main/asciidoc/_chapters/compression.adoc    |  40 +-
 src/main/asciidoc/_chapters/configuration.adoc  |  33 +-
 src/main/asciidoc/_chapters/cp.adoc             | 715 +++++++++----------
 src/main/asciidoc/_chapters/datamodel.adoc      |   8 +-
 src/main/asciidoc/_chapters/developer.adoc      |  49 +-
 src/main/asciidoc/_chapters/external_apis.adoc  |   6 +-
 src/main/asciidoc/_chapters/faq.adoc            |  22 +-
 .../asciidoc/_chapters/getting_started.adoc     |   2 +-
 src/main/asciidoc/_chapters/hbase-default.adoc  | 527 +++++++-------
 src/main/asciidoc/_chapters/hbase_history.adoc  |   8 +-
 src/main/asciidoc/_chapters/hbck_in_depth.adoc  |  24 +-
 src/main/asciidoc/_chapters/mapreduce.adoc      |   8 +-
 src/main/asciidoc/_chapters/ops_mgt.adoc        |  54 +-
 src/main/asciidoc/_chapters/other_info.adoc     |  34 +-
 src/main/asciidoc/_chapters/performance.adoc    |  21 +-
 src/main/asciidoc/_chapters/rpc.adoc            |  22 +-
 src/main/asciidoc/_chapters/schema_design.adoc  | 125 +++-
 src/main/asciidoc/_chapters/security.adoc       |  60 +-
 src/main/asciidoc/_chapters/shell.adoc          |   2 +-
 src/main/asciidoc/_chapters/spark.adoc          | 451 ++++++++++++
 src/main/asciidoc/_chapters/tracing.adoc        |  30 +-
 .../asciidoc/_chapters/troubleshooting.adoc     |  12 +-
 src/main/asciidoc/_chapters/unit_testing.adoc   |  32 +-
 src/main/asciidoc/_chapters/upgrading.adoc      |   8 +-
 src/main/asciidoc/_chapters/zookeeper.adoc      |  30 +-
 src/main/asciidoc/book.adoc                     |   1 +
 33 files changed, 1607 insertions(+), 1027 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/hbase/blob/c07ddc6d/src/main/asciidoc/_chapters/appendix_acl_matrix.adoc
----------------------------------------------------------------------
diff --git a/src/main/asciidoc/_chapters/appendix_acl_matrix.adoc b/src/main/asciidoc/_chapters/appendix_acl_matrix.adoc
index cb285f3..698ae82 100644
--- a/src/main/asciidoc/_chapters/appendix_acl_matrix.adoc
+++ b/src/main/asciidoc/_chapters/appendix_acl_matrix.adoc
@@ -65,7 +65,7 @@ Possible permissions include the following:
 For the most part, permissions work in an expected way, with the following caveats:
 
 Having Write permission does not imply Read permission.::
-  It is possible and sometimes desirable for a user to be able to write data that same user cannot read. One such example is a log-writing process. 
+  It is possible and sometimes desirable for a user to be able to write data that same user cannot read. One such example is a log-writing process.
 The [systemitem]+hbase:meta+ table is readable by every user, regardless of the user's other grants or restrictions.::
   This is a requirement for HBase to function correctly.
 `CheckAndPut` and `CheckAndDelete` operations will fail if the user does not have both Write and Read permission.::

http://git-wip-us.apache.org/repos/asf/hbase/blob/c07ddc6d/src/main/asciidoc/_chapters/appendix_contributing_to_documentation.adoc
----------------------------------------------------------------------
diff --git a/src/main/asciidoc/_chapters/appendix_contributing_to_documentation.adoc b/src/main/asciidoc/_chapters/appendix_contributing_to_documentation.adoc
index 1b674e7..4588e95 100644
--- a/src/main/asciidoc/_chapters/appendix_contributing_to_documentation.adoc
+++ b/src/main/asciidoc/_chapters/appendix_contributing_to_documentation.adoc
@@ -125,7 +125,7 @@ This directory also stores images used in the HBase Reference Guide.
 
 The website's pages are written in an HTML-like XML dialect called xdoc, which
 has a reference guide at
-link:http://maven.apache.org/archives/maven-1.x/plugins/xdoc/reference/xdocs.html.
+http://maven.apache.org/archives/maven-1.x/plugins/xdoc/reference/xdocs.html.
 You can edit these files in a plain-text editor, an IDE, or an XML editor such
 as XML Mind XML Editor (XXE) or Oxygen XML Author.
 
@@ -159,7 +159,7 @@ artifacts to the 0.94/ directory of the `asf-site` branch.
 
 The HBase Reference Guide is written in Asciidoc and built using link:http://asciidoctor.org[AsciiDoctor].
 The following cheat sheet is included for your reference. More nuanced and comprehensive documentation
-is available at link:http://asciidoctor.org/docs/user-manual/.
+is available at http://asciidoctor.org/docs/user-manual/.
 
 .AsciiDoc Cheat Sheet
 [cols="1,1,a",options="header"]
@@ -186,7 +186,8 @@ is available at link:http://asciidoctor.org/docs/user-manual/.
 include\::path/to/app.rb[]
 ----
 ................
-| Include only part of a separate file | Similar to Javadoc | See link:http://asciidoctor.org/docs/user-manual/#by-tagged-regions
+| Include only part of a separate file | Similar to Javadoc
+| See http://asciidoctor.org/docs/user-manual/#by-tagged-regions
 | Filenames, directory names, new terms | italic | \_hbase-default.xml_
 | External naked URLs | A link with the URL as link text |
 ----
@@ -285,7 +286,11 @@ Title:: content
 Title::
   content
 ----
-| Sidebars, quotes, or other blocks of text | a block of text, formatted differently from the default | Delimited using different delimiters, see link:http://asciidoctor.org/docs/user-manual/#built-in-blocks-summary. Some of the examples above use delimiters like \...., ----,====.
+| Sidebars, quotes, or other blocks of text
+| a block of text, formatted differently from the default
+| Delimited using different delimiters,
+see http://asciidoctor.org/docs/user-manual/#built-in-blocks-summary.
+Some of the examples above use delimiters like \...., ----,====.
 ........
 [example]
 ====

http://git-wip-us.apache.org/repos/asf/hbase/blob/c07ddc6d/src/main/asciidoc/_chapters/appendix_hfile_format.adoc
----------------------------------------------------------------------
diff --git a/src/main/asciidoc/_chapters/appendix_hfile_format.adoc b/src/main/asciidoc/_chapters/appendix_hfile_format.adoc
index d73ddfb..18eafe6 100644
--- a/src/main/asciidoc/_chapters/appendix_hfile_format.adoc
+++ b/src/main/asciidoc/_chapters/appendix_hfile_format.adoc
@@ -192,8 +192,11 @@ This format applies to intermediate-level and leaf index blocks of a version 2 m
 Every non-root index block is structured as follows.
 
 . numEntries: the number of entries (int).
-. entryOffsets: the ``secondary index'' of offsets of entries in the block, to facilitate a quick binary search on the key (numEntries + 1 int values). The last value is the total length of all entries in this index block.
-  For example, in a non-root index block with entry sizes 60, 80, 50 the ``secondary index'' will contain the following int array: {0, 60, 140, 190}.
+. entryOffsets: the "secondary index" of offsets of entries in the block, to facilitate
+  a quick binary search on the key (`numEntries + 1` int values). The last value
+  is the total length of all entries in this index block. For example, in a non-root
+  index block with entry sizes 60, 80, 50 the "secondary index" will contain the
+  following int array: `{0, 60, 140, 190}`.
 . Entries.
   Each entry contains:
 +
@@ -222,7 +225,7 @@ In contrast with version 1, in a version 2 HFile Bloom filter metadata is stored
 
 ==== File Info format in versions 1 and 2
 
-The file info block is a serialized link:http://hbase.apache.org/devapidocs/org/apache/hadoop/hbase/io/HbaseMapWritable.html[HbaseMapWritable] (essentially a map from byte arrays to byte arrays) with the following keys, among others.
+The file info block is a serialized map from byte arrays to byte arrays, with the following keys, among others.
 StoreFile-level logic adds more keys to this.
 
 [cols="1,1", frame="all"]
@@ -232,9 +235,11 @@ StoreFile-level logic adds more keys to this.
 |hfile.AVG_VALUE_LEN| The average value length in the file (int)
 |===
 
-File info format did not change in version 2.
-However, we moved the file info to the final section of the file, which can be loaded as one block at the time the HFile is being opened.
-Also, we do not store comparator in the version 2 file info anymore.
+In version 2, we did not change the file format, but we moved the file info to
+the final section of the file, which can be loaded as one block when the HFile
+is being opened.
+
+Also, we do not store the comparator in the version 2 file info anymore.
 Instead, we store it in the fixed file trailer.
 This is because we need to know the comparator at the time of parsing the load-on-open section of the HFile.
 
@@ -249,7 +254,8 @@ However, the version is always stored as the last four-byte integer in the file.
 |===
 | Version 1 | Version 2
 | |File info offset (long)
-| Data index offset (long)| loadOnOpenOffset (long) /The offset of the sectionthat we need toload when opening the file./
+| Data index offset (long)
+| loadOnOpenOffset (long) /The offset of the section that we need to load when opening the file./
 | | Number of data index entries (int)
 | metaIndexOffset (long) /This field is not being used by the version 1 reader, so we removed it from version 2./ | uncompressedDataIndexSize (long) /The total uncompressed size of the whole data block index, including root-level, intermediate-level, and leaf-level blocks./
 | | Number of meta index entries (int)
@@ -257,7 +263,7 @@ However, the version is always stored as the last four-byte integer in the file.
 | numEntries (int) | numEntries (long)
 | Compression codec: 0 = LZO, 1 = GZ, 2 = NONE (int) | Compression codec: 0 = LZO, 1 = GZ, 2 = NONE (int)
 | | The number of levels in the data block index (int)
-| | firstDataBlockOffset (long) /The offset of the first first data block. Used when scanning./
+| | firstDataBlockOffset (long) /The offset of the first data block. Used when scanning./
 | | lastDataBlockEnd (long) /The offset of the first byte after the last key/value data block. We don't need to go beyond this offset when scanning./
 | Version: 1 (int) | Version: 2 (int)
 |===

http://git-wip-us.apache.org/repos/asf/hbase/blob/c07ddc6d/src/main/asciidoc/_chapters/architecture.adoc
----------------------------------------------------------------------
diff --git a/src/main/asciidoc/_chapters/architecture.adoc b/src/main/asciidoc/_chapters/architecture.adoc
index 8122e11..103f624 100644
--- a/src/main/asciidoc/_chapters/architecture.adoc
+++ b/src/main/asciidoc/_chapters/architecture.adoc
@@ -41,7 +41,8 @@ Technically speaking, HBase is really more a "Data Store" than "Data Base" becau
 However, HBase has many features which supports both linear and modular scaling.
 HBase clusters expand by adding RegionServers that are hosted on commodity class servers.
 If a cluster expands from 10 to 20 RegionServers, for example, it doubles both in terms of storage and as well as processing capacity.
-RDBMS can scale well, but only up to a point - specifically, the size of a single database server - and for the best performance requires specialized hardware and storage devices.
+An RDBMS can scale well, but only up to a point - specifically, the size of a single database
+server - and for the best performance requires specialized hardware and storage devices.
 HBase features of note are:
 
 * Strongly consistent reads/writes:  HBase is not an "eventually consistent" DataStore.
@@ -140,7 +141,7 @@ If a region has both an empty start and an empty end key, it is the only region
 
 In the (hopefully unlikely) event that programmatic processing of catalog metadata
 is required, see the
-link:http://hbase.apache.org/devapidocs/org/apache/hadoop/hbase/util/Writables.html#getHRegionInfo%28byte[]%29[Writables]
++++<a href="http://hbase.apache.org/devapidocs/org/apache/hadoop/hbase/util/Writables.html#getHRegionInfo%28byte%5B%5D%29">Writables</a>+++
 utility.
 
 [[arch.catalog.startup]]
@@ -172,7 +173,7 @@ The API changed in HBase 1.0. For connection configuration information, see <<cl
 
 ==== API as of HBase 1.0.0
 
-Its been cleaned up and users are returned Interfaces to work against rather than particular types.
+It's been cleaned up and users are returned Interfaces to work against rather than particular types.
 In HBase 1.0, obtain a `Connection` object from `ConnectionFactory` and thereafter, get from it instances of `Table`, `Admin`, and `RegionLocator` on an as-need basis.
 When done, close the obtained instances.
 Finally, be sure to cleanup your `Connection` instance before exiting.
@@ -295,7 +296,11 @@ scan.setFilter(list);
 [[client.filter.cv.scvf]]
 ==== SingleColumnValueFilter
 
-link:http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/filter/SingleColumnValueFilter.html[SingleColumnValueFilter] can be used to test column values for equivalence (`CompareOp.EQUAL`), inequality (`CompareOp.NOT_EQUAL`), or ranges (e.g., `CompareOp.GREATER`). The following is example of testing equivalence a column to a String value "my value"...
+A SingleColumnValueFilter (see:
+http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/filter/SingleColumnValueFilter.html)
+can be used to test column values for equivalence (`CompareOp.EQUAL`),
+inequality (`CompareOp.NOT_EQUAL`), or ranges (e.g., `CompareOp.GREATER`). The following is an
+example of testing equivalence of a column to a String value "my value"...
 
 [source,java]
 ----
@@ -694,7 +699,8 @@ Here are others that you may have to take into account:
 
 Catalog Tables::
   The `-ROOT-` (prior to HBase 0.96, see <<arch.catalog.root,arch.catalog.root>>) and `hbase:meta` tables are forced into the block cache and have the in-memory priority which means that they are harder to evict.
-  The former never uses more than a few hundreds bytes while the latter can occupy a few MBs (depending on the number of regions).
+  The former never uses more than a few hundred bytes while the latter can occupy a few MBs
+  (depending on the number of regions).
 
 HFiles Indexes::
   An _HFile_ is the file format that HBase uses to store data in HDFS.
@@ -878,7 +884,10 @@ image::region_split_process.png[Region Split Process]
 . The Master learns about this znode, since it has a watcher for the parent `region-in-transition` znode.
 . The RegionServer creates a sub-directory named `.splits` under the parent’s `region` directory in HDFS.
 . The RegionServer closes the parent region and marks the region as offline in its local data structures. *THE SPLITTING REGION IS NOW OFFLINE.* At this point, client requests coming to the parent region will throw `NotServingRegionException`. The client will retry with some backoff. The closing region is flushed.
-. The  RegionServer creates region directories under the `.splits` directory, for daughter regions A and B, and creates necessary data structures. Then it splits the store files, in the sense that it creates two link:http://www.google.com/url?q=http%3A%2F%2Fhbase.apache.org%2Fapidocs%2Forg%2Fapache%2Fhadoop%2Fhbase%2Fio%2FReference.html&sa=D&sntz=1&usg=AFQjCNEkCbADZ3CgKHTtGYI8bJVwp663CA[Reference] files per store file in the parent region. Those reference files will point to the parent regions'files.
+. The RegionServer creates region directories under the `.splits` directory, for daughter
+regions A and B, and creates necessary data structures. Then it splits the store files,
+in the sense that it creates two Reference files per store file in the parent region.
+Those reference files will point to the parent region's files.
 . The RegionServer creates the actual region directory in HDFS, and moves the reference files for each daughter.
 . The RegionServer sends a `Put` request to the `.META.` table, to set the parent as offline in the `.META.` table and add information about daughter regions. At this point, there won’t be individual entries in `.META.` for the daughters. Clients will see that the parent region is split if they scan `.META.`, but won’t know about the daughters until they appear in `.META.`. Also, if this `Put` to `.META`. succeeds, the parent will be effectively split. If the RegionServer fails before this RPC succeeds, Master and the next Region Server opening the region will clean dirty state about the region split. After the `.META.` update, though, the region split will be rolled-forward by Master.
 . The RegionServer opens daughters A and B in parallel.
@@ -931,7 +940,7 @@ To configure MultiWAL for a RegionServer, set the value of the property `hbase.w
 </property>
 ----
 
-Restart the RegionServer for the changes to take effect. 
+Restart the RegionServer for the changes to take effect.
 
 To disable MultiWAL for a RegionServer, unset the property and restart the RegionServer.
 
@@ -1008,7 +1017,8 @@ If you set the `hbase.hlog.split.skip.errors` option to `true`, errors are treat
 * Processing of the WAL will continue
 
 If the `hbase.hlog.split.skip.errors` option is set to `false`, the default, the exception will be propagated and the split will be logged as failed.
-See link:https://issues.apache.org/jira/browse/HBASE-2958[HBASE-2958 When hbase.hlog.split.skip.errors is set to false, we fail the split but thats it].
+See link:https://issues.apache.org/jira/browse/HBASE-2958[HBASE-2958 When
+hbase.hlog.split.skip.errors is set to false, we fail the split but that's it].
 We need to do more than just fail split if this flag is set.
 
 ====== How EOFExceptions are treated when splitting a crashed RegionServer's WALs
@@ -1117,7 +1127,8 @@ Based on the state of the task whose data is changed, the split log manager does
 Each RegionServer runs a daemon thread called the _split log worker_, which does the work to split the logs.
 The daemon thread starts when the RegionServer starts, and registers itself to watch HBase znodes.
 If any splitlog znode children change, it notifies a sleeping worker thread to wake up and grab more tasks.
-If if a worker's current task's node data is changed, the worker checks to see if the task has been taken by another worker.
+If a worker's current task's node data is changed,
+the worker checks to see if the task has been taken by another worker.
 If so, the worker thread stops work on the current task.
 +
 The worker monitors the splitlog znode constantly.
@@ -1127,7 +1138,7 @@ At this point, the split log worker scans for another unclaimed task.
 +
 .How the Split Log Worker Approaches a Task
 * It queries the task state and only takes action if the task is in `TASK_UNASSIGNED `state.
-* If the task is is in `TASK_UNASSIGNED` state, the worker attempts to set the state to `TASK_OWNED` by itself.
+* If the task is in `TASK_UNASSIGNED` state, the worker attempts to set the state to `TASK_OWNED` by itself.
   If it fails to set the state, another worker will try to grab it.
   The split log manager will also ask all workers to rescan later if the task remains unassigned.
 * If the worker succeeds in taking ownership of the task, it tries to get the task state again to make sure it really gets it asynchronously.
@@ -1135,7 +1146,7 @@ At this point, the split log worker scans for another unclaimed task.
 ** Get the HBase root folder, create a temp folder under the root, and split the log file to the temp folder.
 ** If the split was successful, the task executor sets the task to state `TASK_DONE`.
 ** If the worker catches an unexpected IOException, the task is set to state `TASK_ERR`.
-** If the worker is shutting down, set the the task to state `TASK_RESIGNED`.
+** If the worker is shutting down, set the task to state `TASK_RESIGNED`.
 ** If the task is taken by another worker, just log it.
 
 
@@ -1326,7 +1337,7 @@ image::region_states.png[]
 . Before assigning a region, the master moves the region to `OFFLINE` state automatically if it is in `CLOSED` state.
 . When a RegionServer is about to split a region, it notifies the master.
   The master moves the region to be split from `OPEN` to `SPLITTING` state and add the two new regions to be created to the RegionServer.
-  These two regions are in `SPLITING_NEW` state initially.
+  These two regions are in `SPLITTING_NEW` state initially.
 . After notifying the master, the RegionServer starts to split the region.
   Once past the point of no return, the RegionServer notifies the master again so the master can update the `hbase:meta` table.
   However, the master does not update the region states until it is notified by the server that the split is done.
@@ -1404,8 +1415,8 @@ hbase> create 'test', {METHOD => 'table_att', CONFIG => {'SPLIT_POLICY' => 'org.
 ----
 
 The default split policy can be overwritten using a custom
-link:http://hbase.apache.org/devapidocs/org/apache/hadoop/hbase/regionserver/RegionSplitPolicy.html
-[RegionSplitPolicy(HBase 0.94+)]. Typically a custom split policy should extend HBase's default split policy:
+link:http://hbase.apache.org/devapidocs/org/apache/hadoop/hbase/regionserver/RegionSplitPolicy.html[RegionSplitPolicy(HBase 0.94+)].
+Typically a custom split policy should extend HBase's default split policy:
 link:http://hbase.apache.org/devapidocs/org/apache/hadoop/hbase/regionserver/ConstantSizeRegionSplitPolicy.html[ConstantSizeRegionSplitPolicy].
 
 The policy can be set globally through the HBaseConfiguration used or on a per table basis:
@@ -1806,60 +1817,116 @@ This list is not exhaustive.
 To tune these parameters from the defaults, edit the _hbase-default.xml_ file.
 For a full list of all configuration parameters available, see <<config.files,config.files>>
 
-[cols="1,1a,1", options="header"]
-|===
-| Parameter
-| Description
-| Default
-
-|`hbase.hstore.compaction.min`
-| The minimum number of StoreFiles which must be eligible for compaction before compaction can run. The goal of tuning `hbase.hstore.compaction.min` is to avoid ending up with too many tiny StoreFiles to compact. Setting this value to 2 would cause a minor compaction each time you have two StoreFiles in a Store, and this is probably not appropriate. If you set this value too high, all the other values will need to be adjusted accordingly. For most cases, the default value is appropriate. In previous versions of HBase, the parameter hbase.hstore.compaction.min was called `hbase.hstore.compactionThreshold`.
-|3
-
-|`hbase.hstore.compaction.max`
-| The maximum number of StoreFiles which will be selected for a single minor compaction, regardless of the number of eligible StoreFiles. Effectively, the value of hbase.hstore.compaction.max controls the length of time it takes a single compaction to complete. Setting it larger means that more StoreFiles are included in a compaction. For most cases, the default value is appropriate.
-|10
-
-|`hbase.hstore.compaction.min.size`
-| A StoreFile smaller than this size will always be eligible for minor compaction. StoreFiles this size or larger are evaluated by `hbase.hstore.compaction.ratio` to determine if they are eligible. Because this limit represents the "automatic include" limit for all StoreFiles smaller than this value, this value may need to be reduced in write-heavy environments where many files in the 1-2 MB range are being flushed, because every StoreFile will be targeted for compaction and the resulting StoreFiles may still be under the minimum size and require further compaction. If this parameter is lowered, the ratio check is triggered more quickly. This addressed some issues seen in earlier versions of HBase but changing this parameter is no longer necessary in most situations.
-|128 MB
-
-|`hbase.hstore.compaction.max.size`
-| An StoreFile larger than this size will be excluded from compaction. The effect of raising `hbase.hstore.compaction.max.size` is fewer, larger StoreFiles that do not get compacted often. If you feel that compaction is happening too often without much benefit, you can try raising this value.
-|`Long.MAX_VALUE`
-
-|`hbase.hstore.compaction.ratio`
-| For minor compaction, this ratio is used to determine whether a given StoreFile which is larger than `hbase.hstore.compaction.min.size` is eligible for compaction. Its effect is to limit compaction of large StoreFile. The value of `hbase.hstore.compaction.ratio` is expressed as a floating-point decimal.
-
-* A large ratio, such as 10, will produce a single giant StoreFile. Conversely, a value of .25, will produce behavior similar to the BigTable compaction algorithm, producing four StoreFiles.
-* A moderate value of between 1.0 and 1.4 is recommended. When tuning this value, you are balancing write costs with read costs. Raising the value (to something like 1.4) will have more write costs, because you will compact larger StoreFiles. However, during reads, HBase will need to seek through fewer StoreFiles to accomplish the read. Consider this approach if you cannot take advantage of <<bloom>>.
-* Alternatively, you can lower this value to something like 1.0 to reduce the background cost of writes, and use  to limit the number of StoreFiles touched during reads. For most cases, the default value is appropriate.
-| `1.2F`
-
-|`hbase.hstore.compaction.ratio.offpeak`
-| The compaction ratio used during off-peak compactions, if off-peak hours are also configured (see below). Expressed as a floating-point decimal. This allows for more aggressive (or less aggressive, if you set it lower than `hbase.hstore.compaction.ratio`) compaction during a set time period. Ignored if off-peak is disabled (default). This works the same as hbase.hstore.compaction.ratio.
-| `5.0F`
+`hbase.hstore.compaction.min`::
+  The minimum number of StoreFiles which must be eligible for compaction before compaction can run.
+  The goal of tuning `hbase.hstore.compaction.min` is to avoid ending up with too many tiny StoreFiles
+  to compact. Setting this value to 2 would cause a minor compaction each time you have two StoreFiles
+  in a Store, and this is probably not appropriate. If you set this value too high, all the other
+  values will need to be adjusted accordingly. For most cases, the default value is appropriate.
+  In previous versions of HBase, the parameter `hbase.hstore.compaction.min` was called
+  `hbase.hstore.compactionThreshold`.
++
+*Default*: 3
+
+`hbase.hstore.compaction.max`::
+  The maximum number of StoreFiles which will be selected for a single minor compaction,
+  regardless of the number of eligible StoreFiles. Effectively, the value of
+  `hbase.hstore.compaction.max` controls the length of time it takes a single
+  compaction to complete. Setting it larger means that more StoreFiles are included
+  in a compaction. For most cases, the default value is appropriate.
++
+*Default*: 10
+
+`hbase.hstore.compaction.min.size`::
+  A StoreFile smaller than this size will always be eligible for minor compaction.
+  StoreFiles this size or larger are evaluated by `hbase.hstore.compaction.ratio`
+  to determine if they are eligible. Because this limit represents the "automatic
+  include" limit for all StoreFiles smaller than this value, this value may need
+  to be reduced in write-heavy environments where many files in the 1-2 MB range
+  are being flushed, because every StoreFile will be targeted for compaction and
+  the resulting StoreFiles may still be under the minimum size and require further
+  compaction. If this parameter is lowered, the ratio check is triggered more quickly.
+  This addressed some issues seen in earlier versions of HBase but changing this
+  parameter is no longer necessary in most situations.
++
+*Default*:128 MB
 
-| `hbase.offpeak.start.hour`
-| The start of off-peak hours, expressed as an integer between 0 and 23, inclusive. Set to -1 to disable off-peak.
-| `-1` (disabled)
+`hbase.hstore.compaction.max.size`::
+  A StoreFile larger than this size will be excluded from compaction. The effect of
+  raising `hbase.hstore.compaction.max.size` is fewer, larger StoreFiles that do not
+  get compacted often. If you feel that compaction is happening too often without
+  much benefit, you can try raising this value.
++
+*Default*: `Long.MAX_VALUE`
 
-| `hbase.offpeak.end.hour`
-| The end of off-peak hours, expressed as an integer between 0 and 23, inclusive. Set to -1 to disable off-peak.
-| `-1` (disabled)
+`hbase.hstore.compaction.ratio`::
+  For minor compaction, this ratio is used to determine whether a given StoreFile
+  which is larger than `hbase.hstore.compaction.min.size` is eligible for compaction.
+  Its effect is to limit compaction of large StoreFile. The value of
+  `hbase.hstore.compaction.ratio` is expressed as a floating-point decimal.
++
+* A large ratio, such as 10, will produce a single giant StoreFile. Conversely,
+  a value of .25, will produce behavior similar to the BigTable compaction algorithm,
+  producing four StoreFiles.
+* A moderate value of between 1.0 and 1.4 is recommended. When tuning this value,
+  you are balancing write costs with read costs. Raising the value (to something like
+  1.4) will have more write costs, because you will compact larger StoreFiles.
+  However, during reads, HBase will need to seek through fewer StoreFiles to
+  accomplish the read. Consider this approach if you cannot take advantage of <<bloom>>.
+* Alternatively, you can lower this value to something like 1.0 to reduce the
+  background cost of writes, and use  to limit the number of StoreFiles touched
+  during reads. For most cases, the default value is appropriate.
++
+*Default*: `1.2F`
+
+`hbase.hstore.compaction.ratio.offpeak`::
+  The compaction ratio used during off-peak compactions, if off-peak hours are
+  also configured (see below). Expressed as a floating-point decimal. This allows
+  for more aggressive (or less aggressive, if you set it lower than
+  `hbase.hstore.compaction.ratio`) compaction during a set time period. Ignored
+  if off-peak is disabled (default). This works the same as
+  `hbase.hstore.compaction.ratio`.
++
+*Default*: `5.0F`
 
-| `hbase.regionserver.thread.compaction.throttle`
-| There are two different thread pools for compactions, one for large compactions and the other for small compactions. This helps to keep compaction of lean tables (such as `hbase:meta`) fast. If a compaction is larger than this threshold, it goes into the large compaction pool. In most cases, the default value is appropriate.
-| `2 x hbase.hstore.compaction.max x hbase.hregion.memstore.flush.size` (which defaults to `128`)
+`hbase.offpeak.start.hour`::
+  The start of off-peak hours, expressed as an integer between 0 and 23, inclusive.
+  Set to -1 to disable off-peak.
++
+*Default*: `-1` (disabled)
 
-| `hbase.hregion.majorcompaction`
-| Time between major compactions, expressed in milliseconds. Set to 0 to disable time-based automatic major compactions. User-requested and size-based major compactions will still run. This value is multiplied by `hbase.hregion.majorcompaction.jitter` to cause compaction to start at a somewhat-random time during a given window of time.
-| 7 days (`604800000` milliseconds)
+`hbase.offpeak.end.hour`::
+  The end of off-peak hours, expressed as an integer between 0 and 23, inclusive.
+  Set to -1 to disable off-peak.
++
+*Default*: `-1` (disabled)
+
+`hbase.regionserver.thread.compaction.throttle`::
+  There are two different thread pools for compactions, one for large compactions
+  and the other for small compactions. This helps to keep compaction of lean tables
+  (such as `hbase:meta`) fast. If a compaction is larger than this threshold,
+  it goes into the large compaction pool. In most cases, the default value is
+  appropriate.
++
+*Default*: `2 x hbase.hstore.compaction.max x hbase.hregion.memstore.flush.size`
+(which defaults to `128`)
+
+`hbase.hregion.majorcompaction`::
+  Time between major compactions, expressed in milliseconds. Set to 0 to disable
+  time-based automatic major compactions. User-requested and size-based major
+  compactions will still run. This value is multiplied by
+  `hbase.hregion.majorcompaction.jitter` to cause compaction to start at a
+  somewhat-random time during a given window of time.
++
+*Default*: 7 days (`604800000` milliseconds)
 
-| `hbase.hregion.majorcompaction.jitter`
-| A multiplier applied to hbase.hregion.majorcompaction to cause compaction to occur a given amount of time either side of `hbase.hregion.majorcompaction`. The smaller the number, the closer the compactions will happen to the `hbase.hregion.majorcompaction` interval. Expressed as a floating-point decimal.
-| `.50F`
-|===
+`hbase.hregion.majorcompaction.jitter`::
+  A multiplier applied to hbase.hregion.majorcompaction to cause compaction to
+  occur a given amount of time either side of `hbase.hregion.majorcompaction`.
+  The smaller the number, the closer the compactions will happen to the
+  `hbase.hregion.majorcompaction` interval. Expressed as a floating-point decimal.
++
+*Default*: `.50F`
 
 [[compaction.file.selection.old]]
 ===== Compaction File Selection
@@ -1916,8 +1983,8 @@ Why?
 * 100 -> No, because sum(50, 23, 12, 12) * 1.0 = 97.
 * 50 -> No, because sum(23, 12, 12) * 1.0 = 47.
 * 23 -> Yes, because sum(12, 12) * 1.0 = 24.
-* 12 -> Yes, because the previous file has been included, and because this does not exceed the the max-file limit of 5
-* 12 -> Yes, because the previous file had been included, and because this does not exceed the the max-file limit of 5.
+* 12 -> Yes, because the previous file has been included, and because this does not exceed the max-file limit of 5
+* 12 -> Yes, because the previous file had been included, and because this does not exceed the max-file limit of 5.
 
 [[compaction.file.selection.example2]]
 ====== Minor Compaction File Selection - Example #2 (Not Enough Files ToCompact)
@@ -2178,7 +2245,7 @@ See link:http://blog.cloudera.com/blog/2013/09/how-to-use-hbase-bulk-loading-and
 [[arch.bulk.load.adv]]
 === Advanced Usage
 
-Although the `importtsv` tool is useful in many cases, advanced users may want to generate data programatically, or import data from other formats.
+Although the `importtsv` tool is useful in many cases, advanced users may want to generate data programmatically, or import data from other formats.
 To get started doing so, dig into `ImportTsv.java` and check the JavaDoc for HFileOutputFormat.
 
 The import step of the bulk load can also be done programmatically.
@@ -2274,8 +2341,8 @@ In terms of semantics, TIMELINE consistency as implemented by HBase differs from
 .Timeline Consistency
 image::timeline_consistency.png[Timeline Consistency]
 
-To better understand the TIMELINE semantics, lets look at the above diagram.
-Lets say that there are two clients, and the first one writes x=1 at first, then x=2 and x=3 later.
+To better understand the TIMELINE semantics, let's look at the above diagram.
+Let's say that there are two clients, and the first one writes x=1 at first, then x=2 and x=3 later.
 As above, all writes are handled by the primary region replica.
 The writes are saved in the write ahead log (WAL), and replicated to the other replicas asynchronously.
 In the above diagram, notice that replica_id=1 received 2 updates, and its data shows that x=2, while the replica_id=2 only received a single update, and its data shows that x=1.
@@ -2308,18 +2375,18 @@ To serve the region data from multiple replicas, HBase opens the regions in seco
 The regions opened in secondary mode will share the same data files with the primary region replica, however each secondary region replica will have its own MemStore to keep the unflushed data (only primary region can do flushes). Also to serve reads from secondary regions, the blocks of data files may be also cached in the block caches for the secondary regions.
 
 === Where is the code
-This feature is delivered in two phases, Phase 1 and 2. The first phase is done in time for HBase-1.0.0 release. Meaning that using HBase-1.0.x, you can use all the features that are marked for Phase 1. Phase 2 is committed in HBase-1.1.0, meaning all HBase versions after 1.1.0 should contain Phase 2 items. 
+This feature is delivered in two phases, Phase 1 and 2. The first phase is done in time for HBase-1.0.0 release. Meaning that using HBase-1.0.x, you can use all the features that are marked for Phase 1. Phase 2 is committed in HBase-1.1.0, meaning all HBase versions after 1.1.0 should contain Phase 2 items.
 
 === Propagating writes to region replicas
-As discussed above writes only go to the primary region replica. For propagating the writes from the primary region replica to the secondaries, there are two different mechanisms. For read-only tables, you do not need to use any of the following methods. Disabling and enabling the table should make the data available in all region replicas. For mutable tables, you have to use *only* one of the following mechanisms: storefile refresher, or async wal replication. The latter is recommeded. 
+As discussed above writes only go to the primary region replica. For propagating the writes from the primary region replica to the secondaries, there are two different mechanisms. For read-only tables, you do not need to use any of the following methods. Disabling and enabling the table should make the data available in all region replicas. For mutable tables, you have to use *only* one of the following mechanisms: storefile refresher, or async wal replication. The latter is recommended.
 
 ==== StoreFile Refresher
-The first mechanism is store file refresher which is introduced in HBase-1.0+. Store file refresher is a thread per region server, which runs periodically, and does a refresh operation for the store files of the primary region for the secondary region replicas. If enabled, the refresher will ensure that the secondary region replicas see the new flushed, compacted or bulk loaded files from the primary region in a timely manner. However, this means that only flushed data can be read back from the secondary region replicas, and after the refresher is run, making the secondaries lag behind the primary for an a longer time. 
+The first mechanism is store file refresher which is introduced in HBase-1.0+. Store file refresher is a thread per region server, which runs periodically, and does a refresh operation for the store files of the primary region for the secondary region replicas. If enabled, the refresher will ensure that the secondary region replicas see the new flushed, compacted or bulk loaded files from the primary region in a timely manner. However, this means that only flushed data can be read back from the secondary region replicas, and after the refresher is run, making the secondaries lag behind the primary for an a longer time.
 
-For turning this feature on, you should configure `hbase.regionserver.storefile.refresh.period` to a non-zero value. See Configuration section below. 
+For turning this feature on, you should configure `hbase.regionserver.storefile.refresh.period` to a non-zero value. See Configuration section below.
 
 ==== Asnyc WAL replication
-The second mechanism for propagation of writes to secondaries is done via “Async WAL Replication” feature and is only available in HBase-1.1+. This works similarly to HBase’s multi-datacenter replication, but instead the data from a region is replicated to the secondary regions. Each secondary replica always receives and observes the writes in the same order that the primary region committed them. In some sense, this design can be thought of as “in-cluster replication”, where instead of replicating to a different datacenter, the data goes to secondary regions to keep secondary region’s in-memory state up to date. The data files are shared between the primary region and the other replicas, so that there is no extra storage overhead. However, the secondary regions will have recent non-flushed data in their memstores, which increases the memory overhead. The primary region writes flush, compaction, and bulk load events to its WAL as well, which are also replicated through w
 al replication to secondaries. When they observe the flush/compaction or bulk load event, the secondary regions replay the event to pick up the new files and drop the old ones.  
+The second mechanism for propagation of writes to secondaries is done via “Async WAL Replication” feature and is only available in HBase-1.1+. This works similarly to HBase’s multi-datacenter replication, but instead the data from a region is replicated to the secondary regions. Each secondary replica always receives and observes the writes in the same order that the primary region committed them. In some sense, this design can be thought of as “in-cluster replication”, where instead of replicating to a different datacenter, the data goes to secondary regions to keep secondary region’s in-memory state up to date. The data files are shared between the primary region and the other replicas, so that there is no extra storage overhead. However, the secondary regions will have recent non-flushed data in their memstores, which increases the memory overhead. The primary region writes flush, compaction, and bulk load events to its WAL as well, which are also replicated through w
 al replication to secondaries. When they observe the flush/compaction or bulk load event, the secondary regions replay the event to pick up the new files and drop the old ones.
 
 Committing writes in the same order as in primary ensures that the secondaries won’t diverge from the primary regions data, but since the log replication is asynchronous, the data might still be stale in secondary regions. Since this feature works as a replication endpoint, the performance and latency characteristics is expected to be similar to inter-cluster replication.
 
@@ -2332,18 +2399,18 @@ Asyn WAL Replication feature will add a new replication peer named `region_repli
 	hbase> disable_peer 'region_replica_replication'
 ----
 
-=== Store File TTL 
-In both of the write propagation approaches mentioned above, store files of the primary will be opened in secondaries independent of the primary region. So for files that the primary compacted away, the secondaries might still be referring to these files for reading. Both features are using HFileLinks to refer to files, but there is no protection (yet) for guaranteeing that the file will not be deleted prematurely. Thus, as a guard, you should set the configuration property `hbase.master.hfilecleaner.ttl` to a larger value, such as 1 hour to guarantee that you will not receive IOExceptions for requests going to replicas. 
+=== Store File TTL
+In both of the write propagation approaches mentioned above, store files of the primary will be opened in secondaries independent of the primary region. So for files that the primary compacted away, the secondaries might still be referring to these files for reading. Both features are using HFileLinks to refer to files, but there is no protection (yet) for guaranteeing that the file will not be deleted prematurely. Thus, as a guard, you should set the configuration property `hbase.master.hfilecleaner.ttl` to a larger value, such as 1 hour to guarantee that you will not receive IOExceptions for requests going to replicas.
 
 === Region replication for META table’s region
-Currently, Async WAL Replication is not done for the META table’s WAL. The meta table’s secondary replicas still refreshes themselves from the persistent store files. Hence the `hbase.regionserver.meta.storefile.refresh.period` needs to be set to a certain non-zero value for refreshing the meta store files. Note that this configuration is configured differently than 
-`hbase.regionserver.storefile.refresh.period`. 
+Currently, Async WAL Replication is not done for the META table’s WAL. The meta table’s secondary replicas still refreshes themselves from the persistent store files. Hence the `hbase.regionserver.meta.storefile.refresh.period` needs to be set to a certain non-zero value for refreshing the meta store files. Note that this configuration is configured differently than
+`hbase.regionserver.storefile.refresh.period`.
 
 === Memory accounting
 The secondary region replicas refer to the data files of the primary region replica, but they have their own memstores (in HBase-1.1+) and uses block cache as well. However, one distinction is that the secondary region replicas cannot flush the data when there is memory pressure for their memstores. They can only free up memstore memory when the primary region does a flush and this flush is replicated to the secondary. Since in a region server hosting primary replicas for some regions and secondaries for some others, the secondaries might cause extra flushes to the primary regions in the same host. In extreme situations, there can be no memory left for adding new writes coming from the primary via wal replication. For unblocking this situation (and since secondary cannot flush by itself), the secondary is allowed to do a “store file refresh” by doing a file system list operation to pick up new files from primary, and possibly dropping its memstore. This refresh will only be perf
 ormed if the memstore size of the biggest secondary region replica is at least `hbase.region.replica.storefile.refresh.memstore.multiplier` (default 4) times bigger than the biggest memstore of a primary replica. One caveat is that if this is performed, the secondary can observe partial row updates across column families (since column families are flushed independently). The default should be good to not do this operation frequently. You can set this value to a large number to disable this feature if desired, but be warned that it might cause the replication to block forever.
 
 === Secondary replica failover
-When a secondary region replica first comes online, or fails over, it may have served some edits from it’s memstore. Since the recovery is handled differently for secondary replicas, the secondary has to ensure that it does not go back in time before it starts serving requests after assignment. For doing that, the secondary waits until it observes a full flush cycle (start flush, commit flush) or a “region open event” replicated from the primary. Until this happens, the secondary region replica will reject all read requests by throwing an IOException with message “The region's reads are disabled”. However, the other replicas will probably still be available to read, thus not causing any impact for the rpc with TIMELINE consistency. To facilitate faster recovery, the secondary region will trigger a flush request from the primary when it is opened. The configuration property `hbase.region.replica.wait.for.primary.flush` (enabled by default) can be used to disable this featur
 e if needed. 
+When a secondary region replica first comes online, or fails over, it may have served some edits from its memstore. Since the recovery is handled differently for secondary replicas, the secondary has to ensure that it does not go back in time before it starts serving requests after assignment. For doing that, the secondary waits until it observes a full flush cycle (start flush, commit flush) or a “region open event” replicated from the primary. Until this happens, the secondary region replica will reject all read requests by throwing an IOException with message “The region's reads are disabled”. However, the other replicas will probably still be available to read, thus not causing any impact for the rpc with TIMELINE consistency. To facilitate faster recovery, the secondary region will trigger a flush request from the primary when it is opened. The configuration property `hbase.region.replica.wait.for.primary.flush` (enabled by default) can be used to disable this feature i
 f needed.
 
 
 
@@ -2352,7 +2419,7 @@ When a secondary region replica first comes online, or fails over, it may have s
 
 To use highly available reads, you should set the following properties in `hbase-site.xml` file.
 There is no specific configuration to enable or disable region replicas.
-Instead you can change the number of region replicas per table to increase or decrease at the table creation or with alter table. The following configuration is for using async wal replication and using meta replicas of 3. 
+Instead you can change the number of region replicas per table to increase or decrease at the table creation or with alter table. The following configuration is for using async wal replication and using meta replicas of 3.
 
 
 ==== Server side properties
@@ -2379,7 +2446,7 @@ Instead you can change the number of region replicas per table to increase or de
     <name>hbase.region.replica.replication.enabled</name>
     <value>true</value>
     <description>
-      Whether asynchronous WAL replication to the secondary region replicas is enabled or not. If this is enabled, a replication peer named "region_replica_replication" will be created which will tail the logs and replicate the mutatations to region replicas for tables that have region replication > 1. If this is enabled once, disabling this replication also      requires disabling the replication peer using shell or ReplicationAdmin java class. Replication to secondary region replicas works over standard inter-cluster replication. So replication, if disabled explicitly, also has to be enabled by setting "hbase.replication"· to true for this feature to work.
+      Whether asynchronous WAL replication to the secondary region replicas is enabled or not. If this is enabled, a replication peer named "region_replica_replication" will be created which will tail the logs and replicate the mutations to region replicas for tables that have region replication > 1. If this is enabled once, disabling this replication also      requires disabling the replication peer using shell or ReplicationAdmin java class. Replication to secondary region replicas works over standard inter-cluster replication. So replication, if disabled explicitly, also has to be enabled by setting "hbase.replication"· to true for this feature to work.
     </description>
 </property>
 <property>
@@ -2413,7 +2480,7 @@ Instead you can change the number of region replicas per table to increase or de
 </property>
 
 
-<property> 
+<property>
     <name>hbase.region.replica.storefile.refresh.memstore.multiplier</name>
     <value>4</value>
     <description>
@@ -2476,7 +2543,7 @@ Ensure to set the following for all clients (and servers) that will use region r
 </property>
 ----
 
-Note HBase-1.0.x users should use `hbase.ipc.client.allowsInterrupt` rather than `hbase.ipc.client.specificThreadForWriting`. 
+Note HBase-1.0.x users should use `hbase.ipc.client.allowsInterrupt` rather than `hbase.ipc.client.specificThreadForWriting`.
 
 === User Interface
 
@@ -2547,7 +2614,7 @@ hbase> scan 't1', {CONSISTENCY => 'TIMELINE'}
 
 ==== Java
 
-You can set set the consistency for Gets and Scans and do requests as follows.
+You can set the consistency for Gets and Scans and do requests as follows.
 
 [source,java]
 ----

http://git-wip-us.apache.org/repos/asf/hbase/blob/c07ddc6d/src/main/asciidoc/_chapters/asf.adoc
----------------------------------------------------------------------
diff --git a/src/main/asciidoc/_chapters/asf.adoc b/src/main/asciidoc/_chapters/asf.adoc
index 77eed8f..47c29e5 100644
--- a/src/main/asciidoc/_chapters/asf.adoc
+++ b/src/main/asciidoc/_chapters/asf.adoc
@@ -35,13 +35,13 @@ HBase is a project in the Apache Software Foundation and as such there are respo
 [[asf.devprocess]]
 === ASF Development Process
 
-See the link:http://www.apache.org/dev/#committers[Apache Development Process page]            for all sorts of information on how the ASF is structured (e.g., PMC, committers, contributors), to tips on contributing and getting involved, and how open-source works at ASF. 
+See the link:http://www.apache.org/dev/#committers[Apache Development Process page]            for all sorts of information on how the ASF is structured (e.g., PMC, committers, contributors), to tips on contributing and getting involved, and how open-source works at ASF.
 
 [[asf.reporting]]
 === ASF Board Reporting
 
 Once a quarter, each project in the ASF portfolio submits a report to the ASF board.
 This is done by the HBase project lead and the committers.
-See link:http://www.apache.org/foundation/board/reporting[ASF board reporting] for more information. 
+See link:http://www.apache.org/foundation/board/reporting[ASF board reporting] for more information.
 
 :numbered:

http://git-wip-us.apache.org/repos/asf/hbase/blob/c07ddc6d/src/main/asciidoc/_chapters/case_studies.adoc
----------------------------------------------------------------------
diff --git a/src/main/asciidoc/_chapters/case_studies.adoc b/src/main/asciidoc/_chapters/case_studies.adoc
index 992414c..b021aa2 100644
--- a/src/main/asciidoc/_chapters/case_studies.adoc
+++ b/src/main/asciidoc/_chapters/case_studies.adoc
@@ -55,7 +55,7 @@ These jobs were consistently found to be waiting on map and reduce tasks assigne
 
 .Datanodes:
 * Two 12-core processors
-* Six Enerprise SATA disks
+* Six Enterprise SATA disks
 * 24GB of RAM
 * Two bonded gigabit NICs
 

http://git-wip-us.apache.org/repos/asf/hbase/blob/c07ddc6d/src/main/asciidoc/_chapters/community.adoc
----------------------------------------------------------------------
diff --git a/src/main/asciidoc/_chapters/community.adoc b/src/main/asciidoc/_chapters/community.adoc
index 573fb49..ba07df7 100644
--- a/src/main/asciidoc/_chapters/community.adoc
+++ b/src/main/asciidoc/_chapters/community.adoc
@@ -45,18 +45,18 @@ See link:http://search-hadoop.com/m/asM982C5FkS1[HBase, mail # dev - Thoughts
 
 The below policy is something we put in place 09/2012.
 It is a suggested policy rather than a hard requirement.
-We want to try it first to see if it works before we cast it in stone. 
+We want to try it first to see if it works before we cast it in stone.
 
 Apache HBase is made of link:https://issues.apache.org/jira/browse/HBASE#selectedTab=com.atlassian.jira.plugin.system.project%3Acomponents-panel[components].
 Components have one or more <<owner,OWNER>>s.
-See the 'Description' field on the link:https://issues.apache.org/jira/browse/HBASE#selectedTab=com.atlassian.jira.plugin.system.project%3Acomponents-panel[components]        JIRA page for who the current owners are by component. 
+See the 'Description' field on the link:https://issues.apache.org/jira/browse/HBASE#selectedTab=com.atlassian.jira.plugin.system.project%3Acomponents-panel[components]        JIRA page for who the current owners are by component.
 
 Patches that fit within the scope of a single Apache HBase component require, at least, a +1 by one of the component's owners before commit.
-If owners are absent -- busy or otherwise -- two +1s by non-owners will suffice. 
+If owners are absent -- busy or otherwise -- two +1s by non-owners will suffice.
 
-Patches that span components need at least two +1s before they can be committed, preferably +1s by owners of components touched by the x-component patch (TODO: This needs tightening up but I think fine for first pass). 
+Patches that span components need at least two +1s before they can be committed, preferably +1s by owners of components touched by the x-component patch (TODO: This needs tightening up but I think fine for first pass).
 
-Any -1 on a patch by anyone vetos a patch; it cannot be committed until the justification for the -1 is addressed. 
+Any -1 on a patch by anyone vetoes a patch; it cannot be committed until the justification for the -1 is addressed.
 
 [[hbase.fix.version.in.jira]]
 .How to set fix version in JIRA on issue resolve
@@ -67,13 +67,13 @@ If master is going to be 0.98.0 then:
 * Commit only to master: Mark with 0.98
 * Commit to 0.95 and master: Mark with 0.98, and 0.95.x
 * Commit to 0.94.x and 0.95, and master: Mark with 0.98, 0.95.x, and 0.94.x
-* Commit to 89-fb: Mark with 89-fb. 
-* Commit site fixes: no version 
+* Commit to 89-fb: Mark with 89-fb.
+* Commit site fixes: no version
 
 [[hbase.when.to.close.jira]]
 .Policy on when to set a RESOLVED JIRA as CLOSED
 
-We link:http://search-hadoop.com/m/4cIKs1iwXMS1[agreed] that for issues that list multiple releases in their _Fix Version/s_ field, CLOSE the issue on the release of any of the versions listed; subsequent change to the issue must happen in a new JIRA. 
+We link:http://search-hadoop.com/m/4cIKs1iwXMS1[agreed] that for issues that list multiple releases in their _Fix Version/s_ field, CLOSE the issue on the release of any of the versions listed; subsequent change to the issue must happen in a new JIRA.
 
 [[no.permanent.state.in.zk]]
 .Only transient state in ZooKeeper!
@@ -81,7 +81,7 @@ We link:http://search-hadoop.com/m/4cIKs1iwXMS1[agreed] that for issues that lis
 You should be able to kill the data in zookeeper and hbase should ride over it recreating the zk content as it goes.
 This is an old adage around these parts.
 We just made note of it now.
-We also are currently in violation of this basic tenet -- replication at least keeps permanent state in zk -- but we are working to undo this breaking of a golden rule. 
+We also are currently in violation of this basic tenet -- replication at least keeps permanent state in zk -- but we are working to undo this breaking of a golden rule.
 
 [[community.roles]]
 == Community Roles
@@ -90,22 +90,22 @@ We also are currently in violation of this basic tenet -- replication at least k
 .Component Owner/Lieutenant
 
 Component owners are listed in the description field on this Apache HBase JIRA link:https://issues.apache.org/jira/browse/HBASE#selectedTab=com.atlassian.jira.plugin.system.project%3Acomponents-panel[components]        page.
-The owners are listed in the 'Description' field rather than in the 'Component Lead' field because the latter only allows us list one individual whereas it is encouraged that components have multiple owners. 
+The owners are listed in the 'Description' field rather than in the 'Component Lead' field because the latter only allows us list one individual whereas it is encouraged that components have multiple owners.
 
-Owners or component lieutenants are volunteers who are (usually, but not necessarily) expert in their component domain and may have an agenda on how they think their Apache HBase component should evolve. 
+Owners or component lieutenants are volunteers who are (usually, but not necessarily) expert in their component domain and may have an agenda on how they think their Apache HBase component should evolve.
 
-. Owners will try and review patches that land within their component's scope. 
-. If applicable, if an owner has an agenda, they will publish their goals or the design toward which they are driving their component 
+. Owners will try and review patches that land within their component's scope.
+. If applicable, if an owner has an agenda, they will publish their goals or the design toward which they are driving their component
 
 If you would like to be volunteer as a component owner, just write the dev list and we'll sign you up.
-Owners do not need to be committers. 
+Owners do not need to be committers.
 
 [[hbase.commit.msg.format]]
 == Commit Message format
 
-We link:http://search-hadoop.com/m/Gwxwl10cFHa1[agreed] to the following Git commit message format: 
+We link:http://search-hadoop.com/m/Gwxwl10cFHa1[agreed] to the following Git commit message format:
 [source]
 ----
 HBASE-xxxxx <title>. (<contributor>)
----- 
-If the person making the commit is the contributor, leave off the '(<contributor>)' element. 
+----
+If the person making the commit is the contributor, leave off the '(<contributor>)' element.

http://git-wip-us.apache.org/repos/asf/hbase/blob/c07ddc6d/src/main/asciidoc/_chapters/compression.adoc
----------------------------------------------------------------------
diff --git a/src/main/asciidoc/_chapters/compression.adoc b/src/main/asciidoc/_chapters/compression.adoc
index 42d4de5..462bce3 100644
--- a/src/main/asciidoc/_chapters/compression.adoc
+++ b/src/main/asciidoc/_chapters/compression.adoc
@@ -144,15 +144,15 @@ In general, you need to weigh your options between smaller size and faster compr
 
 The Hadoop shared library has a bunch of facility including compression libraries and fast crc'ing. To make this facility available to HBase, do the following. HBase/Hadoop will fall back to use alternatives if it cannot find the native library versions -- or fail outright if you asking for an explicit compressor and there is no alternative available.
 
-If you see the following in your HBase logs, you know that HBase was unable to locate the Hadoop native libraries: 
+If you see the following in your HBase logs, you know that HBase was unable to locate the Hadoop native libraries:
 [source]
 ----
 2014-08-07 09:26:20,139 WARN  [main] util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
-----      
-If the libraries loaded successfully, the WARN message does not show. 
+----
+If the libraries loaded successfully, the WARN message does not show.
 
-Lets presume your Hadoop shipped with a native library that suits the platform you are running HBase on.
-To check if the Hadoop native library is available to HBase, run the following tool (available in  Hadoop 2.1 and greater): 
+Let's presume your Hadoop shipped with a native library that suits the platform you are running HBase on.
+To check if the Hadoop native library is available to HBase, run the following tool (available in  Hadoop 2.1 and greater):
 [source]
 ----
 $ ./bin/hbase --config ~/conf_hbase org.apache.hadoop.util.NativeLibraryChecker
@@ -165,28 +165,28 @@ lz4:    false
 bzip2:  false
 2014-08-26 13:15:38,863 INFO  [main] util.ExitUtil: Exiting with status 1
 ----
-Above shows that the native hadoop library is not available in HBase context. 
+Above shows that the native hadoop library is not available in HBase context.
 
 To fix the above, either copy the Hadoop native libraries local or symlink to them if the Hadoop and HBase stalls are adjacent in the filesystem.
 You could also point at their location by setting the `LD_LIBRARY_PATH` environment variable.
 
-Where the JVM looks to find native librarys is "system dependent" (See `java.lang.System#loadLibrary(name)`). On linux, by default, is going to look in _lib/native/PLATFORM_ where `PLATFORM`      is the label for the platform your HBase is installed on.
+Where the JVM looks to find native libraries is "system dependent" (See `java.lang.System#loadLibrary(name)`). On linux, by default, is going to look in _lib/native/PLATFORM_ where `PLATFORM`      is the label for the platform your HBase is installed on.
 On a local linux machine, it seems to be the concatenation of the java properties `os.name` and `os.arch` followed by whether 32 or 64 bit.
 HBase on startup prints out all of the java system properties so find the os.name and os.arch in the log.
-For example: 
+For example:
 [source]
 ----
 ...
 2014-08-06 15:27:22,853 INFO  [main] zookeeper.ZooKeeper: Client environment:os.name=Linux
 2014-08-06 15:27:22,853 INFO  [main] zookeeper.ZooKeeper: Client environment:os.arch=amd64
 ...
-----     
+----
 So in this case, the PLATFORM string is `Linux-amd64-64`.
 Copying the Hadoop native libraries or symlinking at _lib/native/Linux-amd64-64_     will ensure they are found.
 Check with the Hadoop _NativeLibraryChecker_.
- 
 
-Here is example of how to point at the Hadoop libs with `LD_LIBRARY_PATH`      environment variable: 
+
+Here is example of how to point at the Hadoop libs with `LD_LIBRARY_PATH`      environment variable:
 [source]
 ----
 $ LD_LIBRARY_PATH=~/hadoop-2.5.0-SNAPSHOT/lib/native ./bin/hbase --config ~/conf_hbase org.apache.hadoop.util.NativeLibraryChecker
@@ -199,7 +199,7 @@ snappy: true /usr/lib64/libsnappy.so.1
 lz4:    true revision:99
 bzip2:  true /lib64/libbz2.so.1
 ----
-Set in _hbase-env.sh_ the LD_LIBRARY_PATH environment variable when starting your HBase. 
+Set in _hbase-env.sh_ the LD_LIBRARY_PATH environment variable when starting your HBase.
 
 === Compressor Configuration, Installation, and Use
 
@@ -210,13 +210,13 @@ Before HBase can use a given compressor, its libraries need to be available.
 Due to licensing issues, only GZ compression is available to HBase (via native Java libraries) in a default installation.
 Other compression libraries are available via the shared library bundled with your hadoop.
 The hadoop native library needs to be findable when HBase starts.
-See 
+See
 
 .Compressor Support On the Master
 
 A new configuration setting was introduced in HBase 0.95, to check the Master to determine which data block encoders are installed and configured on it, and assume that the entire cluster is configured the same.
 This option, `hbase.master.check.compression`, defaults to `true`.
-This prevents the situation described in link:https://issues.apache.org/jira/browse/HBASE-6370[HBASE-6370], where a table is created or modified to support a codec that a region server does not support, leading to failures that take a long time to occur and are difficult to debug. 
+This prevents the situation described in link:https://issues.apache.org/jira/browse/HBASE-6370[HBASE-6370], where a table is created or modified to support a codec that a region server does not support, leading to failures that take a long time to occur and are difficult to debug.
 
 If `hbase.master.check.compression` is enabled, libraries for all desired compressors need to be installed and configured on the Master, even if the Master does not run a region server.
 
@@ -232,7 +232,7 @@ See <<brand.new.compressor,brand.new.compressor>>).
 
 HBase cannot ship with LZO because of incompatibility between HBase, which uses an Apache Software License (ASL) and LZO, which uses a GPL license.
 See the link:http://wiki.apache.org/hadoop/UsingLzoCompression[Using LZO
-              Compression] wiki page for information on configuring LZO support for HBase. 
+              Compression] wiki page for information on configuring LZO support for HBase.
 
 If you depend upon LZO compression, consider configuring your RegionServers to fail to start if LZO is not available.
 See <<hbase.regionserver.codecs,hbase.regionserver.codecs>>.
@@ -244,19 +244,19 @@ LZ4 support is bundled with Hadoop.
 Make sure the hadoop shared library (libhadoop.so) is accessible when you start HBase.
 After configuring your platform (see <<hbase.native.platform,hbase.native.platform>>), you can make a symbolic link from HBase to the native Hadoop libraries.
 This assumes the two software installs are colocated.
-For example, if my 'platform' is Linux-amd64-64: 
+For example, if my 'platform' is Linux-amd64-64:
 [source,bourne]
 ----
 $ cd $HBASE_HOME
 $ mkdir lib/native
 $ ln -s $HADOOP_HOME/lib/native lib/native/Linux-amd64-64
-----            
+----
 Use the compression tool to check that LZ4 is installed on all nodes.
 Start up (or restart) HBase.
-Afterward, you can create and alter tables to enable LZ4 as a compression codec.: 
+Afterward, you can create and alter tables to enable LZ4 as a compression codec.:
 ----
 hbase(main):003:0> alter 'TestTable', {NAME => 'info', COMPRESSION => 'LZ4'}
-----          
+----
 
 [[snappy.compression.installation]]
 .Install Snappy Support
@@ -347,7 +347,7 @@ You must specify either `-write` or `-update-read` as your first parameter, and
 ====
 ----
 
-$ bin/hbase org.apache.hadoop.hbase.util.LoadTestTool -h            
+$ bin/hbase org.apache.hadoop.hbase.util.LoadTestTool -h
 usage: bin/hbase org.apache.hadoop.hbase.util.LoadTestTool <options>
 Options:
  -batchupdate                 Whether to use batch as opposed to separate

http://git-wip-us.apache.org/repos/asf/hbase/blob/c07ddc6d/src/main/asciidoc/_chapters/configuration.adoc
----------------------------------------------------------------------
diff --git a/src/main/asciidoc/_chapters/configuration.adoc b/src/main/asciidoc/_chapters/configuration.adoc
index 5a4a6ec..495232f 100644
--- a/src/main/asciidoc/_chapters/configuration.adoc
+++ b/src/main/asciidoc/_chapters/configuration.adoc
@@ -162,7 +162,7 @@ For example, assuming that a schema had 3 ColumnFamilies per region with an aver
 +
 Another related setting is the number of processes a user is allowed to run at once. In Linux and Unix, the number of processes is set using the `ulimit -u` command. This should not be confused with the `nproc` command, which controls the number of CPUs available to a given user. Under load, a `ulimit -u` that is too low can cause OutOfMemoryError exceptions. See Jack Levin's major HDFS issues thread on the hbase-users mailing list, from 2011.
 +
-Configuring the maximum number of file descriptors and processes for the user who is running the HBase process is an operating system configuration, rather than an HBase configuration. It is also important to be sure that the settings are changed for the user that actually runs HBase. To see which user started HBase, and that user's ulimit configuration, look at the first line of the HBase log for that instance. A useful read setting config on you hadoop cluster is Aaron Kimballs' Configuration Parameters: What can you just ignore?
+Configuring the maximum number of file descriptors and processes for the user who is running the HBase process is an operating system configuration, rather than an HBase configuration. It is also important to be sure that the settings are changed for the user that actually runs HBase. To see which user started HBase, and that user's ulimit configuration, look at the first line of the HBase log for that instance. A useful read setting config on your hadoop cluster is Aaron Kimball's Configuration Parameters: What can you just ignore?
 +
 .`ulimit` Settings on Ubuntu
 ====
@@ -222,7 +222,8 @@ Use the following legend to interpret this table:
 |Hadoop-2.3.x | NT | S | NT | NT | NT
 |Hadoop-2.4.x | NT | S | S | S | S
 |Hadoop-2.5.x | NT | S | S | S | S
-|Hadoop-2.6.x | X | X | X | X | X
+|Hadoop-2.6.0 | X | X | X | X | X
+|Hadoop-2.6.1+ | NT | NT | NT | NT | S
 |Hadoop-2.7.0 | X | X | X | X | X
 |Hadoop-2.7.1+ | NT | NT | NT | NT | S
 |===
@@ -233,7 +234,7 @@ Use the following legend to interpret this table:
 Hadoop distributions based on the 2.6.x line *must* have
 link:https://issues.apache.org/jira/browse/HADOOP-11710[HADOOP-11710] applied if you plan to run
 HBase on top of an HDFS Encryption Zone. Failure to do so will result in cluster failure and
-data loss.
+data loss. This patch is present in Apache Hadoop releases 2.6.1+.
 ====
 
 .Hadoop 2.7.x
@@ -410,7 +411,7 @@ Zookeeper binds to a well known port so clients may talk to HBase.
 
 === Distributed
 
-Distributed mode can be subdivided into distributed but all daemons run on a single node -- a.k.a _pseudo-distributed_ -- and _fully-distributed_ where the daemons are spread across all nodes in the cluster.
+Distributed mode can be subdivided into distributed but all daemons run on a single node -- a.k.a. _pseudo-distributed_ -- and _fully-distributed_ where the daemons are spread across all nodes in the cluster.
 The _pseudo-distributed_ vs. _fully-distributed_ nomenclature comes from Hadoop.
 
 Pseudo-distributed mode can run against the local filesystem or it can run against an instance of the _Hadoop Distributed File System_ (HDFS). Fully-distributed mode can ONLY run on HDFS.
@@ -540,7 +541,7 @@ HBase logs can be found in the _logs_ subdirectory.
 Check them out especially if HBase had trouble starting.
 
 HBase also puts up a UI listing vital attributes.
-By default it's deployed on the Master host at port 16010 (HBase RegionServers listen on port 16020 by default and put up an informational HTTP server at port 16030). If the Master is running on a host named `master.example.org` on the default port, point your browser at _http://master.example.org:16010_ to see the web interface.
+By default it's deployed on the Master host at port 16010 (HBase RegionServers listen on port 16020 by default and put up an informational HTTP server at port 16030). If the Master is running on a host named `master.example.org` on the default port, point your browser at pass:[http://master.example.org:16010] to see the web interface.
 
 Prior to HBase 0.98 the master UI was deployed on port 60010, and the HBase RegionServers UI on port 60030.
 
@@ -564,7 +565,7 @@ If you are running a distributed operation, be sure to wait until HBase has shut
 === _hbase-site.xml_ and _hbase-default.xml_
 
 Just as in Hadoop where you add site-specific HDFS configuration to the _hdfs-site.xml_ file, for HBase, site specific customizations go into the file _conf/hbase-site.xml_.
-For the list of configurable properties, see <<hbase_default_configurations,hbase default configurations>> below or view the raw _hbase-default.xml_ source file in the HBase source code at _src/main/resources_. 
+For the list of configurable properties, see <<hbase_default_configurations,hbase default configurations>> below or view the raw _hbase-default.xml_ source file in the HBase source code at _src/main/resources_.
 
 Not all configuration options make it out to _hbase-default.xml_.
 Configuration that it is thought rare anyone would change can exist only in code; the only way to turn up such configurations is via a reading of the source code itself.
@@ -572,7 +573,7 @@ Configuration that it is thought rare anyone would change can exist only in code
 Currently, changes here will require a cluster restart for HBase to notice the change.
 // hbase/src/main/asciidoc
 //
-include::../../../../target/asciidoc/hbase-default.adoc[]
+include::{docdir}/../../../target/asciidoc/hbase-default.adoc[]
 
 
 [[hbase.env.sh]]
@@ -604,7 +605,7 @@ ZooKeeper is where all these values are kept.
 Thus clients require the location of the ZooKeeper ensemble before they can do anything else.
 Usually this the ensemble location is kept out in the _hbase-site.xml_ and is picked up by the client from the `CLASSPATH`.
 
-If you are configuring an IDE to run a HBase client, you should include the _conf/_ directory on your classpath so _hbase-site.xml_ settings can be found (or add _src/test/resources_ to pick up the hbase-site.xml used by tests). 
+If you are configuring an IDE to run an HBase client, you should include the _conf/_ directory on your classpath so _hbase-site.xml_ settings can be found (or add _src/test/resources_ to pick up the hbase-site.xml used by tests).
 
 Minimally, a client of HBase needs several libraries in its `CLASSPATH` when connecting to a cluster, including:
 [source]
@@ -621,7 +622,7 @@ slf4j-log4j (slf4j-log4j12-1.5.8.jar)
 zookeeper (zookeeper-3.4.2.jar)
 ----
 
-An example basic _hbase-site.xml_ for client only might look as follows: 
+An example basic _hbase-site.xml_ for client only might look as follows:
 [source,xml]
 ----
 <?xml version="1.0"?>
@@ -917,7 +918,7 @@ See <<master.processes.loadbalancer,master.processes.loadbalancer>> for more inf
 ==== Disabling Blockcache
 
 Do not turn off block cache (You'd do it by setting `hbase.block.cache.size` to zero). Currently we do not do well if you do this because the RegionServer will spend all its time loading HFile indices over and over again.
-If your working set it such that block cache does you no good, at least size the block cache such that HFile indices will stay up in the cache (you can get a rough idea on the size you need by surveying RegionServer UIs; you'll see index block size accounted near the top of the webpage).
+If your working set is such that block cache does you no good, at least size the block cache such that HFile indices will stay up in the cache (you can get a rough idea on the size you need by surveying RegionServer UIs; you'll see index block size accounted near the top of the webpage).
 
 [[nagles]]
 ==== link:http://en.wikipedia.org/wiki/Nagle's_algorithm[Nagle's] or the small package problem
@@ -930,7 +931,7 @@ You might also see the graphs on the tail of link:https://issues.apache.org/jira
 ==== Better Mean Time to Recover (MTTR)
 
 This section is about configurations that will make servers come back faster after a fail.
-See the Deveraj Das an Nicolas Liochon blog post link:http://hortonworks.com/blog/introduction-to-hbase-mean-time-to-recover-mttr/[Introduction to HBase Mean Time to Recover (MTTR)] for a brief introduction.
+See the Deveraj Das and Nicolas Liochon blog post link:http://hortonworks.com/blog/introduction-to-hbase-mean-time-to-recover-mttr/[Introduction to HBase Mean Time to Recover (MTTR)] for a brief introduction.
 
 The issue link:https://issues.apache.org/jira/browse/HBASE-8389[HBASE-8354 forces Namenode into loop with lease recovery requests] is messy but has a bunch of good discussion toward the end on low timeouts and how to effect faster recovery including citation of fixes added to HDFS. Read the Varun Sharma comments.
 The below suggested configurations are Varun's suggestions distilled and tested.
@@ -1002,7 +1003,7 @@ See the link:http://docs.oracle.com/javase/6/docs/technotes/guides/management/ag
 Historically, besides above port mentioned, JMX opens two additional random TCP listening ports, which could lead to port conflict problem. (See link:https://issues.apache.org/jira/browse/HBASE-10289[HBASE-10289] for details)
 
 As an alternative, You can use the coprocessor-based JMX implementation provided by HBase.
-To enable it in 0.99 or above, add below property in _hbase-site.xml_: 
+To enable it in 0.99 or above, add below property in _hbase-site.xml_:
 
 [source,xml]
 ----
@@ -1033,7 +1034,7 @@ The registry port can be shared with connector port in most cases, so you only n
 However if you want to use SSL communication, the 2 ports must be configured to different values.
 
 By default the password authentication and SSL communication is disabled.
-To enable password authentication, you need to update _hbase-env.sh_          like below: 
+To enable password authentication, you need to update _hbase-env.sh_          like below:
 [source,bash]
 ----
 export HBASE_JMX_BASE="-Dcom.sun.management.jmxremote.authenticate=true                  \
@@ -1060,7 +1061,7 @@ keytool -export -alias jconsole -keystore myKeyStore -file jconsole.cert
 keytool -import -alias jconsole -keystore jconsoleKeyStore -file jconsole.cert
 ----
 
-And then update _hbase-env.sh_ like below: 
+And then update _hbase-env.sh_ like below:
 
 [source,bash]
 ----
@@ -1082,12 +1083,12 @@ Finally start `jconsole` on the client using the key store:
 jconsole -J-Djavax.net.ssl.trustStore=/home/tianq/jconsoleKeyStore
 ----
 
-NOTE: To enable the HBase JMX implementation on Master, you also need to add below property in _hbase-site.xml_: 
+NOTE: To enable the HBase JMX implementation on Master, you also need to add below property in _hbase-site.xml_:
 
 [source,xml]
 ----
 <property>
-  <ame>hbase.coprocessor.master.classes</name>
+  <name>hbase.coprocessor.master.classes</name>
   <value>org.apache.hadoop.hbase.JMXListener</value>
 </property>
 ----