You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hbase.apache.org by st...@apache.org on 2011/04/01 07:45:10 UTC

svn commit: r1087588 - in /hbase/trunk: CHANGES.txt src/docbkx/book.xml

Author: stack
Date: Fri Apr  1 05:45:10 2011
New Revision: 1087588

URL: http://svn.apache.org/viewvc?rev=1087588&view=rev
Log:
HBASE-3715 Book.xml - adding architecture section on client, adding section on spec-ex under mapreduce

Modified:
    hbase/trunk/CHANGES.txt
    hbase/trunk/src/docbkx/book.xml

Modified: hbase/trunk/CHANGES.txt
URL: http://svn.apache.org/viewvc/hbase/trunk/CHANGES.txt?rev=1087588&r1=1087587&r2=1087588&view=diff
==============================================================================
--- hbase/trunk/CHANGES.txt (original)
+++ hbase/trunk/CHANGES.txt Fri Apr  1 05:45:10 2011
@@ -117,6 +117,8 @@ Release 0.91.0 - Unreleased
    HBASE-3720  Book.xml - porting conceptual-view / physical-view sections of
                HBaseArchitecture wiki (Doug Meil via Stack)
    HBASE-3705  Allow passing timestamp into importtsv (Andy Sautins via Stack)
+   HBASE-3715  Book.xml - adding architecture section on client, adding section
+               on spec-ex under mapreduce (Doug Meil via Stack)
 
   TASK
    HBASE-3559  Move report of split to master OFF the heartbeat channel

Modified: hbase/trunk/src/docbkx/book.xml
URL: http://svn.apache.org/viewvc/hbase/trunk/src/docbkx/book.xml?rev=1087588&r1=1087587&r2=1087588&view=diff
==============================================================================
--- hbase/trunk/src/docbkx/book.xml (original)
+++ hbase/trunk/src/docbkx/book.xml Fri Apr  1 05:45:10 2011
@@ -124,6 +124,16 @@ throws InterruptedException, IOException
   }</programlisting>
    </para>
     </section>
+  <section xml:id="mapreduce.specex">
+  <title>Speculative Execution</title>
+  <para>It is generally advisable to turn off speculative execution for
+      MapReduce jobs that use HBase as a source.  This can either be done on a
+      per-Job basis through properties, on on the entire cluster.  Especially
+      for longer running jobs, speculative execution will create duplicate
+      map-tasks which will double-write your data to HBase; this is probably
+      not what you want.
+  </para>
+  </section>
   </chapter>
 
   <chapter xml:id="schema">
@@ -694,6 +704,30 @@ public static byte[][] getHexSplits(Stri
 
   <chapter xml:id="architecture">
     <title>Architecture</title>
+
+	<section xml:id="client">
+	 <title>Client</title>
+     <para>The HBase client
+         <link xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/HTable.html">HTable</link>
+         is responsible for finding RegionServers that are serving the
+         particular row range of interest.  It does this by querying
+         the <code>.META.</code> and <code>-ROOT</code> catalog tables
+         (TODO: Explain).  After locating the required
+         region(s), the client <emphasis>directly</emphasis> contacts
+         the RegionServer serving that region (i.e., it does not go
+         through the master) and issues the read or write request.
+         This information is cached in the client so that subsequent requests
+         need not go through the lookup process.  Should a region be reassigned
+         either by the master load balancer or because a RegionServer has died,
+         the client will requery the catalog tables to determine the new
+         location of the user region. 
+    </para>
+    <para>Administrative functions are handled through <link xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/HBaseAdmin.html">HBaseAdmin</link>
+    </para>
+    <para>For connection configuration information, see the <link linkend="client_dependencies">configuration</link> section. 
+    </para>
+	</section>
+    
     <section xml:id="daemons">
      <title>Daemons</title>
      <section xml:id="master"><title>Master</title>