You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hbase.apache.org by gi...@apache.org on 2018/04/05 14:47:39 UTC

[32/40] hbase-site git commit: Published site at e2b0490d18f7cc03aa59475a1b423597ddc481fb.

http://git-wip-us.apache.org/repos/asf/hbase-site/blob/6c67ddd7/book.html
----------------------------------------------------------------------
diff --git a/book.html b/book.html
index 0621ea8..7977239 100644
--- a/book.html
+++ b/book.html
@@ -485,7 +485,7 @@ See <a href="#java">Java</a> for information about supported JDK versions.</p>
 <div class="title">Procedure: Download, Configure, and Start HBase in Standalone Mode</div>
 <ol class="arabic">
 <li>
-<p>Choose a download site from this list of <a href="https://www.apache.org/dyn/closer.cgi/hbase/">Apache Download Mirrors</a>.
+<p>Choose a download site from this list of <a href="https://www.apache.org/dyn/closer.lua/hbase/">Apache Download Mirrors</a>.
 Click on the suggested top link.
 This will take you to a mirror of <em>HBase Releases</em>.
 Click on the folder named <em>stable</em> and then download the binary file that ends in <em>.tar.gz</em> to your local filesystem.
@@ -6703,6 +6703,9 @@ Quitting...</code></pre>
 <li>
 <p>hbase.regionserver.region.split.policy is now SteppingSplitPolicy. Previously it was IncreasingToUpperBoundRegionSplitPolicy.</p>
 </li>
+<li>
+<p>replication.source.ratio is now 0.5. Previously it was 0.1.</p>
+</li>
 </ul>
 </div>
 <div id="upgrade2.0.regions.on.master" class="paragraph">
@@ -6915,13 +6918,81 @@ Quitting...</code></pre>
 </div>
 </div>
 <div class="sect3">
-<h4 id="upgrade2.0.rolling.upgrades"><a class="anchor" href="#upgrade2.0.rolling.upgrades"></a>13.1.2. Rolling Upgrade from 1.x to 2.x</h4>
+<h4 id="upgrade2.0.coprocessors"><a class="anchor" href="#upgrade2.0.coprocessors"></a>13.1.2. Upgrading Coprocessors to 2.0</h4>
+<div class="paragraph">
+<p>Coprocessors have changed substantially in 2.0 ranging from top level design changes in class
+hierarchies to changed/removed methods, interfaces, etc.
+(Parent jira: <a href="https://issues.apache.org/jira/browse/HBASE-18169">HBASE-18169 Coprocessor fix
+and cleanup before 2.0.0 release</a>). Some of the reasons for such widespread changes:</p>
+</div>
+<div class="olist arabic">
+<ol class="arabic">
+<li>
+<p>Pass Interfaces instead of Implementations; e.g. TableDescriptor instead of HTableDescriptor and
+Region instead of HRegion (<a href="https://issues.apache.org/jira/browse/HBASE-18241">HBASE-18241</a>
+Change client.Table and client.Admin to not use HTableDescriptor).</p>
+</li>
+<li>
+<p>Design refactor so implementers need to fill out less boilerplate and so we can do more
+compile-time checking (<a href="https://issues.apache.org/jira/browse/HBASE-17732">HBASE-17732</a>)</p>
+</li>
+<li>
+<p>Purge Protocol Buffers from Coprocessor API
+(<a href="https://issues.apache.org/jira/browse/HBASE-18859">HBASE-18859</a>,
+<a href="https://issues.apache.org/jira/browse/HBASE-16769">HBASE-16769</a>, etc)</p>
+</li>
+<li>
+<p>Cut back on what we expose to Coprocessors removing hooks on internals that were too private to
+expose (for eg. <a href="https://issues.apache.org/jira/browse/HBASE-18453">HBASE-18453</a>
+CompactionRequest should not be exposed to user directly;
+<a href="https://issues.apache.org/jira/browse/HBASE-18298">HBASE-18298</a> RegionServerServices Interface
+cleanup for CP expose; etc)</p>
+</li>
+</ol>
+</div>
+<div class="paragraph">
+<p>To use coprocessors in 2.0, they should be rebuilt against new API otherwise they will fail to
+load and HBase processes will die.</p>
+</div>
+<div class="paragraph">
+<p>Suggested order of changes to upgrade the coprocessors:</p>
+</div>
+<div class="olist arabic">
+<ol class="arabic">
+<li>
+<p>Directly implement observer interfaces instead of extending Base*Observer classes. Change
+<code>Foo extends BaseXXXObserver</code> to <code>Foo implements XXXObserver</code>.
+(<a href="https://issues.apache.org/jira/browse/HBASE-17312">HBASE-17312</a>).</p>
+</li>
+<li>
+<p>Adapt to design change from Inheritence to Composition
+(<a href="https://issues.apache.org/jira/browse/HBASE-17732">HBASE-17732</a>) by following
+<a href="https://github.com/apache/hbase/blob/master/dev-support/design-docs/Coprocessor_Design_Improvements-Use_composition_instead_of_inheritance-HBASE-17732.adoc#migrating-existing-cps-to-new-design">this
+example</a>.</p>
+</li>
+<li>
+<p>getTable() has been removed from the CoprocessorEnvrionment, coprocessors should self-manage
+Table instances.</p>
+</li>
+</ol>
+</div>
+<div class="paragraph">
+<p>Some examples of writing coprocessors with new API can be found in hbase-example module
+<a href="https://github.com/apache/hbase/tree/branch-2.0/hbase-examples/src/main/java/org/apache/hadoop/hbase/coprocessor/example">here</a> .</p>
+</div>
+<div class="paragraph">
+<p>Lastly, if an api has been changed/removed that breaks you in an irreparable way, and if there&#8217;s a
+good justification to add it back, bring it our notice (<a href="mailto:dev@hbase.apache.org">dev@hbase.apache.org</a>).</p>
+</div>
+</div>
+<div class="sect3">
+<h4 id="upgrade2.0.rolling.upgrades"><a class="anchor" href="#upgrade2.0.rolling.upgrades"></a>13.1.3. Rolling Upgrade from 1.x to 2.x</h4>
 <div class="paragraph">
 <p>There is no rolling upgrade from HBase 1.x+ to HBase 2.x+. In order to perform a zero downtime upgrade, you will need to run an additional cluster in parallel and handle failover in application logic.</p>
 </div>
 </div>
 <div class="sect3">
-<h4 id="upgrade2.0.process"><a class="anchor" href="#upgrade2.0.process"></a>13.1.3. Upgrade process from 1.x to 2.x</h4>
+<h4 id="upgrade2.0.process"><a class="anchor" href="#upgrade2.0.process"></a>13.1.4. Upgrade process from 1.x to 2.x</h4>
 <div class="paragraph">
 <p>To upgrade an existing HBase 1.x cluster, you should:</p>
 </div>
@@ -6931,6 +7002,9 @@ Quitting...</code></pre>
 <p>Clean shutdown of existing 1.x cluster</p>
 </li>
 <li>
+<p>Update coprocessors</p>
+</li>
+<li>
 <p>Upgrade Master roles first</p>
 </li>
 <li>
@@ -10043,18 +10117,33 @@ If you don&#8217;t have time to build it both ways and compare, my advice would
 </div>
 </div>
 <div class="sect2">
-<h3 id="_optimize_on_the_server_side_for_low_latency"><a class="anchor" href="#_optimize_on_the_server_side_for_low_latency"></a>45.4. Optimize on the Server Side for Low Latency</h3>
+<h3 id="shortcircuit.reads"><a class="anchor" href="#shortcircuit.reads"></a>45.4. Optimize on the Server Side for Low Latency</h3>
+<div class="paragraph">
+<p>Skip the network for local blocks when the RegionServer goes to read from HDFS by exploiting HDFS&#8217;s
+<a href="https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/ShortCircuitLocalReads.html">Short-Circuit Local Reads</a> facility.
+Note how setup must be done both at the datanode and on the dfsclient ends of the conneciton&#8201;&#8212;&#8201;i.e. at the RegionServer
+and how both ends need to have loaded the hadoop native <code>.so</code> library.
+After configuring your hadoop setting <em>dfs.client.read.shortcircuit</em> to <em>true</em> and configuring
+the <em>dfs.domain.socket.path</em> path for the datanode and dfsclient to share and restarting, next configure
+the regionserver/dfsclient side.</p>
+</div>
 <div class="ulist">
 <ul>
 <li>
-<p>Skip the network for local blocks. In <code>hbase-site.xml</code>, set the following parameters:</p>
+<p>In <code>hbase-site.xml</code>, set the following parameters:</p>
 <div class="ulist">
 <ul>
 <li>
 <p><code>dfs.client.read.shortcircuit = true</code></p>
 </li>
 <li>
-<p><code>dfs.client.read.shortcircuit.buffer.size = 131072</code> (Important to avoid OOME)</p>
+<p><code>dfs.client.read.shortcircuit.skip.checksum = true</code> so we don&#8217;t double checksum (HBase does its own checksumming to save on i/os. See <a href="#hbase.regionserver.checksum.verify.performance"><code>hbase.regionserver.checksum.verify</code></a> for more on this.</p>
+</li>
+<li>
+<p><code>dfs.domain.socket.path</code> to match what was set for the datanodes.</p>
+</li>
+<li>
+<p><code>dfs.client.read.shortcircuit.buffer.size = 131072</code> Important to avoid OOME&#8201;&#8212;&#8201;hbase has a default it uses if unset, see <code>hbase.dfs.client.read.shortcircuit.buffer.size</code>; its default is 131072.</p>
 </li>
 </ul>
 </div>
@@ -10077,6 +10166,24 @@ If you don&#8217;t have time to build it both ways and compare, my advice would
 </li>
 </ul>
 </div>
+<div class="paragraph">
+<p>Check the RegionServer logs after restart. You should only see complaint if misconfiguration.
+Otherwise, shortcircuit read operates quietly in background. It does not provide metrics so
+no optics on how effective it is but read latencies should show a marked improvement, especially if
+good data locality, lots of random reads, and dataset is larger than available cache.</p>
+</div>
+<div class="paragraph">
+<p>Other advanced configurations that you might play with, especially if shortcircuit functionality
+is complaining in the logs,  include <code>dfs.client.read.shortcircuit.streams.cache.size</code> and
+<code>dfs.client.socketcache.capacity</code>. Documentation is sparse on these options. You&#8217;ll have to
+read source code.</p>
+</div>
+<div class="paragraph">
+<p>For more on short-circuit reads, see Colin&#8217;s old blog on rollout,
+<a href="http://blog.cloudera.com/blog/2013/08/how-improved-short-circuit-local-reads-bring-better-performance-and-security-to-hadoop/">How Improved Short-Circuit Local Reads Bring Better Performance and Security to Hadoop</a>.
+The <a href="https://issues.apache.org/jira/browse/HDFS-347">HDFS-347</a> issue also makes for an
+interesting read showing the HDFS community at its best (caveat a few comments).</p>
+</div>
 </div>
 <div class="sect2">
 <h3 id="_jvm_tuning"><a class="anchor" href="#_jvm_tuning"></a>45.5. JVM Tuning</h3>
@@ -37373,7 +37480,7 @@ The server will return cellblocks compressed using this same compressor as long
 <div id="footer">
 <div id="footer-text">
 Version 3.0.0-SNAPSHOT<br>
-Last updated 2018-04-04 14:29:50 UTC
+Last updated 2018-04-05 14:29:11 UTC
 </div>
 </div>
 </body>

http://git-wip-us.apache.org/repos/asf/hbase-site/blob/6c67ddd7/bulk-loads.html
----------------------------------------------------------------------
diff --git a/bulk-loads.html b/bulk-loads.html
index 2f77955..15e7cfa 100644
--- a/bulk-loads.html
+++ b/bulk-loads.html
@@ -7,7 +7,7 @@
   <head>
     <meta charset="UTF-8" />
     <meta name="viewport" content="width=device-width, initial-scale=1.0" />
-    <meta name="Date-Revision-yyyymmdd" content="20180404" />
+    <meta name="Date-Revision-yyyymmdd" content="20180405" />
     <meta http-equiv="Content-Language" content="en" />
     <title>Apache HBase &#x2013;  
       Bulk Loads in Apache HBase (TM)
@@ -63,7 +63,7 @@
                       <li>      <a href="license.html"  title="License">License</a>
 </li>
                   
-                      <li>      <a href="http://www.apache.org/dyn/closer.cgi/hbase/"  title="Downloads">Downloads</a>
+                      <li>      <a href="http://www.apache.org/dyn/closer.lua/hbase/"  title="Downloads">Downloads</a>
 </li>
                   
                       <li>      <a href="https://issues.apache.org/jira/browse/HBASE?report=com.atlassian.jira.plugin.system.project:changelog-panel#selectedTab=com.atlassian.jira.plugin.system.project%3Achangelog-panel"  title="Release Notes">Release Notes</a>
@@ -296,7 +296,7 @@ under the License. -->
                         <a href="https://www.apache.org/">The Apache Software Foundation</a>.
             All rights reserved.      
                     
-                  <li id="publishDate" class="pull-right">Last Published: 2018-04-04</li>
+                  <li id="publishDate" class="pull-right">Last Published: 2018-04-05</li>
             </p>
                 </div>