You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hbase.apache.org by dm...@apache.org on 2011/08/24 23:05:39 UTC
svn commit: r1161273 - /hbase/trunk/src/docbkx/performance.xml
Author: dmeil
Date: Wed Aug 24 21:05:38 2011
New Revision: 1161273
URL: http://svn.apache.org/viewvc?rev=1161273&view=rev
Log:
HBASE-4249 - performance.xml (adding network section)
Modified:
hbase/trunk/src/docbkx/performance.xml
Modified: hbase/trunk/src/docbkx/performance.xml
URL: http://svn.apache.org/viewvc/hbase/trunk/src/docbkx/performance.xml?rev=1161273&r1=1161272&r2=1161273&view=diff
==============================================================================
--- hbase/trunk/src/docbkx/performance.xml (original)
+++ hbase/trunk/src/docbkx/performance.xml Wed Aug 24 21:05:38 2011
@@ -24,7 +24,59 @@
<para>Watch out for swapping. Set swappiness to 0.</para>
</section>
</section>
-
+ <section xml:id="perf.network">
+ <title>Network</title>
+ <para>
+ Perhaps the most important factor in avoiding network issues degrading Hadoop and HBbase performance is the switching hardware
+ that is used, decisions made early in the scope of the project can cause major problems when you double or triple the size of your cluster (or more).
+ </para>
+ <para>
+ Important items to consider:
+ <itemizedlist>
+ <listitem>Switching capacity of the device</listitem>
+ <listitem>Number of systems connected</listitem>
+ <listitem>Uplink capacity</listitem>
+ </itemizedlist>
+ </para>
+ <section xml:id="perf.network.1switch">
+ <title>Single Switch</title>
+ <para>The single most important factor in this configuration is that the switching capacity of the hardware is capable of
+ handling the traffic which can be generated by all systems connected to the switch. Some lower priced commodity hardware
+ can have a slower switching capacity than could be utilized by a full switch.
+ </para>
+ </section>
+ <section xml:id="perf.network.2switch">
+ <title>Multiple Switches</title>
+ <para>Multiple switches are a potential pitfall in the architecture. The most common configuration of lower priced hardware is a
+ simple 1Gbps uplink from one switch to another. This often overlooked pinch point can easily become a bottleneck for cluster communication.
+ Especially with MapReduce jobs that are both reading and writing a lot of data the communication across this uplink could be saturated.
+ </para>
+ <para>Mitigation of this issue is fairly simple and can be accomplished in multiple ways:
+ <itemizedlist>
+ <listitem>Use appropriate hardware for the scale of the cluster which you're attempting to build.</listitem>
+ <listitem>Use larger single switch configurations i.e. single 48 port as opposed to 2x 24 port</listitem>
+ <listitem>Configure port trunking for uplinks to utilize multiple interfaces to increase cross switch bandwidth.</listitem>
+ </itemizedlist>
+ </para>
+ </section>
+ <section xml:id="perf.network.multirack">
+ <title>Multiple Racks</title>
+ <para>Multiple rack configurations carry the same potential issues as multiple switches, and can suffer performance degradation from two main areas:
+ <itemizedlist>
+ <listitem>Poor switch capacity performance</listitem>
+ <listitem>Insufficient uplink to another rack</listitem>
+ </itemizedlist>
+ If the the switches in your rack have appropriate switching capacity to handle all the hosts at full speed, the next most likely issue will be caused by homing
+ more of your cluster across racks. The easiest way to avoid issues when spanning multiple racks is to use port trunking to create a bonded uplink to other racks.
+ The downside of this method however, is in the overhead of ports that could potentially be used. An example of this is, creating an 8Gbps port channel from rack
+ A to rack B, using 8 of your 24 ports to communicate between racks gives you a poor ROI, using too few however can mean you're not getting the most out of your cluster.
+ </para>
+ <para>Using 10Gbe links between racks will greatly increase performance, and assuming your switches support a 10Gbe uplink or allow for an expansion card will allow you to
+ save your ports for machines as opposed to uplinks.
+ </para>
+
+ </section>
+ </section> <!-- network -->
<section xml:id="jvm">
<title>Java</title>
@@ -56,7 +108,7 @@
</section>
<section xml:id="perf.configurations">
- <title>Configurations</title>
+ <title>HBase Configurations</title>
<para>See <xref linkend="recommended_configurations" />.</para>