You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@zookeeper.apache.org by ph...@apache.org on 2008/12/10 01:11:48 UTC
svn commit: r724936 - in /hadoop/zookeeper/trunk: CHANGES.txt
docs/zookeeperAdmin.html docs/zookeeperAdmin.pdf
src/docs/src/documentation/content/xdocs/zookeeperAdmin.xml
Author: phunt
Date: Tue Dec 9 16:11:48 2008
New Revision: 724936
URL: http://svn.apache.org/viewvc?rev=724936&view=rev
Log:
ZOOKEEPER-161. Content needed: "Designing a ZooKeeper Deployment"
Modified:
hadoop/zookeeper/trunk/CHANGES.txt
hadoop/zookeeper/trunk/docs/zookeeperAdmin.html
hadoop/zookeeper/trunk/docs/zookeeperAdmin.pdf
hadoop/zookeeper/trunk/src/docs/src/documentation/content/xdocs/zookeeperAdmin.xml
Modified: hadoop/zookeeper/trunk/CHANGES.txt
URL: http://svn.apache.org/viewvc/hadoop/zookeeper/trunk/CHANGES.txt?rev=724936&r1=724935&r2=724936&view=diff
==============================================================================
--- hadoop/zookeeper/trunk/CHANGES.txt (original)
+++ hadoop/zookeeper/trunk/CHANGES.txt Tue Dec 9 16:11:48 2008
@@ -5,58 +5,62 @@
Backward compatibile changes:
BUGFIXES:
- ZOOKEEPER-211 Not all Mock tests are working (ben via phunt)
+ ZOOKEEPER-211. Not all Mock tests are working (ben via phunt)
- ZOOKEEPER-223. change default level in root logger to INFO.
- (pat via mahadev)
+ ZOOKEEPER-223. change default level in root logger to INFO.
+ (pat via mahadev)
- ZOOKEEPER-212. fix the snapshot to be asynchronous. (mahadev and ben)
+ ZOOKEEPER-212. fix the snapshot to be asynchronous. (mahadev and ben)
- ZOOKEEPER-213. fix programmer guide C api docs to be in sync with latest
- zookeeper.h (pat via mahadev)
+ ZOOKEEPER-213. fix programmer guide C api docs to be in sync with latest
+ zookeeper.h (pat via mahadev)
- ZOOKEEPER-219. fix events.poll timeout in watcher test to be longer.
- (pat via mahadev)
+ ZOOKEEPER-219. fix events.poll timeout in watcher test to be longer.
+ (pat via mahadev)
- ZOOKEEPER-217. Fix errors in config to be thrown as Exceptions. (mahadev)
+ ZOOKEEPER-217. Fix errors in config to be thrown as Exceptions. (mahadev)
- ZOOKEEPER-228. fix apache header missing in DBTest. (mahadev)
+ ZOOKEEPER-228. fix apache header missing in DBTest. (mahadev)
- ZOOKEEPER-218. fix the error in the barrier example code. (pat via mahadev)
+ ZOOKEEPER-218. fix the error in the barrier example code. (pat via mahadev)
- ZOOKEEPER-206. documentation tab should contain the version number and
- other small site changes. (pat via mahadev)
+ ZOOKEEPER-206. documentation tab should contain the version number and
+ other small site changes. (pat via mahadev)
- ZOOKEEPER-226. fix exists calls that fail on server if node has null data.
- (mahadev)
+ ZOOKEEPER-226. fix exists calls that fail on server if node has null data.
+ (mahadev)
- ZOOKEEPER-204. SetWatches needs to be the first message after auth messages
-to the server (ben via mahadev)
+ ZOOKEEPER-204. SetWatches needs to be the first message after auth
+ messages to the server (ben via mahadev)
- ZOOKEEPER-208. Zookeeper C client uses API that are not thread safe,
-causing crashes when multiple instances are active. (austin shoemaker, chris
-daroch and ben reed via mahadev)
+ ZOOKEEPER-208. Zookeeper C client uses API that are not thread safe,
+ causing crashes when multiple instances are active.
+ (austin shoemaker, chris daroch and ben reed via mahadev)
- ZOOKEEPER-227. gcc warning from recordio.h (chris darroch via mahadev)
+ ZOOKEEPER-227. gcc warning from recordio.h (chris darroch via mahadev)
- ZOOKEEPER-232. fix apache licence header in TestableZookeeper (mahadev)
+ ZOOKEEPER-232. fix apache licence header in TestableZookeeper (mahadev)
- ZOOKEEPER-249. QuorumPeer.getClientPort() always returns -1. (nitay
-joffe via mahadev)
+ ZOOKEEPER-249. QuorumPeer.getClientPort() always returns -1.
+ (nitay joffe via mahadev)
- ZOOKEEPER-248. QuorumPeer should use Map interface instead of
-HashMap implementation. (nitay joffe via mahadev)
+ ZOOKEEPER-248. QuorumPeer should use Map interface instead of HashMap
+ implementation. (nitay joffe via mahadev)
- ZOOKEEPER-241. Build of a distro fails after clean target is run. (patrick
-hunt via mahadev)
+ ZOOKEEPER-241. Build of a distro fails after clean target is run.
+ (patrick hunt via mahadev)
IMPROVEMENTS:
- ZOOKEEPER-64. Log system env information when initializing server and
-client (pat via mahadev)
+ ZOOKEEPER-161. Content needed: "Designing a ZooKeeper Deployment"
+ (breed via phunt)
+
+ ZOOKEEPER-64. Log system env information when initializing server and
+ client (pat via mahadev)
+
+ ZOOKEEPER-243. add SEQUENCE flag documentation to the programming guide.
+ (patrick hunt via mahadev)
- ZOOKEEPER-243. add SEQUENCE flag documentation to the programming guide.
-(patrick hunt via mahadev)
Release 3.0.0 - 2008-10-21
Modified: hadoop/zookeeper/trunk/docs/zookeeperAdmin.html
URL: http://svn.apache.org/viewvc/hadoop/zookeeper/trunk/docs/zookeeperAdmin.html?rev=724936&r1=724935&r2=724936&view=diff
==============================================================================
--- hadoop/zookeeper/trunk/docs/zookeeperAdmin.html (original)
+++ hadoop/zookeeper/trunk/docs/zookeeperAdmin.html Tue Dec 9 16:11:48 2008
@@ -201,6 +201,14 @@
<ul class="minitoc">
<li>
<a href="#sc_designing">Designing a ZooKeeper Deployment</a>
+<ul class="minitoc">
+<li>
+<a href="#sc_CrossMachineRequirements">Cross Machine Requirements</a>
+</li>
+<li>
+<a href="#Single+Machine+Requirements">Single Machine Requirements</a>
+</li>
+</ul>
</li>
<li>
<a href="#sc_provisioning">Provisioning</a>
@@ -621,20 +629,103 @@
</ul>
<a name="N10160"></a><a name="sc_designing"></a>
<h3 class="h4">Designing a ZooKeeper Deployment</h3>
-<p></p>
-<a name="N10169"></a><a name="sc_provisioning"></a>
+<p>The reliablity of ZooKeeper rests on two basic assumptions.</p>
+<ol>
+
+<li>
+<p> Only a minority of servers in a deployment
+ will fail. <em>Failure</em> in this context
+ means a machine crash, or some error in the network that
+ partitions a server off from the majority.</p>
+
+</li>
+
+<li>
+<p> Deployed machines operate correctly. To
+ operate correctly means to execute code correctly, to have
+ clocks that work properly, and to have storage and network
+ components that perform consistently.</p>
+
+</li>
+
+</ol>
+<p>The sections below contain considerations for ZooKeeper
+ administrators to maximize the probability for these assumptions
+ to hold true. Some of these are cross-machines considerations,
+ and others are things you should consider for each and every
+ machine in your deployment.</p>
+<a name="N1017C"></a><a name="sc_CrossMachineRequirements"></a>
+<h4>Cross Machine Requirements</h4>
+<p>For the ZooKeeper service to be active, there must be a
+ majority of non-failing machines that can communicate with
+ each other. To create a deployment that can tolerate the
+ failure of F machines, you should count on deploying 2xF+1
+ machines. Thus, a deployment that consists of three machines
+ can handle one failure, and a deployment of five machines can
+ handle two failures. Note that a deployment of six machines
+ can only handle two failures since three machines is not a
+ majority. For this reason, ZooKeeper deployments are usually
+ made up of an odd number of machines.</p>
+<p>To achieve the highest probability of tolerating a failure
+ you should try to make machine failures independent. For
+ example, if most of the machines share the same switch,
+ failure of that switch could cause a correlated failure and
+ bring down the service. The same holds true of shared power
+ circuits, cooling systems, etc.</p>
+<a name="N10189"></a><a name="Single+Machine+Requirements"></a>
+<h4>Single Machine Requirements</h4>
+<p>If ZooKeeper has to contend with other applications for
+ access to resourses like storage media, CPU, network, or
+ memory, its performance will suffer markedly. ZooKeeper has
+ strong durability guarantees, which means it uses storage
+ media to log changes before the operation responsible for the
+ change is allowed to complete. You should be aware of this
+ dependency then, and take great care if you want to ensure
+ that ZooKeeper operations aren’t held up by your media. Here
+ are some things you can do to minimize that sort of
+ degradation:
+ </p>
+<ul>
+
+<li>
+
+<p>ZooKeeper's transaction log must be on a dedicated
+ device. (A dedicated partition is not enough.) ZooKeeper
+ writes the log sequentially, without seeking Sharing your
+ log device with other processes can cause seeks and
+ contention, which in turn can cause multi-second
+ delays.</p>
+
+</li>
+
+
+<li>
+
+<p>Do not put ZooKeeper in a situation that can cause a
+ swap. In order for ZooKeeper to function with any sort of
+ timeliness, it simply cannot be allowed to swap.
+ Therefore, make certain that the maximum heap size given
+ to ZooKeeper is not bigger than the amount of real memory
+ available to ZooKeeper. For more on this, see
+ <a href="#sc_commonProblems">Things to Avoid</a>
+ below. </p>
+
+</li>
+
+</ul>
+<a name="N101A7"></a><a name="sc_provisioning"></a>
<h3 class="h4">Provisioning</h3>
<p></p>
-<a name="N10172"></a><a name="sc_strengthsAndLimitations"></a>
+<a name="N101B0"></a><a name="sc_strengthsAndLimitations"></a>
<h3 class="h4">Things to Consider: ZooKeeper Strengths and Limitations</h3>
<p></p>
-<a name="N1017B"></a><a name="sc_administering"></a>
+<a name="N101B9"></a><a name="sc_administering"></a>
<h3 class="h4">Administering</h3>
<p></p>
-<a name="N10184"></a><a name="sc_monitoring"></a>
+<a name="N101C2"></a><a name="sc_monitoring"></a>
<h3 class="h4">Monitoring</h3>
<p></p>
-<a name="N1018D"></a><a name="sc_logging"></a>
+<a name="N101CB"></a><a name="sc_logging"></a>
<h3 class="h4">Logging</h3>
<p>ZooKeeper uses <strong>log4j</strong> version 1.2 as
its logging infrastructure. The ZooKeeper default <span class="codefrag filename">log4j.properties</span>
@@ -644,10 +735,10 @@
<p>For more information, see
<a href="http://logging.apache.org/log4j/1.2/manual.html#defaultInit">Log4j Default Initialization Procedure</a>
of the log4j manual.</p>
-<a name="N101AD"></a><a name="sc_troubleshooting"></a>
+<a name="N101EB"></a><a name="sc_troubleshooting"></a>
<h3 class="h4">Troubleshooting</h3>
<p></p>
-<a name="N101B6"></a><a name="sc_configuration"></a>
+<a name="N101F4"></a><a name="sc_configuration"></a>
<h3 class="h4">Configuration Parameters</h3>
<p>ZooKeeper's behavior is governed by the ZooKeeper configuration
file. This file is designed so that the exact same file can be used by
@@ -655,7 +746,7 @@
layouts are the same. If servers use different configuration files, care
must be taken to ensure that the list of servers in all of the different
configuration files match.</p>
-<a name="N101BF"></a><a name="sc_minimumConfiguration"></a>
+<a name="N101FD"></a><a name="sc_minimumConfiguration"></a>
<h4>Minimum Configuration</h4>
<p>Here are the minimum configuration keywords that must be defined
in the configuration file:</p>
@@ -702,7 +793,7 @@
</dd>
</dl>
-<a name="N101E6"></a><a name="sc_advancedConfiguration"></a>
+<a name="N10224"></a><a name="sc_advancedConfiguration"></a>
<h4>Advanced Configuration</h4>
<p>The configuration settings in the section are optional. You can
use them to further fine tune the behaviour of your ZooKeeper servers.
@@ -793,7 +884,7 @@
</dd>
</dl>
-<a name="N10246"></a><a name="sc_clusterOptions"></a>
+<a name="N10284"></a><a name="sc_clusterOptions"></a>
<h4>Cluster Options</h4>
<p>The options in this section are designed for use with an ensemble
of servers -- that is, when deploying clusters of servers.</p>
@@ -883,7 +974,7 @@
</dl>
<p></p>
-<a name="N102A3"></a><a name="Unsafe+Options"></a>
+<a name="N102E1"></a><a name="Unsafe+Options"></a>
<h4>Unsafe Options</h4>
<p>The following options can be useful, but be careful when you use
them. The risk of each is explained along with the explanation of what
@@ -928,7 +1019,7 @@
</dd>
</dl>
-<a name="N102D5"></a><a name="sc_zkCommands"></a>
+<a name="N10313"></a><a name="sc_zkCommands"></a>
<h3 class="h4">ZooKeeper Commands: The Four Letter Words</h3>
<p>ZooKeeper responds to a small set of commands. Each command is
composed of four letters. You issue the commands to ZooKeeper via telnet
@@ -993,7 +1084,7 @@
<pre class="code">$ echo ruok | nc 127.0.0.1 5111
imok
</pre>
-<a name="N10315"></a><a name="sc_dataFileManagement"></a>
+<a name="N10353"></a><a name="sc_dataFileManagement"></a>
<h3 class="h4">Data File Management</h3>
<p>ZooKeeper stores its data in a data directory and its transaction
log in a transaction log directory. By default these two directories are
@@ -1001,7 +1092,7 @@
transaction log files in a separate directory than the data files.
Throughput increases and latency decreases when transaction logs reside
on a dedicated log devices.</p>
-<a name="N1031E"></a><a name="The+Data+Directory"></a>
+<a name="N1035C"></a><a name="The+Data+Directory"></a>
<h4>The Data Directory</h4>
<p>This directory has two files in it:</p>
<ul>
@@ -1047,14 +1138,14 @@
idempotent nature of its updates. By replaying the transaction log
against fuzzy snapshots ZooKeeper gets the state of the system at the
end of the log.</p>
-<a name="N1035A"></a><a name="The+Log+Directory"></a>
+<a name="N10398"></a><a name="The+Log+Directory"></a>
<h4>The Log Directory</h4>
<p>The Log Directory contains the ZooKeeper transaction logs.
Before any update takes place, ZooKeeper ensures that the transaction
that represents the update is written to non-volatile storage. A new
log file is started each time a snapshot is begun. The log file's
suffix is the first zxid written to that log.</p>
-<a name="N10364"></a><a name="File+Management"></a>
+<a name="N103A2"></a><a name="File+Management"></a>
<h4>File Management</h4>
<p>The format of snapshot and log files does not change between
standalone ZooKeeper servers and different configurations of
@@ -1071,7 +1162,7 @@
needs the latest complete fuzzy snapshot and the log files from the
start of that snapshot. The PurgeTxnLog utility implements a simple
retention policy that administrators can use.</p>
-<a name="N10375"></a><a name="sc_commonProblems"></a>
+<a name="N103B3"></a><a name="sc_commonProblems"></a>
<h3 class="h4">Things to Avoid</h3>
<p>Here are some common problems you can avoid by configuring
ZooKeeper correctly:</p>
@@ -1125,7 +1216,7 @@
</dd>
</dl>
-<a name="N10399"></a><a name="sc_bestPractices"></a>
+<a name="N103D7"></a><a name="sc_bestPractices"></a>
<h3 class="h4">Best Practices</h3>
<p>For best results, take note of the following list of good
Zookeeper practices. <em>[tbd...]</em>
Modified: hadoop/zookeeper/trunk/docs/zookeeperAdmin.pdf
URL: http://svn.apache.org/viewvc/hadoop/zookeeper/trunk/docs/zookeeperAdmin.pdf?rev=724936&r1=724935&r2=724936&view=diff
==============================================================================
Binary files - no diff available.
Modified: hadoop/zookeeper/trunk/src/docs/src/documentation/content/xdocs/zookeeperAdmin.xml
URL: http://svn.apache.org/viewvc/hadoop/zookeeper/trunk/src/docs/src/documentation/content/xdocs/zookeeperAdmin.xml?rev=724936&r1=724935&r2=724936&view=diff
==============================================================================
--- hadoop/zookeeper/trunk/src/docs/src/documentation/content/xdocs/zookeeperAdmin.xml (original)
+++ hadoop/zookeeper/trunk/src/docs/src/documentation/content/xdocs/zookeeperAdmin.xml Tue Dec 9 16:11:48 2008
@@ -282,7 +282,85 @@
<section id="sc_designing">
<title>Designing a ZooKeeper Deployment</title>
- <para></para>
+ <para>The reliablity of ZooKeeper rests on two basic assumptions.</para>
+ <orderedlist>
+ <listitem><para> Only a minority of servers in a deployment
+ will fail. <emphasis>Failure</emphasis> in this context
+ means a machine crash, or some error in the network that
+ partitions a server off from the majority.</para>
+ </listitem>
+ <listitem><para> Deployed machines operate correctly. To
+ operate correctly means to execute code correctly, to have
+ clocks that work properly, and to have storage and network
+ components that perform consistently.</para>
+ </listitem>
+ </orderedlist>
+
+ <para>The sections below contain considerations for ZooKeeper
+ administrators to maximize the probability for these assumptions
+ to hold true. Some of these are cross-machines considerations,
+ and others are things you should consider for each and every
+ machine in your deployment.</para>
+
+ <section id="sc_CrossMachineRequirements">
+ <title>Cross Machine Requirements</title>
+
+ <para>For the ZooKeeper service to be active, there must be a
+ majority of non-failing machines that can communicate with
+ each other. To create a deployment that can tolerate the
+ failure of F machines, you should count on deploying 2xF+1
+ machines. Thus, a deployment that consists of three machines
+ can handle one failure, and a deployment of five machines can
+ handle two failures. Note that a deployment of six machines
+ can only handle two failures since three machines is not a
+ majority. For this reason, ZooKeeper deployments are usually
+ made up of an odd number of machines.</para>
+
+ <para>To achieve the highest probability of tolerating a failure
+ you should try to make machine failures independent. For
+ example, if most of the machines share the same switch,
+ failure of that switch could cause a correlated failure and
+ bring down the service. The same holds true of shared power
+ circuits, cooling systems, etc.</para>
+ </section>
+
+ <section>
+ <title>Single Machine Requirements</title>
+
+ <para>If ZooKeeper has to contend with other applications for
+ access to resourses like storage media, CPU, network, or
+ memory, its performance will suffer markedly. ZooKeeper has
+ strong durability guarantees, which means it uses storage
+ media to log changes before the operation responsible for the
+ change is allowed to complete. You should be aware of this
+ dependency then, and take great care if you want to ensure
+ that ZooKeeper operations arenât held up by your media. Here
+ are some things you can do to minimize that sort of
+ degradation:
+ </para>
+
+ <itemizedlist>
+ <listitem>
+ <para>ZooKeeper's transaction log must be on a dedicated
+ device. (A dedicated partition is not enough.) ZooKeeper
+ writes the log sequentially, without seeking Sharing your
+ log device with other processes can cause seeks and
+ contention, which in turn can cause multi-second
+ delays.</para>
+ </listitem>
+
+ <listitem>
+ <para>Do not put ZooKeeper in a situation that can cause a
+ swap. In order for ZooKeeper to function with any sort of
+ timeliness, it simply cannot be allowed to swap.
+ Therefore, make certain that the maximum heap size given
+ to ZooKeeper is not bigger than the amount of real memory
+ available to ZooKeeper. For more on this, see
+ <xref linkend="sc_commonProblems"/>
+ below. </para>
+ </listitem>
+ </itemizedlist>
+ </section>
</section>
<section id="sc_provisioning">