You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@zookeeper.apache.org by an...@apache.org on 2018/07/04 13:11:24 UTC

[04/12] zookeeper git commit: ZOOKEEPER-3022: MAVEN MIGRATION 3.4 - Iteration 1 - docs, it

http://git-wip-us.apache.org/repos/asf/zookeeper/blob/c1efa954/zookeeper-docs/src/documentation/content/xdocs/zookeeperAdmin.xml
----------------------------------------------------------------------
diff --git a/zookeeper-docs/src/documentation/content/xdocs/zookeeperAdmin.xml b/zookeeper-docs/src/documentation/content/xdocs/zookeeperAdmin.xml
new file mode 100644
index 0000000..d88ddbd
--- /dev/null
+++ b/zookeeper-docs/src/documentation/content/xdocs/zookeeperAdmin.xml
@@ -0,0 +1,1861 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!--
+  Copyright 2002-2004 The Apache Software Foundation
+
+  Licensed under the Apache License, Version 2.0 (the "License");
+  you may not use this file except in compliance with the License.
+  You may obtain a copy of the License at
+
+      http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License.
+-->
+<!DOCTYPE article PUBLIC "-//OASIS//DTD Simplified DocBook XML V1.0//EN"
+"http://www.oasis-open.org/docbook/xml/simple/1.0/sdocbook.dtd">
+<article id="bk_Admin">
+  <title>ZooKeeper Administrator's Guide</title>
+
+  <subtitle>A Guide to Deployment and Administration</subtitle>
+
+  <articleinfo>
+    <legalnotice>
+      <para>Licensed under the Apache License, Version 2.0 (the "License");
+      you may not use this file except in compliance with the License. You may
+      obtain a copy of the License at <ulink
+      url="http://www.apache.org/licenses/LICENSE-2.0">http://www.apache.org/licenses/LICENSE-2.0</ulink>.</para>
+
+      <para>Unless required by applicable law or agreed to in writing,
+      software distributed under the License is distributed on an "AS IS"
+      BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
+      implied. See the License for the specific language governing permissions
+      and limitations under the License.</para>
+    </legalnotice>
+
+    <abstract>
+      <para>This document contains information about deploying, administering
+      and mantaining ZooKeeper. It also discusses best practices and common
+      problems.</para>
+    </abstract>
+  </articleinfo>
+
+  <section id="ch_deployment">
+    <title>Deployment</title>
+
+    <para>This section contains information about deploying Zookeeper and
+    covers these topics:</para>
+
+    <itemizedlist>
+      <listitem>
+        <para><xref linkend="sc_systemReq" /></para>
+      </listitem>
+
+      <listitem>
+        <para><xref linkend="sc_zkMulitServerSetup" /></para>
+      </listitem>
+
+      <listitem>
+        <para><xref linkend="sc_singleAndDevSetup" /></para>
+      </listitem>
+    </itemizedlist>
+
+    <para>The first two sections assume you are interested in installing
+    ZooKeeper in a production environment such as a datacenter. The final
+    section covers situations in which you are setting up ZooKeeper on a
+    limited basis - for evaluation, testing, or development - but not in a
+    production environment.</para>
+
+    <section id="sc_systemReq">
+      <title>System Requirements</title>
+
+      <section id="sc_supportedPlatforms">
+        <title>Supported Platforms</title>
+
+        <para>ZooKeeper consists of multiple components. Some components are
+        supported broadly, and other components are supported only on a smaller
+        set of platforms.</para>
+
+        <itemizedlist>
+          <listitem>
+            <para><emphasis role="bold">Client</emphasis> is the Java client
+            library, used by applications to connect to a ZooKeeper ensemble.
+            </para>
+          </listitem>
+          <listitem>
+            <para><emphasis role="bold">Server</emphasis> is the Java server
+            that runs on the ZooKeeper ensemble nodes.</para>
+          </listitem>
+          <listitem>
+            <para><emphasis role="bold">Native Client</emphasis> is a client
+            implemented in C, similar to the Java client, used by applications
+            to connect to a ZooKeeper ensemble.</para>
+          </listitem>
+          <listitem>
+            <para><emphasis role="bold">Contrib</emphasis> refers to multiple
+            optional add-on components.</para>
+          </listitem>
+        </itemizedlist>
+
+        <para>The following matrix describes the level of support committed for
+        running each component on different operating system platforms.</para>
+
+        <table>
+          <title>Support Matrix</title>
+          <tgroup cols="5" align="left" colsep="1" rowsep="1">
+            <thead>
+              <row>
+                <entry>Operating System</entry>
+                <entry>Client</entry>
+                <entry>Server</entry>
+                <entry>Native Client</entry>
+                <entry>Contrib</entry>
+              </row>
+            </thead>
+            <tbody>
+              <row>
+                <entry>GNU/Linux</entry>
+                <entry>Development and Production</entry>
+                <entry>Development and Production</entry>
+                <entry>Development and Production</entry>
+                <entry>Development and Production</entry>
+              </row>
+              <row>
+                <entry>Solaris</entry>
+                <entry>Development and Production</entry>
+                <entry>Development and Production</entry>
+                <entry>Not Supported</entry>
+                <entry>Not Supported</entry>
+              </row>
+              <row>
+                <entry>FreeBSD</entry>
+                <entry>Development and Production</entry>
+                <entry>Development and Production</entry>
+                <entry>Not Supported</entry>
+                <entry>Not Supported</entry>
+              </row>
+              <row>
+                <entry>Windows</entry>
+                <entry>Development and Production</entry>
+                <entry>Development and Production</entry>
+                <entry>Not Supported</entry>
+                <entry>Not Supported</entry>
+              </row>
+              <row>
+                <entry>Mac OS X</entry>
+                <entry>Development Only</entry>
+                <entry>Development Only</entry>
+                <entry>Not Supported</entry>
+                <entry>Not Supported</entry>
+              </row>
+            </tbody>
+          </tgroup>
+        </table>
+
+        <para>For any operating system not explicitly mentioned as supported in
+        the matrix, components may or may not work.  The ZooKeeper community
+        will fix obvious bugs that are reported for other platforms, but there
+        is no full support.</para>
+      </section>
+
+      <section id="sc_requiredSoftware">
+        <title>Required Software </title>
+
+        <para>ZooKeeper runs in Java, release 1.6 or greater (JDK 6 or
+          greater).  It runs as an <emphasis>ensemble</emphasis> of
+          ZooKeeper servers. Three ZooKeeper servers is the minimum
+          recommended size for an ensemble, and we also recommend that
+          they run on separate machines. At Yahoo!, ZooKeeper is
+          usually deployed on dedicated RHEL boxes, with dual-core
+          processors, 2GB of RAM, and 80GB IDE hard drives.</para>
+      </section>
+
+    </section>
+
+    <section id="sc_zkMulitServerSetup">
+      <title>Clustered (Multi-Server) Setup</title>
+
+      <para>For reliable ZooKeeper service, you should deploy ZooKeeper in a
+      cluster known as an <emphasis>ensemble</emphasis>. As long as a majority
+      of the ensemble are up, the service will be available. Because Zookeeper
+      requires a majority, it is best to use an
+      odd number of machines. For example, with four machines ZooKeeper can
+      only handle the failure of a single machine; if two machines fail, the
+      remaining two machines do not constitute a majority. However, with five
+      machines ZooKeeper can handle the failure of two machines. </para>
+      <note>
+         <para>
+            As mentioned in the
+            <ulink url="zookeeperStarted.html">ZooKeeper Getting Started Guide</ulink>
+            , a minimum of three servers are required for a fault tolerant
+            clustered setup, and it is strongly recommended that you have an
+            odd number of servers.
+         </para>
+         <para>Usually three servers is more than enough for a production
+            install, but for maximum reliability during maintenance, you may
+            wish to install five servers. With three servers, if you perform
+            maintenance on one of them, you are vulnerable to a failure on one
+            of the other two servers during that maintenance. If you have five
+            of them running, you can take one down for maintenance, and know
+            that you're still OK if one of the other four suddenly fails.
+         </para>
+         <para>Your redundancy considerations should include all aspects of
+            your environment. If you have three ZooKeeper servers, but their
+            network cables are all plugged into the same network switch, then
+            the failure of that switch will take down your entire ensemble.
+         </para>
+      </note>
+      <para>Here are the steps to setting a server that will be part of an
+      ensemble. These steps should be performed on every host in the
+      ensemble:</para>
+
+      <orderedlist>
+        <listitem>
+          <para>Install the Java JDK. You can use the native packaging system
+          for your system, or download the JDK from:</para>
+
+          <para><ulink
+          url="http://java.sun.com/javase/downloads/index.jsp">http://java.sun.com/javase/downloads/index.jsp</ulink></para>
+        </listitem>
+
+        <listitem>
+          <para>Set the Java heap size. This is very important to avoid
+          swapping, which will seriously degrade ZooKeeper performance. To
+          determine the correct value, use load tests, and make sure you are
+          well below the usage limit that would cause you to swap. Be
+          conservative - use a maximum heap size of 3GB for a 4GB
+          machine.</para>
+        </listitem>
+
+        <listitem>
+          <para>Install the ZooKeeper Server Package. It can be downloaded
+            from:
+          </para>
+          <para>
+            <ulink url="http://zookeeper.apache.org/releases.html">
+              http://zookeeper.apache.org/releases.html
+            </ulink>
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>Create a configuration file. This file can be called anything.
+          Use the following settings as a starting point:</para>
+
+          <programlisting>
+tickTime=2000
+dataDir=/var/lib/zookeeper/
+clientPort=2181
+initLimit=5
+syncLimit=2
+server.1=zoo1:2888:3888
+server.2=zoo2:2888:3888
+server.3=zoo3:2888:3888</programlisting>
+
+          <para>You can find the meanings of these and other configuration
+          settings in the section <xref linkend="sc_configuration" />. A word
+          though about a few here:</para>
+
+          <para>Every machine that is part of the ZooKeeper ensemble should know
+          about every other machine in the ensemble. You accomplish this with
+          the series of lines of the form <emphasis
+          role="bold">server.id=host:port:port</emphasis>. The parameters <emphasis
+          role="bold">host</emphasis> and <emphasis
+          role="bold">port</emphasis> are straightforward. You attribute the
+          server id to each machine by creating a file named
+          <filename>myid</filename>, one for each server, which resides in
+          that server's data directory, as specified by the configuration file
+          parameter <emphasis role="bold">dataDir</emphasis>.</para></listitem>
+
+          <listitem><para>The myid file
+          consists of a single line containing only the text of that machine's
+          id. So <filename>myid</filename> of server 1 would contain the text
+          "1" and nothing else. The id must be unique within the
+          ensemble and should have a value between 1 and 255.</para>
+        </listitem>
+
+        <listitem>
+          <para>If your configuration file is set up, you can start a
+          ZooKeeper server:</para>
+
+          <para><computeroutput>$ java -cp zookeeper.jar:lib/slf4j-api-1.6.1.jar:lib/slf4j-log4j12-1.6.1.jar:lib/log4j-1.2.15.jar:conf \
+              org.apache.zookeeper.server.quorum.QuorumPeerMain zoo.cfg
+          </computeroutput></para>
+          
+          <para>QuorumPeerMain starts a ZooKeeper server,
+            <ulink url="http://java.sun.com/javase/technologies/core/mntr-mgmt/javamanagement/">JMX</ulink>
+            management beans are also registered which allows
+            management through a JMX management console. 
+            The <ulink url="zookeeperJMX.html">ZooKeeper JMX
+            document</ulink> contains details on managing ZooKeeper with JMX.
+          </para>
+
+          <para>See the script <emphasis>bin/zkServer.sh</emphasis>,
+            which is included in the release, for an example
+            of starting server instances.</para>
+
+        </listitem>
+
+        <listitem>
+          <para>Test your deployment by connecting to the hosts:</para>
+
+          <para>In Java, you can run the following command to execute
+          simple operations:</para>
+
+          <para><computeroutput>$ bin/zkCli.sh -server 127.0.0.1:2181</computeroutput></para>
+        </listitem>
+      </orderedlist>
+    </section>
+
+    <section id="sc_singleAndDevSetup">
+      <title>Single Server and Developer Setup</title>
+
+      <para>If you want to setup ZooKeeper for development purposes, you will
+      probably want to setup a single server instance of ZooKeeper, and then
+      install either the Java or C client-side libraries and bindings on your
+      development machine.</para>
+
+      <para>The steps to setting up a single server instance are the similar
+      to the above, except the configuration file is simpler. You can find the
+      complete instructions in the <ulink
+      url="zookeeperStarted.html#sc_InstallingSingleMode">Installing and
+      Running ZooKeeper in Single Server Mode</ulink> section of the <ulink
+      url="zookeeperStarted.html">ZooKeeper Getting Started
+      Guide</ulink>.</para>
+
+      <para>For information on installing the client side libraries, refer to
+      the <ulink url="zookeeperProgrammers.html#Bindings">Bindings</ulink>
+      section of the <ulink url="zookeeperProgrammers.html">ZooKeeper
+      Programmer's Guide</ulink>.</para>
+    </section>
+  </section>
+
+  <section id="ch_administration">
+    <title>Administration</title>
+
+    <para>This section contains information about running and maintaining
+    ZooKeeper and covers these topics: </para>
+    <itemizedlist>
+        <listitem>
+          <para><xref linkend="sc_designing" /></para>
+        </listitem>
+
+        <listitem>
+          <para><xref linkend="sc_provisioning" /></para>
+        </listitem>
+
+        <listitem>
+          <para><xref linkend="sc_strengthsAndLimitations" /></para>
+        </listitem>
+
+        <listitem>
+          <para><xref linkend="sc_administering" /></para>
+        </listitem>
+
+        <listitem>
+          <para><xref linkend="sc_maintenance" /></para>
+        </listitem>
+
+        <listitem>
+          <para><xref linkend="sc_supervision" /></para>
+        </listitem>
+
+        <listitem>
+          <para><xref linkend="sc_monitoring" /></para>
+        </listitem>
+
+        <listitem>
+          <para><xref linkend="sc_logging" /></para>
+        </listitem>
+
+        <listitem>
+          <para><xref linkend="sc_troubleshooting" /></para>
+        </listitem>
+
+        <listitem>
+          <para><xref linkend="sc_configuration" /></para>
+        </listitem>
+
+        <listitem>
+          <para><xref linkend="sc_zkCommands" /></para>
+        </listitem>
+
+        <listitem>
+          <para><xref linkend="sc_dataFileManagement" /></para>
+        </listitem>
+
+        <listitem>
+          <para><xref linkend="sc_commonProblems" /></para>
+        </listitem>
+
+        <listitem>
+          <para><xref linkend="sc_bestPractices" /></para>
+        </listitem>
+      </itemizedlist>
+
+    <section id="sc_designing">
+      <title>Designing a ZooKeeper Deployment</title>
+
+      <para>The reliablity of ZooKeeper rests on two basic assumptions.</para>
+      <orderedlist>
+        <listitem><para> Only a minority of servers in a deployment
+            will fail. <emphasis>Failure</emphasis> in this context
+            means a machine crash, or some error in the network that
+            partitions a server off from the majority.</para>
+        </listitem>
+        <listitem><para> Deployed machines operate correctly. To
+            operate correctly means to execute code correctly, to have
+            clocks that work properly, and to have storage and network
+            components that perform consistently.</para>
+        </listitem>
+      </orderedlist>
+    
+    <para>The sections below contain considerations for ZooKeeper
+      administrators to maximize the probability for these assumptions
+      to hold true. Some of these are cross-machines considerations,
+      and others are things you should consider for each and every
+      machine in your deployment.</para>
+
+    <section id="sc_CrossMachineRequirements">
+      <title>Cross Machine Requirements</title>
+    
+      <para>For the ZooKeeper service to be active, there must be a
+        majority of non-failing machines that can communicate with
+        each other. To create a deployment that can tolerate the
+        failure of F machines, you should count on deploying 2xF+1
+        machines.  Thus, a deployment that consists of three machines
+        can handle one failure, and a deployment of five machines can
+        handle two failures. Note that a deployment of six machines
+        can only handle two failures since three machines is not a
+        majority.  For this reason, ZooKeeper deployments are usually
+        made up of an odd number of machines.</para>
+
+      <para>To achieve the highest probability of tolerating a failure
+        you should try to make machine failures independent. For
+        example, if most of the machines share the same switch,
+        failure of that switch could cause a correlated failure and
+        bring down the service. The same holds true of shared power
+        circuits, cooling systems, etc.</para>
+    </section>
+
+    <section>
+      <title>Single Machine Requirements</title>
+
+      <para>If ZooKeeper has to contend with other applications for
+        access to resourses like storage media, CPU, network, or
+        memory, its performance will suffer markedly.  ZooKeeper has
+        strong durability guarantees, which means it uses storage
+        media to log changes before the operation responsible for the
+        change is allowed to complete. You should be aware of this
+        dependency then, and take great care if you want to ensure
+        that ZooKeeper operations aren’t held up by your media. Here
+        are some things you can do to minimize that sort of
+        degradation:
+      </para>
+
+      <itemizedlist>
+        <listitem>
+          <para>ZooKeeper's transaction log must be on a dedicated
+            device. (A dedicated partition is not enough.) ZooKeeper
+            writes the log sequentially, without seeking Sharing your
+            log device with other processes can cause seeks and
+            contention, which in turn can cause multi-second
+            delays.</para>
+        </listitem>
+
+        <listitem>
+          <para>Do not put ZooKeeper in a situation that can cause a
+            swap. In order for ZooKeeper to function with any sort of
+            timeliness, it simply cannot be allowed to swap.
+            Therefore, make certain that the maximum heap size given
+            to ZooKeeper is not bigger than the amount of real memory
+            available to ZooKeeper.  For more on this, see
+            <xref linkend="sc_commonProblems"/>
+            below. </para>
+        </listitem>
+      </itemizedlist>
+    </section>
+    </section>
+
+    <section id="sc_provisioning">
+      <title>Provisioning</title>
+
+      <para></para>
+    </section>
+
+    <section id="sc_strengthsAndLimitations">
+      <title>Things to Consider: ZooKeeper Strengths and Limitations</title>
+
+      <para></para>
+    </section>
+
+    <section id="sc_administering">
+      <title>Administering</title>
+
+      <para></para>
+    </section>
+
+    <section id="sc_maintenance">
+      <title>Maintenance</title>
+
+      <para>Little long term maintenance is required for a ZooKeeper
+        cluster however you must be aware of the following:</para>
+
+      <section>
+        <title>Ongoing Data Directory Cleanup</title>
+
+        <para>The ZooKeeper <ulink url="#var_datadir">Data
+          Directory</ulink> contains files which are a persistent copy
+          of the znodes stored by a particular serving ensemble. These
+          are the snapshot and transactional log files. As changes are
+          made to the znodes these changes are appended to a
+          transaction log. Occasionally, when a log grows large, a
+          snapshot of the current state of all znodes will be written
+          to the filesystem and a new transaction log file is created
+          for future transactions. During snapshotting, ZooKeeper may
+          continue appending incoming transactions to the old log file.
+          Therefore, some transactions which are newer than a snapshot
+          may be found in the last transaction log preceding the
+          snapshot.
+        </para>
+
+        <para>A ZooKeeper server <emphasis role="bold">will not remove
+        old snapshots and log files</emphasis> when using the default
+        configuration (see autopurge below), this is the
+        responsibility of the operator. Every serving environment is
+        different and therefore the requirements of managing these
+        files may differ from install to install (backup for example).
+        </para>
+
+        <para>The PurgeTxnLog utility implements a simple retention
+        policy that administrators can use. The <ulink
+        url="ext:api/index">API docs</ulink> contains details on
+        calling conventions (arguments, etc...).
+        </para>
+
+        <para>In the following example the last count snapshots and
+        their corresponding logs are retained and the others are
+        deleted.  The value of &lt;count&gt; should typically be
+        greater than 3 (although not required, this provides 3 backups
+        in the unlikely event a recent log has become corrupted). This
+        can be run as a cron job on the ZooKeeper server machines to
+        clean up the logs daily.</para>
+
+        <programlisting> java -cp zookeeper.jar:lib/slf4j-api-1.6.1.jar:lib/slf4j-log4j12-1.6.1.jar:lib/log4j-1.2.15.jar:conf org.apache.zookeeper.server.PurgeTxnLog &lt;dataDir&gt; &lt;snapDir&gt; -n &lt;count&gt;</programlisting>
+
+        <para>Automatic purging of the snapshots and corresponding
+        transaction logs was introduced in version 3.4.0 and can be
+        enabled via the following configuration parameters <emphasis
+        role="bold">autopurge.snapRetainCount</emphasis> and <emphasis
+        role="bold">autopurge.purgeInterval</emphasis>. For more on
+        this, see <xref linkend="sc_advancedConfiguration"/>
+        below.</para>
+      </section>
+
+      <section>
+        <title>Debug Log Cleanup (log4j)</title>
+
+        <para>See the section on <ulink
+        url="#sc_logging">logging</ulink> in this document. It is
+        expected that you will setup a rolling file appender using the
+        in-built log4j feature. The sample configuration file in the
+        release tar's conf/log4j.properties provides an example of
+        this.
+        </para>
+      </section>
+
+    </section>
+
+    <section id="sc_supervision">
+      <title>Supervision</title>
+
+      <para>You will want to have a supervisory process that manages
+      each of your ZooKeeper server processes (JVM). The ZK server is
+      designed to be "fail fast" meaning that it will shutdown
+      (process exit) if an error occurs that it cannot recover
+      from. As a ZooKeeper serving cluster is highly reliable, this
+      means that while the server may go down the cluster as a whole
+      is still active and serving requests. Additionally, as the
+      cluster is "self healing" the failed server once restarted will
+      automatically rejoin the ensemble w/o any manual
+      interaction.</para>
+
+      <para>Having a supervisory process such as <ulink
+      url="http://cr.yp.to/daemontools.html">daemontools</ulink> or
+      <ulink
+      url="http://en.wikipedia.org/wiki/Service_Management_Facility">SMF</ulink>
+      (other options for supervisory process are also available, it's
+      up to you which one you would like to use, these are just two
+      examples) managing your ZooKeeper server ensures that if the
+      process does exit abnormally it will automatically be restarted
+      and will quickly rejoin the cluster.</para>
+    </section>
+
+    <section id="sc_monitoring">
+      <title>Monitoring</title>
+
+      <para>The ZooKeeper service can be monitored in one of two
+      primary ways; 1) the command port through the use of <ulink
+      url="#sc_zkCommands">4 letter words</ulink> and 2) <ulink
+      url="zookeeperJMX.html">JMX</ulink>. See the appropriate section for
+      your environment/requirements.</para>
+    </section>
+
+    <section id="sc_logging">
+      <title>Logging</title>
+
+      <para>ZooKeeper uses <emphasis role="bold">log4j</emphasis> version 1.2 as 
+      its logging infrastructure. The  ZooKeeper default <filename>log4j.properties</filename> 
+      file resides in the <filename>conf</filename> directory. Log4j requires that 
+      <filename>log4j.properties</filename> either be in the working directory 
+      (the directory from which ZooKeeper is run) or be accessible from the classpath.</para>
+
+      <para>For more information, see 
+      <ulink url="http://logging.apache.org/log4j/1.2/manual.html#defaultInit">Log4j Default Initialization Procedure</ulink> 
+      of the log4j manual.</para>
+      
+    </section>
+
+    <section id="sc_troubleshooting">
+      <title>Troubleshooting</title>
+	<variablelist>
+		<varlistentry>
+		<term> Server not coming up because of file corruption</term>
+		<listitem>
+		<para>A server might not be able to read its database and fail to come up because of 
+		some file corruption in the transaction logs of the ZooKeeper server. You will
+		see some IOException on loading ZooKeeper database. In such a case,
+		make sure all the other servers in your ensemble are up and  working. Use "stat" 
+		command on the command port to see if they are in good health. After you have verified that
+		all the other servers of the ensemble are up, you can go ahead and clean the database
+		of the corrupt server. Delete all the files in datadir/version-2 and datalogdir/version-2/.
+		Restart the server.
+		</para>
+		</listitem>
+		</varlistentry>
+		</variablelist>
+    </section>
+
+    <section id="sc_configuration">
+      <title>Configuration Parameters</title>
+
+      <para>ZooKeeper's behavior is governed by the ZooKeeper configuration
+      file. This file is designed so that the exact same file can be used by
+      all the servers that make up a ZooKeeper server assuming the disk
+      layouts are the same. If servers use different configuration files, care
+      must be taken to ensure that the list of servers in all of the different
+      configuration files match.</para>
+
+      <section id="sc_minimumConfiguration">
+        <title>Minimum Configuration</title>
+
+        <para>Here are the minimum configuration keywords that must be defined
+        in the configuration file:</para>
+
+        <variablelist>
+          <varlistentry>
+            <term>clientPort</term>
+
+            <listitem>
+              <para>the port to listen for client connections; that is, the
+              port that clients attempt to connect to.</para>
+            </listitem>
+          </varlistentry>
+
+          <varlistentry id="var_datadir">
+            <term>dataDir</term>
+
+            <listitem>
+              <para>the location where ZooKeeper will store the in-memory
+              database snapshots and, unless specified otherwise, the
+              transaction log of updates to the database.</para>
+
+              <note>
+                <para>Be careful where you put the transaction log. A
+                dedicated transaction log device is key to consistent good
+                performance. Putting the log on a busy device will adversely
+                effect performance.</para>
+              </note>
+            </listitem>
+          </varlistentry>
+
+          <varlistentry id="id_tickTime">
+            <term>tickTime</term>
+
+            <listitem>
+              <para>the length of a single tick, which is the basic time unit
+              used by ZooKeeper, as measured in milliseconds. It is used to
+              regulate heartbeats, and timeouts. For example, the minimum
+              session timeout will be two ticks.</para>
+            </listitem>
+          </varlistentry>
+        </variablelist>
+      </section>
+
+      <section id="sc_advancedConfiguration">
+        <title>Advanced Configuration</title>
+
+        <para>The configuration settings in the section are optional. You can
+        use them to further fine tune the behaviour of your ZooKeeper servers.
+        Some can also be set using Java system properties, generally of the
+        form <emphasis>zookeeper.keyword</emphasis>. The exact system
+        property, when available, is noted below.</para>
+
+        <variablelist>
+          <varlistentry>
+            <term>dataLogDir</term>
+
+            <listitem>
+              <para>(No Java system property)</para>
+
+              <para>This option will direct the machine to write the
+              transaction log to the <emphasis
+              role="bold">dataLogDir</emphasis> rather than the <emphasis
+              role="bold">dataDir</emphasis>. This allows a dedicated log
+              device to be used, and helps avoid competition between logging
+              and snaphots.</para>
+
+              <note>
+                <para>Having a dedicated log device has a large impact on
+                throughput and stable latencies. It is highly recommened to
+                dedicate a log device and set <emphasis
+                role="bold">dataLogDir</emphasis> to point to a directory on
+                that device, and then make sure to point <emphasis
+                role="bold">dataDir</emphasis> to a directory
+                <emphasis>not</emphasis> residing on that device.</para>
+              </note>
+            </listitem>
+          </varlistentry>
+
+          <varlistentry>
+            <term>globalOutstandingLimit</term>
+
+            <listitem>
+              <para>(Java system property: <emphasis
+              role="bold">zookeeper.globalOutstandingLimit.</emphasis>)</para>
+
+              <para>Clients can submit requests faster than ZooKeeper can
+              process them, especially if there are a lot of clients. To
+              prevent ZooKeeper from running out of memory due to queued
+              requests, ZooKeeper will throttle clients so that there is no
+              more than globalOutstandingLimit outstanding requests in the
+              system. The default limit is 1,000.</para>
+            </listitem>
+          </varlistentry>
+
+          <varlistentry>
+            <term>preAllocSize</term>
+
+            <listitem>
+              <para>(Java system property: <emphasis
+              role="bold">zookeeper.preAllocSize</emphasis>)</para>
+
+              <para>To avoid seeks ZooKeeper allocates space in the
+              transaction log file in blocks of preAllocSize kilobytes. The
+              default block size is 64M. One reason for changing the size of
+              the blocks is to reduce the block size if snapshots are taken
+              more often. (Also, see <emphasis
+              role="bold">snapCount</emphasis>).</para>
+            </listitem>
+          </varlistentry>
+
+          <varlistentry>
+            <term>snapCount</term>
+
+            <listitem>
+              <para>(Java system property: <emphasis
+              role="bold">zookeeper.snapCount</emphasis>)</para>
+
+              <para>ZooKeeper records its transactions using snapshots and
+              a transaction log (think write-ahead log).The number of
+              transactions recorded in the transaction log before a snapshot
+              can be taken (and the transaction log rolled) is determined
+              by snapCount. In order to prevent all of the machines in the quorum
+              from taking a snapshot at the same time, each ZooKeeper server
+              will take a snapshot when the number of transactions in the transaction log
+              reaches a runtime generated random value in the [snapCount/2+1, snapCount] 
+              range.The default snapCount is 100,000.</para>
+            </listitem>
+          </varlistentry>
+
+          <varlistentry>
+            <term>maxClientCnxns</term>
+            <listitem>
+              <para>(No Java system property)</para>
+
+              <para>Limits the number of concurrent connections (at the socket 
+              level) that a single client, identified by IP address, may make
+              to a single member of the ZooKeeper ensemble. This is used to 
+              prevent certain classes of DoS attacks, including file 
+              descriptor exhaustion. The default is 60. Setting this to 0
+              entirely removes the limit on concurrent connections.</para>
+            </listitem>
+           </varlistentry>
+
+           <varlistentry>
+             <term>clientPortAddress</term>
+
+             <listitem>
+               <para><emphasis role="bold">New in 3.3.0:</emphasis> the
+               address (ipv4, ipv6 or hostname) to listen for client
+               connections; that is, the address that clients attempt
+               to connect to. This is optional, by default we bind in
+               such a way that any connection to the <emphasis
+               role="bold">clientPort</emphasis> for any
+               address/interface/nic on the server will be
+               accepted.</para>
+             </listitem>
+           </varlistentry>
+
+          <varlistentry>
+            <term>minSessionTimeout</term>
+            <listitem>
+              <para>(No Java system property)</para>
+
+              <para><emphasis role="bold">New in 3.3.0:</emphasis> the
+              minimum session timeout in milliseconds that the server
+              will allow the client to negotiate. Defaults to 2 times
+              the <emphasis role="bold">tickTime</emphasis>.</para>
+            </listitem>
+           </varlistentry>
+
+          <varlistentry>
+            <term>maxSessionTimeout</term>
+            <listitem>
+              <para>(No Java system property)</para>
+
+              <para><emphasis role="bold">New in 3.3.0:</emphasis> the
+              maximum session timeout in milliseconds that the server
+              will allow the client to negotiate. Defaults to 20 times
+              the <emphasis role="bold">tickTime</emphasis>.</para>
+            </listitem>
+           </varlistentry>
+           
+           <varlistentry>
+             <term>fsync.warningthresholdms</term>
+             <listitem>
+               <para>(Java system property: <emphasis
+               role="bold">zookeeper.fsync.warningthresholdms</emphasis>)</para>
+
+               <para><emphasis role="bold">New in 3.3.4:</emphasis> A
+               warning message will be output to the log whenever an
+               fsync in the Transactional Log (WAL) takes longer than
+               this value. The values is specified in milliseconds and
+               defaults to 1000. This value can only be set as a
+               system property.</para>
+             </listitem>
+           </varlistentry>
+
+          <varlistentry>
+            <term>autopurge.snapRetainCount</term>
+
+            <listitem>
+              <para>(No Java system property)</para>
+
+              <para><emphasis role="bold">New in 3.4.0:</emphasis> 
+              When enabled, ZooKeeper auto purge feature retains
+              the <emphasis role="bold">autopurge.snapRetainCount</emphasis> most
+              recent snapshots and the corresponding transaction logs in the 
+              <emphasis role="bold">dataDir</emphasis> and <emphasis 
+              role="bold">dataLogDir</emphasis> respectively and deletes the rest.
+              Defaults to 3. Minimum value is 3.</para>
+            </listitem>
+          </varlistentry>
+          
+          <varlistentry>
+            <term>autopurge.purgeInterval</term>
+
+            <listitem>
+              <para>(No Java system property)</para>
+
+              <para><emphasis role="bold">New in 3.4.0:</emphasis> The
+              time interval in hours for which the purge task has to
+              be triggered. Set to a positive integer (1 and above)
+              to enable the auto purging. Defaults to 0.</para>
+            </listitem>
+          </varlistentry>
+
+          <varlistentry>
+            <term>syncEnabled</term>
+
+            <listitem>
+              <para>(Java system property: <emphasis
+              role="bold">zookeeper.observer.syncEnabled</emphasis>)</para>
+
+              <para><emphasis role="bold">New in 3.4.6, 3.5.0:</emphasis>
+              The observers now log transaction and write snapshot to disk
+              by default like the participants. This reduces the recovery time
+              of the observers on restart. Set to "false" to disable this
+              feature. Default is "true"</para>
+            </listitem>
+          </varlistentry>
+        </variablelist>
+      </section>
+
+      <section id="sc_clusterOptions">
+        <title>Cluster Options</title>
+
+        <para>The options in this section are designed for use with an ensemble
+        of servers -- that is, when deploying clusters of servers.</para>
+
+        <variablelist>
+          <varlistentry>
+            <term>electionAlg</term>
+
+            <listitem>
+              <para>(No Java system property)</para>
+
+              <para>Election implementation to use. A value of "0" corresponds
+              to the original UDP-based version, "1" corresponds to the
+              non-authenticated UDP-based version of fast leader election, "2"
+              corresponds to the authenticated UDP-based version of fast
+              leader election, and "3" corresponds to TCP-based version of
+              fast leader election. Currently, algorithm 3 is the default</para>
+              
+              <note>
+              <para> The implementations of leader election 0, 1, and 2 are now 
+              <emphasis role="bold"> deprecated </emphasis>. We have the intention
+              of removing them in the next release, at which point only the 
+              FastLeaderElection will be available. 
+              </para>
+              </note>
+            </listitem>
+          </varlistentry>
+
+          <varlistentry>
+            <term>initLimit</term>
+
+            <listitem>
+              <para>(No Java system property)</para>
+
+              <para>Amount of time, in ticks (see <ulink
+              url="#id_tickTime">tickTime</ulink>), to allow followers to
+              connect and sync to a leader. Increased this value as needed, if
+              the amount of data managed by ZooKeeper is large.</para>
+            </listitem>
+          </varlistentry>
+
+          <varlistentry>
+            <term>leaderServes</term>
+
+            <listitem>
+              <para>(Java system property: zookeeper.<emphasis
+              role="bold">leaderServes</emphasis>)</para>
+
+              <para>Leader accepts client connections. Default value is "yes".
+              The leader machine coordinates updates. For higher update
+              throughput at thes slight expense of read throughput the leader
+              can be configured to not accept clients and focus on
+              coordination. The default to this option is yes, which means
+              that a leader will accept client connections.</para>
+
+              <note>
+                <para>Turning on leader selection is highly recommended when
+                you have more than three ZooKeeper servers in an ensemble.</para>
+              </note>
+            </listitem>
+          </varlistentry>
+
+          <varlistentry>
+            <term>server.x=[hostname]:nnnnn[:nnnnn], etc</term>
+
+            <listitem>
+              <para>(No Java system property)</para>
+
+              <para>servers making up the ZooKeeper ensemble. When the server
+              starts up, it determines which server it is by looking for the
+              file <filename>myid</filename> in the data directory. That file
+              contains the server number, in ASCII, and it should match
+              <emphasis role="bold">x</emphasis> in <emphasis
+              role="bold">server.x</emphasis> in the left hand side of this
+              setting.</para>
+
+              <para>The list of servers that make up ZooKeeper servers that is
+              used by the clients must match the list of ZooKeeper servers
+              that each ZooKeeper server has.</para>
+
+              <para>There are two port numbers <emphasis role="bold">nnnnn</emphasis>. 
+              The first followers use to connect to the leader, and the second is for 
+              leader election. The leader election port is only necessary if electionAlg 
+              is 1, 2, or 3 (default). If electionAlg is 0, then the second port is not 
+              necessary. If you want to test multiple servers on a single machine, then 
+              different ports can be used for each server.</para>
+            </listitem>
+          </varlistentry>
+
+          <varlistentry>
+            <term>syncLimit</term>
+
+            <listitem>
+              <para>(No Java system property)</para>
+
+              <para>Amount of time, in ticks (see <ulink
+              url="#id_tickTime">tickTime</ulink>), to allow followers to sync
+              with ZooKeeper. If followers fall too far behind a leader, they
+              will be dropped.</para>
+            </listitem>
+          </varlistentry>
+
+          <varlistentry>
+            <term>group.x=nnnnn[:nnnnn]</term>
+
+            <listitem>
+              <para>(No Java system property)</para>
+
+              <para>Enables a hierarchical quorum construction."x" is a group identifier
+              and the numbers following the "=" sign correspond to server identifiers. 
+              The left-hand side of the assignment is a colon-separated list of server
+              identifiers. Note that groups must be disjoint and the union of all groups
+              must be the ZooKeeper ensemble. </para>
+              
+              <para> You will find an example <ulink url="zookeeperHierarchicalQuorums.html">here</ulink>
+              </para>
+            </listitem>
+          </varlistentry>
+
+          <varlistentry>
+            <term>weight.x=nnnnn</term>
+
+            <listitem>
+              <para>(No Java system property)</para>
+
+              <para>Used along with "group", it assigns a weight to a server when
+              forming quorums. Such a value corresponds to the weight of a server
+              when voting. There are a few parts of ZooKeeper that require voting
+              such as leader election and the atomic broadcast protocol. By default
+              the weight of server is 1. If the configuration defines groups, but not
+              weights, then a value of 1 will be assigned to all servers.  
+              </para>
+              
+              <para> You will find an example <ulink url="zookeeperHierarchicalQuorums.html">here</ulink>
+              </para>
+            </listitem>
+          </varlistentry>
+          
+          <varlistentry>
+            <term>cnxTimeout</term>
+
+            <listitem>
+              <para>(Java system property: zookeeper.<emphasis
+              role="bold">cnxTimeout</emphasis>)</para>
+
+              <para>Sets the timeout value for opening connections for leader election notifications. 
+              Only applicable if you are using electionAlg 3. 
+              </para>
+
+              <note>
+                <para>Default value is 5 seconds.</para>
+              </note>
+            </listitem>
+          </varlistentry>
+
+          <varlistentry>
+            <term>4lw.commands.whitelist</term>
+
+            <listitem>
+              <para>(Java system property: <emphasis
+                      role="bold">zookeeper.4lw.commands.whitelist</emphasis>)</para>
+
+              <para><emphasis role="bold">New in 3.4.10:</emphasis>
+                This property contains a list of comma separated
+                <ulink url="#sc_zkCommands">Four Letter Words</ulink> commands. It is introduced
+                to provide fine grained control over the set of commands ZooKeeper can execute,
+                so users can turn off certain commands if necessary.
+                By default it contains all supported four letter word commands except "wchp" and "wchc",
+                if the property is not specified. If the property is specified, then only commands listed
+                in the whitelist are enabled.
+              </para>
+
+              <para>Here's an example of the configuration that enables stat, ruok, conf, and isro
+                command while disabling the rest of Four Letter Words command:</para>
+              <programlisting>
+                4lw.commands.whitelist=stat, ruok, conf, isro
+              </programlisting>
+
+              <para>Users can also use asterisk option so they don't have to include every command one by one in the list.
+                As an example, this will enable all four letter word commands:
+              </para>
+              <programlisting>
+                4lw.commands.whitelist=*
+              </programlisting>
+
+            </listitem>
+          </varlistentry>
+
+          <varlistentry>
+            <term>ipReachableTimeout</term>
+
+            <listitem>
+              <para>(Java system property: <emphasis
+                      role="bold">zookeeper.ipReachableTimeout</emphasis>)</para>
+
+              <para><emphasis role="bold">New in 3.4.11:</emphasis>
+                Set this timeout value for IP addresses reachable checking when hostname is resolved, as mesured in
+                milliseconds.
+                By default, ZooKeeper will use the first IP address of the hostname(without any reachable checking).
+                When zookeeper.ipReachableTimeout is set(larger than 0), ZooKeeper will will try to pick up the first 
+                IP address which is reachable. This is done by calling Java API InetAddress.isReachable(long timeout)
+                function, in which this timeout value is used. If none of such reachable IP address can be found, the
+                first IP address of the hostname will be used anyway.
+              </para>
+
+            </listitem>
+          </varlistentry>
+
+          <varlistentry>
+            <term>tcpKeepAlive</term>
+
+            <listitem>
+              <para>(Java system property: <emphasis
+                      role="bold">zookeeper.tcpKeepAlive</emphasis>)</para>
+
+              <para><emphasis role="bold">New in 3.4.11:</emphasis>
+                Setting this to true sets the TCP keepAlive flag on the
+                sockets used by quorum members to perform elections.
+                This will allow for connections between quorum members to
+                remain up when there is network infrastructure that may
+                otherwise break them. Some NATs and firewalls may terminate
+                or lose state for long running or idle connections.</para>
+
+              <para> Enabling this option relies on OS level settings to work
+                properly, check your operating system's options regarding TCP
+                keepalive for more information.  Defaults to
+                <emphasis role="bold">false</emphasis>.
+              </para>
+            </listitem>
+          </varlistentry>
+
+        </variablelist>
+        <para></para>
+      </section>
+
+      <section id="sc_authOptions">
+        <title>Authentication &amp; Authorization Options</title>
+
+        <para>The options in this section allow control over
+        authentication/authorization performed by the service.</para>
+
+        <variablelist>
+          <varlistentry>
+            <term>zookeeper.DigestAuthenticationProvider.superDigest</term>
+
+            <listitem>
+              <para>(Java system property only: <emphasis
+              role="bold">zookeeper.DigestAuthenticationProvider.superDigest</emphasis>)</para>
+
+              <para>By default this feature is <emphasis
+              role="bold">disabled</emphasis></para>
+
+              <para><emphasis role="bold">New in 3.2:</emphasis>
+              Enables a ZooKeeper ensemble administrator to access the
+              znode hierarchy as a "super" user. In particular no ACL
+              checking occurs for a user authenticated as
+              super.</para>
+
+              <para>org.apache.zookeeper.server.auth.DigestAuthenticationProvider
+              can be used to generate the superDigest, call it with
+              one parameter of "super:&lt;password>". Provide the
+              generated "super:&lt;data>" as the system property value
+              when starting each server of the ensemble.</para>
+
+              <para>When authenticating to a ZooKeeper server (from a
+              ZooKeeper client) pass a scheme of "digest" and authdata
+              of "super:&lt;password>". Note that digest auth passes
+              the authdata in plaintext to the server, it would be
+              prudent to use this authentication method only on
+              localhost (not over the network) or over an encrypted
+              connection.</para>
+            </listitem>
+          </varlistentry>
+
+          <varlistentry>
+            <term>isro</term>
+
+            <listitem>
+              <para><emphasis role="bold">New in 3.4.0:</emphasis> Tests if
+              server is running in read-only mode.  The server will respond with
+              "ro" if in read-only mode or "rw" if not in read-only mode.</para>
+            </listitem>
+          </varlistentry>
+
+          <varlistentry>
+            <term>gtmk</term>
+
+            <listitem>
+              <para>Gets the current trace mask as a 64-bit signed long value in
+              decimal format.  See <command>stmk</command> for an explanation of
+              the possible values.</para>
+            </listitem>
+          </varlistentry>
+
+          <varlistentry>
+            <term>stmk</term>
+
+            <listitem>
+              <para>Sets the current trace mask.  The trace mask is 64 bits,
+              where each bit enables or disables a specific category of trace
+              logging on the server.  Log4J must be configured to enable
+              <command>TRACE</command> level first in order to see trace logging
+              messages.  The bits of the trace mask correspond to the following
+              trace logging categories.</para>
+
+              <table>
+                <title>Trace Mask Bit Values</title>
+                <tgroup cols="2" align="left" colsep="1" rowsep="1">
+                  <tbody>
+                    <row>
+                      <entry>0b0000000000</entry>
+                      <entry>Unused, reserved for future use.</entry>
+                    </row>
+                    <row>
+                      <entry>0b0000000010</entry>
+                      <entry>Logs client requests, excluding ping
+                      requests.</entry>
+                    </row>
+                    <row>
+                      <entry>0b0000000100</entry>
+                      <entry>Unused, reserved for future use.</entry>
+                    </row>
+                    <row>
+                      <entry>0b0000001000</entry>
+                      <entry>Logs client ping requests.</entry>
+                    </row>
+                    <row>
+                      <entry>0b0000010000</entry>
+                      <entry>Logs packets received from the quorum peer that is
+                      the current leader, excluding ping requests.</entry>
+                    </row>
+                    <row>
+                      <entry>0b0000100000</entry>
+                      <entry>Logs addition, removal and validation of client
+                      sessions.</entry>
+                    </row>
+                    <row>
+                      <entry>0b0001000000</entry>
+                      <entry>Logs delivery of watch events to client
+                      sessions.</entry>
+                    </row>
+                    <row>
+                      <entry>0b0010000000</entry>
+                      <entry>Logs ping packets received from the quorum peer
+                      that is the current leader.</entry>
+                    </row>
+                    <row>
+                      <entry>0b0100000000</entry>
+                      <entry>Unused, reserved for future use.</entry>
+                    </row>
+                    <row>
+                      <entry>0b1000000000</entry>
+                      <entry>Unused, reserved for future use.</entry>
+                    </row>
+                  </tbody>
+                </tgroup>
+              </table>
+
+              <para>All remaining bits in the 64-bit value are unused and
+              reserved for future use.  Multiple trace logging categories are
+              specified by calculating the bitwise OR of the documented values.
+              The default trace mask is 0b0100110010.  Thus, by default, trace
+              logging includes client requests, packets received from the
+              leader and sessions.</para>
+
+              <para>To set a different trace mask, send a request containing the
+              <command>stmk</command> four-letter word followed by the trace
+              mask represented as a 64-bit signed long value.  This example uses
+              the Perl <command>pack</command> function to construct a trace
+              mask that enables all trace logging categories described above and
+              convert it to a 64-bit signed long value with big-endian byte
+              order.  The result is appended to <command>stmk</command> and sent
+              to the server using netcat.  The server responds with the new
+              trace mask in decimal format.</para>
+
+              <programlisting>$ perl -e "print 'stmk', pack('q>', 0b0011111010)" | nc localhost 2181
+250
+              </programlisting>
+            </listitem>
+          </varlistentry>
+        </variablelist>
+      </section>
+
+      <section>
+        <title>Experimental Options/Features</title>
+
+        <para>New features that are currently considered experimental.</para>
+
+        <variablelist>
+          <varlistentry>
+            <term>Read Only Mode Server</term>
+
+            <listitem>
+              <para>(Java system property: <emphasis
+              role="bold">readonlymode.enabled</emphasis>)</para>
+
+              <para><emphasis role="bold">New in 3.4.0:</emphasis>
+              Setting this value to true enables Read Only Mode server
+              support (disabled by default). ROM allows clients
+              sessions which requested ROM support to connect to the
+              server even when the server might be partitioned from
+              the quorum. In this mode ROM clients can still read
+              values from the ZK service, but will be unable to write
+              values and see changes from other clients. See
+              ZOOKEEPER-784 for more details.
+              </para>
+            </listitem>
+          </varlistentry>
+
+        </variablelist>
+      </section>
+
+      <section>
+        <title>Unsafe Options</title>
+
+        <para>The following options can be useful, but be careful when you use
+        them. The risk of each is explained along with the explanation of what
+        the variable does.</para>
+
+        <variablelist>
+          <varlistentry>
+            <term>forceSync</term>
+
+            <listitem>
+              <para>(Java system property: <emphasis
+              role="bold">zookeeper.forceSync</emphasis>)</para>
+
+              <para>Requires updates to be synced to media of the transaction
+              log before finishing processing the update. If this option is
+              set to no, ZooKeeper will not require updates to be synced to
+              the media.</para>
+            </listitem>
+          </varlistentry>
+
+          <varlistentry>
+            <term>jute.maxbuffer:</term>
+
+            <listitem>
+              <para>(Java system property:<emphasis role="bold">
+              jute.maxbuffer</emphasis>)</para>
+
+              <para>This option can only be set as a Java system property.
+              There is no zookeeper prefix on it. It specifies the maximum
+              size of the data that can be stored in a znode. The default is
+              0xfffff, or just under 1M. If this option is changed, the system
+              property must be set on all servers and clients otherwise
+              problems will arise. This is really a sanity check. ZooKeeper is
+              designed to store data on the order of kilobytes in size.</para>
+            </listitem>
+          </varlistentry>
+
+          <varlistentry>
+            <term>skipACL</term>
+
+            <listitem>
+              <para>(Java system property: <emphasis
+              role="bold">zookeeper.skipACL</emphasis>)</para>
+
+              <para>Skips ACL checks. This results in a boost in throughput,
+              but opens up full access to the data tree to everyone.</para>
+            </listitem>
+          </varlistentry>
+
+          <varlistentry>
+            <term>quorumListenOnAllIPs</term>
+
+            <listitem>
+              <para>When set to true the ZooKeeper server will listen  
+              for connections from its peers on all available IP addresses,
+              and not only the address configured in the server list of the
+              configuration file. It affects the connections handling the 
+              ZAB protocol and the Fast Leader Election protocol. Default
+              value is <emphasis role="bold">false</emphasis>.</para>
+            </listitem>
+          </varlistentry>
+
+        </variablelist>
+      </section>
+
+      <section>
+        <title>Communication using the Netty framework</title>
+
+        <para><emphasis role="bold">New in
+            3.4:</emphasis> <ulink url="http://jboss.org/netty">Netty</ulink>
+            is an NIO based client/server communication framework, it
+            simplifies (over NIO being used directly) many of the
+            complexities of network level communication for java
+            applications. Additionally the Netty framework has built
+            in support for encryption (SSL) and authentication
+            (certificates). These are optional features and can be
+            turned on or off individually.
+        </para>
+        <para>Prior to version 3.4 ZooKeeper has always used NIO
+            directly, however in versions 3.4 and later Netty is
+            supported as an option to NIO (replaces). NIO continues to
+            be the default, however Netty based communication can be
+            used in place of NIO by setting the environment variable
+            "zookeeper.serverCnxnFactory" to
+            "org.apache.zookeeper.server.NettyServerCnxnFactory". You
+            have the option of setting this on either the client(s) or
+            server(s), typically you would want to set this on both,
+            however that is at your discretion.
+        </para>
+        <para>
+          TBD - tuning options for netty - currently there are none that are netty specific but we should add some. Esp around max bound on the number of reader worker threads netty creates.
+        </para>
+        <para>
+          TBD - how to manage encryption
+        </para>
+        <para>
+          TBD - how to manage certificates
+        </para>
+
+      </section>
+
+    </section>
+
+    <section id="sc_zkCommands">
+      <title>ZooKeeper Commands: The Four Letter Words</title>
+
+      <para>ZooKeeper responds to a small set of commands. Each command is
+      composed of four letters. You issue the commands to ZooKeeper via telnet
+      or nc, at the client port.</para>
+
+      <para>Three of the more interesting commands: "stat" gives some
+      general information about the server and connected clients,
+      while "srvr" and "cons" give extended details on server and
+      connections respectively.</para>
+
+      <variablelist>
+        <varlistentry>
+          <term>conf</term>
+
+          <listitem>
+            <para><emphasis role="bold">New in 3.3.0:</emphasis> Print
+            details about serving configuration.</para>
+          </listitem>
+
+        </varlistentry>
+
+        <varlistentry>
+          <term>cons</term>
+
+          <listitem>
+            <para><emphasis role="bold">New in 3.3.0:</emphasis> List
+            full connection/session details for all clients connected
+            to this server. Includes information on numbers of packets
+            received/sent, session id, operation latencies, last
+            operation performed, etc...</para>
+          </listitem>
+
+        </varlistentry>
+
+        <varlistentry>
+          <term>crst</term>
+
+          <listitem>
+            <para><emphasis role="bold">New in 3.3.0:</emphasis> Reset
+            connection/session statistics for all connections.</para>
+          </listitem>
+        </varlistentry>
+
+        <varlistentry>
+          <term>dump</term>
+
+          <listitem>
+            <para>Lists the outstanding sessions and ephemeral nodes. This
+            only works on the leader.</para>
+          </listitem>
+        </varlistentry>
+
+        <varlistentry>
+          <term>envi</term>
+
+          <listitem>
+            <para>Print details about serving environment</para>
+          </listitem>
+        </varlistentry>
+
+        <varlistentry>
+          <term>ruok</term>
+
+          <listitem>
+            <para>Tests if server is running in a non-error state. The server
+            will respond with imok if it is running. Otherwise it will not
+            respond at all.</para>
+
+            <para>A response of "imok" does not necessarily indicate that the
+            server has joined the quorum, just that the server process is active
+            and bound to the specified client port. Use "stat" for details on
+            state wrt quorum and client connection information.</para>
+          </listitem>
+        </varlistentry>
+
+        <varlistentry>
+          <term>srst</term>
+
+          <listitem>
+            <para>Reset server statistics.</para>
+          </listitem>
+        </varlistentry>
+
+        <varlistentry>
+          <term>srvr</term>
+
+          <listitem>
+            <para><emphasis role="bold">New in 3.3.0:</emphasis> Lists
+            full details for the server.</para>
+          </listitem>
+        </varlistentry>
+
+        <varlistentry>
+          <term>stat</term>
+
+          <listitem>
+            <para>Lists brief details for the server and connected
+            clients.</para>
+          </listitem>
+        </varlistentry>
+
+        <varlistentry>
+          <term>wchs</term>
+
+          <listitem>
+            <para><emphasis role="bold">New in 3.3.0:</emphasis> Lists
+            brief information on watches for the server.</para>
+          </listitem>
+        </varlistentry>
+
+        <varlistentry>
+          <term>wchc</term>
+
+          <listitem>
+            <para><emphasis role="bold">New in 3.3.0:</emphasis> Lists
+            detailed information on watches for the server, by
+            session.  This outputs a list of sessions(connections)
+            with associated watches (paths). Note, depending on the
+            number of watches this operation may be expensive (ie
+            impact server performance), use it carefully.</para>
+          </listitem>
+        </varlistentry>
+
+        <varlistentry>
+          <term>wchp</term>
+
+          <listitem>
+            <para><emphasis role="bold">New in 3.3.0:</emphasis> Lists
+            detailed information on watches for the server, by path.
+            This outputs a list of paths (znodes) with associated
+            sessions. Note, depending on the number of watches this
+            operation may be expensive (ie impact server performance),
+            use it carefully.</para>
+          </listitem>
+        </varlistentry>
+
+
+        <varlistentry>
+          <term>mntr</term>
+
+          <listitem>
+            <para><emphasis role="bold">New in 3.4.0:</emphasis> Outputs a list 
+            of variables that could be used for monitoring the health of the cluster.</para>
+
+            <programlisting>$ echo mntr | nc localhost 2185
+
+zk_version  3.4.0
+zk_avg_latency  0
+zk_max_latency  0
+zk_min_latency  0
+zk_packets_received 70
+zk_packets_sent 69
+zk_outstanding_requests 0
+zk_server_state leader
+zk_znode_count   4
+zk_watch_count  0
+zk_ephemerals_count 0
+zk_approximate_data_size    27
+zk_followers    4                   - only exposed by the Leader
+zk_synced_followers 4               - only exposed by the Leader
+zk_pending_syncs    0               - only exposed by the Leader
+zk_open_file_descriptor_count 23    - only available on Unix platforms
+zk_max_file_descriptor_count 1024   - only available on Unix platforms
+zk_fsync_threshold_exceed_count	0
+</programlisting>
+
+          <para>The output is compatible with java properties format and the content 
+        may change over time (new keys added). Your scripts should expect changes.</para>
+
+          <para>ATTENTION: Some of the keys are platform specific and some of the keys are only exported by the Leader. </para>
+
+          <para>The output contains multiple lines with the following format:</para>
+          <programlisting>key \t value</programlisting>
+          </listitem>
+        </varlistentry>
+      </variablelist>
+
+      <para>Here's an example of the <emphasis role="bold">ruok</emphasis>
+      command:</para>
+
+      <programlisting>$ echo ruok | nc 127.0.0.1 5111
+imok
+</programlisting>
+
+    
+    </section>
+
+    <section id="sc_dataFileManagement">
+      <title>Data File Management</title>
+
+      <para>ZooKeeper stores its data in a data directory and its transaction
+      log in a transaction log directory. By default these two directories are
+      the same. The server can (and should) be configured to store the
+      transaction log files in a separate directory than the data files.
+      Throughput increases and latency decreases when transaction logs reside
+      on a dedicated log devices.</para>
+
+      <section>
+        <title>The Data Directory</title>
+
+        <para>This directory has two files in it:</para>
+
+        <itemizedlist>
+          <listitem>
+            <para><filename>myid</filename> - contains a single integer in
+            human readable ASCII text that represents the server id.</para>
+          </listitem>
+
+          <listitem>
+            <para><filename>snapshot.&lt;zxid&gt;</filename> - holds the fuzzy
+            snapshot of a data tree.</para>
+          </listitem>
+        </itemizedlist>
+
+        <para>Each ZooKeeper server has a unique id. This id is used in two
+        places: the <filename>myid</filename> file and the configuration file.
+        The <filename>myid</filename> file identifies the server that
+        corresponds to the given data directory. The configuration file lists
+        the contact information for each server identified by its server id.
+        When a ZooKeeper server instance starts, it reads its id from the
+        <filename>myid</filename> file and then, using that id, reads from the
+        configuration file, looking up the port on which it should
+        listen.</para>
+
+        <para>The <filename>snapshot</filename> files stored in the data
+        directory are fuzzy snapshots in the sense that during the time the
+        ZooKeeper server is taking the snapshot, updates are occurring to the
+        data tree. The suffix of the <filename>snapshot</filename> file names
+        is the <emphasis>zxid</emphasis>, the ZooKeeper transaction id, of the
+        last committed transaction at the start of the snapshot. Thus, the
+        snapshot includes a subset of the updates to the data tree that
+        occurred while the snapshot was in process. The snapshot, then, may
+        not correspond to any data tree that actually existed, and for this
+        reason we refer to it as a fuzzy snapshot. Still, ZooKeeper can
+        recover using this snapshot because it takes advantage of the
+        idempotent nature of its updates. By replaying the transaction log
+        against fuzzy snapshots ZooKeeper gets the state of the system at the
+        end of the log.</para>
+      </section>
+
+      <section>
+        <title>The Log Directory</title>
+
+        <para>The Log Directory contains the ZooKeeper transaction logs.
+        Before any update takes place, ZooKeeper ensures that the transaction
+        that represents the update is written to non-volatile storage. A new
+        log file is started when the number of transactions written to the
+        current log file reaches a (variable) threshold. The threshold is
+        computed using the same parameter which influences the frequency of
+        snapshotting (see snapCount above). The log file's suffix is the first
+        zxid written to that log.</para>
+      </section>
+
+      <section id="sc_filemanagement">
+        <title>File Management</title>
+
+        <para>The format of snapshot and log files does not change between
+        standalone ZooKeeper servers and different configurations of
+        replicated ZooKeeper servers. Therefore, you can pull these files from
+        a running replicated ZooKeeper server to a development machine with a
+        stand-alone ZooKeeper server for trouble shooting.</para>
+
+        <para>Using older log and snapshot files, you can look at the previous
+        state of ZooKeeper servers and even restore that state. The
+        LogFormatter class allows an administrator to look at the transactions
+        in a log.</para>
+
+        <para>The ZooKeeper server creates snapshot and log files, but
+        never deletes them. The retention policy of the data and log
+        files is implemented outside of the ZooKeeper server. The
+        server itself only needs the latest complete fuzzy snapshot, all log
+        files following it, and the last log file preceding it.  The latter
+        requirement is necessary to include updates which happened after this
+        snapshot was started but went into the existing log file at that time.
+        This is possible because snapshotting and rolling over of logs
+        proceed somewhat independently in ZooKeeper. See the
+        <ulink url="#sc_maintenance">maintenance</ulink> section in
+        this document for more details on setting a retention policy
+        and maintenance of ZooKeeper storage.
+        </para>
+        <note>
+        <para>The data stored in these files is not encrypted. In the case of
+        storing sensitive data in ZooKeeper, necessary measures need to be
+        taken to prevent unauthorized access. Such measures are external to
+        ZooKeeper (e.g., control access to the files) and depend on the
+        individual settings in which it is being deployed. </para>
+        </note>
+      </section>
+
+      <section>
+        <title>Recovery - TxnLogToolkit</title>
+
+        <para>TxnLogToolkit is a command line tool shipped with ZooKeeper which
+          is capable of recovering transaction log entries with broken CRC.</para>
+        <para>Running it without any command line parameters or with the "-h,--help"
+          argument, it outputs the following help page:</para>
+
+        <programlisting>
+          $ bin/zkTxnLogToolkit.sh
+
+          usage: TxnLogToolkit [-dhrv] txn_log_file_name
+          -d,--dump      Dump mode. Dump all entries of the log file. (this is the default)
+          -h,--help      Print help message
+          -r,--recover   Recovery mode. Re-calculate CRC for broken entries.
+          -v,--verbose   Be verbose in recovery mode: print all entries, not just fixed ones.
+          -y,--yes       Non-interactive mode: repair all CRC errors without asking
+        </programlisting>
+
+        <para>The default behaviour is safe: it dumps the entries of the given
+        transaction log file to the screen: (same as using '-d,--dump' parameter)</para>
+
+        <programlisting>
+          $ bin/zkTxnLogToolkit.sh log.100000001
+          ZooKeeper Transactional Log File with dbid 0 txnlog format version 2
+          4/5/18 2:15:58 PM CEST session 0x16295bafcc40000 cxid 0x0 zxid 0x100000001 createSession 30000
+          <emphasis role="bold">CRC ERROR - 4/5/18 2:16:05 PM CEST session 0x16295bafcc40000 cxid 0x1 zxid 0x100000002 closeSession null</emphasis>
+          4/5/18 2:16:05 PM CEST session 0x16295bafcc40000 cxid 0x1 zxid 0x100000002 closeSession null
+          4/5/18 2:16:12 PM CEST session 0x26295bafcc90000 cxid 0x0 zxid 0x100000003 createSession 30000
+          4/5/18 2:17:34 PM CEST session 0x26295bafcc90000 cxid 0x0 zxid 0x200000001 closeSession null
+          4/5/18 2:17:34 PM CEST session 0x16295bd23720000 cxid 0x0 zxid 0x200000002 createSession 30000
+          4/5/18 2:18:02 PM CEST session 0x16295bd23720000 cxid 0x2 zxid 0x200000003 create '/andor,#626262,v{s{31,s{'world,'anyone}}},F,1
+          EOF reached after 6 txns.
+        </programlisting>
+
+        <para>There's a CRC error in the 2nd entry of the above transaction log file. In <emphasis role="bold">dump</emphasis>
+          mode, the toolkit only prints this information to the screen without touching the original file. In
+          <emphasis role="bold">recovery</emphasis> mode (-r,--recover flag) the original file still remains
+          untouched and all transactions will be copied over to a new txn log file with ".fixed" suffix. It recalculates
+          CRC values and copies the calculated value, if it doesn't match the original txn entry.
+          By default, the tool works interactively: it asks for confirmation whenever CRC error encountered.</para>
+
+        <programlisting>
+          $ bin/zkTxnLogToolkit.sh -r log.100000001
+          ZooKeeper Transactional Log File with dbid 0 txnlog format version 2
+          CRC ERROR - 4/5/18 2:16:05 PM CEST session 0x16295bafcc40000 cxid 0x1 zxid 0x100000002 closeSession null
+          Would you like to fix it (Yes/No/Abort) ?
+        </programlisting>
+
+        <para>Answering <emphasis role="bold">Yes</emphasis> means the newly calculated CRC value will be outputted
+          to the new file. <emphasis role="bold">No</emphasis> means that the original CRC value will be copied over.
+          <emphasis role="bold">Abort</emphasis> will abort the entire operation and exits.
+          (In this case the ".fixed" will not be deleted and left in a half-complete state: contains only entries which
+          have already been processed or only the header if the operation was aborted at the first entry.)</para>
+
+        <programlisting>
+          $ bin/zkTxnLogToolkit.sh -r log.100000001
+          ZooKeeper Transactional Log File with dbid 0 txnlog format version 2
+          CRC ERROR - 4/5/18 2:16:05 PM CEST session 0x16295bafcc40000 cxid 0x1 zxid 0x100000002 closeSession null
+          Would you like to fix it (Yes/No/Abort) ? y
+          EOF reached after 6 txns.
+          Recovery file log.100000001.fixed has been written with 1 fixed CRC error(s)
+        </programlisting>
+
+        <para>The default behaviour of recovery is to be silent: only entries with CRC error get printed to the screen.
+          One can turn on verbose mode with the -v,--verbose parameter to see all records.
+          Interactive mode can be turned off with the -y,--yes parameter. In this case all CRC errors will be fixed
+          in the new transaction file.</para>
+      </section>
+    </section>
+
+    <section id="sc_commonProblems">
+      <title>Things to Avoid</title>
+
+      <para>Here are some common problems you can avoid by configuring
+      ZooKeeper correctly:</para>
+
+      <variablelist>
+        <varlistentry>
+          <term>inconsistent lists of servers</term>
+
+          <listitem>
+            <para>The list of ZooKeeper servers used by the clients must match
+            the list of ZooKeeper servers that each ZooKeeper server has.
+            Things work okay if the client list is a subset of the real list,
+            but things will really act strange if clients have a list of
+            ZooKeeper servers that are in different ZooKeeper clusters. Also,
+            the server lists in each Zookeeper server configuration file
+            should be consistent with one another.</para>
+          </listitem>
+        </varlistentry>
+
+        <varlistentry>
+          <term>incorrect placement of transaction log</term>
+
+          <listitem>
+            <para>The most performance critical part of ZooKeeper is the
+            transaction log. ZooKeeper syncs transactions to media before it
+            returns a response. A dedicated transaction log device is key to
+            consistent good performance. Putting the log on a busy device will
+            adversely effect performance. If you only have one storage device,
+            put trace files on NFS and increase the snapshotCount; it doesn't
+            eliminate the problem, but it should mitigate it.</para>
+          </listitem>
+        </varlistentry>
+
+        <varlistentry>
+          <term>incorrect Java heap size</term>
+
+          <listitem>
+            <para>You should take special care to set your Java max heap size
+            correctly. In particular, you should not create a situation in
+            which ZooKeeper swaps to disk. The disk is death to ZooKeeper.
+            Everything is ordered, so if processing one request swaps the
+            disk, all other queued requests will probably do the same. the
+            disk. DON'T SWAP.</para>
+
+            <para>Be conservative in your estimates: if you have 4G of RAM, do
+            not set the Java max heap size to 6G or even 4G. For example, it
+            is more likely you would use a 3G heap for a 4G machine, as the
+            operating system and the cache also need memory. The best and only
+            recommend practice for estimating the heap size your system needs
+            is to run load tests, and then make sure you are well below the
+            usage limit that would cause the system to swap.</para>
+          </listitem>
+        </varlistentry>
+
+        <varlistentry>
+          <term>Publicly accessible deployment</term>
+          <listitem>
+            <para>
+              A ZooKeeper ensemble is expected to operate in a trusted computing environment.
+              It is thus recommended to deploy ZooKeeper behind a firewall.
+            </para>
+          </listitem>
+        </varlistentry>
+      </variablelist>
+    </section>
+
+    <section id="sc_bestPractices">
+      <title>Best Practices</title>
+
+      <para>For best results, take note of the following list of good
+      Zookeeper practices:</para>
+
+
+      <para>For multi-tennant installations see the <ulink
+      url="zookeeperProgrammers.html#ch_zkSessions">section</ulink>
+      detailing ZooKeeper "chroot" support, this can be very useful
+      when deploying many applications/services interfacing to a
+      single ZooKeeper cluster.</para>
+
+    </section>
+  </section>
+</article>

http://git-wip-us.apache.org/repos/asf/zookeeper/blob/c1efa954/zookeeper-docs/src/documentation/content/xdocs/zookeeperHierarchicalQuorums.xml
----------------------------------------------------------------------
diff --git a/zookeeper-docs/src/documentation/content/xdocs/zookeeperHierarchicalQuorums.xml b/zookeeper-docs/src/documentation/content/xdocs/zookeeperHierarchicalQuorums.xml
new file mode 100644
index 0000000..f71c4a8
--- /dev/null
+++ b/zookeeper-docs/src/documentation/content/xdocs/zookeeperHierarchicalQuorums.xml
@@ -0,0 +1,75 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!--
+  Copyright 2002-2004 The Apache Software Foundation
+
+  Licensed under the Apache License, Version 2.0 (the "License");
+  you may not use this file except in compliance with the License.
+  You may obtain a copy of the License at
+
+      http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License.
+-->
+
+<!DOCTYPE article PUBLIC "-//OASIS//DTD Simplified DocBook XML V1.0//EN"
+"http://www.oasis-open.org/docbook/xml/simple/1.0/sdocbook.dtd">
+<article id="zk_HierarchicalQuorums">
+  <title>Introduction to hierarchical quorums</title>
+
+  <articleinfo>
+    <legalnotice>
+      <para>Licensed under the Apache License, Version 2.0 (the "License");
+      you may not use this file except in compliance with the License. You may
+      obtain a copy of the License at <ulink
+      url="http://www.apache.org/licenses/LICENSE-2.0">http://www.apache.org/licenses/LICENSE-2.0</ulink>.</para>
+
+      <para>Unless required by applicable law or agreed to in writing,
+      software distributed under the License is distributed on an "AS IS"
+      BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
+      implied. See the License for the specific language governing permissions
+      and limitations under the License.</para>
+    </legalnotice>
+
+    <abstract>
+      <para>This document contains information about hierarchical quorums.</para>
+    </abstract>
+  </articleinfo>
+
+    <para>
+    This document gives an example of how to use hierarchical quorums. The basic idea is
+    very simple. First, we split servers into groups, and add a line for each group listing
+    the servers that form this group. Next we have to assign a weight to each server.  
+    </para>
+    
+    <para>
+    The following example shows how to configure a system with three groups of three servers
+    each, and we assign a weight of 1 to each server:
+    </para>
+    
+    <programlisting>
+    group.1=1:2:3
+    group.2=4:5:6
+    group.3=7:8:9
+   
+    weight.1=1
+    weight.2=1
+    weight.3=1
+    weight.4=1
+    weight.5=1
+    weight.6=1
+    weight.7=1
+    weight.8=1
+    weight.9=1
+ 	</programlisting>
+
+	<para>    
+    When running the system, we are able to form a quorum once we have a majority of votes from
+    a majority of non-zero-weight groups. Groups that have zero weight are discarded and not
+    considered when forming quorums. Looking at the example, we are able to form a quorum once
+    we have votes from at least two servers from each of two different groups.
+    </para> 
+ </article>
\ No newline at end of file