You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@zookeeper.apache.org by an...@apache.org on 2018/07/04 13:11:21 UTC
[01/12] zookeeper git commit: ZOOKEEPER-3022: MAVEN MIGRATION 3.4 -
Iteration 1 - docs, it
Repository: zookeeper
Updated Branches:
refs/heads/branch-3.4 4a8cceb93 -> c1efa954d
http://git-wip-us.apache.org/repos/asf/zookeeper/blob/c1efa954/zookeeper-docs/src/documentation/content/xdocs/zookeeperStarted.xml
----------------------------------------------------------------------
diff --git a/zookeeper-docs/src/documentation/content/xdocs/zookeeperStarted.xml b/zookeeper-docs/src/documentation/content/xdocs/zookeeperStarted.xml
new file mode 100644
index 0000000..70c227f
--- /dev/null
+++ b/zookeeper-docs/src/documentation/content/xdocs/zookeeperStarted.xml
@@ -0,0 +1,418 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!--
+ Copyright 2002-2004 The Apache Software Foundation
+
+ Licensed under the Apache License, Version 2.0 (the "License");
+ you may not use this file except in compliance with the License.
+ You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ See the License for the specific language governing permissions and
+ limitations under the License.
+-->
+
+<!DOCTYPE article PUBLIC "-//OASIS//DTD Simplified DocBook XML V1.0//EN"
+"http://www.oasis-open.org/docbook/xml/simple/1.0/sdocbook.dtd">
+<article id="bk_GettStartedGuide">
+ <title>ZooKeeper Getting Started Guide</title>
+
+ <articleinfo>
+ <legalnotice>
+ <para>Licensed under the Apache License, Version 2.0 (the "License");
+ you may not use this file except in compliance with the License. You may
+ obtain a copy of the License at <ulink
+ url="http://www.apache.org/licenses/LICENSE-2.0">http://www.apache.org/licenses/LICENSE-2.0</ulink>.</para>
+
+ <para>Unless required by applicable law or agreed to in writing,
+ software distributed under the License is distributed on an "AS IS"
+ BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
+ implied. See the License for the specific language governing permissions
+ and limitations under the License.</para>
+ </legalnotice>
+
+ <abstract>
+ <para>This guide contains detailed information about creating
+ distributed applications that use ZooKeeper. It discusses the basic
+ operations ZooKeeper supports, and how these can be used to build
+ higher-level abstractions. It contains solutions to common tasks, a
+ troubleshooting guide, and links to other information.</para>
+ </abstract>
+ </articleinfo>
+
+ <section id="ch_GettingStarted">
+ <title>Getting Started: Coordinating Distributed Applications with
+ ZooKeeper</title>
+
+ <para>This document contains information to get you started quickly with
+ ZooKeeper. It is aimed primarily at developers hoping to try it out, and
+ contains simple installation instructions for a single ZooKeeper server, a
+ few commands to verify that it is running, and a simple programming
+ example. Finally, as a convenience, there are a few sections regarding
+ more complicated installations, for example running replicated
+ deployments, and optimizing the transaction log. However for the complete
+ instructions for commercial deployments, please refer to the <ulink
+ url="zookeeperAdmin.html">ZooKeeper
+ Administrator's Guide</ulink>.</para>
+
+ <section id="sc_Prerequisites">
+ <title>Pre-requisites</title>
+
+ <para>See <ulink url="zookeeperAdmin.html#sc_systemReq">
+ System Requirements</ulink> in the Admin guide.</para>
+ </section>
+
+ <section id="sc_Download">
+ <title>Download</title>
+
+ <para>To get a ZooKeeper distribution, download a recent
+ <ulink url="http://zookeeper.apache.org/releases.html">
+ stable</ulink> release from one of the Apache Download
+ Mirrors.</para>
+ </section>
+
+ <section id="sc_InstallingSingleMode">
+ <title>Standalone Operation</title>
+
+ <para>Setting up a ZooKeeper server in standalone mode is
+ straightforward. The server is contained in a single JAR file,
+ so installation consists of creating a configuration.</para>
+
+ <para>Once you've downloaded a stable ZooKeeper release unpack
+ it and cd to the root</para>
+
+ <para>To start ZooKeeper you need a configuration file. Here is a sample,
+ create it in <emphasis role="bold">conf/zoo.cfg</emphasis>:</para>
+
+<programlisting>
+tickTime=2000
+dataDir=/var/lib/zookeeper
+clientPort=2181
+</programlisting>
+
+ <para>This file can be called anything, but for the sake of this
+ discussion call
+ it <emphasis role="bold">conf/zoo.cfg</emphasis>. Change the
+ value of <emphasis role="bold">dataDir</emphasis> to specify an
+ existing (empty to start with) directory. Here are the meanings
+ for each of the fields:</para>
+
+ <variablelist>
+ <varlistentry>
+ <term><emphasis role="bold">tickTime</emphasis></term>
+
+ <listitem>
+ <para>the basic time unit in milliseconds used by ZooKeeper. It is
+ used to do heartbeats and the minimum session timeout will be
+ twice the tickTime.</para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+
+ <variablelist>
+ <varlistentry>
+ <term><emphasis role="bold">dataDir</emphasis></term>
+
+ <listitem>
+ <para>the location to store the in-memory database snapshots and,
+ unless specified otherwise, the transaction log of updates to the
+ database.</para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><emphasis role="bold">clientPort</emphasis></term>
+
+ <listitem>
+ <para>the port to listen for client connections</para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+
+ <para>Now that you created the configuration file, you can start
+ ZooKeeper:</para>
+
+ <programlisting>bin/zkServer.sh start</programlisting>
+
+ <para>ZooKeeper logs messages using log4j -- more detail
+ available in the
+ <ulink url="zookeeperProgrammers.html#Logging">Logging</ulink>
+ section of the Programmer's Guide. You will see log messages
+ coming to the console (default) and/or a log file depending on
+ the log4j configuration.</para>
+
+ <para>The steps outlined here run ZooKeeper in standalone mode. There is
+ no replication, so if ZooKeeper process fails, the service will go down.
+ This is fine for most development situations, but to run ZooKeeper in
+ replicated mode, please see <ulink
+ url="#sc_RunningReplicatedZooKeeper">Running Replicated
+ ZooKeeper</ulink>.</para>
+ </section>
+
+ <section id="sc_FileManagement">
+ <title>Managing ZooKeeper Storage</title>
+ <para>For long running production systems ZooKeeper storage must
+ be managed externally (dataDir and logs). See the section on
+ <ulink
+ url="zookeeperAdmin.html#sc_maintenance">maintenance</ulink> for
+ more details.</para>
+ </section>
+
+ <section id="sc_ConnectingToZooKeeper">
+ <title>Connecting to ZooKeeper</title>
+
+ <programlisting>$ bin/zkCli.sh -server 127.0.0.1:2181</programlisting>
+
+ <para>This lets you perform simple, file-like operations.</para>
+
+ <para>Once you have connected, you should see something like:
+ </para>
+ <programlisting>
+<![CDATA[
+Connecting to localhost:2181
+log4j:WARN No appenders could be found for logger (org.apache.zookeeper.ZooKeeper).
+log4j:WARN Please initialize the log4j system properly.
+Welcome to ZooKeeper!
+JLine support is enabled
+[zkshell: 0]
+]]> </programlisting>
+ <para>
+ From the shell, type <command>help</command> to get a listing of commands that can be executed from the client, as in:
+ </para>
+ <programlisting>
+<![CDATA[
+[zkshell: 0] help
+ZooKeeper host:port cmd args
+ get path [watch]
+ ls path [watch]
+ set path data [version]
+ delquota [-n|-b] path
+ quit
+ printwatches on|off
+ createpath data acl
+ stat path [watch]
+ listquota path
+ history
+ setAcl path acl
+ getAcl path
+ sync path
+ redo cmdno
+ addauth scheme auth
+ delete path [version]
+ setquota -n|-b val path
+
+]]> </programlisting>
+ <para>From here, you can try a few simple commands to get a feel for this simple command line interface. First, start by issuing the list command, as
+ in <command>ls</command>, yielding:
+ </para>
+ <programlisting>
+<![CDATA[
+[zkshell: 8] ls /
+[zookeeper]
+]]> </programlisting>
+ <para>Next, create a new znode by running <command>create /zk_test my_data</command>. This creates a new znode and associates the string "my_data" with the node.
+ You should see:</para>
+ <programlisting>
+<![CDATA[
+[zkshell: 9] create /zk_test my_data
+Created /zk_test
+]]> </programlisting>
+ <para> Issue another <command>ls /</command> command to see what the directory looks like:
+ </para>
+ <programlisting>
+<![CDATA[
+[zkshell: 11] ls /
+[zookeeper, zk_test]
+
+]]> </programlisting><para>
+ Notice that the zk_test directory has now been created.
+ </para>
+ <para>Next, verify that the data was associated with the znode by running the <command>get</command> command, as in:
+ </para>
+ <programlisting>
+<![CDATA[
+[zkshell: 12] get /zk_test
+my_data
+cZxid = 5
+ctime = Fri Jun 05 13:57:06 PDT 2009
+mZxid = 5
+mtime = Fri Jun 05 13:57:06 PDT 2009
+pZxid = 5
+cversion = 0
+dataVersion = 0
+aclVersion = 0
+ephemeralOwner = 0
+dataLength = 7
+numChildren = 0
+]]> </programlisting>
+ <para>We can change the data associated with zk_test by issuing the <command>set</command> command, as in:
+ </para>
+ <programlisting>
+<![CDATA[
+[zkshell: 14] set /zk_test junk
+cZxid = 5
+ctime = Fri Jun 05 13:57:06 PDT 2009
+mZxid = 6
+mtime = Fri Jun 05 14:01:52 PDT 2009
+pZxid = 5
+cversion = 0
+dataVersion = 1
+aclVersion = 0
+ephemeralOwner = 0
+dataLength = 4
+numChildren = 0
+[zkshell: 15] get /zk_test
+junk
+cZxid = 5
+ctime = Fri Jun 05 13:57:06 PDT 2009
+mZxid = 6
+mtime = Fri Jun 05 14:01:52 PDT 2009
+pZxid = 5
+cversion = 0
+dataVersion = 1
+aclVersion = 0
+ephemeralOwner = 0
+dataLength = 4
+numChildren = 0
+]]> </programlisting>
+ <para>
+ (Notice we did a <command>get</command> after setting the data and it did, indeed, change.</para>
+ <para>Finally, let's <command>delete</command> the node by issuing:
+ </para>
+ <programlisting>
+<![CDATA[
+[zkshell: 16] delete /zk_test
+[zkshell: 17] ls /
+[zookeeper]
+[zkshell: 18]
+]]></programlisting>
+ <para>That's it for now. To explore more, continue with the rest of this document and see the <ulink url="zookeeperProgrammers.html">Programmer's Guide</ulink>. </para>
+ </section>
+
+ <section id="sc_ProgrammingToZooKeeper">
+ <title>Programming to ZooKeeper</title>
+
+ <para>ZooKeeper has a Java bindings and C bindings. They are
+ functionally equivalent. The C bindings exist in two variants: single
+ threaded and multi-threaded. These differ only in how the messaging loop
+ is done. For more information, see the <ulink
+ url="zookeeperProgrammers.html#ch_programStructureWithExample">Programming
+ Examples in the ZooKeeper Programmer's Guide</ulink> for
+ sample code using of the different APIs.</para>
+ </section>
+
+ <section id="sc_RunningReplicatedZooKeeper">
+ <title>Running Replicated ZooKeeper</title>
+
+ <para>Running ZooKeeper in standalone mode is convenient for evaluation,
+ some development, and testing. But in production, you should run
+ ZooKeeper in replicated mode. A replicated group of servers in the same
+ application is called a <emphasis>quorum</emphasis>, and in replicated
+ mode, all servers in the quorum have copies of the same configuration
+ file.</para>
+ <note>
+ <para>
+ For replicated mode, a minimum of three servers are required,
+ and it is strongly recommended that you have an odd number of
+ servers. If you only have two servers, then you are in a
+ situation where if one of them fails, there are not enough
+ machines to form a majority quorum. Two servers is inherently
+ <emphasis role="bold">less</emphasis>
+ stable than a single server, because there are two single
+ points of failure.
+ </para>
+ </note>
+ <para>
+ The required
+ <emphasis role="bold">conf/zoo.cfg</emphasis>
+ file for replicated mode is similar to the one used in standalone
+ mode, but with a few differences. Here is an example:
+ </para>
+
+<programlisting>
+tickTime=2000
+dataDir=/var/lib/zookeeper
+clientPort=2181
+initLimit=5
+syncLimit=2
+server.1=zoo1:2888:3888
+server.2=zoo2:2888:3888
+server.3=zoo3:2888:3888
+</programlisting>
+
+ <para>The new entry, <emphasis role="bold">initLimit</emphasis> is
+ timeouts ZooKeeper uses to limit the length of time the ZooKeeper
+ servers in quorum have to connect to a leader. The entry <emphasis
+ role="bold">syncLimit</emphasis> limits how far out of date a server can
+ be from a leader.</para>
+
+ <para>With both of these timeouts, you specify the unit of time using
+ <emphasis role="bold">tickTime</emphasis>. In this example, the timeout
+ for initLimit is 5 ticks at 2000 milleseconds a tick, or 10
+ seconds.</para>
+
+ <para>The entries of the form <emphasis>server.X</emphasis> list the
+ servers that make up the ZooKeeper service. When the server starts up,
+ it knows which server it is by looking for the file
+ <emphasis>myid</emphasis> in the data directory. That file has the
+ contains the server number, in ASCII.</para>
+
+ <para>Finally, note the two port numbers after each server
+ name: " 2888" and "3888". Peers use the former port to connect
+ to other peers. Such a connection is necessary so that peers
+ can communicate, for example, to agree upon the order of
+ updates. More specifically, a ZooKeeper server uses this port
+ to connect followers to the leader. When a new leader arises, a
+ follower opens a TCP connection to the leader using this
+ port. Because the default leader election also uses TCP, we
+ currently require another port for leader election. This is the
+ second port in the server entry.
+ </para>
+
+ <note>
+ <para>If you want to test multiple servers on a single
+ machine, specify the servername
+ as <emphasis>localhost</emphasis> with unique quorum &
+ leader election ports (i.e. 2888:3888, 2889:3889, 2890:3890 in
+ the example above) for each server.X in that server's config
+ file. Of course separate <emphasis>dataDir</emphasis>s and
+ distinct <emphasis>clientPort</emphasis>s are also necessary
+ (in the above replicated example, running on a
+ single <emphasis>localhost</emphasis>, you would still have
+ three config files).</para>
+ <para>Please be aware that setting up multiple servers on a single
+ machine will not create any redundancy. If something were to
+ happen which caused the machine to die, all of the zookeeper
+ servers would be offline. Full redundancy requires that each
+ server have its own machine. It must be a completely separate
+ physical server. Multiple virtual machines on the same physical
+ host are still vulnerable to the complete failure of that host.</para>
+ </note>
+ </section>
+
+ <section>
+ <title>Other Optimizations</title>
+
+ <para>There are a couple of other configuration parameters that can
+ greatly increase performance:</para>
+
+ <itemizedlist>
+ <listitem>
+ <para>To get low latencies on updates it is important to
+ have a dedicated transaction log directory. By default
+ transaction logs are put in the same directory as the data
+ snapshots and <emphasis>myid</emphasis> file. The dataLogDir
+ parameters indicates a different directory to use for the
+ transaction logs.</para>
+ </listitem>
+
+ <listitem>
+ <para><emphasis>[tbd: what is the other config param?]</emphasis></para>
+ </listitem>
+ </itemizedlist>
+ </section>
+ </section>
+</article>
http://git-wip-us.apache.org/repos/asf/zookeeper/blob/c1efa954/zookeeper-docs/src/documentation/content/xdocs/zookeeperTutorial.xml
----------------------------------------------------------------------
diff --git a/zookeeper-docs/src/documentation/content/xdocs/zookeeperTutorial.xml b/zookeeper-docs/src/documentation/content/xdocs/zookeeperTutorial.xml
new file mode 100644
index 0000000..77cca8f
--- /dev/null
+++ b/zookeeper-docs/src/documentation/content/xdocs/zookeeperTutorial.xml
@@ -0,0 +1,712 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!--
+ Copyright 2002-2004 The Apache Software Foundation
+
+ Licensed under the Apache License, Version 2.0 (the "License");
+ you may not use this file except in compliance with the License.
+ You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ See the License for the specific language governing permissions and
+ limitations under the License.
+-->
+
+<!DOCTYPE article PUBLIC "-//OASIS//DTD Simplified DocBook XML V1.0//EN"
+"http://www.oasis-open.org/docbook/xml/simple/1.0/sdocbook.dtd">
+<article id="ar_Tutorial">
+ <title>Programming with ZooKeeper - A basic tutorial</title>
+
+ <articleinfo>
+ <legalnotice>
+ <para>Licensed under the Apache License, Version 2.0 (the "License");
+ you may not use this file except in compliance with the License. You may
+ obtain a copy of the License at <ulink
+ url="http://www.apache.org/licenses/LICENSE-2.0">http://www.apache.org/licenses/LICENSE-2.0</ulink>.</para>
+
+ <para>Unless required by applicable law or agreed to in writing,
+ software distributed under the License is distributed on an "AS IS"
+ BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
+ implied. See the License for the specific language governing permissions
+ and limitations under the License.</para>
+ </legalnotice>
+
+ <abstract>
+ <para>This article contains sample Java code for simple implementations of barrier
+ and consumers queues..</para>
+
+ </abstract>
+ </articleinfo>
+
+ <section id="ch_Introduction">
+ <title>Introduction</title>
+
+ <para>In this tutorial, we show simple implementations of barriers and
+ producer-consumer queues using ZooKeeper. We call the respective classes Barrier and Queue.
+ These examples assume that you have at least one ZooKeeper server running.</para>
+
+ <para>Both primitives use the following common excerpt of code:</para>
+
+ <programlisting>
+ static ZooKeeper zk = null;
+ static Integer mutex;
+
+ String root;
+
+ SyncPrimitive(String address) {
+ if(zk == null){
+ try {
+ System.out.println("Starting ZK:");
+ zk = new ZooKeeper(address, 3000, this);
+ mutex = new Integer(-1);
+ System.out.println("Finished starting ZK: " + zk);
+ } catch (IOException e) {
+ System.out.println(e.toString());
+ zk = null;
+ }
+ }
+ }
+
+ synchronized public void process(WatchedEvent event) {
+ synchronized (mutex) {
+ mutex.notify();
+ }
+ }
+</programlisting>
+
+<para>Both classes extend SyncPrimitive. In this way, we execute steps that are
+common to all primitives in the constructor of SyncPrimitive. To keep the examples
+simple, we create a ZooKeeper object the first time we instantiate either a barrier
+object or a queue object, and we declare a static variable that is a reference
+to this object. The subsequent instances of Barrier and Queue check whether a
+ZooKeeper object exists. Alternatively, we could have the application creating a
+ZooKeeper object and passing it to the constructor of Barrier and Queue.</para>
+<para>
+We use the process() method to process notifications triggered due to watches.
+In the following discussion, we present code that sets watches. A watch is internal
+structure that enables ZooKeeper to notify a client of a change to a node. For example,
+if a client is waiting for other clients to leave a barrier, then it can set a watch and
+wait for modifications to a particular node, which can indicate that it is the end of the wait.
+This point becomes clear once we go over the examples.
+</para>
+</section>
+
+ <section id="sc_barriers"><title>Barriers</title>
+
+ <para>
+ A barrier is a primitive that enables a group of processes to synchronize the
+ beginning and the end of a computation. The general idea of this implementation
+ is to have a barrier node that serves the purpose of being a parent for individual
+ process nodes. Suppose that we call the barrier node "/b1". Each process "p" then
+ creates a node "/b1/p". Once enough processes have created their corresponding
+ nodes, joined processes can start the computation.
+ </para>
+
+ <para>In this example, each process instantiates a Barrier object, and its constructor takes as parameters:</para>
+
+ <itemizedlist><listitem><para>the address of a ZooKeeper server (e.g., "zoo1.foo.com:2181")</para></listitem>
+<listitem><para>the path of the barrier node on ZooKeeper (e.g., "/b1")</para></listitem>
+<listitem><para>the size of the group of processes</para></listitem>
+</itemizedlist>
+
+<para>The constructor of Barrier passes the address of the Zookeeper server to the
+constructor of the parent class. The parent class creates a ZooKeeper instance if
+one does not exist. The constructor of Barrier then creates a
+barrier node on ZooKeeper, which is the parent node of all process nodes, and
+we call root (<emphasis role="bold">Note:</emphasis> This is not the ZooKeeper root "/").</para>
+
+<programlisting>
+ /**
+ * Barrier constructor
+ *
+ * @param address
+ * @param root
+ * @param size
+ */
+ Barrier(String address, String root, int size) {
+ super(address);
+ this.root = root;
+ this.size = size;
+
+ // Create barrier node
+ if (zk != null) {
+ try {
+ Stat s = zk.exists(root, false);
+ if (s == null) {
+ zk.create(root, new byte[0], Ids.OPEN_ACL_UNSAFE,
+ CreateMode.PERSISTENT);
+ }
+ } catch (KeeperException e) {
+ System.out
+ .println("Keeper exception when instantiating queue: "
+ + e.toString());
+ } catch (InterruptedException e) {
+ System.out.println("Interrupted exception");
+ }
+ }
+
+ // My node name
+ try {
+ name = new String(InetAddress.getLocalHost().getCanonicalHostName().toString());
+ } catch (UnknownHostException e) {
+ System.out.println(e.toString());
+ }
+
+ }
+</programlisting>
+<para>
+To enter the barrier, a process calls enter(). The process creates a node under
+the root to represent it, using its host name to form the node name. It then wait
+until enough processes have entered the barrier. A process does it by checking
+the number of children the root node has with "getChildren()", and waiting for
+notifications in the case it does not have enough. To receive a notification when
+there is a change to the root node, a process has to set a watch, and does it
+through the call to "getChildren()". In the code, we have that "getChildren()"
+has two parameters. The first one states the node to read from, and the second is
+a boolean flag that enables the process to set a watch. In the code the flag is true.
+</para>
+
+<programlisting>
+ /**
+ * Join barrier
+ *
+ * @return
+ * @throws KeeperException
+ * @throws InterruptedException
+ */
+
+ boolean enter() throws KeeperException, InterruptedException{
+ zk.create(root + "/" + name, new byte[0], Ids.OPEN_ACL_UNSAFE,
+ CreateMode.EPHEMERAL_SEQUENTIAL);
+ while (true) {
+ synchronized (mutex) {
+ List<String> list = zk.getChildren(root, true);
+
+ if (list.size() < size) {
+ mutex.wait();
+ } else {
+ return true;
+ }
+ }
+ }
+ }
+</programlisting>
+<para>
+Note that enter() throws both KeeperException and InterruptedException, so it is
+the reponsability of the application to catch and handle such exceptions.</para>
+
+<para>
+Once the computation is finished, a process calls leave() to leave the barrier.
+First it deletes its corresponding node, and then it gets the children of the root
+node. If there is at least one child, then it waits for a notification (obs: note
+that the second parameter of the call to getChildren() is true, meaning that
+ZooKeeper has to set a watch on the the root node). Upon reception of a notification,
+it checks once more whether the root node has any child.</para>
+
+<programlisting>
+ /**
+ * Wait until all reach barrier
+ *
+ * @return
+ * @throws KeeperException
+ * @throws InterruptedException
+ */
+
+ boolean leave() throws KeeperException, InterruptedException{
+ zk.delete(root + "/" + name, 0);
+ while (true) {
+ synchronized (mutex) {
+ List<String> list = zk.getChildren(root, true);
+ if (list.size() > 0) {
+ mutex.wait();
+ } else {
+ return true;
+ }
+ }
+ }
+ }
+ }
+</programlisting>
+</section>
+<section id="sc_producerConsumerQueues"><title>Producer-Consumer Queues</title>
+<para>
+A producer-consumer queue is a distributed data estructure thata group of processes
+use to generate and consume items. Producer processes create new elements and add
+them to the queue. Consumer processes remove elements from the list, and process them.
+In this implementation, the elements are simple integers. The queue is represented
+by a root node, and to add an element to the queue, a producer process creates a new node,
+a child of the root node.
+</para>
+
+<para>
+The following excerpt of code corresponds to the constructor of the object. As
+with Barrier objects, it first calls the constructor of the parent class, SyncPrimitive,
+that creates a ZooKeeper object if one doesn't exist. It then verifies if the root
+node of the queue exists, and creates if it doesn't.
+</para>
+<programlisting>
+ /**
+ * Constructor of producer-consumer queue
+ *
+ * @param address
+ * @param name
+ */
+ Queue(String address, String name) {
+ super(address);
+ this.root = name;
+ // Create ZK node name
+ if (zk != null) {
+ try {
+ Stat s = zk.exists(root, false);
+ if (s == null) {
+ zk.create(root, new byte[0], Ids.OPEN_ACL_UNSAFE,
+ CreateMode.PERSISTENT);
+ }
+ } catch (KeeperException e) {
+ System.out
+ .println("Keeper exception when instantiating queue: "
+ + e.toString());
+ } catch (InterruptedException e) {
+ System.out.println("Interrupted exception");
+ }
+ }
+ }
+</programlisting>
+
+<para>
+A producer process calls "produce()" to add an element to the queue, and passes
+an integer as an argument. To add an element to the queue, the method creates a
+new node using "create()", and uses the SEQUENCE flag to instruct ZooKeeper to
+append the value of the sequencer counter associated to the root node. In this way,
+we impose a total order on the elements of the queue, thus guaranteeing that the
+oldest element of the queue is the next one consumed.
+</para>
+
+<programlisting>
+ /**
+ * Add element to the queue.
+ *
+ * @param i
+ * @return
+ */
+
+ boolean produce(int i) throws KeeperException, InterruptedException{
+ ByteBuffer b = ByteBuffer.allocate(4);
+ byte[] value;
+
+ // Add child with value i
+ b.putInt(i);
+ value = b.array();
+ zk.create(root + "/element", value, Ids.OPEN_ACL_UNSAFE,
+ CreateMode.PERSISTENT_SEQUENTIAL);
+
+ return true;
+ }
+</programlisting>
+<para>
+To consume an element, a consumer process obtains the children of the root node,
+reads the node with smallest counter value, and returns the element. Note that
+if there is a conflict, then one of the two contending processes won't be able to
+delete the node and the delete operation will throw an exception.</para>
+
+<para>
+A call to getChildren() returns the list of children in lexicographic order.
+As lexicographic order does not necessary follow the numerical order of the counter
+values, we need to decide which element is the smallest. To decide which one has
+the smallest counter value, we traverse the list, and remove the prefix "element"
+from each one.</para>
+
+<programlisting>
+ /**
+ * Remove first element from the queue.
+ *
+ * @return
+ * @throws KeeperException
+ * @throws InterruptedException
+ */
+ int consume() throws KeeperException, InterruptedException{
+ int retvalue = -1;
+ Stat stat = null;
+
+ // Get the first element available
+ while (true) {
+ synchronized (mutex) {
+ List<String> list = zk.getChildren(root, true);
+ if (list.size() == 0) {
+ System.out.println("Going to wait");
+ mutex.wait();
+ } else {
+ Integer min = new Integer(list.get(0).substring(7));
+ for(String s : list){
+ Integer tempValue = new Integer(s.substring(7));
+ //System.out.println("Temporary value: " + tempValue);
+ if(tempValue < min) min = tempValue;
+ }
+ System.out.println("Temporary value: " + root + "/element" + min);
+ byte[] b = zk.getData(root + "/element" + min,
+ false, stat);
+ zk.delete(root + "/element" + min, 0);
+ ByteBuffer buffer = ByteBuffer.wrap(b);
+ retvalue = buffer.getInt();
+
+ return retvalue;
+ }
+ }
+ }
+ }
+ }
+</programlisting>
+
+</section>
+
+<section>
+<title>Complete example</title>
+<para>
+In the following section you can find a complete command line application to demonstrate the above mentioned
+recipes. Use the following command to run it.
+</para>
+<programlisting>
+ZOOBINDIR="[path_to_distro]/bin"
+. "$ZOOBINDIR"/zkEnv.sh
+java SyncPrimitive [Test Type] [ZK server] [No of elements] [Client type]
+</programlisting>
+
+<section>
+<title>Queue test</title>
+<para>Start a producer to create 100 elements</para>
+<programlisting>
+java SyncPrimitive qTest localhost 100 p
+</programlisting>
+
+<para>Start a consumer to consume 100 elements</para>
+<programlisting>
+java SyncPrimitive qTest localhost 100 c
+</programlisting>
+</section>
+
+<section>
+<title>Barrier test</title>
+<para>Start a barrier with 2 participants (start as many times as many participants you'd like to enter)</para>
+<programlisting>
+java SyncPrimitive bTest localhost 2
+</programlisting>
+</section>
+
+<section id="sc_sourceListing"><title>Source Listing</title>
+<example id="eg_SyncPrimitive_java">
+<title>SyncPrimitive.Java</title>
+<programlisting>
+import java.io.IOException;
+import java.net.InetAddress;
+import java.net.UnknownHostException;
+import java.nio.ByteBuffer;
+import java.util.List;
+import java.util.Random;
+
+import org.apache.zookeeper.CreateMode;
+import org.apache.zookeeper.KeeperException;
+import org.apache.zookeeper.WatchedEvent;
+import org.apache.zookeeper.Watcher;
+import org.apache.zookeeper.ZooKeeper;
+import org.apache.zookeeper.ZooDefs.Ids;
+import org.apache.zookeeper.data.Stat;
+
+public class SyncPrimitive implements Watcher {
+
+ static ZooKeeper zk = null;
+ static Integer mutex;
+
+ String root;
+
+ SyncPrimitive(String address) {
+ if(zk == null){
+ try {
+ System.out.println("Starting ZK:");
+ zk = new ZooKeeper(address, 3000, this);
+ mutex = new Integer(-1);
+ System.out.println("Finished starting ZK: " + zk);
+ } catch (IOException e) {
+ System.out.println(e.toString());
+ zk = null;
+ }
+ }
+ //else mutex = new Integer(-1);
+ }
+
+ synchronized public void process(WatchedEvent event) {
+ synchronized (mutex) {
+ //System.out.println("Process: " + event.getType());
+ mutex.notify();
+ }
+ }
+
+ /**
+ * Barrier
+ */
+ static public class Barrier extends SyncPrimitive {
+ int size;
+ String name;
+
+ /**
+ * Barrier constructor
+ *
+ * @param address
+ * @param root
+ * @param size
+ */
+ Barrier(String address, String root, int size) {
+ super(address);
+ this.root = root;
+ this.size = size;
+
+ // Create barrier node
+ if (zk != null) {
+ try {
+ Stat s = zk.exists(root, false);
+ if (s == null) {
+ zk.create(root, new byte[0], Ids.OPEN_ACL_UNSAFE,
+ CreateMode.PERSISTENT);
+ }
+ } catch (KeeperException e) {
+ System.out
+ .println("Keeper exception when instantiating queue: "
+ + e.toString());
+ } catch (InterruptedException e) {
+ System.out.println("Interrupted exception");
+ }
+ }
+
+ // My node name
+ try {
+ name = new String(InetAddress.getLocalHost().getCanonicalHostName().toString());
+ } catch (UnknownHostException e) {
+ System.out.println(e.toString());
+ }
+
+ }
+
+ /**
+ * Join barrier
+ *
+ * @return
+ * @throws KeeperException
+ * @throws InterruptedException
+ */
+
+ boolean enter() throws KeeperException, InterruptedException{
+ zk.create(root + "/" + name, new byte[0], Ids.OPEN_ACL_UNSAFE,
+ CreateMode.EPHEMERAL_SEQUENTIAL);
+ while (true) {
+ synchronized (mutex) {
+ List<String> list = zk.getChildren(root, true);
+
+ if (list.size() < size) {
+ mutex.wait();
+ } else {
+ return true;
+ }
+ }
+ }
+ }
+
+ /**
+ * Wait until all reach barrier
+ *
+ * @return
+ * @throws KeeperException
+ * @throws InterruptedException
+ */
+
+ boolean leave() throws KeeperException, InterruptedException{
+ zk.delete(root + "/" + name, 0);
+ while (true) {
+ synchronized (mutex) {
+ List<String> list = zk.getChildren(root, true);
+ if (list.size() > 0) {
+ mutex.wait();
+ } else {
+ return true;
+ }
+ }
+ }
+ }
+ }
+
+ /**
+ * Producer-Consumer queue
+ */
+ static public class Queue extends SyncPrimitive {
+
+ /**
+ * Constructor of producer-consumer queue
+ *
+ * @param address
+ * @param name
+ */
+ Queue(String address, String name) {
+ super(address);
+ this.root = name;
+ // Create ZK node name
+ if (zk != null) {
+ try {
+ Stat s = zk.exists(root, false);
+ if (s == null) {
+ zk.create(root, new byte[0], Ids.OPEN_ACL_UNSAFE,
+ CreateMode.PERSISTENT);
+ }
+ } catch (KeeperException e) {
+ System.out
+ .println("Keeper exception when instantiating queue: "
+ + e.toString());
+ } catch (InterruptedException e) {
+ System.out.println("Interrupted exception");
+ }
+ }
+ }
+
+ /**
+ * Add element to the queue.
+ *
+ * @param i
+ * @return
+ */
+
+ boolean produce(int i) throws KeeperException, InterruptedException{
+ ByteBuffer b = ByteBuffer.allocate(4);
+ byte[] value;
+
+ // Add child with value i
+ b.putInt(i);
+ value = b.array();
+ zk.create(root + "/element", value, Ids.OPEN_ACL_UNSAFE,
+ CreateMode.PERSISTENT_SEQUENTIAL);
+
+ return true;
+ }
+
+
+ /**
+ * Remove first element from the queue.
+ *
+ * @return
+ * @throws KeeperException
+ * @throws InterruptedException
+ */
+ int consume() throws KeeperException, InterruptedException{
+ int retvalue = -1;
+ Stat stat = null;
+
+ // Get the first element available
+ while (true) {
+ synchronized (mutex) {
+ List<String> list = zk.getChildren(root, true);
+ if (list.size() == 0) {
+ System.out.println("Going to wait");
+ mutex.wait();
+ } else {
+ Integer min = new Integer(list.get(0).substring(7));
+ String minNode = list.get(0);
+ for(String s : list){
+ Integer tempValue = new Integer(s.substring(7));
+ //System.out.println("Temporary value: " + tempValue);
+ if(tempValue < min) {
+ min = tempValue;
+ minNode = s;
+ }
+ }
+ System.out.println("Temporary value: " + root + "/" + minNode);
+ byte[] b = zk.getData(root + "/" + minNode,
+ false, stat);
+ zk.delete(root + "/" + minNode, 0);
+ ByteBuffer buffer = ByteBuffer.wrap(b);
+ retvalue = buffer.getInt();
+
+ return retvalue;
+ }
+ }
+ }
+ }
+ }
+
+ public static void main(String args[]) {
+ if (args[0].equals("qTest"))
+ queueTest(args);
+ else
+ barrierTest(args);
+
+ }
+
+ public static void queueTest(String args[]) {
+ Queue q = new Queue(args[1], "/app1");
+
+ System.out.println("Input: " + args[1]);
+ int i;
+ Integer max = new Integer(args[2]);
+
+ if (args[3].equals("p")) {
+ System.out.println("Producer");
+ for (i = 0; i < max; i++)
+ try{
+ q.produce(10 + i);
+ } catch (KeeperException e){
+
+ } catch (InterruptedException e){
+
+ }
+ } else {
+ System.out.println("Consumer");
+
+ for (i = 0; i < max; i++) {
+ try{
+ int r = q.consume();
+ System.out.println("Item: " + r);
+ } catch (KeeperException e){
+ i--;
+ } catch (InterruptedException e){
+
+ }
+ }
+ }
+ }
+
+ public static void barrierTest(String args[]) {
+ Barrier b = new Barrier(args[1], "/b1", new Integer(args[2]));
+ try{
+ boolean flag = b.enter();
+ System.out.println("Entered barrier: " + args[2]);
+ if(!flag) System.out.println("Error when entering the barrier");
+ } catch (KeeperException e){
+
+ } catch (InterruptedException e){
+
+ }
+
+ // Generate random integer
+ Random rand = new Random();
+ int r = rand.nextInt(100);
+ // Loop for rand iterations
+ for (int i = 0; i < r; i++) {
+ try {
+ Thread.sleep(100);
+ } catch (InterruptedException e) {
+
+ }
+ }
+ try{
+ b.leave();
+ } catch (KeeperException e){
+
+ } catch (InterruptedException e){
+
+ }
+ System.out.println("Left barrier");
+ }
+}
+</programlisting></example>
+</section>
+</section>
+
+</article>
http://git-wip-us.apache.org/repos/asf/zookeeper/blob/c1efa954/zookeeper-docs/src/documentation/resources/images/2pc.jpg
----------------------------------------------------------------------
diff --git a/zookeeper-docs/src/documentation/resources/images/2pc.jpg b/zookeeper-docs/src/documentation/resources/images/2pc.jpg
new file mode 100755
index 0000000..fe4488f
Binary files /dev/null and b/zookeeper-docs/src/documentation/resources/images/2pc.jpg differ
http://git-wip-us.apache.org/repos/asf/zookeeper/blob/c1efa954/zookeeper-docs/src/documentation/resources/images/bk-overview.jpg
----------------------------------------------------------------------
diff --git a/zookeeper-docs/src/documentation/resources/images/bk-overview.jpg b/zookeeper-docs/src/documentation/resources/images/bk-overview.jpg
new file mode 100644
index 0000000..6e12fb4
Binary files /dev/null and b/zookeeper-docs/src/documentation/resources/images/bk-overview.jpg differ
http://git-wip-us.apache.org/repos/asf/zookeeper/blob/c1efa954/zookeeper-docs/src/documentation/resources/images/favicon.ico
----------------------------------------------------------------------
diff --git a/zookeeper-docs/src/documentation/resources/images/favicon.ico b/zookeeper-docs/src/documentation/resources/images/favicon.ico
new file mode 100644
index 0000000..161bcf7
Binary files /dev/null and b/zookeeper-docs/src/documentation/resources/images/favicon.ico differ
http://git-wip-us.apache.org/repos/asf/zookeeper/blob/c1efa954/zookeeper-docs/src/documentation/resources/images/hadoop-logo.jpg
----------------------------------------------------------------------
diff --git a/zookeeper-docs/src/documentation/resources/images/hadoop-logo.jpg b/zookeeper-docs/src/documentation/resources/images/hadoop-logo.jpg
new file mode 100644
index 0000000..809525d
Binary files /dev/null and b/zookeeper-docs/src/documentation/resources/images/hadoop-logo.jpg differ
http://git-wip-us.apache.org/repos/asf/zookeeper/blob/c1efa954/zookeeper-docs/src/documentation/resources/images/state_dia.dia
----------------------------------------------------------------------
diff --git a/zookeeper-docs/src/documentation/resources/images/state_dia.dia b/zookeeper-docs/src/documentation/resources/images/state_dia.dia
new file mode 100755
index 0000000..4a58a00
Binary files /dev/null and b/zookeeper-docs/src/documentation/resources/images/state_dia.dia differ
http://git-wip-us.apache.org/repos/asf/zookeeper/blob/c1efa954/zookeeper-docs/src/documentation/resources/images/state_dia.jpg
----------------------------------------------------------------------
diff --git a/zookeeper-docs/src/documentation/resources/images/state_dia.jpg b/zookeeper-docs/src/documentation/resources/images/state_dia.jpg
new file mode 100755
index 0000000..b6f4a8b
Binary files /dev/null and b/zookeeper-docs/src/documentation/resources/images/state_dia.jpg differ
http://git-wip-us.apache.org/repos/asf/zookeeper/blob/c1efa954/zookeeper-docs/src/documentation/resources/images/zkarch.jpg
----------------------------------------------------------------------
diff --git a/zookeeper-docs/src/documentation/resources/images/zkarch.jpg b/zookeeper-docs/src/documentation/resources/images/zkarch.jpg
new file mode 100644
index 0000000..a0e5fcc
Binary files /dev/null and b/zookeeper-docs/src/documentation/resources/images/zkarch.jpg differ
http://git-wip-us.apache.org/repos/asf/zookeeper/blob/c1efa954/zookeeper-docs/src/documentation/resources/images/zkcomponents.jpg
----------------------------------------------------------------------
diff --git a/zookeeper-docs/src/documentation/resources/images/zkcomponents.jpg b/zookeeper-docs/src/documentation/resources/images/zkcomponents.jpg
new file mode 100644
index 0000000..7690578
Binary files /dev/null and b/zookeeper-docs/src/documentation/resources/images/zkcomponents.jpg differ
http://git-wip-us.apache.org/repos/asf/zookeeper/blob/c1efa954/zookeeper-docs/src/documentation/resources/images/zknamespace.jpg
----------------------------------------------------------------------
diff --git a/zookeeper-docs/src/documentation/resources/images/zknamespace.jpg b/zookeeper-docs/src/documentation/resources/images/zknamespace.jpg
new file mode 100644
index 0000000..05534bc
Binary files /dev/null and b/zookeeper-docs/src/documentation/resources/images/zknamespace.jpg differ
http://git-wip-us.apache.org/repos/asf/zookeeper/blob/c1efa954/zookeeper-docs/src/documentation/resources/images/zkperfRW-3.2.jpg
----------------------------------------------------------------------
diff --git a/zookeeper-docs/src/documentation/resources/images/zkperfRW-3.2.jpg b/zookeeper-docs/src/documentation/resources/images/zkperfRW-3.2.jpg
new file mode 100644
index 0000000..594b50b
Binary files /dev/null and b/zookeeper-docs/src/documentation/resources/images/zkperfRW-3.2.jpg differ
http://git-wip-us.apache.org/repos/asf/zookeeper/blob/c1efa954/zookeeper-docs/src/documentation/resources/images/zkperfRW.jpg
----------------------------------------------------------------------
diff --git a/zookeeper-docs/src/documentation/resources/images/zkperfRW.jpg b/zookeeper-docs/src/documentation/resources/images/zkperfRW.jpg
new file mode 100644
index 0000000..ad3019f
Binary files /dev/null and b/zookeeper-docs/src/documentation/resources/images/zkperfRW.jpg differ
http://git-wip-us.apache.org/repos/asf/zookeeper/blob/c1efa954/zookeeper-docs/src/documentation/resources/images/zkperfreliability.jpg
----------------------------------------------------------------------
diff --git a/zookeeper-docs/src/documentation/resources/images/zkperfreliability.jpg b/zookeeper-docs/src/documentation/resources/images/zkperfreliability.jpg
new file mode 100644
index 0000000..232bba8
Binary files /dev/null and b/zookeeper-docs/src/documentation/resources/images/zkperfreliability.jpg differ
http://git-wip-us.apache.org/repos/asf/zookeeper/blob/c1efa954/zookeeper-docs/src/documentation/resources/images/zkservice.jpg
----------------------------------------------------------------------
diff --git a/zookeeper-docs/src/documentation/resources/images/zkservice.jpg b/zookeeper-docs/src/documentation/resources/images/zkservice.jpg
new file mode 100644
index 0000000..1ec9154
Binary files /dev/null and b/zookeeper-docs/src/documentation/resources/images/zkservice.jpg differ
http://git-wip-us.apache.org/repos/asf/zookeeper/blob/c1efa954/zookeeper-docs/src/documentation/resources/images/zookeeper_small.gif
----------------------------------------------------------------------
diff --git a/zookeeper-docs/src/documentation/resources/images/zookeeper_small.gif b/zookeeper-docs/src/documentation/resources/images/zookeeper_small.gif
new file mode 100644
index 0000000..4e8014f
Binary files /dev/null and b/zookeeper-docs/src/documentation/resources/images/zookeeper_small.gif differ
http://git-wip-us.apache.org/repos/asf/zookeeper/blob/c1efa954/zookeeper-docs/src/documentation/skinconf.xml
----------------------------------------------------------------------
diff --git a/zookeeper-docs/src/documentation/skinconf.xml b/zookeeper-docs/src/documentation/skinconf.xml
new file mode 100644
index 0000000..43f3a49
--- /dev/null
+++ b/zookeeper-docs/src/documentation/skinconf.xml
@@ -0,0 +1,360 @@
+<?xml version="1.0"?>
+<!--
+ Copyright 2002-2004 The Apache Software Foundation
+
+ Licensed under the Apache License, Version 2.0 (the "License");
+ you may not use this file except in compliance with the License.
+ You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ See the License for the specific language governing permissions and
+ limitations under the License.
+-->
+
+<!--
+Skin configuration file. This file contains details of your project,
+which will be used to configure the chosen Forrest skin.
+-->
+
+<!DOCTYPE skinconfig PUBLIC "-//APACHE//DTD Skin Configuration V0.6-3//EN" "http://forrest.apache.org/dtd/skinconfig-v06-3.dtd">
+<skinconfig>
+ <!-- To enable lucene search add provider="lucene" (default is google).
+ Add box-location="alt" to move the search box to an alternate location
+ (if the skin supports it) and box-location="all" to show it in all
+ available locations on the page. Remove the <search> element to show
+ no search box. @domain will enable sitesearch for the specific domain with google.
+ In other words google will search the @domain for the query string.
+
+ -->
+ <search name="ZooKeeper" domain="zookeeper.apache.org" provider="google"/>
+
+ <!-- Disable the print link? If enabled, invalid HTML 4.0.1 -->
+ <disable-print-link>true</disable-print-link>
+ <!-- Disable the PDF link? -->
+ <disable-pdf-link>false</disable-pdf-link>
+ <!-- Disable the POD link? -->
+ <disable-pod-link>true</disable-pod-link>
+ <!-- Disable the Text link? FIXME: NOT YET IMPLEMENETED. -->
+ <disable-txt-link>true</disable-txt-link>
+ <!-- Disable the xml source link? -->
+ <!-- The xml source link makes it possible to access the xml rendition
+ of the source frim the html page, and to have it generated statically.
+ This can be used to enable other sites and services to reuse the
+ xml format for their uses. Keep this disabled if you don't want other
+ sites to easily reuse your pages.-->
+ <disable-xml-link>true</disable-xml-link>
+
+ <!-- Disable navigation icons on all external links? -->
+ <disable-external-link-image>true</disable-external-link-image>
+
+ <!-- Disable w3c compliance links?
+ Use e.g. align="center" to move the compliance links logos to
+ an alternate location default is left.
+ (if the skin supports it) -->
+ <disable-compliance-links>true</disable-compliance-links>
+
+ <!-- Render mailto: links unrecognisable by spam harvesters? -->
+ <obfuscate-mail-links>false</obfuscate-mail-links>
+
+ <!-- Disable the javascript facility to change the font size -->
+ <disable-font-script>true</disable-font-script>
+
+ <!-- project logo -->
+ <project-name>ZooKeeper</project-name>
+ <project-description>ZooKeeper: distributed coordination</project-description>
+ <project-url>http://zookeeper.apache.org/</project-url>
+ <project-logo>images/zookeeper_small.gif</project-logo>
+
+ <!-- group logo -->
+ <group-name>Hadoop</group-name>
+ <group-description>Apache Hadoop</group-description>
+ <group-url>http://hadoop.apache.org/</group-url>
+ <group-logo>images/hadoop-logo.jpg</group-logo>
+
+ <!-- optional host logo (e.g. sourceforge logo)
+ default skin: renders it at the bottom-left corner -->
+ <host-url></host-url>
+ <host-logo></host-logo>
+
+ <!-- relative url of a favicon file, normally favicon.ico -->
+ <favicon-url>images/favicon.ico</favicon-url>
+
+ <!-- The following are used to construct a copyright statement -->
+ <year></year>
+ <vendor>The Apache Software Foundation.</vendor>
+ <copyright-link>http://www.apache.org/licenses/</copyright-link>
+
+ <!-- Some skins use this to form a 'breadcrumb trail' of links.
+ Use location="alt" to move the trail to an alternate location
+ (if the skin supports it).
+ Omit the location attribute to display the trail in the default location.
+ Use location="none" to not display the trail (if the skin supports it).
+ For some skins just set the attributes to blank.
+ -->
+ <trail>
+ <link1 name="Apache" href="http://www.apache.org/"/>
+ <link2 name="ZooKeeper" href="http://zookeeper.apache.org/"/>
+ <link3 name="ZooKeeper" href="http://zookeeper.apache.org/"/>
+ </trail>
+
+ <!-- Configure the TOC, i.e. the Table of Contents.
+ @max-depth
+ how many "section" levels need to be included in the
+ generated Table of Contents (TOC).
+ @min-sections
+ Minimum required to create a TOC.
+ @location ("page","menu","page,menu", "none")
+ Where to show the TOC.
+ -->
+ <toc max-depth="2" min-sections="1" location="page"/>
+
+ <!-- Heading types can be clean|underlined|boxed -->
+ <headings type="clean"/>
+
+ <!-- The optional feedback element will be used to construct a
+ feedback link in the footer with the page pathname appended:
+ <a href="@href">{@to}</a>
+ <feedback to="webmaster@foo.com"
+ href="mailto:webmaster@foo.com?subject=Feedback " >
+ Send feedback about the website to:
+ </feedback>
+ -->
+ <!--
+ extra-css - here you can define custom css-elements that are
+ a. overriding the fallback elements or
+ b. adding the css definition from new elements that you may have
+ used in your documentation.
+ -->
+ <extra-css>
+ <!--Example of b.
+ To define the css definition of a new element that you may have used
+ in the class attribute of a <p> node.
+ e.g. <p class="quote"/>
+ -->
+ p.quote {
+ margin-left: 2em;
+ padding: .5em;
+ background-color: #f0f0f0;
+ font-family: monospace;
+ }
+
+ pre.code {
+ margin-left: 0em;
+ padding: 0.5em;
+ background-color: #f0f0f0;
+ font-family: monospace;
+ }
+
+<!-- patricks
+ .code {
+ font-family: "Courier New", Courier, monospace;
+ font-size: 110%;
+ }
+-->
+
+ </extra-css>
+
+ <colors>
+ <!-- These values are used for the generated CSS files. -->
+
+ <!-- Krysalis -->
+<!--
+ <color name="header" value="#FFFFFF"/>
+
+ <color name="tab-selected" value="#a5b6c6" link="#000000" vlink="#000000" hlink="#000000"/>
+ <color name="tab-unselected" value="#F7F7F7" link="#000000" vlink="#000000" hlink="#000000"/>
+ <color name="subtab-selected" value="#a5b6c6" link="#000000" vlink="#000000" hlink="#000000"/>
+ <color name="subtab-unselected" value="#a5b6c6" link="#000000" vlink="#000000" hlink="#000000"/>
+
+ <color name="heading" value="#a5b6c6"/>
+ <color name="subheading" value="#CFDCED"/>
+
+ <color name="navstrip" value="#CFDCED" font="#000000" link="#000000" vlink="#000000" hlink="#000000"/>
+ <color name="toolbox" value="#a5b6c6"/>
+ <color name="border" value="#a5b6c6"/>
+
+ <color name="menu" value="#F7F7F7" link="#000000" vlink="#000000" hlink="#000000"/>
+ <color name="dialog" value="#F7F7F7"/>
+
+ <color name="body" value="#ffffff" link="#0F3660" vlink="#009999" hlink="#000066"/>
+
+ <color name="table" value="#a5b6c6"/>
+ <color name="table-cell" value="#ffffff"/>
+ <color name="highlight" value="#ffff00"/>
+ <color name="fixme" value="#cc6600"/>
+ <color name="note" value="#006699"/>
+ <color name="warning" value="#990000"/>
+ <color name="code" value="#a5b6c6"/>
+
+ <color name="footer" value="#a5b6c6"/>
+-->
+
+ <!-- Forrest -->
+<!--
+ <color name="header" value="#294563"/>
+
+ <color name="tab-selected" value="#4a6d8c" link="#0F3660" vlink="#0F3660" hlink="#000066"/>
+ <color name="tab-unselected" value="#b5c7e7" link="#0F3660" vlink="#0F3660" hlink="#000066"/>
+ <color name="subtab-selected" value="#4a6d8c" link="#0F3660" vlink="#0F3660" hlink="#000066"/>
+ <color name="subtab-unselected" value="#4a6d8c" link="#0F3660" vlink="#0F3660" hlink="#000066"/>
+
+ <color name="heading" value="#294563"/>
+ <color name="subheading" value="#4a6d8c"/>
+
+ <color name="navstrip" value="#cedfef" font="#0F3660" link="#0F3660" vlink="#0F3660" hlink="#000066"/>
+ <color name="toolbox" value="#4a6d8c"/>
+ <color name="border" value="#294563"/>
+
+ <color name="menu" value="#4a6d8c" font="#cedfef" link="#ffffff" vlink="#ffffff" hlink="#ffcf00"/>
+ <color name="dialog" value="#4a6d8c"/>
+
+ <color name="body" value="#ffffff" link="#0F3660" vlink="#009999" hlink="#000066"/>
+
+ <color name="table" value="#7099C5"/>
+ <color name="table-cell" value="#f0f0ff"/>
+ <color name="highlight" value="#ffff00"/>
+ <color name="fixme" value="#cc6600"/>
+ <color name="note" value="#006699"/>
+ <color name="warning" value="#990000"/>
+ <color name="code" value="#CFDCED"/>
+
+ <color name="footer" value="#cedfef"/>
+-->
+
+ <!-- Collabnet -->
+<!--
+ <color name="header" value="#003366"/>
+
+ <color name="tab-selected" value="#dddddd" link="#555555" vlink="#555555" hlink="#555555"/>
+ <color name="tab-unselected" value="#999999" link="#ffffff" vlink="#ffffff" hlink="#ffffff"/>
+ <color name="subtab-selected" value="#cccccc" link="#000000" vlink="#000000" hlink="#000000"/>
+ <color name="subtab-unselected" value="#cccccc" link="#555555" vlink="#555555" hlink="#555555"/>
+
+ <color name="heading" value="#003366"/>
+ <color name="subheading" value="#888888"/>
+
+ <color name="navstrip" value="#dddddd" font="#555555"/>
+ <color name="toolbox" value="#dddddd" font="#555555"/>
+ <color name="border" value="#999999"/>
+
+ <color name="menu" value="#ffffff"/>
+ <color name="dialog" value="#eeeeee"/>
+
+ <color name="body" value="#ffffff"/>
+
+ <color name="table" value="#ccc"/>
+ <color name="table-cell" value="#ffffff"/>
+ <color name="highlight" value="#ffff00"/>
+ <color name="fixme" value="#cc6600"/>
+ <color name="note" value="#006699"/>
+ <color name="warning" value="#990000"/>
+ <color name="code" value="#003366"/>
+
+ <color name="footer" value="#ffffff"/>
+-->
+ <!-- Lenya using pelt-->
+<!--
+ <color name="header" value="#ffffff"/>
+
+ <color name="tab-selected" value="#4C6C8F" link="#ffffff" vlink="#ffffff" hlink="#ffffff"/>
+ <color name="tab-unselected" value="#E5E4D9" link="#000000" vlink="#000000" hlink="#000000"/>
+ <color name="subtab-selected" value="#000000" link="#000000" vlink="#000000" hlink="#000000"/>
+ <color name="subtab-unselected" value="#E5E4D9" link="#000000" vlink="#000000" hlink="#000000"/>
+
+ <color name="heading" value="#E5E4D9"/>
+ <color name="subheading" value="#000000"/>
+ <color name="published" value="#4C6C8F" font="#FFFFFF"/>
+ <color name="feedback" value="#4C6C8F" font="#FFFFFF" align="center"/>
+ <color name="navstrip" value="#E5E4D9" font="#000000"/>
+
+ <color name="toolbox" value="#CFDCED" font="#000000"/>
+
+ <color name="border" value="#999999"/>
+ <color name="menu" value="#4C6C8F" font="#ffffff" link="#ffffff" vlink="#ffffff" hlink="#ffffff" current="#FFCC33" />
+ <color name="menuheading" value="#cfdced" font="#000000" />
+ <color name="searchbox" value="#E5E4D9" font="#000000"/>
+
+ <color name="dialog" value="#CFDCED"/>
+ <color name="body" value="#ffffff" />
+
+ <color name="table" value="#ccc"/>
+ <color name="table-cell" value="#ffffff"/>
+ <color name="highlight" value="#ffff00"/>
+ <color name="fixme" value="#cc6600"/>
+ <color name="note" value="#006699"/>
+ <color name="warning" value="#990000"/>
+ <color name="code" value="#003366"/>
+
+ <color name="footer" value="#E5E4D9"/>
+-->
+ </colors>
+
+ <!-- Settings specific to PDF output. -->
+ <pdf>
+ <!--
+ Supported page sizes are a0, a1, a2, a3, a4, a5, executive,
+ folio, legal, ledger, letter, quarto, tabloid (default letter).
+ Supported page orientations are portrait, landscape (default
+ portrait).
+ Supported text alignments are left, right, justify (default left).
+ -->
+ <page size="letter" orientation="portrait" text-align="left"/>
+
+ <!--
+ Margins can be specified for top, bottom, inner, and outer
+ edges. If double-sided="false", the inner edge is always left
+ and the outer is always right. If double-sided="true", the
+ inner edge will be left on odd pages, right on even pages,
+ the outer edge vice versa.
+ Specified below are the default settings.
+ -->
+ <margins double-sided="false">
+ <top>1in</top>
+ <bottom>1in</bottom>
+ <inner>1.25in</inner>
+ <outer>1in</outer>
+ </margins>
+
+ <!--
+ Print the URL text next to all links going outside the file
+ -->
+ <show-external-urls>false</show-external-urls>
+
+ <!--
+ Disable the copyright footer on each page of the PDF.
+ A footer is composed for each page. By default, a "credit" with role=pdf
+ will be used, as explained below. Otherwise a copyright statement
+ will be generated. This latter can be disabled.
+ -->
+ <disable-copyright-footer>false</disable-copyright-footer>
+ </pdf>
+
+ <!-- Credits are typically rendered as a set of small clickable
+ images in the page footer.
+ Use box-location="alt" to move the credit to an alternate location
+ (if the skin supports it).
+ -->
+ <credits>
+ <credit box-location="alt">
+ <name>Built with Apache Forrest</name>
+ <url>http://forrest.apache.org/</url>
+ <image>images/built-with-forrest-button.png</image>
+ <width>88</width>
+ <height>31</height>
+ </credit>
+ <!-- A credit with @role="pdf" will be used to compose a footer
+ for each page in the PDF, using either "name" or "url" or both.
+ -->
+ <!--
+ <credit role="pdf">
+ <name>Built with Apache Forrest</name>
+ <url>http://forrest.apache.org/</url>
+ </credit>
+ -->
+ </credits>
+
+</skinconfig>
http://git-wip-us.apache.org/repos/asf/zookeeper/blob/c1efa954/zookeeper-docs/status.xml
----------------------------------------------------------------------
diff --git a/zookeeper-docs/status.xml b/zookeeper-docs/status.xml
new file mode 100644
index 0000000..3ac3fda
--- /dev/null
+++ b/zookeeper-docs/status.xml
@@ -0,0 +1,74 @@
+<?xml version="1.0"?>
+<!--
+ Copyright 2002-2004 The Apache Software Foundation
+
+ Licensed under the Apache License, Version 2.0 (the "License");
+ you may not use this file except in compliance with the License.
+ You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ See the License for the specific language governing permissions and
+ limitations under the License.
+-->
+<status>
+
+ <developers>
+ <person name="Joe Bloggs" email="joe@joescompany.org" id="JB" />
+ <!-- Add more people here -->
+ </developers>
+
+ <changes>
+ <!-- Add new releases here -->
+ <release version="0.1" date="unreleased">
+ <!-- Some action types have associated images. By default, images are
+ defined for 'add', 'fix', 'remove', 'update' and 'hack'. If you add
+ src/documentation/resources/images/<foo>.jpg images, these will
+ automatically be used for entries of type <foo>. -->
+
+ <action dev="JB" type="add" context="admin">
+ Initial Import
+ </action>
+ <!-- Sample action:
+ <action dev="JB" type="fix" due-to="Joe Contributor"
+ due-to-email="joec@apache.org" fixes-bug="123">
+ Fixed a bug in the Foo class.
+ </action>
+ -->
+ </release>
+ </changes>
+
+ <todo>
+ <actions priority="high">
+ <action context="docs" dev="JB">
+ Customize this template project with your project's details. This
+ TODO list is generated from 'status.xml'.
+ </action>
+ <action context="docs" dev="JB">
+ Add lots of content. XML content goes in
+ <code>src/documentation/content/xdocs</code>, or wherever the
+ <code>${project.xdocs-dir}</code> property (set in
+ <code>forrest.properties</code>) points.
+ </action>
+ <action context="feedback" dev="JB">
+ Mail <link
+ href="mailto:forrest-dev@xml.apache.org">forrest-dev@xml.apache.org</link>
+ with feedback.
+ </action>
+ </actions>
+ <!-- Add todo items. @context is an arbitrary string. Eg:
+ <actions priority="high">
+ <action context="code" dev="SN">
+ </action>
+ </actions>
+ <actions priority="medium">
+ <action context="docs" dev="open">
+ </action>
+ </actions>
+ -->
+ </todo>
+
+</status>
http://git-wip-us.apache.org/repos/asf/zookeeper/blob/c1efa954/zookeeper-it/.empty
----------------------------------------------------------------------
diff --git a/zookeeper-it/.empty b/zookeeper-it/.empty
new file mode 100644
index 0000000..e69de29
[04/12] zookeeper git commit: ZOOKEEPER-3022: MAVEN MIGRATION 3.4 -
Iteration 1 - docs, it
Posted by an...@apache.org.
http://git-wip-us.apache.org/repos/asf/zookeeper/blob/c1efa954/zookeeper-docs/src/documentation/content/xdocs/zookeeperAdmin.xml
----------------------------------------------------------------------
diff --git a/zookeeper-docs/src/documentation/content/xdocs/zookeeperAdmin.xml b/zookeeper-docs/src/documentation/content/xdocs/zookeeperAdmin.xml
new file mode 100644
index 0000000..d88ddbd
--- /dev/null
+++ b/zookeeper-docs/src/documentation/content/xdocs/zookeeperAdmin.xml
@@ -0,0 +1,1861 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!--
+ Copyright 2002-2004 The Apache Software Foundation
+
+ Licensed under the Apache License, Version 2.0 (the "License");
+ you may not use this file except in compliance with the License.
+ You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ See the License for the specific language governing permissions and
+ limitations under the License.
+-->
+<!DOCTYPE article PUBLIC "-//OASIS//DTD Simplified DocBook XML V1.0//EN"
+"http://www.oasis-open.org/docbook/xml/simple/1.0/sdocbook.dtd">
+<article id="bk_Admin">
+ <title>ZooKeeper Administrator's Guide</title>
+
+ <subtitle>A Guide to Deployment and Administration</subtitle>
+
+ <articleinfo>
+ <legalnotice>
+ <para>Licensed under the Apache License, Version 2.0 (the "License");
+ you may not use this file except in compliance with the License. You may
+ obtain a copy of the License at <ulink
+ url="http://www.apache.org/licenses/LICENSE-2.0">http://www.apache.org/licenses/LICENSE-2.0</ulink>.</para>
+
+ <para>Unless required by applicable law or agreed to in writing,
+ software distributed under the License is distributed on an "AS IS"
+ BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
+ implied. See the License for the specific language governing permissions
+ and limitations under the License.</para>
+ </legalnotice>
+
+ <abstract>
+ <para>This document contains information about deploying, administering
+ and mantaining ZooKeeper. It also discusses best practices and common
+ problems.</para>
+ </abstract>
+ </articleinfo>
+
+ <section id="ch_deployment">
+ <title>Deployment</title>
+
+ <para>This section contains information about deploying Zookeeper and
+ covers these topics:</para>
+
+ <itemizedlist>
+ <listitem>
+ <para><xref linkend="sc_systemReq" /></para>
+ </listitem>
+
+ <listitem>
+ <para><xref linkend="sc_zkMulitServerSetup" /></para>
+ </listitem>
+
+ <listitem>
+ <para><xref linkend="sc_singleAndDevSetup" /></para>
+ </listitem>
+ </itemizedlist>
+
+ <para>The first two sections assume you are interested in installing
+ ZooKeeper in a production environment such as a datacenter. The final
+ section covers situations in which you are setting up ZooKeeper on a
+ limited basis - for evaluation, testing, or development - but not in a
+ production environment.</para>
+
+ <section id="sc_systemReq">
+ <title>System Requirements</title>
+
+ <section id="sc_supportedPlatforms">
+ <title>Supported Platforms</title>
+
+ <para>ZooKeeper consists of multiple components. Some components are
+ supported broadly, and other components are supported only on a smaller
+ set of platforms.</para>
+
+ <itemizedlist>
+ <listitem>
+ <para><emphasis role="bold">Client</emphasis> is the Java client
+ library, used by applications to connect to a ZooKeeper ensemble.
+ </para>
+ </listitem>
+ <listitem>
+ <para><emphasis role="bold">Server</emphasis> is the Java server
+ that runs on the ZooKeeper ensemble nodes.</para>
+ </listitem>
+ <listitem>
+ <para><emphasis role="bold">Native Client</emphasis> is a client
+ implemented in C, similar to the Java client, used by applications
+ to connect to a ZooKeeper ensemble.</para>
+ </listitem>
+ <listitem>
+ <para><emphasis role="bold">Contrib</emphasis> refers to multiple
+ optional add-on components.</para>
+ </listitem>
+ </itemizedlist>
+
+ <para>The following matrix describes the level of support committed for
+ running each component on different operating system platforms.</para>
+
+ <table>
+ <title>Support Matrix</title>
+ <tgroup cols="5" align="left" colsep="1" rowsep="1">
+ <thead>
+ <row>
+ <entry>Operating System</entry>
+ <entry>Client</entry>
+ <entry>Server</entry>
+ <entry>Native Client</entry>
+ <entry>Contrib</entry>
+ </row>
+ </thead>
+ <tbody>
+ <row>
+ <entry>GNU/Linux</entry>
+ <entry>Development and Production</entry>
+ <entry>Development and Production</entry>
+ <entry>Development and Production</entry>
+ <entry>Development and Production</entry>
+ </row>
+ <row>
+ <entry>Solaris</entry>
+ <entry>Development and Production</entry>
+ <entry>Development and Production</entry>
+ <entry>Not Supported</entry>
+ <entry>Not Supported</entry>
+ </row>
+ <row>
+ <entry>FreeBSD</entry>
+ <entry>Development and Production</entry>
+ <entry>Development and Production</entry>
+ <entry>Not Supported</entry>
+ <entry>Not Supported</entry>
+ </row>
+ <row>
+ <entry>Windows</entry>
+ <entry>Development and Production</entry>
+ <entry>Development and Production</entry>
+ <entry>Not Supported</entry>
+ <entry>Not Supported</entry>
+ </row>
+ <row>
+ <entry>Mac OS X</entry>
+ <entry>Development Only</entry>
+ <entry>Development Only</entry>
+ <entry>Not Supported</entry>
+ <entry>Not Supported</entry>
+ </row>
+ </tbody>
+ </tgroup>
+ </table>
+
+ <para>For any operating system not explicitly mentioned as supported in
+ the matrix, components may or may not work. The ZooKeeper community
+ will fix obvious bugs that are reported for other platforms, but there
+ is no full support.</para>
+ </section>
+
+ <section id="sc_requiredSoftware">
+ <title>Required Software </title>
+
+ <para>ZooKeeper runs in Java, release 1.6 or greater (JDK 6 or
+ greater). It runs as an <emphasis>ensemble</emphasis> of
+ ZooKeeper servers. Three ZooKeeper servers is the minimum
+ recommended size for an ensemble, and we also recommend that
+ they run on separate machines. At Yahoo!, ZooKeeper is
+ usually deployed on dedicated RHEL boxes, with dual-core
+ processors, 2GB of RAM, and 80GB IDE hard drives.</para>
+ </section>
+
+ </section>
+
+ <section id="sc_zkMulitServerSetup">
+ <title>Clustered (Multi-Server) Setup</title>
+
+ <para>For reliable ZooKeeper service, you should deploy ZooKeeper in a
+ cluster known as an <emphasis>ensemble</emphasis>. As long as a majority
+ of the ensemble are up, the service will be available. Because Zookeeper
+ requires a majority, it is best to use an
+ odd number of machines. For example, with four machines ZooKeeper can
+ only handle the failure of a single machine; if two machines fail, the
+ remaining two machines do not constitute a majority. However, with five
+ machines ZooKeeper can handle the failure of two machines. </para>
+ <note>
+ <para>
+ As mentioned in the
+ <ulink url="zookeeperStarted.html">ZooKeeper Getting Started Guide</ulink>
+ , a minimum of three servers are required for a fault tolerant
+ clustered setup, and it is strongly recommended that you have an
+ odd number of servers.
+ </para>
+ <para>Usually three servers is more than enough for a production
+ install, but for maximum reliability during maintenance, you may
+ wish to install five servers. With three servers, if you perform
+ maintenance on one of them, you are vulnerable to a failure on one
+ of the other two servers during that maintenance. If you have five
+ of them running, you can take one down for maintenance, and know
+ that you're still OK if one of the other four suddenly fails.
+ </para>
+ <para>Your redundancy considerations should include all aspects of
+ your environment. If you have three ZooKeeper servers, but their
+ network cables are all plugged into the same network switch, then
+ the failure of that switch will take down your entire ensemble.
+ </para>
+ </note>
+ <para>Here are the steps to setting a server that will be part of an
+ ensemble. These steps should be performed on every host in the
+ ensemble:</para>
+
+ <orderedlist>
+ <listitem>
+ <para>Install the Java JDK. You can use the native packaging system
+ for your system, or download the JDK from:</para>
+
+ <para><ulink
+ url="http://java.sun.com/javase/downloads/index.jsp">http://java.sun.com/javase/downloads/index.jsp</ulink></para>
+ </listitem>
+
+ <listitem>
+ <para>Set the Java heap size. This is very important to avoid
+ swapping, which will seriously degrade ZooKeeper performance. To
+ determine the correct value, use load tests, and make sure you are
+ well below the usage limit that would cause you to swap. Be
+ conservative - use a maximum heap size of 3GB for a 4GB
+ machine.</para>
+ </listitem>
+
+ <listitem>
+ <para>Install the ZooKeeper Server Package. It can be downloaded
+ from:
+ </para>
+ <para>
+ <ulink url="http://zookeeper.apache.org/releases.html">
+ http://zookeeper.apache.org/releases.html
+ </ulink>
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>Create a configuration file. This file can be called anything.
+ Use the following settings as a starting point:</para>
+
+ <programlisting>
+tickTime=2000
+dataDir=/var/lib/zookeeper/
+clientPort=2181
+initLimit=5
+syncLimit=2
+server.1=zoo1:2888:3888
+server.2=zoo2:2888:3888
+server.3=zoo3:2888:3888</programlisting>
+
+ <para>You can find the meanings of these and other configuration
+ settings in the section <xref linkend="sc_configuration" />. A word
+ though about a few here:</para>
+
+ <para>Every machine that is part of the ZooKeeper ensemble should know
+ about every other machine in the ensemble. You accomplish this with
+ the series of lines of the form <emphasis
+ role="bold">server.id=host:port:port</emphasis>. The parameters <emphasis
+ role="bold">host</emphasis> and <emphasis
+ role="bold">port</emphasis> are straightforward. You attribute the
+ server id to each machine by creating a file named
+ <filename>myid</filename>, one for each server, which resides in
+ that server's data directory, as specified by the configuration file
+ parameter <emphasis role="bold">dataDir</emphasis>.</para></listitem>
+
+ <listitem><para>The myid file
+ consists of a single line containing only the text of that machine's
+ id. So <filename>myid</filename> of server 1 would contain the text
+ "1" and nothing else. The id must be unique within the
+ ensemble and should have a value between 1 and 255.</para>
+ </listitem>
+
+ <listitem>
+ <para>If your configuration file is set up, you can start a
+ ZooKeeper server:</para>
+
+ <para><computeroutput>$ java -cp zookeeper.jar:lib/slf4j-api-1.6.1.jar:lib/slf4j-log4j12-1.6.1.jar:lib/log4j-1.2.15.jar:conf \
+ org.apache.zookeeper.server.quorum.QuorumPeerMain zoo.cfg
+ </computeroutput></para>
+
+ <para>QuorumPeerMain starts a ZooKeeper server,
+ <ulink url="http://java.sun.com/javase/technologies/core/mntr-mgmt/javamanagement/">JMX</ulink>
+ management beans are also registered which allows
+ management through a JMX management console.
+ The <ulink url="zookeeperJMX.html">ZooKeeper JMX
+ document</ulink> contains details on managing ZooKeeper with JMX.
+ </para>
+
+ <para>See the script <emphasis>bin/zkServer.sh</emphasis>,
+ which is included in the release, for an example
+ of starting server instances.</para>
+
+ </listitem>
+
+ <listitem>
+ <para>Test your deployment by connecting to the hosts:</para>
+
+ <para>In Java, you can run the following command to execute
+ simple operations:</para>
+
+ <para><computeroutput>$ bin/zkCli.sh -server 127.0.0.1:2181</computeroutput></para>
+ </listitem>
+ </orderedlist>
+ </section>
+
+ <section id="sc_singleAndDevSetup">
+ <title>Single Server and Developer Setup</title>
+
+ <para>If you want to setup ZooKeeper for development purposes, you will
+ probably want to setup a single server instance of ZooKeeper, and then
+ install either the Java or C client-side libraries and bindings on your
+ development machine.</para>
+
+ <para>The steps to setting up a single server instance are the similar
+ to the above, except the configuration file is simpler. You can find the
+ complete instructions in the <ulink
+ url="zookeeperStarted.html#sc_InstallingSingleMode">Installing and
+ Running ZooKeeper in Single Server Mode</ulink> section of the <ulink
+ url="zookeeperStarted.html">ZooKeeper Getting Started
+ Guide</ulink>.</para>
+
+ <para>For information on installing the client side libraries, refer to
+ the <ulink url="zookeeperProgrammers.html#Bindings">Bindings</ulink>
+ section of the <ulink url="zookeeperProgrammers.html">ZooKeeper
+ Programmer's Guide</ulink>.</para>
+ </section>
+ </section>
+
+ <section id="ch_administration">
+ <title>Administration</title>
+
+ <para>This section contains information about running and maintaining
+ ZooKeeper and covers these topics: </para>
+ <itemizedlist>
+ <listitem>
+ <para><xref linkend="sc_designing" /></para>
+ </listitem>
+
+ <listitem>
+ <para><xref linkend="sc_provisioning" /></para>
+ </listitem>
+
+ <listitem>
+ <para><xref linkend="sc_strengthsAndLimitations" /></para>
+ </listitem>
+
+ <listitem>
+ <para><xref linkend="sc_administering" /></para>
+ </listitem>
+
+ <listitem>
+ <para><xref linkend="sc_maintenance" /></para>
+ </listitem>
+
+ <listitem>
+ <para><xref linkend="sc_supervision" /></para>
+ </listitem>
+
+ <listitem>
+ <para><xref linkend="sc_monitoring" /></para>
+ </listitem>
+
+ <listitem>
+ <para><xref linkend="sc_logging" /></para>
+ </listitem>
+
+ <listitem>
+ <para><xref linkend="sc_troubleshooting" /></para>
+ </listitem>
+
+ <listitem>
+ <para><xref linkend="sc_configuration" /></para>
+ </listitem>
+
+ <listitem>
+ <para><xref linkend="sc_zkCommands" /></para>
+ </listitem>
+
+ <listitem>
+ <para><xref linkend="sc_dataFileManagement" /></para>
+ </listitem>
+
+ <listitem>
+ <para><xref linkend="sc_commonProblems" /></para>
+ </listitem>
+
+ <listitem>
+ <para><xref linkend="sc_bestPractices" /></para>
+ </listitem>
+ </itemizedlist>
+
+ <section id="sc_designing">
+ <title>Designing a ZooKeeper Deployment</title>
+
+ <para>The reliablity of ZooKeeper rests on two basic assumptions.</para>
+ <orderedlist>
+ <listitem><para> Only a minority of servers in a deployment
+ will fail. <emphasis>Failure</emphasis> in this context
+ means a machine crash, or some error in the network that
+ partitions a server off from the majority.</para>
+ </listitem>
+ <listitem><para> Deployed machines operate correctly. To
+ operate correctly means to execute code correctly, to have
+ clocks that work properly, and to have storage and network
+ components that perform consistently.</para>
+ </listitem>
+ </orderedlist>
+
+ <para>The sections below contain considerations for ZooKeeper
+ administrators to maximize the probability for these assumptions
+ to hold true. Some of these are cross-machines considerations,
+ and others are things you should consider for each and every
+ machine in your deployment.</para>
+
+ <section id="sc_CrossMachineRequirements">
+ <title>Cross Machine Requirements</title>
+
+ <para>For the ZooKeeper service to be active, there must be a
+ majority of non-failing machines that can communicate with
+ each other. To create a deployment that can tolerate the
+ failure of F machines, you should count on deploying 2xF+1
+ machines. Thus, a deployment that consists of three machines
+ can handle one failure, and a deployment of five machines can
+ handle two failures. Note that a deployment of six machines
+ can only handle two failures since three machines is not a
+ majority. For this reason, ZooKeeper deployments are usually
+ made up of an odd number of machines.</para>
+
+ <para>To achieve the highest probability of tolerating a failure
+ you should try to make machine failures independent. For
+ example, if most of the machines share the same switch,
+ failure of that switch could cause a correlated failure and
+ bring down the service. The same holds true of shared power
+ circuits, cooling systems, etc.</para>
+ </section>
+
+ <section>
+ <title>Single Machine Requirements</title>
+
+ <para>If ZooKeeper has to contend with other applications for
+ access to resourses like storage media, CPU, network, or
+ memory, its performance will suffer markedly. ZooKeeper has
+ strong durability guarantees, which means it uses storage
+ media to log changes before the operation responsible for the
+ change is allowed to complete. You should be aware of this
+ dependency then, and take great care if you want to ensure
+ that ZooKeeper operations aren’t held up by your media. Here
+ are some things you can do to minimize that sort of
+ degradation:
+ </para>
+
+ <itemizedlist>
+ <listitem>
+ <para>ZooKeeper's transaction log must be on a dedicated
+ device. (A dedicated partition is not enough.) ZooKeeper
+ writes the log sequentially, without seeking Sharing your
+ log device with other processes can cause seeks and
+ contention, which in turn can cause multi-second
+ delays.</para>
+ </listitem>
+
+ <listitem>
+ <para>Do not put ZooKeeper in a situation that can cause a
+ swap. In order for ZooKeeper to function with any sort of
+ timeliness, it simply cannot be allowed to swap.
+ Therefore, make certain that the maximum heap size given
+ to ZooKeeper is not bigger than the amount of real memory
+ available to ZooKeeper. For more on this, see
+ <xref linkend="sc_commonProblems"/>
+ below. </para>
+ </listitem>
+ </itemizedlist>
+ </section>
+ </section>
+
+ <section id="sc_provisioning">
+ <title>Provisioning</title>
+
+ <para></para>
+ </section>
+
+ <section id="sc_strengthsAndLimitations">
+ <title>Things to Consider: ZooKeeper Strengths and Limitations</title>
+
+ <para></para>
+ </section>
+
+ <section id="sc_administering">
+ <title>Administering</title>
+
+ <para></para>
+ </section>
+
+ <section id="sc_maintenance">
+ <title>Maintenance</title>
+
+ <para>Little long term maintenance is required for a ZooKeeper
+ cluster however you must be aware of the following:</para>
+
+ <section>
+ <title>Ongoing Data Directory Cleanup</title>
+
+ <para>The ZooKeeper <ulink url="#var_datadir">Data
+ Directory</ulink> contains files which are a persistent copy
+ of the znodes stored by a particular serving ensemble. These
+ are the snapshot and transactional log files. As changes are
+ made to the znodes these changes are appended to a
+ transaction log. Occasionally, when a log grows large, a
+ snapshot of the current state of all znodes will be written
+ to the filesystem and a new transaction log file is created
+ for future transactions. During snapshotting, ZooKeeper may
+ continue appending incoming transactions to the old log file.
+ Therefore, some transactions which are newer than a snapshot
+ may be found in the last transaction log preceding the
+ snapshot.
+ </para>
+
+ <para>A ZooKeeper server <emphasis role="bold">will not remove
+ old snapshots and log files</emphasis> when using the default
+ configuration (see autopurge below), this is the
+ responsibility of the operator. Every serving environment is
+ different and therefore the requirements of managing these
+ files may differ from install to install (backup for example).
+ </para>
+
+ <para>The PurgeTxnLog utility implements a simple retention
+ policy that administrators can use. The <ulink
+ url="ext:api/index">API docs</ulink> contains details on
+ calling conventions (arguments, etc...).
+ </para>
+
+ <para>In the following example the last count snapshots and
+ their corresponding logs are retained and the others are
+ deleted. The value of <count> should typically be
+ greater than 3 (although not required, this provides 3 backups
+ in the unlikely event a recent log has become corrupted). This
+ can be run as a cron job on the ZooKeeper server machines to
+ clean up the logs daily.</para>
+
+ <programlisting> java -cp zookeeper.jar:lib/slf4j-api-1.6.1.jar:lib/slf4j-log4j12-1.6.1.jar:lib/log4j-1.2.15.jar:conf org.apache.zookeeper.server.PurgeTxnLog <dataDir> <snapDir> -n <count></programlisting>
+
+ <para>Automatic purging of the snapshots and corresponding
+ transaction logs was introduced in version 3.4.0 and can be
+ enabled via the following configuration parameters <emphasis
+ role="bold">autopurge.snapRetainCount</emphasis> and <emphasis
+ role="bold">autopurge.purgeInterval</emphasis>. For more on
+ this, see <xref linkend="sc_advancedConfiguration"/>
+ below.</para>
+ </section>
+
+ <section>
+ <title>Debug Log Cleanup (log4j)</title>
+
+ <para>See the section on <ulink
+ url="#sc_logging">logging</ulink> in this document. It is
+ expected that you will setup a rolling file appender using the
+ in-built log4j feature. The sample configuration file in the
+ release tar's conf/log4j.properties provides an example of
+ this.
+ </para>
+ </section>
+
+ </section>
+
+ <section id="sc_supervision">
+ <title>Supervision</title>
+
+ <para>You will want to have a supervisory process that manages
+ each of your ZooKeeper server processes (JVM). The ZK server is
+ designed to be "fail fast" meaning that it will shutdown
+ (process exit) if an error occurs that it cannot recover
+ from. As a ZooKeeper serving cluster is highly reliable, this
+ means that while the server may go down the cluster as a whole
+ is still active and serving requests. Additionally, as the
+ cluster is "self healing" the failed server once restarted will
+ automatically rejoin the ensemble w/o any manual
+ interaction.</para>
+
+ <para>Having a supervisory process such as <ulink
+ url="http://cr.yp.to/daemontools.html">daemontools</ulink> or
+ <ulink
+ url="http://en.wikipedia.org/wiki/Service_Management_Facility">SMF</ulink>
+ (other options for supervisory process are also available, it's
+ up to you which one you would like to use, these are just two
+ examples) managing your ZooKeeper server ensures that if the
+ process does exit abnormally it will automatically be restarted
+ and will quickly rejoin the cluster.</para>
+ </section>
+
+ <section id="sc_monitoring">
+ <title>Monitoring</title>
+
+ <para>The ZooKeeper service can be monitored in one of two
+ primary ways; 1) the command port through the use of <ulink
+ url="#sc_zkCommands">4 letter words</ulink> and 2) <ulink
+ url="zookeeperJMX.html">JMX</ulink>. See the appropriate section for
+ your environment/requirements.</para>
+ </section>
+
+ <section id="sc_logging">
+ <title>Logging</title>
+
+ <para>ZooKeeper uses <emphasis role="bold">log4j</emphasis> version 1.2 as
+ its logging infrastructure. The ZooKeeper default <filename>log4j.properties</filename>
+ file resides in the <filename>conf</filename> directory. Log4j requires that
+ <filename>log4j.properties</filename> either be in the working directory
+ (the directory from which ZooKeeper is run) or be accessible from the classpath.</para>
+
+ <para>For more information, see
+ <ulink url="http://logging.apache.org/log4j/1.2/manual.html#defaultInit">Log4j Default Initialization Procedure</ulink>
+ of the log4j manual.</para>
+
+ </section>
+
+ <section id="sc_troubleshooting">
+ <title>Troubleshooting</title>
+ <variablelist>
+ <varlistentry>
+ <term> Server not coming up because of file corruption</term>
+ <listitem>
+ <para>A server might not be able to read its database and fail to come up because of
+ some file corruption in the transaction logs of the ZooKeeper server. You will
+ see some IOException on loading ZooKeeper database. In such a case,
+ make sure all the other servers in your ensemble are up and working. Use "stat"
+ command on the command port to see if they are in good health. After you have verified that
+ all the other servers of the ensemble are up, you can go ahead and clean the database
+ of the corrupt server. Delete all the files in datadir/version-2 and datalogdir/version-2/.
+ Restart the server.
+ </para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ </section>
+
+ <section id="sc_configuration">
+ <title>Configuration Parameters</title>
+
+ <para>ZooKeeper's behavior is governed by the ZooKeeper configuration
+ file. This file is designed so that the exact same file can be used by
+ all the servers that make up a ZooKeeper server assuming the disk
+ layouts are the same. If servers use different configuration files, care
+ must be taken to ensure that the list of servers in all of the different
+ configuration files match.</para>
+
+ <section id="sc_minimumConfiguration">
+ <title>Minimum Configuration</title>
+
+ <para>Here are the minimum configuration keywords that must be defined
+ in the configuration file:</para>
+
+ <variablelist>
+ <varlistentry>
+ <term>clientPort</term>
+
+ <listitem>
+ <para>the port to listen for client connections; that is, the
+ port that clients attempt to connect to.</para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry id="var_datadir">
+ <term>dataDir</term>
+
+ <listitem>
+ <para>the location where ZooKeeper will store the in-memory
+ database snapshots and, unless specified otherwise, the
+ transaction log of updates to the database.</para>
+
+ <note>
+ <para>Be careful where you put the transaction log. A
+ dedicated transaction log device is key to consistent good
+ performance. Putting the log on a busy device will adversely
+ effect performance.</para>
+ </note>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry id="id_tickTime">
+ <term>tickTime</term>
+
+ <listitem>
+ <para>the length of a single tick, which is the basic time unit
+ used by ZooKeeper, as measured in milliseconds. It is used to
+ regulate heartbeats, and timeouts. For example, the minimum
+ session timeout will be two ticks.</para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ </section>
+
+ <section id="sc_advancedConfiguration">
+ <title>Advanced Configuration</title>
+
+ <para>The configuration settings in the section are optional. You can
+ use them to further fine tune the behaviour of your ZooKeeper servers.
+ Some can also be set using Java system properties, generally of the
+ form <emphasis>zookeeper.keyword</emphasis>. The exact system
+ property, when available, is noted below.</para>
+
+ <variablelist>
+ <varlistentry>
+ <term>dataLogDir</term>
+
+ <listitem>
+ <para>(No Java system property)</para>
+
+ <para>This option will direct the machine to write the
+ transaction log to the <emphasis
+ role="bold">dataLogDir</emphasis> rather than the <emphasis
+ role="bold">dataDir</emphasis>. This allows a dedicated log
+ device to be used, and helps avoid competition between logging
+ and snaphots.</para>
+
+ <note>
+ <para>Having a dedicated log device has a large impact on
+ throughput and stable latencies. It is highly recommened to
+ dedicate a log device and set <emphasis
+ role="bold">dataLogDir</emphasis> to point to a directory on
+ that device, and then make sure to point <emphasis
+ role="bold">dataDir</emphasis> to a directory
+ <emphasis>not</emphasis> residing on that device.</para>
+ </note>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>globalOutstandingLimit</term>
+
+ <listitem>
+ <para>(Java system property: <emphasis
+ role="bold">zookeeper.globalOutstandingLimit.</emphasis>)</para>
+
+ <para>Clients can submit requests faster than ZooKeeper can
+ process them, especially if there are a lot of clients. To
+ prevent ZooKeeper from running out of memory due to queued
+ requests, ZooKeeper will throttle clients so that there is no
+ more than globalOutstandingLimit outstanding requests in the
+ system. The default limit is 1,000.</para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>preAllocSize</term>
+
+ <listitem>
+ <para>(Java system property: <emphasis
+ role="bold">zookeeper.preAllocSize</emphasis>)</para>
+
+ <para>To avoid seeks ZooKeeper allocates space in the
+ transaction log file in blocks of preAllocSize kilobytes. The
+ default block size is 64M. One reason for changing the size of
+ the blocks is to reduce the block size if snapshots are taken
+ more often. (Also, see <emphasis
+ role="bold">snapCount</emphasis>).</para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>snapCount</term>
+
+ <listitem>
+ <para>(Java system property: <emphasis
+ role="bold">zookeeper.snapCount</emphasis>)</para>
+
+ <para>ZooKeeper records its transactions using snapshots and
+ a transaction log (think write-ahead log).The number of
+ transactions recorded in the transaction log before a snapshot
+ can be taken (and the transaction log rolled) is determined
+ by snapCount. In order to prevent all of the machines in the quorum
+ from taking a snapshot at the same time, each ZooKeeper server
+ will take a snapshot when the number of transactions in the transaction log
+ reaches a runtime generated random value in the [snapCount/2+1, snapCount]
+ range.The default snapCount is 100,000.</para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>maxClientCnxns</term>
+ <listitem>
+ <para>(No Java system property)</para>
+
+ <para>Limits the number of concurrent connections (at the socket
+ level) that a single client, identified by IP address, may make
+ to a single member of the ZooKeeper ensemble. This is used to
+ prevent certain classes of DoS attacks, including file
+ descriptor exhaustion. The default is 60. Setting this to 0
+ entirely removes the limit on concurrent connections.</para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>clientPortAddress</term>
+
+ <listitem>
+ <para><emphasis role="bold">New in 3.3.0:</emphasis> the
+ address (ipv4, ipv6 or hostname) to listen for client
+ connections; that is, the address that clients attempt
+ to connect to. This is optional, by default we bind in
+ such a way that any connection to the <emphasis
+ role="bold">clientPort</emphasis> for any
+ address/interface/nic on the server will be
+ accepted.</para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>minSessionTimeout</term>
+ <listitem>
+ <para>(No Java system property)</para>
+
+ <para><emphasis role="bold">New in 3.3.0:</emphasis> the
+ minimum session timeout in milliseconds that the server
+ will allow the client to negotiate. Defaults to 2 times
+ the <emphasis role="bold">tickTime</emphasis>.</para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>maxSessionTimeout</term>
+ <listitem>
+ <para>(No Java system property)</para>
+
+ <para><emphasis role="bold">New in 3.3.0:</emphasis> the
+ maximum session timeout in milliseconds that the server
+ will allow the client to negotiate. Defaults to 20 times
+ the <emphasis role="bold">tickTime</emphasis>.</para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>fsync.warningthresholdms</term>
+ <listitem>
+ <para>(Java system property: <emphasis
+ role="bold">zookeeper.fsync.warningthresholdms</emphasis>)</para>
+
+ <para><emphasis role="bold">New in 3.3.4:</emphasis> A
+ warning message will be output to the log whenever an
+ fsync in the Transactional Log (WAL) takes longer than
+ this value. The values is specified in milliseconds and
+ defaults to 1000. This value can only be set as a
+ system property.</para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>autopurge.snapRetainCount</term>
+
+ <listitem>
+ <para>(No Java system property)</para>
+
+ <para><emphasis role="bold">New in 3.4.0:</emphasis>
+ When enabled, ZooKeeper auto purge feature retains
+ the <emphasis role="bold">autopurge.snapRetainCount</emphasis> most
+ recent snapshots and the corresponding transaction logs in the
+ <emphasis role="bold">dataDir</emphasis> and <emphasis
+ role="bold">dataLogDir</emphasis> respectively and deletes the rest.
+ Defaults to 3. Minimum value is 3.</para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>autopurge.purgeInterval</term>
+
+ <listitem>
+ <para>(No Java system property)</para>
+
+ <para><emphasis role="bold">New in 3.4.0:</emphasis> The
+ time interval in hours for which the purge task has to
+ be triggered. Set to a positive integer (1 and above)
+ to enable the auto purging. Defaults to 0.</para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>syncEnabled</term>
+
+ <listitem>
+ <para>(Java system property: <emphasis
+ role="bold">zookeeper.observer.syncEnabled</emphasis>)</para>
+
+ <para><emphasis role="bold">New in 3.4.6, 3.5.0:</emphasis>
+ The observers now log transaction and write snapshot to disk
+ by default like the participants. This reduces the recovery time
+ of the observers on restart. Set to "false" to disable this
+ feature. Default is "true"</para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ </section>
+
+ <section id="sc_clusterOptions">
+ <title>Cluster Options</title>
+
+ <para>The options in this section are designed for use with an ensemble
+ of servers -- that is, when deploying clusters of servers.</para>
+
+ <variablelist>
+ <varlistentry>
+ <term>electionAlg</term>
+
+ <listitem>
+ <para>(No Java system property)</para>
+
+ <para>Election implementation to use. A value of "0" corresponds
+ to the original UDP-based version, "1" corresponds to the
+ non-authenticated UDP-based version of fast leader election, "2"
+ corresponds to the authenticated UDP-based version of fast
+ leader election, and "3" corresponds to TCP-based version of
+ fast leader election. Currently, algorithm 3 is the default</para>
+
+ <note>
+ <para> The implementations of leader election 0, 1, and 2 are now
+ <emphasis role="bold"> deprecated </emphasis>. We have the intention
+ of removing them in the next release, at which point only the
+ FastLeaderElection will be available.
+ </para>
+ </note>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>initLimit</term>
+
+ <listitem>
+ <para>(No Java system property)</para>
+
+ <para>Amount of time, in ticks (see <ulink
+ url="#id_tickTime">tickTime</ulink>), to allow followers to
+ connect and sync to a leader. Increased this value as needed, if
+ the amount of data managed by ZooKeeper is large.</para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>leaderServes</term>
+
+ <listitem>
+ <para>(Java system property: zookeeper.<emphasis
+ role="bold">leaderServes</emphasis>)</para>
+
+ <para>Leader accepts client connections. Default value is "yes".
+ The leader machine coordinates updates. For higher update
+ throughput at thes slight expense of read throughput the leader
+ can be configured to not accept clients and focus on
+ coordination. The default to this option is yes, which means
+ that a leader will accept client connections.</para>
+
+ <note>
+ <para>Turning on leader selection is highly recommended when
+ you have more than three ZooKeeper servers in an ensemble.</para>
+ </note>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>server.x=[hostname]:nnnnn[:nnnnn], etc</term>
+
+ <listitem>
+ <para>(No Java system property)</para>
+
+ <para>servers making up the ZooKeeper ensemble. When the server
+ starts up, it determines which server it is by looking for the
+ file <filename>myid</filename> in the data directory. That file
+ contains the server number, in ASCII, and it should match
+ <emphasis role="bold">x</emphasis> in <emphasis
+ role="bold">server.x</emphasis> in the left hand side of this
+ setting.</para>
+
+ <para>The list of servers that make up ZooKeeper servers that is
+ used by the clients must match the list of ZooKeeper servers
+ that each ZooKeeper server has.</para>
+
+ <para>There are two port numbers <emphasis role="bold">nnnnn</emphasis>.
+ The first followers use to connect to the leader, and the second is for
+ leader election. The leader election port is only necessary if electionAlg
+ is 1, 2, or 3 (default). If electionAlg is 0, then the second port is not
+ necessary. If you want to test multiple servers on a single machine, then
+ different ports can be used for each server.</para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>syncLimit</term>
+
+ <listitem>
+ <para>(No Java system property)</para>
+
+ <para>Amount of time, in ticks (see <ulink
+ url="#id_tickTime">tickTime</ulink>), to allow followers to sync
+ with ZooKeeper. If followers fall too far behind a leader, they
+ will be dropped.</para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>group.x=nnnnn[:nnnnn]</term>
+
+ <listitem>
+ <para>(No Java system property)</para>
+
+ <para>Enables a hierarchical quorum construction."x" is a group identifier
+ and the numbers following the "=" sign correspond to server identifiers.
+ The left-hand side of the assignment is a colon-separated list of server
+ identifiers. Note that groups must be disjoint and the union of all groups
+ must be the ZooKeeper ensemble. </para>
+
+ <para> You will find an example <ulink url="zookeeperHierarchicalQuorums.html">here</ulink>
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>weight.x=nnnnn</term>
+
+ <listitem>
+ <para>(No Java system property)</para>
+
+ <para>Used along with "group", it assigns a weight to a server when
+ forming quorums. Such a value corresponds to the weight of a server
+ when voting. There are a few parts of ZooKeeper that require voting
+ such as leader election and the atomic broadcast protocol. By default
+ the weight of server is 1. If the configuration defines groups, but not
+ weights, then a value of 1 will be assigned to all servers.
+ </para>
+
+ <para> You will find an example <ulink url="zookeeperHierarchicalQuorums.html">here</ulink>
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>cnxTimeout</term>
+
+ <listitem>
+ <para>(Java system property: zookeeper.<emphasis
+ role="bold">cnxTimeout</emphasis>)</para>
+
+ <para>Sets the timeout value for opening connections for leader election notifications.
+ Only applicable if you are using electionAlg 3.
+ </para>
+
+ <note>
+ <para>Default value is 5 seconds.</para>
+ </note>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>4lw.commands.whitelist</term>
+
+ <listitem>
+ <para>(Java system property: <emphasis
+ role="bold">zookeeper.4lw.commands.whitelist</emphasis>)</para>
+
+ <para><emphasis role="bold">New in 3.4.10:</emphasis>
+ This property contains a list of comma separated
+ <ulink url="#sc_zkCommands">Four Letter Words</ulink> commands. It is introduced
+ to provide fine grained control over the set of commands ZooKeeper can execute,
+ so users can turn off certain commands if necessary.
+ By default it contains all supported four letter word commands except "wchp" and "wchc",
+ if the property is not specified. If the property is specified, then only commands listed
+ in the whitelist are enabled.
+ </para>
+
+ <para>Here's an example of the configuration that enables stat, ruok, conf, and isro
+ command while disabling the rest of Four Letter Words command:</para>
+ <programlisting>
+ 4lw.commands.whitelist=stat, ruok, conf, isro
+ </programlisting>
+
+ <para>Users can also use asterisk option so they don't have to include every command one by one in the list.
+ As an example, this will enable all four letter word commands:
+ </para>
+ <programlisting>
+ 4lw.commands.whitelist=*
+ </programlisting>
+
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>ipReachableTimeout</term>
+
+ <listitem>
+ <para>(Java system property: <emphasis
+ role="bold">zookeeper.ipReachableTimeout</emphasis>)</para>
+
+ <para><emphasis role="bold">New in 3.4.11:</emphasis>
+ Set this timeout value for IP addresses reachable checking when hostname is resolved, as mesured in
+ milliseconds.
+ By default, ZooKeeper will use the first IP address of the hostname(without any reachable checking).
+ When zookeeper.ipReachableTimeout is set(larger than 0), ZooKeeper will will try to pick up the first
+ IP address which is reachable. This is done by calling Java API InetAddress.isReachable(long timeout)
+ function, in which this timeout value is used. If none of such reachable IP address can be found, the
+ first IP address of the hostname will be used anyway.
+ </para>
+
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>tcpKeepAlive</term>
+
+ <listitem>
+ <para>(Java system property: <emphasis
+ role="bold">zookeeper.tcpKeepAlive</emphasis>)</para>
+
+ <para><emphasis role="bold">New in 3.4.11:</emphasis>
+ Setting this to true sets the TCP keepAlive flag on the
+ sockets used by quorum members to perform elections.
+ This will allow for connections between quorum members to
+ remain up when there is network infrastructure that may
+ otherwise break them. Some NATs and firewalls may terminate
+ or lose state for long running or idle connections.</para>
+
+ <para> Enabling this option relies on OS level settings to work
+ properly, check your operating system's options regarding TCP
+ keepalive for more information. Defaults to
+ <emphasis role="bold">false</emphasis>.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ </variablelist>
+ <para></para>
+ </section>
+
+ <section id="sc_authOptions">
+ <title>Authentication & Authorization Options</title>
+
+ <para>The options in this section allow control over
+ authentication/authorization performed by the service.</para>
+
+ <variablelist>
+ <varlistentry>
+ <term>zookeeper.DigestAuthenticationProvider.superDigest</term>
+
+ <listitem>
+ <para>(Java system property only: <emphasis
+ role="bold">zookeeper.DigestAuthenticationProvider.superDigest</emphasis>)</para>
+
+ <para>By default this feature is <emphasis
+ role="bold">disabled</emphasis></para>
+
+ <para><emphasis role="bold">New in 3.2:</emphasis>
+ Enables a ZooKeeper ensemble administrator to access the
+ znode hierarchy as a "super" user. In particular no ACL
+ checking occurs for a user authenticated as
+ super.</para>
+
+ <para>org.apache.zookeeper.server.auth.DigestAuthenticationProvider
+ can be used to generate the superDigest, call it with
+ one parameter of "super:<password>". Provide the
+ generated "super:<data>" as the system property value
+ when starting each server of the ensemble.</para>
+
+ <para>When authenticating to a ZooKeeper server (from a
+ ZooKeeper client) pass a scheme of "digest" and authdata
+ of "super:<password>". Note that digest auth passes
+ the authdata in plaintext to the server, it would be
+ prudent to use this authentication method only on
+ localhost (not over the network) or over an encrypted
+ connection.</para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>isro</term>
+
+ <listitem>
+ <para><emphasis role="bold">New in 3.4.0:</emphasis> Tests if
+ server is running in read-only mode. The server will respond with
+ "ro" if in read-only mode or "rw" if not in read-only mode.</para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>gtmk</term>
+
+ <listitem>
+ <para>Gets the current trace mask as a 64-bit signed long value in
+ decimal format. See <command>stmk</command> for an explanation of
+ the possible values.</para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>stmk</term>
+
+ <listitem>
+ <para>Sets the current trace mask. The trace mask is 64 bits,
+ where each bit enables or disables a specific category of trace
+ logging on the server. Log4J must be configured to enable
+ <command>TRACE</command> level first in order to see trace logging
+ messages. The bits of the trace mask correspond to the following
+ trace logging categories.</para>
+
+ <table>
+ <title>Trace Mask Bit Values</title>
+ <tgroup cols="2" align="left" colsep="1" rowsep="1">
+ <tbody>
+ <row>
+ <entry>0b0000000000</entry>
+ <entry>Unused, reserved for future use.</entry>
+ </row>
+ <row>
+ <entry>0b0000000010</entry>
+ <entry>Logs client requests, excluding ping
+ requests.</entry>
+ </row>
+ <row>
+ <entry>0b0000000100</entry>
+ <entry>Unused, reserved for future use.</entry>
+ </row>
+ <row>
+ <entry>0b0000001000</entry>
+ <entry>Logs client ping requests.</entry>
+ </row>
+ <row>
+ <entry>0b0000010000</entry>
+ <entry>Logs packets received from the quorum peer that is
+ the current leader, excluding ping requests.</entry>
+ </row>
+ <row>
+ <entry>0b0000100000</entry>
+ <entry>Logs addition, removal and validation of client
+ sessions.</entry>
+ </row>
+ <row>
+ <entry>0b0001000000</entry>
+ <entry>Logs delivery of watch events to client
+ sessions.</entry>
+ </row>
+ <row>
+ <entry>0b0010000000</entry>
+ <entry>Logs ping packets received from the quorum peer
+ that is the current leader.</entry>
+ </row>
+ <row>
+ <entry>0b0100000000</entry>
+ <entry>Unused, reserved for future use.</entry>
+ </row>
+ <row>
+ <entry>0b1000000000</entry>
+ <entry>Unused, reserved for future use.</entry>
+ </row>
+ </tbody>
+ </tgroup>
+ </table>
+
+ <para>All remaining bits in the 64-bit value are unused and
+ reserved for future use. Multiple trace logging categories are
+ specified by calculating the bitwise OR of the documented values.
+ The default trace mask is 0b0100110010. Thus, by default, trace
+ logging includes client requests, packets received from the
+ leader and sessions.</para>
+
+ <para>To set a different trace mask, send a request containing the
+ <command>stmk</command> four-letter word followed by the trace
+ mask represented as a 64-bit signed long value. This example uses
+ the Perl <command>pack</command> function to construct a trace
+ mask that enables all trace logging categories described above and
+ convert it to a 64-bit signed long value with big-endian byte
+ order. The result is appended to <command>stmk</command> and sent
+ to the server using netcat. The server responds with the new
+ trace mask in decimal format.</para>
+
+ <programlisting>$ perl -e "print 'stmk', pack('q>', 0b0011111010)" | nc localhost 2181
+250
+ </programlisting>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ </section>
+
+ <section>
+ <title>Experimental Options/Features</title>
+
+ <para>New features that are currently considered experimental.</para>
+
+ <variablelist>
+ <varlistentry>
+ <term>Read Only Mode Server</term>
+
+ <listitem>
+ <para>(Java system property: <emphasis
+ role="bold">readonlymode.enabled</emphasis>)</para>
+
+ <para><emphasis role="bold">New in 3.4.0:</emphasis>
+ Setting this value to true enables Read Only Mode server
+ support (disabled by default). ROM allows clients
+ sessions which requested ROM support to connect to the
+ server even when the server might be partitioned from
+ the quorum. In this mode ROM clients can still read
+ values from the ZK service, but will be unable to write
+ values and see changes from other clients. See
+ ZOOKEEPER-784 for more details.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ </variablelist>
+ </section>
+
+ <section>
+ <title>Unsafe Options</title>
+
+ <para>The following options can be useful, but be careful when you use
+ them. The risk of each is explained along with the explanation of what
+ the variable does.</para>
+
+ <variablelist>
+ <varlistentry>
+ <term>forceSync</term>
+
+ <listitem>
+ <para>(Java system property: <emphasis
+ role="bold">zookeeper.forceSync</emphasis>)</para>
+
+ <para>Requires updates to be synced to media of the transaction
+ log before finishing processing the update. If this option is
+ set to no, ZooKeeper will not require updates to be synced to
+ the media.</para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>jute.maxbuffer:</term>
+
+ <listitem>
+ <para>(Java system property:<emphasis role="bold">
+ jute.maxbuffer</emphasis>)</para>
+
+ <para>This option can only be set as a Java system property.
+ There is no zookeeper prefix on it. It specifies the maximum
+ size of the data that can be stored in a znode. The default is
+ 0xfffff, or just under 1M. If this option is changed, the system
+ property must be set on all servers and clients otherwise
+ problems will arise. This is really a sanity check. ZooKeeper is
+ designed to store data on the order of kilobytes in size.</para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>skipACL</term>
+
+ <listitem>
+ <para>(Java system property: <emphasis
+ role="bold">zookeeper.skipACL</emphasis>)</para>
+
+ <para>Skips ACL checks. This results in a boost in throughput,
+ but opens up full access to the data tree to everyone.</para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>quorumListenOnAllIPs</term>
+
+ <listitem>
+ <para>When set to true the ZooKeeper server will listen
+ for connections from its peers on all available IP addresses,
+ and not only the address configured in the server list of the
+ configuration file. It affects the connections handling the
+ ZAB protocol and the Fast Leader Election protocol. Default
+ value is <emphasis role="bold">false</emphasis>.</para>
+ </listitem>
+ </varlistentry>
+
+ </variablelist>
+ </section>
+
+ <section>
+ <title>Communication using the Netty framework</title>
+
+ <para><emphasis role="bold">New in
+ 3.4:</emphasis> <ulink url="http://jboss.org/netty">Netty</ulink>
+ is an NIO based client/server communication framework, it
+ simplifies (over NIO being used directly) many of the
+ complexities of network level communication for java
+ applications. Additionally the Netty framework has built
+ in support for encryption (SSL) and authentication
+ (certificates). These are optional features and can be
+ turned on or off individually.
+ </para>
+ <para>Prior to version 3.4 ZooKeeper has always used NIO
+ directly, however in versions 3.4 and later Netty is
+ supported as an option to NIO (replaces). NIO continues to
+ be the default, however Netty based communication can be
+ used in place of NIO by setting the environment variable
+ "zookeeper.serverCnxnFactory" to
+ "org.apache.zookeeper.server.NettyServerCnxnFactory". You
+ have the option of setting this on either the client(s) or
+ server(s), typically you would want to set this on both,
+ however that is at your discretion.
+ </para>
+ <para>
+ TBD - tuning options for netty - currently there are none that are netty specific but we should add some. Esp around max bound on the number of reader worker threads netty creates.
+ </para>
+ <para>
+ TBD - how to manage encryption
+ </para>
+ <para>
+ TBD - how to manage certificates
+ </para>
+
+ </section>
+
+ </section>
+
+ <section id="sc_zkCommands">
+ <title>ZooKeeper Commands: The Four Letter Words</title>
+
+ <para>ZooKeeper responds to a small set of commands. Each command is
+ composed of four letters. You issue the commands to ZooKeeper via telnet
+ or nc, at the client port.</para>
+
+ <para>Three of the more interesting commands: "stat" gives some
+ general information about the server and connected clients,
+ while "srvr" and "cons" give extended details on server and
+ connections respectively.</para>
+
+ <variablelist>
+ <varlistentry>
+ <term>conf</term>
+
+ <listitem>
+ <para><emphasis role="bold">New in 3.3.0:</emphasis> Print
+ details about serving configuration.</para>
+ </listitem>
+
+ </varlistentry>
+
+ <varlistentry>
+ <term>cons</term>
+
+ <listitem>
+ <para><emphasis role="bold">New in 3.3.0:</emphasis> List
+ full connection/session details for all clients connected
+ to this server. Includes information on numbers of packets
+ received/sent, session id, operation latencies, last
+ operation performed, etc...</para>
+ </listitem>
+
+ </varlistentry>
+
+ <varlistentry>
+ <term>crst</term>
+
+ <listitem>
+ <para><emphasis role="bold">New in 3.3.0:</emphasis> Reset
+ connection/session statistics for all connections.</para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>dump</term>
+
+ <listitem>
+ <para>Lists the outstanding sessions and ephemeral nodes. This
+ only works on the leader.</para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>envi</term>
+
+ <listitem>
+ <para>Print details about serving environment</para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>ruok</term>
+
+ <listitem>
+ <para>Tests if server is running in a non-error state. The server
+ will respond with imok if it is running. Otherwise it will not
+ respond at all.</para>
+
+ <para>A response of "imok" does not necessarily indicate that the
+ server has joined the quorum, just that the server process is active
+ and bound to the specified client port. Use "stat" for details on
+ state wrt quorum and client connection information.</para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>srst</term>
+
+ <listitem>
+ <para>Reset server statistics.</para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>srvr</term>
+
+ <listitem>
+ <para><emphasis role="bold">New in 3.3.0:</emphasis> Lists
+ full details for the server.</para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>stat</term>
+
+ <listitem>
+ <para>Lists brief details for the server and connected
+ clients.</para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>wchs</term>
+
+ <listitem>
+ <para><emphasis role="bold">New in 3.3.0:</emphasis> Lists
+ brief information on watches for the server.</para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>wchc</term>
+
+ <listitem>
+ <para><emphasis role="bold">New in 3.3.0:</emphasis> Lists
+ detailed information on watches for the server, by
+ session. This outputs a list of sessions(connections)
+ with associated watches (paths). Note, depending on the
+ number of watches this operation may be expensive (ie
+ impact server performance), use it carefully.</para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>wchp</term>
+
+ <listitem>
+ <para><emphasis role="bold">New in 3.3.0:</emphasis> Lists
+ detailed information on watches for the server, by path.
+ This outputs a list of paths (znodes) with associated
+ sessions. Note, depending on the number of watches this
+ operation may be expensive (ie impact server performance),
+ use it carefully.</para>
+ </listitem>
+ </varlistentry>
+
+
+ <varlistentry>
+ <term>mntr</term>
+
+ <listitem>
+ <para><emphasis role="bold">New in 3.4.0:</emphasis> Outputs a list
+ of variables that could be used for monitoring the health of the cluster.</para>
+
+ <programlisting>$ echo mntr | nc localhost 2185
+
+zk_version 3.4.0
+zk_avg_latency 0
+zk_max_latency 0
+zk_min_latency 0
+zk_packets_received 70
+zk_packets_sent 69
+zk_outstanding_requests 0
+zk_server_state leader
+zk_znode_count 4
+zk_watch_count 0
+zk_ephemerals_count 0
+zk_approximate_data_size 27
+zk_followers 4 - only exposed by the Leader
+zk_synced_followers 4 - only exposed by the Leader
+zk_pending_syncs 0 - only exposed by the Leader
+zk_open_file_descriptor_count 23 - only available on Unix platforms
+zk_max_file_descriptor_count 1024 - only available on Unix platforms
+zk_fsync_threshold_exceed_count 0
+</programlisting>
+
+ <para>The output is compatible with java properties format and the content
+ may change over time (new keys added). Your scripts should expect changes.</para>
+
+ <para>ATTENTION: Some of the keys are platform specific and some of the keys are only exported by the Leader. </para>
+
+ <para>The output contains multiple lines with the following format:</para>
+ <programlisting>key \t value</programlisting>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+
+ <para>Here's an example of the <emphasis role="bold">ruok</emphasis>
+ command:</para>
+
+ <programlisting>$ echo ruok | nc 127.0.0.1 5111
+imok
+</programlisting>
+
+
+ </section>
+
+ <section id="sc_dataFileManagement">
+ <title>Data File Management</title>
+
+ <para>ZooKeeper stores its data in a data directory and its transaction
+ log in a transaction log directory. By default these two directories are
+ the same. The server can (and should) be configured to store the
+ transaction log files in a separate directory than the data files.
+ Throughput increases and latency decreases when transaction logs reside
+ on a dedicated log devices.</para>
+
+ <section>
+ <title>The Data Directory</title>
+
+ <para>This directory has two files in it:</para>
+
+ <itemizedlist>
+ <listitem>
+ <para><filename>myid</filename> - contains a single integer in
+ human readable ASCII text that represents the server id.</para>
+ </listitem>
+
+ <listitem>
+ <para><filename>snapshot.<zxid></filename> - holds the fuzzy
+ snapshot of a data tree.</para>
+ </listitem>
+ </itemizedlist>
+
+ <para>Each ZooKeeper server has a unique id. This id is used in two
+ places: the <filename>myid</filename> file and the configuration file.
+ The <filename>myid</filename> file identifies the server that
+ corresponds to the given data directory. The configuration file lists
+ the contact information for each server identified by its server id.
+ When a ZooKeeper server instance starts, it reads its id from the
+ <filename>myid</filename> file and then, using that id, reads from the
+ configuration file, looking up the port on which it should
+ listen.</para>
+
+ <para>The <filename>snapshot</filename> files stored in the data
+ directory are fuzzy snapshots in the sense that during the time the
+ ZooKeeper server is taking the snapshot, updates are occurring to the
+ data tree. The suffix of the <filename>snapshot</filename> file names
+ is the <emphasis>zxid</emphasis>, the ZooKeeper transaction id, of the
+ last committed transaction at the start of the snapshot. Thus, the
+ snapshot includes a subset of the updates to the data tree that
+ occurred while the snapshot was in process. The snapshot, then, may
+ not correspond to any data tree that actually existed, and for this
+ reason we refer to it as a fuzzy snapshot. Still, ZooKeeper can
+ recover using this snapshot because it takes advantage of the
+ idempotent nature of its updates. By replaying the transaction log
+ against fuzzy snapshots ZooKeeper gets the state of the system at the
+ end of the log.</para>
+ </section>
+
+ <section>
+ <title>The Log Directory</title>
+
+ <para>The Log Directory contains the ZooKeeper transaction logs.
+ Before any update takes place, ZooKeeper ensures that the transaction
+ that represents the update is written to non-volatile storage. A new
+ log file is started when the number of transactions written to the
+ current log file reaches a (variable) threshold. The threshold is
+ computed using the same parameter which influences the frequency of
+ snapshotting (see snapCount above). The log file's suffix is the first
+ zxid written to that log.</para>
+ </section>
+
+ <section id="sc_filemanagement">
+ <title>File Management</title>
+
+ <para>The format of snapshot and log files does not change between
+ standalone ZooKeeper servers and different configurations of
+ replicated ZooKeeper servers. Therefore, you can pull these files from
+ a running replicated ZooKeeper server to a development machine with a
+ stand-alone ZooKeeper server for trouble shooting.</para>
+
+ <para>Using older log and snapshot files, you can look at the previous
+ state of ZooKeeper servers and even restore that state. The
+ LogFormatter class allows an administrator to look at the transactions
+ in a log.</para>
+
+ <para>The ZooKeeper server creates snapshot and log files, but
+ never deletes them. The retention policy of the data and log
+ files is implemented outside of the ZooKeeper server. The
+ server itself only needs the latest complete fuzzy snapshot, all log
+ files following it, and the last log file preceding it. The latter
+ requirement is necessary to include updates which happened after this
+ snapshot was started but went into the existing log file at that time.
+ This is possible because snapshotting and rolling over of logs
+ proceed somewhat independently in ZooKeeper. See the
+ <ulink url="#sc_maintenance">maintenance</ulink> section in
+ this document for more details on setting a retention policy
+ and maintenance of ZooKeeper storage.
+ </para>
+ <note>
+ <para>The data stored in these files is not encrypted. In the case of
+ storing sensitive data in ZooKeeper, necessary measures need to be
+ taken to prevent unauthorized access. Such measures are external to
+ ZooKeeper (e.g., control access to the files) and depend on the
+ individual settings in which it is being deployed. </para>
+ </note>
+ </section>
+
+ <section>
+ <title>Recovery - TxnLogToolkit</title>
+
+ <para>TxnLogToolkit is a command line tool shipped with ZooKeeper which
+ is capable of recovering transaction log entries with broken CRC.</para>
+ <para>Running it without any command line parameters or with the "-h,--help"
+ argument, it outputs the following help page:</para>
+
+ <programlisting>
+ $ bin/zkTxnLogToolkit.sh
+
+ usage: TxnLogToolkit [-dhrv] txn_log_file_name
+ -d,--dump Dump mode. Dump all entries of the log file. (this is the default)
+ -h,--help Print help message
+ -r,--recover Recovery mode. Re-calculate CRC for broken entries.
+ -v,--verbose Be verbose in recovery mode: print all entries, not just fixed ones.
+ -y,--yes Non-interactive mode: repair all CRC errors without asking
+ </programlisting>
+
+ <para>The default behaviour is safe: it dumps the entries of the given
+ transaction log file to the screen: (same as using '-d,--dump' parameter)</para>
+
+ <programlisting>
+ $ bin/zkTxnLogToolkit.sh log.100000001
+ ZooKeeper Transactional Log File with dbid 0 txnlog format version 2
+ 4/5/18 2:15:58 PM CEST session 0x16295bafcc40000 cxid 0x0 zxid 0x100000001 createSession 30000
+ <emphasis role="bold">CRC ERROR - 4/5/18 2:16:05 PM CEST session 0x16295bafcc40000 cxid 0x1 zxid 0x100000002 closeSession null</emphasis>
+ 4/5/18 2:16:05 PM CEST session 0x16295bafcc40000 cxid 0x1 zxid 0x100000002 closeSession null
+ 4/5/18 2:16:12 PM CEST session 0x26295bafcc90000 cxid 0x0 zxid 0x100000003 createSession 30000
+ 4/5/18 2:17:34 PM CEST session 0x26295bafcc90000 cxid 0x0 zxid 0x200000001 closeSession null
+ 4/5/18 2:17:34 PM CEST session 0x16295bd23720000 cxid 0x0 zxid 0x200000002 createSession 30000
+ 4/5/18 2:18:02 PM CEST session 0x16295bd23720000 cxid 0x2 zxid 0x200000003 create '/andor,#626262,v{s{31,s{'world,'anyone}}},F,1
+ EOF reached after 6 txns.
+ </programlisting>
+
+ <para>There's a CRC error in the 2nd entry of the above transaction log file. In <emphasis role="bold">dump</emphasis>
+ mode, the toolkit only prints this information to the screen without touching the original file. In
+ <emphasis role="bold">recovery</emphasis> mode (-r,--recover flag) the original file still remains
+ untouched and all transactions will be copied over to a new txn log file with ".fixed" suffix. It recalculates
+ CRC values and copies the calculated value, if it doesn't match the original txn entry.
+ By default, the tool works interactively: it asks for confirmation whenever CRC error encountered.</para>
+
+ <programlisting>
+ $ bin/zkTxnLogToolkit.sh -r log.100000001
+ ZooKeeper Transactional Log File with dbid 0 txnlog format version 2
+ CRC ERROR - 4/5/18 2:16:05 PM CEST session 0x16295bafcc40000 cxid 0x1 zxid 0x100000002 closeSession null
+ Would you like to fix it (Yes/No/Abort) ?
+ </programlisting>
+
+ <para>Answering <emphasis role="bold">Yes</emphasis> means the newly calculated CRC value will be outputted
+ to the new file. <emphasis role="bold">No</emphasis> means that the original CRC value will be copied over.
+ <emphasis role="bold">Abort</emphasis> will abort the entire operation and exits.
+ (In this case the ".fixed" will not be deleted and left in a half-complete state: contains only entries which
+ have already been processed or only the header if the operation was aborted at the first entry.)</para>
+
+ <programlisting>
+ $ bin/zkTxnLogToolkit.sh -r log.100000001
+ ZooKeeper Transactional Log File with dbid 0 txnlog format version 2
+ CRC ERROR - 4/5/18 2:16:05 PM CEST session 0x16295bafcc40000 cxid 0x1 zxid 0x100000002 closeSession null
+ Would you like to fix it (Yes/No/Abort) ? y
+ EOF reached after 6 txns.
+ Recovery file log.100000001.fixed has been written with 1 fixed CRC error(s)
+ </programlisting>
+
+ <para>The default behaviour of recovery is to be silent: only entries with CRC error get printed to the screen.
+ One can turn on verbose mode with the -v,--verbose parameter to see all records.
+ Interactive mode can be turned off with the -y,--yes parameter. In this case all CRC errors will be fixed
+ in the new transaction file.</para>
+ </section>
+ </section>
+
+ <section id="sc_commonProblems">
+ <title>Things to Avoid</title>
+
+ <para>Here are some common problems you can avoid by configuring
+ ZooKeeper correctly:</para>
+
+ <variablelist>
+ <varlistentry>
+ <term>inconsistent lists of servers</term>
+
+ <listitem>
+ <para>The list of ZooKeeper servers used by the clients must match
+ the list of ZooKeeper servers that each ZooKeeper server has.
+ Things work okay if the client list is a subset of the real list,
+ but things will really act strange if clients have a list of
+ ZooKeeper servers that are in different ZooKeeper clusters. Also,
+ the server lists in each Zookeeper server configuration file
+ should be consistent with one another.</para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>incorrect placement of transaction log</term>
+
+ <listitem>
+ <para>The most performance critical part of ZooKeeper is the
+ transaction log. ZooKeeper syncs transactions to media before it
+ returns a response. A dedicated transaction log device is key to
+ consistent good performance. Putting the log on a busy device will
+ adversely effect performance. If you only have one storage device,
+ put trace files on NFS and increase the snapshotCount; it doesn't
+ eliminate the problem, but it should mitigate it.</para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>incorrect Java heap size</term>
+
+ <listitem>
+ <para>You should take special care to set your Java max heap size
+ correctly. In particular, you should not create a situation in
+ which ZooKeeper swaps to disk. The disk is death to ZooKeeper.
+ Everything is ordered, so if processing one request swaps the
+ disk, all other queued requests will probably do the same. the
+ disk. DON'T SWAP.</para>
+
+ <para>Be conservative in your estimates: if you have 4G of RAM, do
+ not set the Java max heap size to 6G or even 4G. For example, it
+ is more likely you would use a 3G heap for a 4G machine, as the
+ operating system and the cache also need memory. The best and only
+ recommend practice for estimating the heap size your system needs
+ is to run load tests, and then make sure you are well below the
+ usage limit that would cause the system to swap.</para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>Publicly accessible deployment</term>
+ <listitem>
+ <para>
+ A ZooKeeper ensemble is expected to operate in a trusted computing environment.
+ It is thus recommended to deploy ZooKeeper behind a firewall.
+ </para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ </section>
+
+ <section id="sc_bestPractices">
+ <title>Best Practices</title>
+
+ <para>For best results, take note of the following list of good
+ Zookeeper practices:</para>
+
+
+ <para>For multi-tennant installations see the <ulink
+ url="zookeeperProgrammers.html#ch_zkSessions">section</ulink>
+ detailing ZooKeeper "chroot" support, this can be very useful
+ when deploying many applications/services interfacing to a
+ single ZooKeeper cluster.</para>
+
+ </section>
+ </section>
+</article>
http://git-wip-us.apache.org/repos/asf/zookeeper/blob/c1efa954/zookeeper-docs/src/documentation/content/xdocs/zookeeperHierarchicalQuorums.xml
----------------------------------------------------------------------
diff --git a/zookeeper-docs/src/documentation/content/xdocs/zookeeperHierarchicalQuorums.xml b/zookeeper-docs/src/documentation/content/xdocs/zookeeperHierarchicalQuorums.xml
new file mode 100644
index 0000000..f71c4a8
--- /dev/null
+++ b/zookeeper-docs/src/documentation/content/xdocs/zookeeperHierarchicalQuorums.xml
@@ -0,0 +1,75 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!--
+ Copyright 2002-2004 The Apache Software Foundation
+
+ Licensed under the Apache License, Version 2.0 (the "License");
+ you may not use this file except in compliance with the License.
+ You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ See the License for the specific language governing permissions and
+ limitations under the License.
+-->
+
+<!DOCTYPE article PUBLIC "-//OASIS//DTD Simplified DocBook XML V1.0//EN"
+"http://www.oasis-open.org/docbook/xml/simple/1.0/sdocbook.dtd">
+<article id="zk_HierarchicalQuorums">
+ <title>Introduction to hierarchical quorums</title>
+
+ <articleinfo>
+ <legalnotice>
+ <para>Licensed under the Apache License, Version 2.0 (the "License");
+ you may not use this file except in compliance with the License. You may
+ obtain a copy of the License at <ulink
+ url="http://www.apache.org/licenses/LICENSE-2.0">http://www.apache.org/licenses/LICENSE-2.0</ulink>.</para>
+
+ <para>Unless required by applicable law or agreed to in writing,
+ software distributed under the License is distributed on an "AS IS"
+ BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
+ implied. See the License for the specific language governing permissions
+ and limitations under the License.</para>
+ </legalnotice>
+
+ <abstract>
+ <para>This document contains information about hierarchical quorums.</para>
+ </abstract>
+ </articleinfo>
+
+ <para>
+ This document gives an example of how to use hierarchical quorums. The basic idea is
+ very simple. First, we split servers into groups, and add a line for each group listing
+ the servers that form this group. Next we have to assign a weight to each server.
+ </para>
+
+ <para>
+ The following example shows how to configure a system with three groups of three servers
+ each, and we assign a weight of 1 to each server:
+ </para>
+
+ <programlisting>
+ group.1=1:2:3
+ group.2=4:5:6
+ group.3=7:8:9
+
+ weight.1=1
+ weight.2=1
+ weight.3=1
+ weight.4=1
+ weight.5=1
+ weight.6=1
+ weight.7=1
+ weight.8=1
+ weight.9=1
+ </programlisting>
+
+ <para>
+ When running the system, we are able to form a quorum once we have a majority of votes from
+ a majority of non-zero-weight groups. Groups that have zero weight are discarded and not
+ considered when forming quorums. Looking at the example, we are able to form a quorum once
+ we have votes from at least two servers from each of two different groups.
+ </para>
+ </article>
\ No newline at end of file
[06/12] zookeeper git commit: ZOOKEEPER-3022: MAVEN MIGRATION 3.4 -
Iteration 1 - docs, it
Posted by an...@apache.org.
http://git-wip-us.apache.org/repos/asf/zookeeper/blob/c1efa954/zookeeper-docs/src/documentation/conf/cli.xconf
----------------------------------------------------------------------
diff --git a/zookeeper-docs/src/documentation/conf/cli.xconf b/zookeeper-docs/src/documentation/conf/cli.xconf
new file mode 100644
index 0000000..c671340
--- /dev/null
+++ b/zookeeper-docs/src/documentation/conf/cli.xconf
@@ -0,0 +1,328 @@
+<?xml version="1.0"?>
+<!--
+ Licensed to the Apache Software Foundation (ASF) under one or more
+ contributor license agreements. See the NOTICE file distributed with
+ this work for additional information regarding copyright ownership.
+ The ASF licenses this file to You under the Apache License, Version 2.0
+ (the "License"); you may not use this file except in compliance with
+ the License. You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ See the License for the specific language governing permissions and
+ limitations under the License.
+-->
+<!--+
+ | This is the Apache Cocoon command line configuration file.
+ | Here you give the command line interface details of where
+ | to find various aspects of your Cocoon installation.
+ |
+ | If you wish, you can also use this file to specify the URIs
+ | that you wish to generate.
+ |
+ | The current configuration information in this file is for
+ | building the Cocoon documentation. Therefore, all links here
+ | are relative to the build context dir, which, in the build.xml
+ | file, is set to ${build.context}
+ |
+ | Options:
+ | verbose: increase amount of information presented
+ | to standard output (default: false)
+ | follow-links: whether linked pages should also be
+ | generated (default: true)
+ | precompile-only: precompile sitemaps and XSP pages, but
+ | do not generate any pages (default: false)
+ | confirm-extensions: check the mime type for the generated page
+ | and adjust filename and links extensions
+ | to match the mime type
+ | (e.g. text/html->.html)
+ |
+ | Note: Whilst using an xconf file to configure the Cocoon
+ | Command Line gives access to more features, the use of
+ | command line parameters is more stable, as there are
+ | currently plans to improve the xconf format to allow
+ | greater flexibility. If you require a stable and
+ | consistent method for accessing the CLI, it is recommended
+ | that you use the command line parameters to configure
+ | the CLI. See documentation at:
+ | http://cocoon.apache.org/2.1/userdocs/offline/
+ | http://wiki.apache.org/cocoon/CommandLine
+ |
+ +-->
+
+<cocoon verbose="true"
+ follow-links="true"
+ precompile-only="false"
+ confirm-extensions="false">
+
+ <!--+
+ | The context directory is usually the webapp directory
+ | containing the sitemap.xmap file.
+ |
+ | The config file is the cocoon.xconf file.
+ |
+ | The work directory is used by Cocoon to store temporary
+ | files and cache files.
+ |
+ | The destination directory is where generated pages will
+ | be written (assuming the 'simple' mapper is used, see
+ | below)
+ +-->
+ <context-dir>.</context-dir>
+ <config-file>WEB-INF/cocoon.xconf</config-file>
+ <work-dir>../tmp/cocoon-work</work-dir>
+ <dest-dir>../site</dest-dir>
+
+ <!--+
+ | A checksum file can be used to store checksums for pages
+ | as they are generated. When the site is next generated,
+ | files will not be written if their checksum has not changed.
+ | This means that it will be easier to detect which files
+ | need to be uploaded to a server, using the timestamp.
+ |
+ | The default path is relative to the core webapp directory.
+ | An asolute path can be used.
+ +-->
+ <!-- <checksums-uri>build/work/checksums</checksums-uri>-->
+
+ <!--+
+ | Broken link reporting options:
+ | Report into a text file, one link per line:
+ | <broken-links type="text" report="filename"/>
+ | Report into an XML file:
+ | <broken-links type="xml" report="filename"/>
+ | Ignore broken links (default):
+ | <broken-links type="none"/>
+ |
+ | Two attributes to this node specify whether a page should
+ | be generated when an error has occurred. 'generate' specifies
+ | whether a page should be generated (default: true) and
+ | extension specifies an extension that should be appended
+ | to the generated page's filename (default: none)
+ |
+ | Using this, a quick scan through the destination directory
+ | will show broken links, by their filename extension.
+ +-->
+ <broken-links type="xml"
+ file="../brokenlinks.xml"
+ generate="false"
+ extension=".error"
+ show-referrers="true"/>
+
+ <!--+
+ | Load classes at startup. This is necessary for generating
+ | from sites that use SQL databases and JDBC.
+ | The <load-class> element can be repeated if multiple classes
+ | are needed.
+ +-->
+ <!--
+ <load-class>org.firebirdsql.jdbc.Driver</load-class>
+ -->
+
+ <!--+
+ | Configures logging.
+ | The 'log-kit' parameter specifies the location of the log kit
+ | configuration file (usually called logkit.xconf.
+ |
+ | Logger specifies the logging category (for all logging prior
+ | to other Cocoon logging categories taking over)
+ |
+ | Available log levels are:
+ | DEBUG: prints all level of log messages.
+ | INFO: prints all level of log messages except DEBUG
+ | ones.
+ | WARN: prints all level of log messages except DEBUG
+ | and INFO ones.
+ | ERROR: prints all level of log messages except DEBUG,
+ | INFO and WARN ones.
+ | FATAL_ERROR: prints only log messages of this level
+ +-->
+ <!-- <logging log-kit="WEB-INF/logkit.xconf" logger="cli" level="ERROR" /> -->
+
+ <!--+
+ | Specifies the filename to be appended to URIs that
+ | refer to a directory (i.e. end with a forward slash).
+ +-->
+ <default-filename>index.html</default-filename>
+
+ <!--+
+ | Specifies a user agent string to the sitemap when
+ | generating the site.
+ |
+ | A generic term for a web browser is "user agent". Any
+ | user agent, when connecting to a web server, will provide
+ | a string to identify itself (e.g. as Internet Explorer or
+ | Mozilla). It is possible to have Cocoon serve different
+ | content depending upon the user agent string provided by
+ | the browser. If your site does this, then you may want to
+ | use this <user-agent> entry to provide a 'fake' user agent
+ | to Cocoon, so that it generates the correct version of your
+ | site.
+ |
+ | For most sites, this can be ignored.
+ +-->
+ <!--
+ <user-agent>Cocoon Command Line Environment 2.1</user-agent>
+ -->
+
+ <!--+
+ | Specifies an accept string to the sitemap when generating
+ | the site.
+ | User agents can specify to an HTTP server what types of content
+ | (by mime-type) they are able to receive. E.g. a browser may be
+ | able to handle jpegs, but not pngs. The HTTP accept header
+ | allows the server to take the browser's capabilities into account,
+ | and only send back content that it can handle.
+ |
+ | For most sites, this can be ignored.
+ +-->
+
+ <accept>*/*</accept>
+
+ <!--+
+ | Specifies which URIs should be included or excluded, according
+ | to wildcard patterns.
+ |
+ | These includes/excludes are only relevant when you are following
+ | links. A link URI must match an include pattern (if one is given)
+ | and not match an exclude pattern, if it is to be followed by
+ | Cocoon. It can be useful, for example, where there are links in
+ | your site to pages that are not generated by Cocoon, such as
+ | references to api-documentation.
+ |
+ | By default, all URIs are included. If both include and exclude
+ | patterns are specified, a URI is first checked against the
+ | include patterns, and then against the exclude patterns.
+ |
+ | Multiple patterns can be given, using muliple include or exclude
+ | nodes.
+ |
+ | The order of the elements is not significant, as only the first
+ | successful match of each category is used.
+ |
+ | Currently, only the complete source URI can be matched (including
+ | any URI prefix). Future plans include destination URI matching
+ | and regexp matching. If you have requirements for these, contact
+ | dev@cocoon.apache.org.
+ +-->
+
+ <exclude pattern="**/"/>
+ <exclude pattern="**apidocs**"/>
+ <exclude pattern="api/**"/>
+
+ <!-- ZOOKEEPER-2364 - we build our own release notes separately -->
+ <exclude pattern="releasenotes.**"/>
+
+<!--
+ This is a workaround for FOR-284 "link rewriting broken when
+ linking to xml source views which contain site: links".
+ See the explanation there and in declare-broken-site-links.xsl
+-->
+ <exclude pattern="site:**"/>
+ <exclude pattern="ext:**"/>
+ <exclude pattern="lm:**"/>
+ <exclude pattern="**/site:**"/>
+ <exclude pattern="**/ext:**"/>
+ <exclude pattern="**/lm:**"/>
+
+ <!-- Exclude tokens used in URLs to ASF mirrors (interpreted by a CGI) -->
+ <exclude pattern="[preferred]/**"/>
+ <exclude pattern="[location]"/>
+
+ <!-- <include-links extension=".html"/>-->
+
+ <!--+
+ | <uri> nodes specify the URIs that should be generated, and
+ | where required, what should be done with the generated pages.
+ | They describe the way the URI of the generated file is created
+ | from the source page's URI. There are three ways that a generated
+ | file URI can be created: append, replace and insert.
+ |
+ | The "type" attribute specifies one of (append|replace|insert):
+ |
+ | append:
+ | Append the generated page's URI to the end of the source URI:
+ |
+ | <uri type="append" src-prefix="documents/" src="index.html"
+ | dest="build/dest/"/>
+ |
+ | This means that
+ | (1) the "documents/index.html" page is generated
+ | (2) the file will be written to "build/dest/documents/index.html"
+ |
+ | replace:
+ | Completely ignore the generated page's URI - just
+ | use the destination URI:
+ |
+ | <uri type="replace" src-prefix="documents/" src="index.html"
+ | dest="build/dest/docs.html"/>
+ |
+ | This means that
+ | (1) the "documents/index.html" page is generated
+ | (2) the result is written to "build/dest/docs.html"
+ | (3) this works only for "single" pages - and not when links
+ | are followed
+ |
+ | insert:
+ | Insert generated page's URI into the destination
+ | URI at the point marked with a * (example uses fictional
+ | zip protocol)
+ |
+ | <uri type="insert" src-prefix="documents/" src="index.html"
+ | dest="zip://*.zip/page.html"/>
+ |
+ | This means that
+ | (1)
+ |
+ | In any of these scenarios, if the dest attribute is omitted,
+ | the value provided globally using the <dest-dir> node will
+ | be used instead.
+ +-->
+ <!--
+ <uri type="replace"
+ src-prefix="samples/"
+ src="hello-world/hello.html"
+ dest="build/dest/hello-world.html"/>
+ -->
+
+ <!--+
+ | <uri> nodes can be grouped together in a <uris> node. This
+ | enables a group of URIs to share properties. The following
+ | properties can be set for a group of URIs:
+ | * follow-links: should pages be crawled for links
+ | * confirm-extensions: should file extensions be checked
+ | for the correct mime type
+ | * src-prefix: all source URIs should be
+ | pre-pended with this prefix before
+ | generation. The prefix is not
+ | included when calculating the
+ | destination URI
+ | * dest: the base destination URI to be
+ | shared by all pages in this group
+ | * type: the method to be used to calculate
+ | the destination URI. See above
+ | section on <uri> node for details.
+ |
+ | Each <uris> node can have a name attribute. When a name
+ | attribute has been specified, the -n switch on the command
+ | line can be used to tell Cocoon to only process the URIs
+ | within this URI group. When no -n switch is given, all
+ | <uris> nodes are processed. Thus, one xconf file can be
+ | used to manage multiple sites.
+ +-->
+ <!--
+ <uris name="mirrors" follow-links="false">
+ <uri type="append" src="mirrors.html"/>
+ </uris>
+ -->
+
+ <!--+
+ | File containing URIs (plain text, one per line).
+ +-->
+ <!--
+ <uri-file>uris.txt</uri-file>
+ -->
+</cocoon>
http://git-wip-us.apache.org/repos/asf/zookeeper/blob/c1efa954/zookeeper-docs/src/documentation/content/xdocs/bookkeeperConfig.xml
----------------------------------------------------------------------
diff --git a/zookeeper-docs/src/documentation/content/xdocs/bookkeeperConfig.xml b/zookeeper-docs/src/documentation/content/xdocs/bookkeeperConfig.xml
new file mode 100644
index 0000000..7a80949
--- /dev/null
+++ b/zookeeper-docs/src/documentation/content/xdocs/bookkeeperConfig.xml
@@ -0,0 +1,156 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!--
+ Copyright 2002-2004 The Apache Software Foundation
+
+ Licensed under the Apache License, Version 2.0 (the "License");
+ you may not use this file except in compliance with the License.
+ You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ See the License for the specific language governing permissions and
+ limitations under the License.
+-->
+<!DOCTYPE article PUBLIC "-//OASIS//DTD Simplified DocBook XML V1.0//EN"
+"http://www.oasis-open.org/docbook/xml/simple/1.0/sdocbook.dtd">
+<article id="bk_Admin">
+ <title>BookKeeper Administrator's Guide</title>
+
+ <subtitle>Setup Guide</subtitle>
+
+ <articleinfo>
+ <legalnotice>
+ <para>Licensed under the Apache License, Version 2.0 (the "License");
+ you may not use this file except in compliance with the License. You may
+ obtain a copy of the License at <ulink
+ url="http://www.apache.org/licenses/LICENSE-2.0">http://www.apache.org/licenses/LICENSE-2.0</ulink>.
+ </para>
+
+ <para>Unless required by applicable law or agreed to in writing,
+ software distributed under the License is distributed on an "AS IS"
+ BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
+ implied. See the License for the specific language governing permissions
+ and limitations under the License.
+ </para>
+ </legalnotice>
+
+ <abstract>
+ <para>This document contains information about deploying, administering
+ and mantaining BookKeeper. It also discusses best practices and common
+ problems.
+ </para>
+ <para> As BookKeeper is still a prototype, this article is likely to change
+ significantly over time.
+ </para>
+ </abstract>
+ </articleinfo>
+
+ <section id="bk_deployment">
+ <title>Deployment</title>
+
+ <para>This section contains information about deploying BookKeeper and
+ covers these topics:</para>
+
+ <itemizedlist>
+ <listitem>
+ <para><xref linkend="bk_sysReq" /></para>
+ </listitem>
+
+ <listitem>
+ <para><xref linkend="bk_runningBookies" /></para>
+ </listitem>
+
+ <listitem>
+ <para><xref linkend="bk_zkMetadata" /></para>
+ </listitem>
+ </itemizedlist>
+
+ <para> The first section tells you how many machines you need. The second explains how to bootstrap bookies
+ (BookKeeper storage servers). The third section explains how we use ZooKeeper and our requirements with
+ respect to ZooKeeper.
+ </para>
+
+ <section id="bk_sysReq">
+ <title>System requirements</title>
+ <para> A typical BookKeeper installation comprises a set of bookies and a set of ZooKeeper replicas. The exact number of bookies
+ depends on the quorum mode, desired throughput, and number of clients using this installation simultaneously. The minimum number of
+ bookies is three for self-verifying (stores a message authentication code along with each entry) and four for generic (does not
+ store a message authentication codewith each entry), and there is no upper limit on the number of bookies. Increasing the number of
+ bookies, in fact, enables higher throughput.
+ </para>
+
+ <para> For performance, we require each server to have at least two disks. It is possible to run a bookie with a single disk, but
+ performance will be significantly lower in this case. Of course, it works with one disk, but performance is significantly lower.
+ </para>
+
+ <para> For ZooKeeper, there is no constraint with respect to the number of replicas. Having a single machine running ZooKeeper
+ in standalone mode is sufficient for BookKeeper. For resilience purposes, it might be a good idea to run ZooKeeper in quorum
+ mode with multiple servers. Please refer to the ZooKeeper documentation for detail on how to configure ZooKeeper with multiple
+ replicas
+ </para>
+ </section>
+
+ <section id="bk_runningBookies">
+ <title>Running bookies</title>
+ <para>
+ To run a bookie, we execute the following command:
+ </para>
+
+ <para><computeroutput>
+ java -cp .:./zookeeper-<version>-bookkeeper.jar:./zookeeper-<version>.jar\
+ :../log4j/apache-log4j-1.2.15/log4j-1.2.15.jar -Dlog4j.configuration=log4j.properties\
+ org.apache.bookkeeper.proto.BookieServer 3181 127.0.0.1:2181 /path_to_log_device/\
+ /path_to_ledger_device/
+ </computeroutput></para>
+
+ <para>
+ The parameters are:
+ </para>
+
+ <itemizedlist>
+ <listitem>
+ <para>
+ Port number that the bookie listens on;
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ Comma separated list of ZooKeeper servers with a hostname:port format;
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ Path for Log Device (stores bookie write-ahead log);
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ Path for Ledger Device (stores ledger entries);
+ </para>
+ </listitem>
+ </itemizedlist>
+
+ <para>
+ Ideally, <computeroutput>/path_to_log_device/ </computeroutput> and <computeroutput>/path_to_ledger_device/ </computeroutput> are each
+ in a different device.
+ </para>
+ </section>
+
+ <section id="bk_zkMetadata">
+ <title>ZooKeeper Metadata</title>
+ <para>
+ For BookKeeper, we require a ZooKeeper installation to store metadata, and to pass the list
+ of ZooKeeper servers as parameter to the constructor of the BookKeeper class (<computeroutput>
+ org.apache.bookkeeper.client,BookKeeper</computeroutput>).
+ To setup ZooKeeper, please check the <ulink url="index.html">
+ ZooKeeper documentation</ulink>.
+ </para>
+ </section>
+ </section>
+</article>
http://git-wip-us.apache.org/repos/asf/zookeeper/blob/c1efa954/zookeeper-docs/src/documentation/content/xdocs/bookkeeperOverview.xml
----------------------------------------------------------------------
diff --git a/zookeeper-docs/src/documentation/content/xdocs/bookkeeperOverview.xml b/zookeeper-docs/src/documentation/content/xdocs/bookkeeperOverview.xml
new file mode 100644
index 0000000..cdc1878
--- /dev/null
+++ b/zookeeper-docs/src/documentation/content/xdocs/bookkeeperOverview.xml
@@ -0,0 +1,419 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!--
+ Copyright 2002-2004 The Apache Software Foundation
+
+ Licensed under the Apache License, Version 2.0 (the "License");
+ you may not use this file except in compliance with the License.
+ You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ See the License for the specific language governing permissions and
+ limitations under the License.
+-->
+
+<!DOCTYPE article PUBLIC "-//OASIS//DTD Simplified DocBook XML V1.0//EN"
+"http://www.oasis-open.org/docbook/xml/simple/1.0/sdocbook.dtd">
+<article id="bk_GettStartedGuide">
+ <title>BookKeeper overview</title>
+
+ <articleinfo>
+ <legalnotice>
+ <para>Licensed under the Apache License, Version 2.0 (the "License");
+ you may not use this file except in compliance with the License. You may
+ obtain a copy of the License at <ulink
+ url="http://www.apache.org/licenses/LICENSE-2.0">http://www.apache.org/licenses/LICENSE-2.0</ulink>.</para>
+
+ <para>Unless required by applicable law or agreed to in writing,
+ software distributed under the License is distributed on an "AS IS"
+ BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
+ implied. See the License for the specific language governing permissions
+ and limitations under the License.</para>
+ </legalnotice>
+
+ <abstract>
+ <para>This guide contains detailed information about using BookKeeper
+ for logging. It discusses the basic operations BookKeeper supports,
+ and how to create logs and perform basic read and write operations on these
+ logs.</para>
+ </abstract>
+ </articleinfo>
+ <section id="bk_Overview">
+ <title>BookKeeper overview</title>
+
+ <section id="bk_Intro">
+ <title>BookKeeper introduction</title>
+ <para>
+ BookKeeper is a replicated service to reliably log streams of records. In BookKeeper,
+ servers are "bookies", log streams are "ledgers", and each unit of a log (aka record) is a
+ "ledger entry". BookKeeper is designed to be reliable; bookies, the servers that store
+ ledgers, can crash, corrupt data, discard data, but as long as there are enough bookies
+ behaving correctly the service as a whole behaves correctly.
+ </para>
+
+ <para>
+ The initial motivation for BookKeeper comes from the namenode of HDFS. Namenodes have to
+ log operations in a reliable fashion so that recovery is possible in the case of crashes.
+ We have found the applications for BookKeeper extend far beyond HDFS, however. Essentially,
+ any application that requires an append storage can replace their implementations with
+ BookKeeper. BookKeeper has the advantage of scaling throughput with the number of servers.
+ </para>
+
+ <para>
+ At a high level, a bookkeeper client receives entries from a client application and stores it to
+ sets of bookies, and there are a few advantages in having such a service:
+ </para>
+
+ <itemizedlist>
+ <listitem>
+ <para>
+ We can use hardware that is optimized for such a service. We currently believe that such a
+ system has to be optimized only for disk I/O;
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ We can have a pool of servers implementing such a log system, and shared among a number of servers;
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ We can have a higher degree of replication with such a pool, which makes sense if the hardware necessary for it is cheaper compared to the one the application uses.
+ </para>
+ </listitem>
+ </itemizedlist>
+
+ </section>
+
+ <section id="bk_moreDetail">
+ <title>In slightly more detail...</title>
+
+ <para> BookKeeper implements highly available logs, and it has been designed with write-ahead logging in mind. Besides high availability
+ due to the replicated nature of the service, it provides high throughput due to striping. As we write entries in a subset of bookies of an
+ ensemble and rotate writes across available quorums, we are able to increase throughput with the number of servers for both reads and writes.
+ Scalability is a property that is possible to achieve in this case due to the use of quorums. Other replication techniques, such as
+ state-machine replication, do not enable such a property.
+ </para>
+
+ <para> An application first creates a ledger before writing to bookies through a local BookKeeper client instance.
+ Upon creating a ledger, a BookKeeper client writes metadata about the ledger to ZooKeeper. Each ledger currently
+ has a single writer. This writer has to execute a close ledger operation before any other client can read from it.
+ If the writer of a ledger does not close a ledger properly because, for example, it has crashed before having the
+ opportunity of closing the ledger, then the next client that tries to open a ledger executes a procedure to recover
+ it. As closing a ledger consists essentially of writing the last entry written to a ledger to ZooKeeper, the recovery
+ procedure simply finds the last entry written correctly and writes it to ZooKeeper.
+ </para>
+
+ <para>
+ Note that currently this recovery procedure is executed automatically upon trying to open a ledger and no explicit action is necessary.
+ Although two clients may try to recover a ledger concurrently, only one will succeed, the first one that is able to create the close znode
+ for the ledger.
+ </para>
+ </section>
+
+ <section id="bk_basicComponents">
+ <title>Bookkeeper elements and concepts</title>
+ <para>
+ BookKeeper uses four basic elements:
+ </para>
+
+ <itemizedlist>
+ <listitem>
+ <para>
+ <emphasis role="bold">Ledger</emphasis>: A ledger is a sequence of entries, and each entry is a sequence of bytes. Entries are
+ written sequentially to a ledger and at most once. Consequently, ledgers have an append-only semantics;
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ <emphasis role="bold">BookKeeper client</emphasis>: A client runs along with a BookKeeper application, and it enables applications
+ to execute operations on ledgers, such as creating a ledger and writing to it;
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ <emphasis role="bold">Bookie</emphasis>: A bookie is a BookKeeper storage server. Bookies store the content of ledgers. For any given
+ ledger L, we call an <emphasis>ensemble</emphasis> the group of bookies storing the content of L. For performance, we store on
+ each bookie of an ensemble only a fragment of a ledger. That is, we stripe when writing entries to a ledger such that
+ each entry is written to sub-group of bookies of the ensemble.
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ <emphasis role="bold">Metadata storage service</emphasis>: BookKeeper requires a metadata storage service to store information related
+ to ledgers and available bookies. We currently use ZooKeeper for such a task.
+ </para>
+ </listitem>
+ </itemizedlist>
+ </section>
+
+ <section id="bk_initialDesign">
+ <title>Bookkeeper initial design</title>
+ <para>
+ A set of bookies implements BookKeeper, and we use a quorum-based protocol to replicate data across the bookies.
+ There are basically two operations to an existing ledger: read and append. Here is the complete API list
+ (mode detail <ulink url="bookkeeperProgrammer.html">
+ here</ulink>):
+ </para>
+
+ <itemizedlist>
+ <listitem>
+ <para>
+ Create ledger: creates a new empty ledger;
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ Open ledger: opens an existing ledger for reading;
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ Add entry: adds a record to a ledger either synchronously or asynchronously;
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ Read entries: reads a sequence of entries from a ledger either synchronously or asynchronously
+ </para>
+ </listitem>
+ </itemizedlist>
+
+ <para>
+ There is only a single client that can write to a ledger. Once that ledger is closed or the client fails,
+ no more entries can be added. (We take advantage of this behavior to provide our strong guarantees.)
+ There will not be gaps in the ledger. Fingers get broken, people get roughed up or end up in prison when
+ books are manipulated, so there is no deleting or changing of entries.
+ </para>
+
+ <figure>
+ <title>BookKeeper Overview</title>
+
+ <mediaobject>
+ <imageobject>
+ <imagedata fileref="images/bk-overview.jpg" width="3in" depth="3in" contentwidth="3in" contentdepth="3in" scalefit="0"/>
+ </imageobject>
+ </mediaobject>
+ </figure>
+
+ <para>
+ A simple use of BooKeeper is to implement a write-ahead transaction log. A server maintains an in-memory data structure
+ (with periodic snapshots for example) and logs changes to that structure before it applies the change. The application
+ server creates a ledger at startup and store the ledger id and password in a well known place (ZooKeeper maybe). When
+ it needs to make a change, the server adds an entry with the change information to a ledger and apply the change when
+ BookKeeper adds the entry successfully. The server can even use asyncAddEntry to queue up many changes for high change
+ throughput. BooKeeper meticulously logs the changes in order and call the completion functions in order.
+ </para>
+
+ <para>
+ When the application server dies, a backup server will come online, get the last snapshot and then it will open the
+ ledger of the old server and read all the entries from the time the snapshot was taken. (Since it doesn't know the
+ last entry number it will use MAX_INTEGER). Once all the entries have been processed, it will close the ledger and
+ start a new one for its use.
+ </para>
+
+ <para>
+ A client library takes care of communicating with bookies and managing entry numbers. An entry has the following fields:
+ </para>
+
+ <table frame='all'><title>Entry fields</title>
+ <tgroup cols='3' align='left' colsep='1' rowsep='1'>
+ <colspec colname='Field'/>
+ <colspec colname='Type'/>
+ <colspec colname='Description'/>
+ <colspec colnum='5' colname='c5'/>
+ <thead>
+ <row>
+ <entry>Field</entry>
+ <entry>Type</entry>
+ <entry>Description</entry>
+ </row>
+ </thead>
+ <tfoot>
+ <row>
+ <entry>Ledger number</entry>
+ <entry>long</entry>
+ <entry>The id of the ledger of this entry</entry>
+ </row>
+ <row>
+ <entry>Entry number</entry>
+ <entry>long</entry>
+ <entry>The id of this entry</entry>
+ </row>
+ </tfoot>
+ <tbody>
+ <row>
+ <entry>last confirmed (<emphasis>LC</emphasis>)</entry>
+ <entry>long</entry>
+ <entry>id of the last recorded entry</entry>
+ </row>
+ <row>
+ <entry>data</entry>
+ <entry>byte[]</entry>
+ <entry>the entry data (supplied by application)</entry>
+ </row>
+ <row>
+ <entry>authentication code</entry>
+ <entry>byte[]</entry>
+ <entry>Message authentication code that includes all other fields of the entry</entry>
+ </row>
+
+ </tbody>
+ </tgroup>
+ </table>
+
+ <para>
+ The client library generates a ledger entry. None of the fields are modified by the bookies and only the first three
+ fields are interpreted by the bookies.
+ </para>
+
+ <para>
+ To add to a ledger, the client generates the entry above using the ledger number. The entry number will be one more
+ than the last entry generated. The <emphasis>LC</emphasis> field contains the last entry that has been successfully recorded by BookKeeper.
+ If the client writes entries one at a time, <emphasis>LC</emphasis> is the last entry id. But, if the client is using asyncAddEntry, there
+ may be many entries in flight. An entry is considered recorded when both of the following conditions are met:
+ </para>
+
+ <itemizedlist>
+ <listitem>
+ <para>
+ the entry has been accepted by a quorum of bookies
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ all entries with a lower entry id have been accepted by a quorum of bookies
+ </para>
+ </listitem>
+ </itemizedlist>
+
+ <para>
+ <emphasis>LC</emphasis> seems mysterious right now, but it is too early to explain how we use it; just smile and move on.
+ </para>
+
+ <para>
+ Once all the other fields have been field in, the client generates an authentication code with all of the previous fields.
+ The entry is then sent to a quorum of bookies to be recorded. Any failures will result in the entry being sent to a new
+ quorum of bookies.
+ </para>
+
+ <para>
+ To read, the client library initially contacts a bookie and starts requesting entries. If an entry is missing or
+ invalid (a bad MAC for example), the client will make a request to a different bookie. By using quorum writes,
+ as long as enough bookies are up we are guaranteed to eventually be able to read an entry.
+ </para>
+
+ </section>
+
+ <section id="bk_metadata">
+ <title>Bookkeeper metadata management</title>
+
+ <para>
+ There are some meta data that needs to be made available to BookKeeper clients:
+ </para>
+
+ <itemizedlist>
+ <listitem>
+ <para>
+ The available bookies;
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ The list of ledgers;
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ The list of bookies that have been used for a given ledger;
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ The last entry of a ledger;
+ </para>
+ </listitem>
+ </itemizedlist>
+
+ <para>
+ We maintain this information in ZooKeeper. Bookies use ephemeral nodes to indicate their availability. Clients
+ use znodes to track ledger creation and deletion and also to know the end of the ledger and the bookies that
+ were used to store the ledger. Bookies also watch the ledger list so that they can cleanup ledgers that get deleted.
+ </para>
+
+ </section>
+
+ <section id="bk_closingOut">
+ <title>Closing out ledgers</title>
+
+ <para>
+ The process of closing out the ledger and finding the last ledger is difficult due to the durability guarantees of BookKeeper:
+ </para>
+
+ <itemizedlist>
+ <listitem>
+ <para>
+ If an entry has been successfully recorded, it must be readable.
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ If an entry is read once, it must always be available to be read.
+ </para>
+ </listitem>
+ </itemizedlist>
+
+ <para>
+ If the ledger was closed gracefully, ZooKeeper will have the last entry and everything will work well. But, if the
+ BookKeeper client that was writing the ledger dies, there is some recovery that needs to take place.
+ </para>
+
+ <para>
+ The problematic entries are the ones at the end of the ledger. There can be entries in flight when a BookKeeper client
+ dies. If the entry only gets to one bookie, the entry should not be readable since the entry will disappear if that bookie
+ fails. If the entry is only on one bookie, that doesn't mean that the entry has not been recorded successfully; the other
+ bookies that recorded the entry might have failed.
+ </para>
+
+ <para>
+ The trick to making everything work is to have a correct idea of a last entry. We do it in roughly three steps:
+ </para>
+ <orderedlist>
+ <listitem>
+ <para>
+ Find the entry with the highest last recorded entry, <emphasis>LC</emphasis>;
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ Find the highest consecutively recorded entry, <emphasis>LR</emphasis>;
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ Make sure that all entries between <emphasis>LC</emphasis> and <emphasis>LR</emphasis> are on a quorum of bookies;
+ </para>
+ </listitem>
+
+ </orderedlist>
+ </section>
+ </section>
+</article>
\ No newline at end of file
http://git-wip-us.apache.org/repos/asf/zookeeper/blob/c1efa954/zookeeper-docs/src/documentation/content/xdocs/bookkeeperProgrammer.xml
----------------------------------------------------------------------
diff --git a/zookeeper-docs/src/documentation/content/xdocs/bookkeeperProgrammer.xml b/zookeeper-docs/src/documentation/content/xdocs/bookkeeperProgrammer.xml
new file mode 100644
index 0000000..5f330e1
--- /dev/null
+++ b/zookeeper-docs/src/documentation/content/xdocs/bookkeeperProgrammer.xml
@@ -0,0 +1,678 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!--
+ Copyright 2002-2004 The Apache Software Foundation
+
+ Licensed under the Apache License, Version 2.0 (the "License");
+ you may not use this file except in compliance with the License.
+ You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ See the License for the specific language governing permissions and
+ limitations under the License.
+-->
+
+<!DOCTYPE article PUBLIC "-//OASIS//DTD Simplified DocBook XML V1.0//EN"
+"http://www.oasis-open.org/docbook/xml/simple/1.0/sdocbook.dtd">
+<article id="bk_GettStartedGuide">
+ <title>BookKeeper Getting Started Guide</title>
+
+ <articleinfo>
+ <legalnotice>
+ <para>Licensed under the Apache License, Version 2.0 (the "License");
+ you may not use this file except in compliance with the License. You may
+ obtain a copy of the License at <ulink
+ url="http://www.apache.org/licenses/LICENSE-2.0">http://www.apache.org/licenses/LICENSE-2.0</ulink>.</para>
+
+ <para>Unless required by applicable law or agreed to in writing,
+ software distributed under the License is distributed on an "AS IS"
+ BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
+ implied. See the License for the specific language governing permissions
+ and limitations under the License.</para>
+ </legalnotice>
+
+ <abstract>
+ <para>This guide contains detailed information about using BookKeeper
+ for logging. It discusses the basic operations BookKeeper supports,
+ and how to create logs and perform basic read and write operations on these
+ logs.</para>
+ </abstract>
+ </articleinfo>
+ <section id="bk_GettingStarted">
+ <title>Programming with BookKeeper</title>
+
+ <itemizedlist>
+ <listitem>
+ <para><xref linkend="bk_instance" /></para>
+ </listitem>
+
+ <listitem>
+ <para><xref linkend="bk_createLedger" /></para>
+ </listitem>
+
+ <listitem>
+ <para><xref linkend="bk_writeLedger" /></para>
+ </listitem>
+
+ <listitem>
+ <para><xref linkend="bk_closeLedger" /></para>
+ </listitem>
+
+ <listitem>
+ <para><xref linkend="bk_openLedger" /></para>
+ </listitem>
+
+ <listitem>
+ <para><xref linkend="bk_readLedger" /></para>
+ </listitem>
+
+ <listitem>
+ <para><xref linkend="bk_deleteLedger" /></para>
+ </listitem>
+
+ </itemizedlist>
+
+ <section id="bk_instance">
+ <title> Instantiating BookKeeper.</title>
+ <para>
+ The first step to use BookKeeper is to instantiate a BookKeeper object:
+ </para>
+ <para>
+ <computeroutput>
+ org.apache.bookkeeper.BookKeeper
+ </computeroutput>
+ </para>
+
+ <para>
+ There are three BookKeeper constructors:
+ </para>
+
+ <para>
+ <computeroutput>
+ public BookKeeper(String servers)
+ throws KeeperException, IOException
+ </computeroutput>
+ </para>
+
+ <para>
+ where:
+ </para>
+ <itemizedlist>
+ <listitem>
+ <para>
+ <computeroutput>servers</computeroutput> is a comma-separated list of ZooKeeper servers.
+ </para>
+ </listitem>
+ </itemizedlist>
+
+ <para>
+ <computeroutput>
+ public BookKeeper(ZooKeeper zk)
+ throws InterruptedException, KeeperException
+ </computeroutput>
+ </para>
+
+ <para>
+ where:
+ </para>
+ <itemizedlist>
+ <listitem>
+ <para>
+ <computeroutput>zk</computeroutput> is a ZooKeeper object. This constructor is useful when
+ the application also using ZooKeeper and wants to have a single instance of ZooKeeper.
+ </para>
+ </listitem>
+ </itemizedlist>
+
+
+ <para>
+ <computeroutput>
+ public BookKeeper(ZooKeeper zk, ClientSocketChannelFactory channelFactory)
+ throws InterruptedException, KeeperException
+ </computeroutput>
+ </para>
+
+ <para>
+ where:
+ </para>
+ <itemizedlist>
+ <listitem>
+ <para>
+ <computeroutput>zk</computeroutput> is a ZooKeeper object. This constructor is useful when
+ the application also using ZooKeeper and wants to have a single instance of ZooKeeper.
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ <computeroutput>channelFactory</computeroutput> is a netty channel object
+ (<computeroutput>org.jboss.netty.channel.socket</computeroutput>).
+ </para>
+ </listitem>
+ </itemizedlist>
+
+
+
+ </section>
+
+ <section id="bk_createLedger">
+ <title> Creating a ledger. </title>
+
+ <para> Before writing entries to BookKeeper, it is necessary to create a ledger.
+ With the current BookKeeper API, it is possible to create a ledger both synchronously
+ or asynchronously. The following methods belong
+ to <computeroutput>org.apache.bookkeeper.client.BookKeeper</computeroutput>.
+ </para>
+
+ <para>
+ <emphasis role="bold">Synchronous call:</emphasis>
+ </para>
+
+ <para>
+ <computeroutput>
+ public LedgerHandle createLedger(int ensSize, int qSize, DigestType type, byte passwd[])
+ throws KeeperException, InterruptedException,
+ IOException, BKException
+ </computeroutput>
+ </para>
+
+ <para>
+ where:
+ </para>
+ <itemizedlist>
+ <listitem>
+ <para>
+ <computeroutput>ensSize</computeroutput> is the number of bookies (ensemble size);
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ <computeroutput>qSize</computeroutput> is the write quorum size;
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ <computeroutput>type</computeroutput> is the type of digest used with entries: either MAC or CRC32.
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ <computeroutput>passwd</computeroutput> is a password that authorizes the client to write to the
+ ledger being created.
+ </para>
+ </listitem>
+ </itemizedlist>
+
+ <para>
+ All further operations on a ledger are invoked through the <computeroutput>LedgerHandle</computeroutput>
+ object returned.
+ </para>
+
+ <para>
+ As a convenience, we provide a <computeroutput>createLedger</computeroutput> with default parameters (3,2,VERIFIABLE),
+ and the only two input parameters it requires are a digest type and a password.
+ </para>
+
+ <para>
+ <emphasis role="bold">Asynchronous call:</emphasis>
+ </para>
+
+ <para>
+ <computeroutput>
+ public void asyncCreateLedger(int ensSize,
+ int qSize,
+ DigestType type,
+ byte passwd[],
+ CreateCallback cb,
+ Object ctx
+ )
+ </computeroutput>
+ </para>
+
+ <para>
+ The parameters are the same of the synchronous version, with the
+ exception of <computeroutput>cb</computeroutput> and <computeroutput>ctx</computeroutput>. <computeroutput>CreateCallback</computeroutput>
+ is an interface in <computeroutput>org.apache.bookkeeper.client.AsyncCallback</computeroutput>, and
+ a class implementing it has to implement a method called <computeroutput>createComplete</computeroutput>
+ that has the following signature:
+ </para>
+
+ <para>
+ <computeroutput>
+ void createComplete(int rc, LedgerHandle lh, Object ctx);
+ </computeroutput>
+ </para>
+
+ <para>
+ where:
+ </para>
+ <itemizedlist>
+ <listitem>
+ <para>
+ <computeroutput>rc</computeroutput> is a return code (please refer to <computeroutput>org.apache.bookeeper.client.BKException</computeroutput> for a list);
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ <computeroutput>lh</computeroutput> is a <computeroutput>LedgerHandle</computeroutput> object to manipulate a ledger;
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ <computeroutput>ctx</computeroutput> is a control object for accountability purposes. It can be essentially any object the application is happy with.
+ </para>
+ </listitem>
+ </itemizedlist>
+
+ <para>
+ The <computeroutput>ctx</computeroutput> object passed as a parameter to the call to create a ledger
+ is the one same returned in the callback.
+ </para>
+ </section>
+
+ <section id="bk_writeLedger">
+ <title> Adding entries to a ledger. </title>
+ <para>
+ Once we have a ledger handle <computeroutput>lh</computeroutput> obtained through a call to create a ledger, we
+ can start writing entries. As with creating ledgers, we can write both synchronously and
+ asynchronously. The following methods belong
+ to <computeroutput>org.apache.bookkeeper.client.LedgerHandle</computeroutput>.
+ </para>
+
+ <para>
+ <emphasis role="bold">Synchronous call:</emphasis>
+ </para>
+
+ <para>
+ <computeroutput>
+ public long addEntry(byte[] data)
+ throws InterruptedException
+ </computeroutput>
+ </para>
+
+ <para>
+ where:
+ </para>
+
+ <itemizedlist>
+ <listitem>
+ <para>
+ <computeroutput>data</computeroutput> is a byte array;
+ </para>
+ </listitem>
+ </itemizedlist>
+
+ <para>
+ A call to <computeroutput>addEntry</computeroutput> returns the status of the operation (please refer to <computeroutput>org.apache.bookeeper.client.BKDefs</computeroutput> for a list);
+ </para>
+
+ <para>
+ <emphasis role="bold">Asynchronous call:</emphasis>
+ </para>
+
+ <para>
+ <computeroutput>
+ public void asyncAddEntry(byte[] data, AddCallback cb, Object ctx)
+ </computeroutput>
+ </para>
+
+ <para>
+ It also takes a byte array as the sequence of bytes to be stored as an entry. Additionaly, it takes
+ a callback object <computeroutput>cb</computeroutput> and a control object <computeroutput>ctx</computeroutput>. The callback object must implement
+ the <computeroutput>AddCallback</computeroutput> interface in <computeroutput>org.apache.bookkeeper.client.AsyncCallback</computeroutput>, and
+ a class implementing it has to implement a method called <computeroutput>addComplete</computeroutput>
+ that has the following signature:
+ </para>
+
+ <para>
+ <computeroutput>
+ void addComplete(int rc, LedgerHandle lh, long entryId, Object ctx);
+ </computeroutput>
+ </para>
+
+ <para>
+ where:
+ </para>
+ <itemizedlist>
+ <listitem>
+ <para>
+ <computeroutput>rc</computeroutput> is a return code (please refer to <computeroutput>org.apache.bookeeper.client.BKDefs</computeroutput> for a list);
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ <computeroutput>lh</computeroutput> is a <computeroutput>LedgerHandle</computeroutput> object to manipulate a ledger;
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ <computeroutput>entryId</computeroutput> is the identifier of entry associated with this request;
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ <computeroutput>ctx</computeroutput> is control object used for accountability purposes. It can be any object the application is happy with.
+ </para>
+ </listitem>
+ </itemizedlist>
+ </section>
+
+ <section id="bk_closeLedger">
+ <title> Closing a ledger. </title>
+ <para>
+ Once a client is done writing, it closes the ledger. The following methods belong
+ to <computeroutput>org.apache.bookkeeper.client.LedgerHandle</computeroutput>.
+ </para>
+ <para>
+ <emphasis role="bold">Synchronous close:</emphasis>
+ </para>
+
+ <para>
+ <computeroutput>
+ public void close()
+ throws InterruptedException
+ </computeroutput>
+ </para>
+
+ <para>
+ It takes no input parameters.
+ </para>
+
+ <para>
+ <emphasis role="bold">Asynchronous close:</emphasis>
+ </para>
+ <para>
+ <computeroutput>
+ public void asyncClose(CloseCallback cb, Object ctx)
+ throws InterruptedException
+ </computeroutput>
+ </para>
+
+ <para>
+ It takes a callback object <computeroutput>cb</computeroutput> and a control object <computeroutput>ctx</computeroutput>. The callback object must implement
+ the <computeroutput>CloseCallback</computeroutput> interface in <computeroutput>org.apache.bookkeeper.client.AsyncCallback</computeroutput>, and
+ a class implementing it has to implement a method called <computeroutput>closeComplete</computeroutput>
+ that has the following signature:
+ </para>
+
+ <para>
+ <computeroutput>
+ void closeComplete(int rc, LedgerHandle lh, Object ctx)
+ </computeroutput>
+ </para>
+
+ <para>
+ where:
+ </para>
+ <itemizedlist>
+ <listitem>
+ <para>
+ <computeroutput>rc</computeroutput> is a return code (please refer to <computeroutput>org.apache.bookeeper.client.BKDefs</computeroutput> for a list);
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ <computeroutput>lh</computeroutput> is a <computeroutput>LedgerHandle</computeroutput> object to manipulate a ledger;
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ <computeroutput>ctx</computeroutput> is control object used for accountability purposes.
+ </para>
+ </listitem>
+ </itemizedlist>
+
+ </section>
+
+ <section id="bk_openLedger">
+ <title> Opening a ledger. </title>
+ <para>
+ To read from a ledger, a client must open it first. The following methods belong
+ to <computeroutput>org.apache.bookkeeper.client.BookKeeper</computeroutput>.
+ </para>
+
+ <para>
+ <emphasis role="bold">Synchronous open:</emphasis>
+ </para>
+
+ <para>
+ <computeroutput>
+ public LedgerHandle openLedger(long lId, DigestType type, byte passwd[])
+ throws InterruptedException, BKException
+ </computeroutput>
+ </para>
+
+ <itemizedlist>
+ <listitem>
+ <para>
+ <computeroutput>ledgerId</computeroutput> is the ledger identifier;
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ <computeroutput>type</computeroutput> is the type of digest used with entries: either MAC or CRC32.
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ <computeroutput>passwd</computeroutput> is a password to access the ledger (used only in the case of <computeroutput>VERIFIABLE</computeroutput> ledgers);
+ </para>
+ </listitem>
+ </itemizedlist>
+
+ <para>
+ <emphasis role="bold">Asynchronous open:</emphasis>
+ </para>
+ <para>
+ <computeroutput>
+ public void asyncOpenLedger(long lId, DigestType type, byte passwd[], OpenCallback cb, Object ctx)
+ </computeroutput>
+ </para>
+
+ <para>
+ It also takes a a ledger identifier and a password. Additionaly, it takes a callback object
+ <computeroutput>cb</computeroutput> and a control object <computeroutput>ctx</computeroutput>. The callback object must implement
+ the <computeroutput>OpenCallback</computeroutput> interface in <computeroutput>org.apache.bookkeeper.client.AsyncCallback</computeroutput>, and
+ a class implementing it has to implement a method called <computeroutput>openComplete</computeroutput>
+ that has the following signature:
+ </para>
+
+ <para>
+ <computeroutput>
+ public void openComplete(int rc, LedgerHandle lh, Object ctx)
+ </computeroutput>
+ </para>
+
+ <para>
+ where:
+ </para>
+ <itemizedlist>
+ <listitem>
+ <para>
+ <computeroutput>rc</computeroutput> is a return code (please refer to <computeroutput>org.apache.bookeeper.client.BKDefs</computeroutput> for a list);
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ <computeroutput>lh</computeroutput> is a <computeroutput>LedgerHandle</computeroutput> object to manipulate a ledger;
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ <computeroutput>ctx</computeroutput> is control object used for accountability purposes.
+ </para>
+ </listitem>
+ </itemizedlist>
+ </section>
+
+ <section id="bk_readLedger">
+ <title> Reading from ledger </title>
+ <para>
+ Read calls may request one or more consecutive entries. The following methods belong
+ to <computeroutput>org.apache.bookkeeper.client.LedgerHandle</computeroutput>.
+ </para>
+
+ <para>
+ <emphasis role="bold">Synchronous read:</emphasis>
+ </para>
+
+ <para>
+ <computeroutput>
+ public Enumeration<LedgerEntry> readEntries(long firstEntry, long lastEntry)
+ throws InterruptedException, BKException
+ </computeroutput>
+ </para>
+
+ <itemizedlist>
+ <listitem>
+ <para>
+ <computeroutput>firstEntry</computeroutput> is the identifier of the first entry in the sequence of entries to read;
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ <computeroutput>lastEntry</computeroutput> is the identifier of the last entry in the sequence of entries to read.
+ </para>
+ </listitem>
+ </itemizedlist>
+
+ <para>
+ <emphasis role="bold">Asynchronous read:</emphasis>
+ </para>
+ <para>
+ <computeroutput>
+ public void asyncReadEntries(long firstEntry,
+ long lastEntry, ReadCallback cb, Object ctx)
+ throws BKException, InterruptedException
+ </computeroutput>
+ </para>
+
+ <para>
+ It also takes a first and a last entry identifiers. Additionaly, it takes a callback object
+ <computeroutput>cb</computeroutput> and a control object <computeroutput>ctx</computeroutput>. The callback object must implement
+ the <computeroutput>ReadCallback</computeroutput> interface in <computeroutput>org.apache.bookkeeper.client.AsyncCallback</computeroutput>, and
+ a class implementing it has to implement a method called <computeroutput>readComplete</computeroutput>
+ that has the following signature:
+ </para>
+
+ <para>
+ <computeroutput>
+ void readComplete(int rc, LedgerHandle lh, Enumeration<LedgerEntry> seq, Object ctx)
+ </computeroutput>
+ </para>
+
+ <para>
+ where:
+ </para>
+ <itemizedlist>
+ <listitem>
+ <para>
+ <computeroutput>rc</computeroutput> is a return code (please refer to <computeroutput>org.apache.bookeeper.client.BKDefs</computeroutput> for a list);
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ <computeroutput>lh</computeroutput> is a <computeroutput>LedgerHandle</computeroutput> object to manipulate a ledger;
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ <computeroutput>seq</computeroutput> is a <computeroutput>Enumeration<LedgerEntry> </computeroutput> object to containing the list of entries requested;
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ <computeroutput>ctx</computeroutput> is control object used for accountability purposes.
+ </para>
+ </listitem>
+ </itemizedlist>
+ </section>
+
+ <section id="bk_deleteLedger">
+ <title> Deleting a ledger </title>
+ <para>
+ Once a client is done with a ledger and is sure that nobody will ever need to read from it again, they can delete the ledger.
+ The following methods belong to <computeroutput>org.apache.bookkeeper.client.BookKeeper</computeroutput>.
+ </para>
+
+ <para>
+ <emphasis role="bold">Synchronous delete:</emphasis>
+ </para>
+
+ <para>
+ <computeroutput>
+ public void deleteLedger(long lId) throws InterruptedException, BKException
+ </computeroutput>
+ </para>
+
+ <itemizedlist>
+ <listitem>
+ <para>
+ <computeroutput>lId</computeroutput> is the ledger identifier;
+ </para>
+ </listitem>
+ </itemizedlist>
+
+ <para>
+ <emphasis role="bold">Asynchronous delete:</emphasis>
+ </para>
+ <para>
+ <computeroutput>
+ public void asyncDeleteLedger(long lId, DeleteCallback cb, Object ctx)
+ </computeroutput>
+ </para>
+
+ <para>
+ It takes a ledger identifier. Additionally, it takes a callback object
+ <computeroutput>cb</computeroutput> and a control object <computeroutput>ctx</computeroutput>. The callback object must implement
+ the <computeroutput>DeleteCallback</computeroutput> interface in <computeroutput>org.apache.bookkeeper.client.AsyncCallback</computeroutput>, and
+ a class implementing it has to implement a method called <computeroutput>deleteComplete</computeroutput>
+ that has the following signature:
+ </para>
+
+ <para>
+ <computeroutput>
+ void deleteComplete(int rc, Object ctx)
+ </computeroutput>
+ </para>
+
+ <para>
+ where:
+ </para>
+ <itemizedlist>
+ <listitem>
+ <para>
+ <computeroutput>rc</computeroutput> is a return code (please refer to <computeroutput>org.apache.bookeeper.client.BKDefs</computeroutput> for a list);
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ <computeroutput>ctx</computeroutput> is control object used for accountability purposes.
+ </para>
+ </listitem>
+ </itemizedlist>
+ </section>
+ </section>
+</article>
\ No newline at end of file
http://git-wip-us.apache.org/repos/asf/zookeeper/blob/c1efa954/zookeeper-docs/src/documentation/content/xdocs/bookkeeperStarted.xml
----------------------------------------------------------------------
diff --git a/zookeeper-docs/src/documentation/content/xdocs/bookkeeperStarted.xml b/zookeeper-docs/src/documentation/content/xdocs/bookkeeperStarted.xml
new file mode 100644
index 0000000..74f6f7e
--- /dev/null
+++ b/zookeeper-docs/src/documentation/content/xdocs/bookkeeperStarted.xml
@@ -0,0 +1,208 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!--
+ Copyright 2002-2004 The Apache Software Foundation
+
+ Licensed under the Apache License, Version 2.0 (the "License");
+ you may not use this file except in compliance with the License.
+ You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ See the License for the specific language governing permissions and
+ limitations under the License.
+-->
+
+<!DOCTYPE article PUBLIC "-//OASIS//DTD Simplified DocBook XML V1.0//EN"
+"http://www.oasis-open.org/docbook/xml/simple/1.0/sdocbook.dtd">
+<article id="bk_GettStartedGuide">
+ <title>BookKeeper Getting Started Guide</title>
+
+ <articleinfo>
+ <legalnotice>
+ <para>Licensed under the Apache License, Version 2.0 (the "License");
+ you may not use this file except in compliance with the License. You may
+ obtain a copy of the License at <ulink
+ url="http://www.apache.org/licenses/LICENSE-2.0">http://www.apache.org/licenses/LICENSE-2.0</ulink>.</para>
+
+ <para>Unless required by applicable law or agreed to in writing,
+ software distributed under the License is distributed on an "AS IS"
+ BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
+ implied. See the License for the specific language governing permissions
+ and limitations under the License.</para>
+ </legalnotice>
+
+ <abstract>
+ <para>This guide contains detailed information about using BookKeeper
+ for logging. It discusses the basic operations BookKeeper supports,
+ and how to create logs and perform basic read and write operations on these
+ logs.</para>
+ </abstract>
+ </articleinfo>
+ <section id="bk_GettingStarted">
+ <title>Getting Started: Setting up BookKeeper to write logs.</title>
+
+ <para>This document contains information to get you started quickly with
+ BookKeeper. It is aimed primarily at developers willing to try it out, and
+ contains simple installation instructions for a simple BookKeeper installation
+ and a simple programming example. For further programming detail, please refer to
+ <ulink url="bookkeeperProgrammer.html">BookKeeper Programmer's Guide</ulink>.
+ </para>
+
+ <section id="bk_Prerequisites">
+ <title>Pre-requisites</title>
+ <para>See <ulink url="bookkeeperConfig.html#bk_sysReq">
+ System Requirements</ulink> in the Admin guide.</para>
+ </section>
+
+ <section id="bk_Download">
+ <title>Download</title>
+ <para> BookKeeper is distributed along with ZooKeeper. To get a ZooKeeper distribution,
+ download a recent
+ <ulink url="http://zookeeper.apache.org/releases.html">
+ stable</ulink> release from one of the Apache Download
+ Mirrors.</para>
+ </section>
+
+ <section id="bk_localBK">
+ <title>LocalBookKeeper</title>
+ <para> Under org.apache.bookkeeper.util, you'll find a java program
+ called LocalBookKeeper.java that sets you up to run BookKeeper on a
+ single machine. This is far from ideal from a performance perspective,
+ but the program is useful for both test and educational purposes.
+ </para>
+ </section>
+
+ <section id="bk_setupBookies">
+ <title>Setting up bookies</title>
+ <para> If you're bold and you want more than just running things locally, then
+ you'll need to run bookies in different servers. You'll need at least three bookies
+ to start with.
+ </para>
+
+ <para>
+ For each bookie, we need to execute a command like the following:
+ </para>
+
+ <para><computeroutput>
+ java -cp .:./zookeeper-<version>-bookkeeper.jar:./zookeeper-<version>.jar\
+ :lib/slf4j-api-1.6.1.jar:lib/slf4j-log4j12-1.6.1.jar:lib/log4j-1.2.15.jar -Dlog4j.configuration=log4j.properties\
+ org.apache.bookkeeper.proto.BookieServer 3181 127.0.0.1:2181 /path_to_log_device/\
+ /path_to_ledger_device/
+ </computeroutput></para>
+
+ <para> "/path_to_log_device/" and "/path_to_ledger_device/" are different paths. Also, port 3181
+ is the port that a bookie listens on for connection requests from clients. 127.0.0.1:2181 is the hostname:port
+ for the ZooKeeper server. In this example, the standalone ZooKeeper server is running locally on port 2181.
+ If we had multiple ZooKeeper servers, this parameter would be a comma separated list of all the hostname:port
+ values corresponding to them.
+ </para>
+ </section>
+
+ <section id="bk_setupZK">
+ <title>Setting up ZooKeeper</title>
+ <para> ZooKeeper stores metadata on behalf of BookKeeper clients and bookies. To get a minimal
+ ZooKeeper installation to work with BookKeeper, we can set up one server running in
+ standalone mode. Once we have the server running, we need to create a few znodes:
+ </para>
+
+ <orderedlist>
+ <listitem>
+ <para><computeroutput>
+ /ledgers
+ </computeroutput></para>
+ </listitem>
+
+ <listitem>
+ <para><computeroutput>
+ /ledgers/available
+ </computeroutput></para>
+ </listitem>
+
+ <listitem>
+ <para> For each bookie, we add one znode such that the name of the znode is the
+ concatenation of the machine name and the port number that the bookie is
+ listening on. For example, if a bookie is running on bookie.foo.com an is listening
+ on port 3181, we add a znode
+ <computeroutput>/ledgers/available/bookie.foo.com:3181</computeroutput>.
+ </para>
+ </listitem>
+ </orderedlist>
+ </section>
+
+ <section id="bk_example">
+ <title>Example</title>
+ <para>
+ In the following excerpt of code, we:
+ </para>
+
+ <orderedlist>
+ <listitem>
+ <para>
+ Create a ledger;
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ Write to the ledger;
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ Close the ledger;
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ Open the same ledger for reading;
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ Read from the ledger;
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ Close the ledger again;
+ </para>
+ </listitem>
+ </orderedlist>
+
+ <programlisting>
+LedgerHandle lh = bkc.createLedger(ledgerPassword);
+ledgerId = lh.getId();
+ByteBuffer entry = ByteBuffer.allocate(4);
+
+for(int i = 0; i < 10; i++){
+ entry.putInt(i);
+ entry.position(0);
+ entries.add(entry.array());
+ lh.addEntry(entry.array());
+}
+lh.close();
+lh = bkc.openLedger(ledgerId, ledgerPassword);
+
+Enumeration<LedgerEntry> ls = lh.readEntries(0, 9);
+int i = 0;
+while(ls.hasMoreElements()){
+ ByteBuffer origbb = ByteBuffer.wrap(
+ entries.get(i++));
+ Integer origEntry = origbb.getInt();
+ ByteBuffer result = ByteBuffer.wrap(
+ ls.nextElement().getEntry());
+
+ Integer retrEntry = result.getInt();
+}
+lh.close();
+ </programlisting>
+ </section>
+ </section>
+</article>
http://git-wip-us.apache.org/repos/asf/zookeeper/blob/c1efa954/zookeeper-docs/src/documentation/content/xdocs/bookkeeperStream.xml
----------------------------------------------------------------------
diff --git a/zookeeper-docs/src/documentation/content/xdocs/bookkeeperStream.xml b/zookeeper-docs/src/documentation/content/xdocs/bookkeeperStream.xml
new file mode 100644
index 0000000..9db605a
--- /dev/null
+++ b/zookeeper-docs/src/documentation/content/xdocs/bookkeeperStream.xml
@@ -0,0 +1,331 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!--
+ Copyright 2002-2004 The Apache Software Foundation
+
+ Licensed under the Apache License, Version 2.0 (the "License");
+ you may not use this file except in compliance with the License.
+ You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ See the License for the specific language governing permissions and
+ limitations under the License.
+-->
+
+<!DOCTYPE article PUBLIC "-//OASIS//DTD Simplified DocBook XML V1.0//EN"
+"http://www.oasis-open.org/docbook/xml/simple/1.0/sdocbook.dtd">
+<article id="bk_Stream">
+ <title>Streaming with BookKeeper</title>
+
+ <articleinfo>
+ <legalnotice>
+ <para>Licensed under the Apache License, Version 2.0 (the "License");
+ you may not use this file except in compliance with the License. You may
+ obtain a copy of the License at <ulink
+ url="http://www.apache.org/licenses/LICENSE-2.0">http://www.apache.org/licenses/LICENSE-2.0</ulink>.</para>
+
+ <para>Unless required by applicable law or agreed to in writing,
+ software distributed under the License is distributed on an "AS IS"
+ BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
+ implied. See the License for the specific language governing permissions
+ and limitations under the License.</para>
+ </legalnotice>
+
+ <abstract>
+ <para>This guide contains detailed information about using how to stream bytes
+ on top of BookKeeper. It essentially motivates and discusses the basic stream
+ operations currently supported.</para>
+ </abstract>
+ </articleinfo>
+ <section id="bk_StreamSummary">
+ <title>Summary</title>
+
+ <para>
+ When using the BookKeeper API, an application has to split the data to write into entries, each
+ entry being a byte array. This is natural for many applications. For example, when using BookKeeper
+ for write-ahead logging, an application typically wants to write the modifications corresponding
+ to a command or a transaction. Some other applications, however, might not have a natural boundary
+ for entries, and may prefer to write and read streams of bytes. This is exactly the purpose of the
+ stream API we have implemented on top of BookKeeper.
+ </para>
+
+ <para>
+ The stream API is implemented in the package <computeroutput>Streaming</computeroutput>, and it contains two main classes: <computeroutput>LedgerOutputStream</computeroutput> and
+ <computeroutput>LedgerInputStream</computeroutput>. The class names are indicative of what they do.
+ </para>
+ </section>
+
+ <section id="bk_LedgerOutputStream">
+ <title>Writing a stream of bytes</title>
+ <para>
+ Class <computeroutput>LedgerOutputStream</computeroutput> implements two constructors and five public methods:
+ </para>
+
+ <para>
+ <computeroutput>
+ public LedgerOutputStream(LedgerHandle lh)
+ </computeroutput>
+ </para>
+
+ <para>
+ where:
+ </para>
+ <itemizedlist>
+ <listitem>
+ <para>
+ <computeroutput>lh</computeroutput> is a ledger handle for a previously created and open ledger.
+ </para>
+ </listitem>
+ </itemizedlist>
+
+ <para>
+ <computeroutput>
+ public LedgerOutputStream(LedgerHandle lh, int size)
+ </computeroutput>
+ </para>
+
+ <para>
+ where:
+ </para>
+ <itemizedlist>
+ <listitem>
+ <para>
+ <computeroutput>lh</computeroutput> is a ledger handle for a previously created and open ledger.
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ <computeroutput>size</computeroutput> is the size of the byte buffer to store written bytes before flushing.
+ </para>
+ </listitem>
+ </itemizedlist>
+
+
+ <para>
+ <emphasis role="bold">Closing a stream.</emphasis> This call closes the stream by flushing the write buffer.
+ </para>
+ <para>
+ <computeroutput>
+ public void close()
+ </computeroutput>
+ </para>
+
+ <para>
+ which has no parameters.
+ </para>
+
+ <para>
+ <emphasis role="bold">Flushing a stream.</emphasis> This call essentially flushes the write buffer.
+ </para>
+ <para>
+ <computeroutput>
+ public synchronized void flush()
+ </computeroutput>
+ </para>
+
+ <para>
+ which has no parameters.
+ </para>
+
+ <para>
+ <emphasis role="bold">Writing bytes.</emphasis> There are three calls for writing bytes to a stream.
+ </para>
+
+ <para>
+ <computeroutput>
+ public synchronized void write(byte[] b)
+ </computeroutput>
+ </para>
+
+ <para>
+ where:
+ </para>
+ <itemizedlist>
+ <listitem>
+ <para>
+ <computeroutput>b</computeroutput> is an array of bytes to write.
+ </para>
+ </listitem>
+ </itemizedlist>
+
+ <para>
+ <computeroutput>
+ public synchronized void write(byte[] b, int off, int len)
+ </computeroutput>
+ </para>
+
+ <para>
+ where:
+ </para>
+ <itemizedlist>
+ <listitem>
+ <para>
+ <computeroutput>b</computeroutput> is an array of bytes to write.
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ <computeroutput>off</computeroutput> is a buffer offset.
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ <computeroutput>len</computeroutput> is the length to write.
+ </para>
+ </listitem>
+ </itemizedlist>
+
+ <para>
+ <computeroutput>
+ public synchronized void write(int b)
+ </computeroutput>
+ </para>
+
+ <para>
+ where:
+ </para>
+ <itemizedlist>
+ <listitem>
+ <para>
+ <computeroutput>b</computeroutput> contains a byte to write. The method writes the least significant byte of the integer four bytes.
+ </para>
+ </listitem>
+ </itemizedlist>
+ </section>
+
+ <section id="bk_LedgerInputStream">
+ <title>Reading a stream of bytes</title>
+
+ <para>
+ Class <computeroutput>LedgerOutputStream</computeroutput> implements two constructors and four public methods:
+ </para>
+
+ <para>
+ <computeroutput>
+ public LedgerInputStream(LedgerHandle lh)
+ throws BKException, InterruptedException
+ </computeroutput>
+ </para>
+
+ <para>
+ where:
+ </para>
+ <itemizedlist>
+ <listitem>
+ <para>
+ <computeroutput>lh</computeroutput> is a ledger handle for a previously created and open ledger.
+ </para>
+ </listitem>
+ </itemizedlist>
+
+ <para>
+ <computeroutput>
+ public LedgerInputStream(LedgerHandle lh, int size)
+ throws BKException, InterruptedException
+ </computeroutput>
+ </para>
+
+ <para>
+ where:
+ </para>
+ <itemizedlist>
+ <listitem>
+ <para>
+ <computeroutput>lh</computeroutput> is a ledger handle for a previously created and open ledger.
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ <computeroutput>size</computeroutput> is the size of the byte buffer to store bytes that the application
+ will eventually read.
+ </para>
+ </listitem>
+ </itemizedlist>
+
+ <para>
+ <emphasis role="bold">Closing.</emphasis> There is one call to close an input stream, but the call
+ is currently empty and the application is responsible for closing the ledger handle.
+ </para>
+ <para>
+ <computeroutput>
+ public void close()
+ </computeroutput>
+ </para>
+
+ <para>
+ which has no parameters.
+ </para>
+
+ <para>
+ <emphasis role="bold">Reading.</emphasis> There are three calls to read from the stream.
+ </para>
+ <para>
+ <computeroutput>
+ public synchronized int read()
+ throws IOException
+ </computeroutput>
+ </para>
+
+ <para>
+ which has no parameters.
+ </para>
+
+ <para>
+ <computeroutput>
+ public synchronized int read(byte[] b)
+ throws IOException
+ </computeroutput>
+ </para>
+
+ <para>
+ where:
+ </para>
+ <itemizedlist>
+ <listitem>
+ <para>
+ <computeroutput>b</computeroutput> is a byte array to write to.
+ </para>
+ </listitem>
+ </itemizedlist>
+
+
+ <para>
+ <computeroutput>
+ public synchronized int read(byte[] b, int off, int len)
+ throws IOException
+ </computeroutput>
+ </para>
+
+ <para>
+ where:
+ </para>
+ <itemizedlist>
+ <listitem>
+ <para>
+ <computeroutput>b</computeroutput> is a byte array to write to.
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ <computeroutput>off</computeroutput> is an offset for byte array <computeroutput>b</computeroutput>.
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ <computeroutput>len</computeroutput> is the length in bytes to write to <computeroutput>b</computeroutput>.
+ </para>
+ </listitem>
+ </itemizedlist>
+
+
+ </section>
+ </article>
\ No newline at end of file
http://git-wip-us.apache.org/repos/asf/zookeeper/blob/c1efa954/zookeeper-docs/src/documentation/content/xdocs/index.xml
----------------------------------------------------------------------
diff --git a/zookeeper-docs/src/documentation/content/xdocs/index.xml b/zookeeper-docs/src/documentation/content/xdocs/index.xml
new file mode 100644
index 0000000..8ed4702
--- /dev/null
+++ b/zookeeper-docs/src/documentation/content/xdocs/index.xml
@@ -0,0 +1,98 @@
+<?xml version="1.0"?>
+<!--
+ Copyright 2002-2004 The Apache Software Foundation
+
+ Licensed under the Apache License, Version 2.0 (the "License");
+ you may not use this file except in compliance with the License.
+ You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ See the License for the specific language governing permissions and
+ limitations under the License.
+-->
+
+<!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V2.0//EN" "http://forrest.apache.org/dtd/document-v20.dtd">
+
+<document>
+
+ <header>
+ <title>ZooKeeper: Because Coordinating Distributed Systems is a Zoo</title>
+ </header>
+
+ <body>
+ <p>ZooKeeper is a high-performance coordination service for
+ distributed applications. It exposes common services - such as
+ naming, configuration management, synchronization, and group
+ services - in a simple interface so you don't have to write them
+ from scratch. You can use it off-the-shelf to implement
+ consensus, group management, leader election, and presence
+ protocols. And you can build on it for your own, specific needs.
+ </p>
+
+ <p>
+ The following documents describe concepts and procedures to get
+ you started using ZooKeeper. If you have more questions, please
+ ask the <a href="ext:lists">mailing list</a> or browse the
+ archives.
+ </p>
+ <ul>
+
+ <li><strong>ZooKeeper Overview</strong><p>Technical Overview Documents for Client Developers, Adminstrators, and Contributors</p>
+ <ul><li><a href="zookeeperOver.html">Overview</a> - a bird's eye view of ZooKeeper, including design concepts and architecture</li>
+ <li><a href="zookeeperStarted.html">Getting Started</a> - a tutorial-style guide for developers to install, run, and program to ZooKeeper</li>
+ <li><a href="ext:relnotes">Release Notes</a> - new developer and user facing features, improvements, and incompatibilities</li>
+ </ul>
+ </li>
+
+ <li><strong>Developers</strong><p> Documents for Developers using the ZooKeeper Client API</p>
+ <ul>
+ <li><a href="ext:api/index">API Docs</a> - the technical reference to ZooKeeper Client APIs</li>
+ <li><a href="zookeeperProgrammers.html">Programmer's Guide</a> - a client application developer's guide to ZooKeeper</li>
+ <li><a href="javaExample.html">ZooKeeper Java Example</a> - a simple Zookeeper client appplication, written in Java</li>
+ <li><a href="zookeeperTutorial.html">Barrier and Queue Tutorial</a> - sample implementations of barriers and queues</li>
+ <li><a href="recipes.html">ZooKeeper Recipes</a> - higher level solutions to common problems in distributed applications</li>
+ </ul>
+ </li>
+
+ <li><strong>Administrators & Operators</strong> <p> Documents for Administrators and Operations Engineers of ZooKeeper Deployments</p>
+ <ul>
+ <li><a href="zookeeperAdmin.html">Administrator's Guide</a> - a guide for system administrators and anyone else who might deploy ZooKeeper</li>
+ <li><a href="zookeeperQuotas.html">Quota Guide</a> - a guide for system administrators on Quotas in ZooKeeper. </li>
+ <li><a href="zookeeperJMX.html">JMX</a> - how to enable JMX in ZooKeeper</li>
+ <li><a href="zookeeperHierarchicalQuorums.html">Hierarchical quorums</a></li>
+ <li><a href="zookeeperObservers.html">Observers</a> - non-voting ensemble members that easily improve ZooKeeper's scalability</li>
+ </ul>
+ </li>
+
+ <li><strong>Contributors</strong><p> Documents for Developers Contributing to the ZooKeeper Open Source Project</p>
+ <ul>
+ <li><a href="zookeeperInternals.html">ZooKeeper Internals</a> - assorted topics on the inner workings of ZooKeeper</li>
+ </ul>
+ </li>
+
+ <li><strong>Miscellaneous ZooKeeper Documentation</strong>
+ <ul>
+ <li><a href="ext:wiki">Wiki</a></li>
+ <li><a href="ext:faq">FAQ</a></li>
+ </ul>
+ </li>
+
+ <li><strong>BookKeeper Documentation</strong>
+ <p> BookKeeper is a highly-available system that implements high-performance write-ahead logging. It uses ZooKeeper for metadata,
+ which is the main reason for being a ZooKeeper contrib.
+ </p>
+ <ul>
+ <li><a href="bookkeeperOverview.html">henn, what's it again?</a></li>
+ <li><a href="bookkeeperStarted.html">Ok, now how do I try it out</a></li>
+ <li><a href="bookkeeperProgrammer.html">Awesome, but how do I integrate it with my app?</a></li>
+ <li><a href="bookkeeperStream.html">Can I stream bytes instead of entries?</a></li>
+ </ul>
+ </li>
+ </ul>
+ </body>
+
+</document>
[11/12] zookeeper git commit: ZOOKEEPER-3022: MAVEN MIGRATION 3.4 -
Iteration 1 - docs, it
Posted by an...@apache.org.
http://git-wip-us.apache.org/repos/asf/zookeeper/blob/c1efa954/src/docs/src/documentation/content/xdocs/bookkeeperStarted.xml
----------------------------------------------------------------------
diff --git a/src/docs/src/documentation/content/xdocs/bookkeeperStarted.xml b/src/docs/src/documentation/content/xdocs/bookkeeperStarted.xml
deleted file mode 100644
index 74f6f7e..0000000
--- a/src/docs/src/documentation/content/xdocs/bookkeeperStarted.xml
+++ /dev/null
@@ -1,208 +0,0 @@
-<?xml version="1.0" encoding="UTF-8"?>
-<!--
- Copyright 2002-2004 The Apache Software Foundation
-
- Licensed under the Apache License, Version 2.0 (the "License");
- you may not use this file except in compliance with the License.
- You may obtain a copy of the License at
-
- http://www.apache.org/licenses/LICENSE-2.0
-
- Unless required by applicable law or agreed to in writing, software
- distributed under the License is distributed on an "AS IS" BASIS,
- WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- See the License for the specific language governing permissions and
- limitations under the License.
--->
-
-<!DOCTYPE article PUBLIC "-//OASIS//DTD Simplified DocBook XML V1.0//EN"
-"http://www.oasis-open.org/docbook/xml/simple/1.0/sdocbook.dtd">
-<article id="bk_GettStartedGuide">
- <title>BookKeeper Getting Started Guide</title>
-
- <articleinfo>
- <legalnotice>
- <para>Licensed under the Apache License, Version 2.0 (the "License");
- you may not use this file except in compliance with the License. You may
- obtain a copy of the License at <ulink
- url="http://www.apache.org/licenses/LICENSE-2.0">http://www.apache.org/licenses/LICENSE-2.0</ulink>.</para>
-
- <para>Unless required by applicable law or agreed to in writing,
- software distributed under the License is distributed on an "AS IS"
- BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
- implied. See the License for the specific language governing permissions
- and limitations under the License.</para>
- </legalnotice>
-
- <abstract>
- <para>This guide contains detailed information about using BookKeeper
- for logging. It discusses the basic operations BookKeeper supports,
- and how to create logs and perform basic read and write operations on these
- logs.</para>
- </abstract>
- </articleinfo>
- <section id="bk_GettingStarted">
- <title>Getting Started: Setting up BookKeeper to write logs.</title>
-
- <para>This document contains information to get you started quickly with
- BookKeeper. It is aimed primarily at developers willing to try it out, and
- contains simple installation instructions for a simple BookKeeper installation
- and a simple programming example. For further programming detail, please refer to
- <ulink url="bookkeeperProgrammer.html">BookKeeper Programmer's Guide</ulink>.
- </para>
-
- <section id="bk_Prerequisites">
- <title>Pre-requisites</title>
- <para>See <ulink url="bookkeeperConfig.html#bk_sysReq">
- System Requirements</ulink> in the Admin guide.</para>
- </section>
-
- <section id="bk_Download">
- <title>Download</title>
- <para> BookKeeper is distributed along with ZooKeeper. To get a ZooKeeper distribution,
- download a recent
- <ulink url="http://zookeeper.apache.org/releases.html">
- stable</ulink> release from one of the Apache Download
- Mirrors.</para>
- </section>
-
- <section id="bk_localBK">
- <title>LocalBookKeeper</title>
- <para> Under org.apache.bookkeeper.util, you'll find a java program
- called LocalBookKeeper.java that sets you up to run BookKeeper on a
- single machine. This is far from ideal from a performance perspective,
- but the program is useful for both test and educational purposes.
- </para>
- </section>
-
- <section id="bk_setupBookies">
- <title>Setting up bookies</title>
- <para> If you're bold and you want more than just running things locally, then
- you'll need to run bookies in different servers. You'll need at least three bookies
- to start with.
- </para>
-
- <para>
- For each bookie, we need to execute a command like the following:
- </para>
-
- <para><computeroutput>
- java -cp .:./zookeeper-<version>-bookkeeper.jar:./zookeeper-<version>.jar\
- :lib/slf4j-api-1.6.1.jar:lib/slf4j-log4j12-1.6.1.jar:lib/log4j-1.2.15.jar -Dlog4j.configuration=log4j.properties\
- org.apache.bookkeeper.proto.BookieServer 3181 127.0.0.1:2181 /path_to_log_device/\
- /path_to_ledger_device/
- </computeroutput></para>
-
- <para> "/path_to_log_device/" and "/path_to_ledger_device/" are different paths. Also, port 3181
- is the port that a bookie listens on for connection requests from clients. 127.0.0.1:2181 is the hostname:port
- for the ZooKeeper server. In this example, the standalone ZooKeeper server is running locally on port 2181.
- If we had multiple ZooKeeper servers, this parameter would be a comma separated list of all the hostname:port
- values corresponding to them.
- </para>
- </section>
-
- <section id="bk_setupZK">
- <title>Setting up ZooKeeper</title>
- <para> ZooKeeper stores metadata on behalf of BookKeeper clients and bookies. To get a minimal
- ZooKeeper installation to work with BookKeeper, we can set up one server running in
- standalone mode. Once we have the server running, we need to create a few znodes:
- </para>
-
- <orderedlist>
- <listitem>
- <para><computeroutput>
- /ledgers
- </computeroutput></para>
- </listitem>
-
- <listitem>
- <para><computeroutput>
- /ledgers/available
- </computeroutput></para>
- </listitem>
-
- <listitem>
- <para> For each bookie, we add one znode such that the name of the znode is the
- concatenation of the machine name and the port number that the bookie is
- listening on. For example, if a bookie is running on bookie.foo.com an is listening
- on port 3181, we add a znode
- <computeroutput>/ledgers/available/bookie.foo.com:3181</computeroutput>.
- </para>
- </listitem>
- </orderedlist>
- </section>
-
- <section id="bk_example">
- <title>Example</title>
- <para>
- In the following excerpt of code, we:
- </para>
-
- <orderedlist>
- <listitem>
- <para>
- Create a ledger;
- </para>
- </listitem>
-
- <listitem>
- <para>
- Write to the ledger;
- </para>
- </listitem>
-
- <listitem>
- <para>
- Close the ledger;
- </para>
- </listitem>
-
- <listitem>
- <para>
- Open the same ledger for reading;
- </para>
- </listitem>
-
- <listitem>
- <para>
- Read from the ledger;
- </para>
- </listitem>
-
- <listitem>
- <para>
- Close the ledger again;
- </para>
- </listitem>
- </orderedlist>
-
- <programlisting>
-LedgerHandle lh = bkc.createLedger(ledgerPassword);
-ledgerId = lh.getId();
-ByteBuffer entry = ByteBuffer.allocate(4);
-
-for(int i = 0; i < 10; i++){
- entry.putInt(i);
- entry.position(0);
- entries.add(entry.array());
- lh.addEntry(entry.array());
-}
-lh.close();
-lh = bkc.openLedger(ledgerId, ledgerPassword);
-
-Enumeration<LedgerEntry> ls = lh.readEntries(0, 9);
-int i = 0;
-while(ls.hasMoreElements()){
- ByteBuffer origbb = ByteBuffer.wrap(
- entries.get(i++));
- Integer origEntry = origbb.getInt();
- ByteBuffer result = ByteBuffer.wrap(
- ls.nextElement().getEntry());
-
- Integer retrEntry = result.getInt();
-}
-lh.close();
- </programlisting>
- </section>
- </section>
-</article>
http://git-wip-us.apache.org/repos/asf/zookeeper/blob/c1efa954/src/docs/src/documentation/content/xdocs/bookkeeperStream.xml
----------------------------------------------------------------------
diff --git a/src/docs/src/documentation/content/xdocs/bookkeeperStream.xml b/src/docs/src/documentation/content/xdocs/bookkeeperStream.xml
deleted file mode 100644
index 9db605a..0000000
--- a/src/docs/src/documentation/content/xdocs/bookkeeperStream.xml
+++ /dev/null
@@ -1,331 +0,0 @@
-<?xml version="1.0" encoding="UTF-8"?>
-<!--
- Copyright 2002-2004 The Apache Software Foundation
-
- Licensed under the Apache License, Version 2.0 (the "License");
- you may not use this file except in compliance with the License.
- You may obtain a copy of the License at
-
- http://www.apache.org/licenses/LICENSE-2.0
-
- Unless required by applicable law or agreed to in writing, software
- distributed under the License is distributed on an "AS IS" BASIS,
- WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- See the License for the specific language governing permissions and
- limitations under the License.
--->
-
-<!DOCTYPE article PUBLIC "-//OASIS//DTD Simplified DocBook XML V1.0//EN"
-"http://www.oasis-open.org/docbook/xml/simple/1.0/sdocbook.dtd">
-<article id="bk_Stream">
- <title>Streaming with BookKeeper</title>
-
- <articleinfo>
- <legalnotice>
- <para>Licensed under the Apache License, Version 2.0 (the "License");
- you may not use this file except in compliance with the License. You may
- obtain a copy of the License at <ulink
- url="http://www.apache.org/licenses/LICENSE-2.0">http://www.apache.org/licenses/LICENSE-2.0</ulink>.</para>
-
- <para>Unless required by applicable law or agreed to in writing,
- software distributed under the License is distributed on an "AS IS"
- BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
- implied. See the License for the specific language governing permissions
- and limitations under the License.</para>
- </legalnotice>
-
- <abstract>
- <para>This guide contains detailed information about using how to stream bytes
- on top of BookKeeper. It essentially motivates and discusses the basic stream
- operations currently supported.</para>
- </abstract>
- </articleinfo>
- <section id="bk_StreamSummary">
- <title>Summary</title>
-
- <para>
- When using the BookKeeper API, an application has to split the data to write into entries, each
- entry being a byte array. This is natural for many applications. For example, when using BookKeeper
- for write-ahead logging, an application typically wants to write the modifications corresponding
- to a command or a transaction. Some other applications, however, might not have a natural boundary
- for entries, and may prefer to write and read streams of bytes. This is exactly the purpose of the
- stream API we have implemented on top of BookKeeper.
- </para>
-
- <para>
- The stream API is implemented in the package <computeroutput>Streaming</computeroutput>, and it contains two main classes: <computeroutput>LedgerOutputStream</computeroutput> and
- <computeroutput>LedgerInputStream</computeroutput>. The class names are indicative of what they do.
- </para>
- </section>
-
- <section id="bk_LedgerOutputStream">
- <title>Writing a stream of bytes</title>
- <para>
- Class <computeroutput>LedgerOutputStream</computeroutput> implements two constructors and five public methods:
- </para>
-
- <para>
- <computeroutput>
- public LedgerOutputStream(LedgerHandle lh)
- </computeroutput>
- </para>
-
- <para>
- where:
- </para>
- <itemizedlist>
- <listitem>
- <para>
- <computeroutput>lh</computeroutput> is a ledger handle for a previously created and open ledger.
- </para>
- </listitem>
- </itemizedlist>
-
- <para>
- <computeroutput>
- public LedgerOutputStream(LedgerHandle lh, int size)
- </computeroutput>
- </para>
-
- <para>
- where:
- </para>
- <itemizedlist>
- <listitem>
- <para>
- <computeroutput>lh</computeroutput> is a ledger handle for a previously created and open ledger.
- </para>
- </listitem>
-
- <listitem>
- <para>
- <computeroutput>size</computeroutput> is the size of the byte buffer to store written bytes before flushing.
- </para>
- </listitem>
- </itemizedlist>
-
-
- <para>
- <emphasis role="bold">Closing a stream.</emphasis> This call closes the stream by flushing the write buffer.
- </para>
- <para>
- <computeroutput>
- public void close()
- </computeroutput>
- </para>
-
- <para>
- which has no parameters.
- </para>
-
- <para>
- <emphasis role="bold">Flushing a stream.</emphasis> This call essentially flushes the write buffer.
- </para>
- <para>
- <computeroutput>
- public synchronized void flush()
- </computeroutput>
- </para>
-
- <para>
- which has no parameters.
- </para>
-
- <para>
- <emphasis role="bold">Writing bytes.</emphasis> There are three calls for writing bytes to a stream.
- </para>
-
- <para>
- <computeroutput>
- public synchronized void write(byte[] b)
- </computeroutput>
- </para>
-
- <para>
- where:
- </para>
- <itemizedlist>
- <listitem>
- <para>
- <computeroutput>b</computeroutput> is an array of bytes to write.
- </para>
- </listitem>
- </itemizedlist>
-
- <para>
- <computeroutput>
- public synchronized void write(byte[] b, int off, int len)
- </computeroutput>
- </para>
-
- <para>
- where:
- </para>
- <itemizedlist>
- <listitem>
- <para>
- <computeroutput>b</computeroutput> is an array of bytes to write.
- </para>
- </listitem>
-
- <listitem>
- <para>
- <computeroutput>off</computeroutput> is a buffer offset.
- </para>
- </listitem>
-
- <listitem>
- <para>
- <computeroutput>len</computeroutput> is the length to write.
- </para>
- </listitem>
- </itemizedlist>
-
- <para>
- <computeroutput>
- public synchronized void write(int b)
- </computeroutput>
- </para>
-
- <para>
- where:
- </para>
- <itemizedlist>
- <listitem>
- <para>
- <computeroutput>b</computeroutput> contains a byte to write. The method writes the least significant byte of the integer four bytes.
- </para>
- </listitem>
- </itemizedlist>
- </section>
-
- <section id="bk_LedgerInputStream">
- <title>Reading a stream of bytes</title>
-
- <para>
- Class <computeroutput>LedgerOutputStream</computeroutput> implements two constructors and four public methods:
- </para>
-
- <para>
- <computeroutput>
- public LedgerInputStream(LedgerHandle lh)
- throws BKException, InterruptedException
- </computeroutput>
- </para>
-
- <para>
- where:
- </para>
- <itemizedlist>
- <listitem>
- <para>
- <computeroutput>lh</computeroutput> is a ledger handle for a previously created and open ledger.
- </para>
- </listitem>
- </itemizedlist>
-
- <para>
- <computeroutput>
- public LedgerInputStream(LedgerHandle lh, int size)
- throws BKException, InterruptedException
- </computeroutput>
- </para>
-
- <para>
- where:
- </para>
- <itemizedlist>
- <listitem>
- <para>
- <computeroutput>lh</computeroutput> is a ledger handle for a previously created and open ledger.
- </para>
- </listitem>
-
- <listitem>
- <para>
- <computeroutput>size</computeroutput> is the size of the byte buffer to store bytes that the application
- will eventually read.
- </para>
- </listitem>
- </itemizedlist>
-
- <para>
- <emphasis role="bold">Closing.</emphasis> There is one call to close an input stream, but the call
- is currently empty and the application is responsible for closing the ledger handle.
- </para>
- <para>
- <computeroutput>
- public void close()
- </computeroutput>
- </para>
-
- <para>
- which has no parameters.
- </para>
-
- <para>
- <emphasis role="bold">Reading.</emphasis> There are three calls to read from the stream.
- </para>
- <para>
- <computeroutput>
- public synchronized int read()
- throws IOException
- </computeroutput>
- </para>
-
- <para>
- which has no parameters.
- </para>
-
- <para>
- <computeroutput>
- public synchronized int read(byte[] b)
- throws IOException
- </computeroutput>
- </para>
-
- <para>
- where:
- </para>
- <itemizedlist>
- <listitem>
- <para>
- <computeroutput>b</computeroutput> is a byte array to write to.
- </para>
- </listitem>
- </itemizedlist>
-
-
- <para>
- <computeroutput>
- public synchronized int read(byte[] b, int off, int len)
- throws IOException
- </computeroutput>
- </para>
-
- <para>
- where:
- </para>
- <itemizedlist>
- <listitem>
- <para>
- <computeroutput>b</computeroutput> is a byte array to write to.
- </para>
- </listitem>
-
- <listitem>
- <para>
- <computeroutput>off</computeroutput> is an offset for byte array <computeroutput>b</computeroutput>.
- </para>
- </listitem>
-
- <listitem>
- <para>
- <computeroutput>len</computeroutput> is the length in bytes to write to <computeroutput>b</computeroutput>.
- </para>
- </listitem>
- </itemizedlist>
-
-
- </section>
- </article>
\ No newline at end of file
http://git-wip-us.apache.org/repos/asf/zookeeper/blob/c1efa954/src/docs/src/documentation/content/xdocs/index.xml
----------------------------------------------------------------------
diff --git a/src/docs/src/documentation/content/xdocs/index.xml b/src/docs/src/documentation/content/xdocs/index.xml
deleted file mode 100644
index 8ed4702..0000000
--- a/src/docs/src/documentation/content/xdocs/index.xml
+++ /dev/null
@@ -1,98 +0,0 @@
-<?xml version="1.0"?>
-<!--
- Copyright 2002-2004 The Apache Software Foundation
-
- Licensed under the Apache License, Version 2.0 (the "License");
- you may not use this file except in compliance with the License.
- You may obtain a copy of the License at
-
- http://www.apache.org/licenses/LICENSE-2.0
-
- Unless required by applicable law or agreed to in writing, software
- distributed under the License is distributed on an "AS IS" BASIS,
- WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- See the License for the specific language governing permissions and
- limitations under the License.
--->
-
-<!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V2.0//EN" "http://forrest.apache.org/dtd/document-v20.dtd">
-
-<document>
-
- <header>
- <title>ZooKeeper: Because Coordinating Distributed Systems is a Zoo</title>
- </header>
-
- <body>
- <p>ZooKeeper is a high-performance coordination service for
- distributed applications. It exposes common services - such as
- naming, configuration management, synchronization, and group
- services - in a simple interface so you don't have to write them
- from scratch. You can use it off-the-shelf to implement
- consensus, group management, leader election, and presence
- protocols. And you can build on it for your own, specific needs.
- </p>
-
- <p>
- The following documents describe concepts and procedures to get
- you started using ZooKeeper. If you have more questions, please
- ask the <a href="ext:lists">mailing list</a> or browse the
- archives.
- </p>
- <ul>
-
- <li><strong>ZooKeeper Overview</strong><p>Technical Overview Documents for Client Developers, Adminstrators, and Contributors</p>
- <ul><li><a href="zookeeperOver.html">Overview</a> - a bird's eye view of ZooKeeper, including design concepts and architecture</li>
- <li><a href="zookeeperStarted.html">Getting Started</a> - a tutorial-style guide for developers to install, run, and program to ZooKeeper</li>
- <li><a href="ext:relnotes">Release Notes</a> - new developer and user facing features, improvements, and incompatibilities</li>
- </ul>
- </li>
-
- <li><strong>Developers</strong><p> Documents for Developers using the ZooKeeper Client API</p>
- <ul>
- <li><a href="ext:api/index">API Docs</a> - the technical reference to ZooKeeper Client APIs</li>
- <li><a href="zookeeperProgrammers.html">Programmer's Guide</a> - a client application developer's guide to ZooKeeper</li>
- <li><a href="javaExample.html">ZooKeeper Java Example</a> - a simple Zookeeper client appplication, written in Java</li>
- <li><a href="zookeeperTutorial.html">Barrier and Queue Tutorial</a> - sample implementations of barriers and queues</li>
- <li><a href="recipes.html">ZooKeeper Recipes</a> - higher level solutions to common problems in distributed applications</li>
- </ul>
- </li>
-
- <li><strong>Administrators & Operators</strong> <p> Documents for Administrators and Operations Engineers of ZooKeeper Deployments</p>
- <ul>
- <li><a href="zookeeperAdmin.html">Administrator's Guide</a> - a guide for system administrators and anyone else who might deploy ZooKeeper</li>
- <li><a href="zookeeperQuotas.html">Quota Guide</a> - a guide for system administrators on Quotas in ZooKeeper. </li>
- <li><a href="zookeeperJMX.html">JMX</a> - how to enable JMX in ZooKeeper</li>
- <li><a href="zookeeperHierarchicalQuorums.html">Hierarchical quorums</a></li>
- <li><a href="zookeeperObservers.html">Observers</a> - non-voting ensemble members that easily improve ZooKeeper's scalability</li>
- </ul>
- </li>
-
- <li><strong>Contributors</strong><p> Documents for Developers Contributing to the ZooKeeper Open Source Project</p>
- <ul>
- <li><a href="zookeeperInternals.html">ZooKeeper Internals</a> - assorted topics on the inner workings of ZooKeeper</li>
- </ul>
- </li>
-
- <li><strong>Miscellaneous ZooKeeper Documentation</strong>
- <ul>
- <li><a href="ext:wiki">Wiki</a></li>
- <li><a href="ext:faq">FAQ</a></li>
- </ul>
- </li>
-
- <li><strong>BookKeeper Documentation</strong>
- <p> BookKeeper is a highly-available system that implements high-performance write-ahead logging. It uses ZooKeeper for metadata,
- which is the main reason for being a ZooKeeper contrib.
- </p>
- <ul>
- <li><a href="bookkeeperOverview.html">henn, what's it again?</a></li>
- <li><a href="bookkeeperStarted.html">Ok, now how do I try it out</a></li>
- <li><a href="bookkeeperProgrammer.html">Awesome, but how do I integrate it with my app?</a></li>
- <li><a href="bookkeeperStream.html">Can I stream bytes instead of entries?</a></li>
- </ul>
- </li>
- </ul>
- </body>
-
-</document>
http://git-wip-us.apache.org/repos/asf/zookeeper/blob/c1efa954/src/docs/src/documentation/content/xdocs/javaExample.xml
----------------------------------------------------------------------
diff --git a/src/docs/src/documentation/content/xdocs/javaExample.xml b/src/docs/src/documentation/content/xdocs/javaExample.xml
deleted file mode 100644
index c992282..0000000
--- a/src/docs/src/documentation/content/xdocs/javaExample.xml
+++ /dev/null
@@ -1,663 +0,0 @@
-<?xml version="1.0" encoding="UTF-8"?>
-<!--
- Copyright 2002-2004 The Apache Software Foundation
-
- Licensed under the Apache License, Version 2.0 (the "License");
- you may not use this file except in compliance with the License.
- You may obtain a copy of the License at
-
- http://www.apache.org/licenses/LICENSE-2.0
-
- Unless required by applicable law or agreed to in writing, software
- distributed under the License is distributed on an "AS IS" BASIS,
- WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- See the License for the specific language governing permissions and
- limitations under the License.
--->
-
-<!DOCTYPE article PUBLIC "-//OASIS//DTD Simplified DocBook XML V1.0//EN"
-"http://www.oasis-open.org/docbook/xml/simple/1.0/sdocbook.dtd">
-<article id="ar_JavaExample">
- <title>ZooKeeper Java Example</title>
-
- <articleinfo>
- <legalnotice>
- <para>Licensed under the Apache License, Version 2.0 (the "License");
- you may not use this file except in compliance with the License. You may
- obtain a copy of the License at <ulink
- url="http://www.apache.org/licenses/LICENSE-2.0">http://www.apache.org/licenses/LICENSE-2.0</ulink>.</para>
-
- <para>Unless required by applicable law or agreed to in writing,
- software distributed under the License is distributed on an "AS IS"
- BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
- implied. See the License for the specific language governing permissions
- and limitations under the License.</para>
- </legalnotice>
-
- <abstract>
- <para>This article contains sample Java code for a simple watch client.</para>
-
- </abstract>
- </articleinfo>
-
- <section id="ch_Introduction">
- <title>A Simple Watch Client</title>
-
- <para>To introduce you to the ZooKeeper Java API, we develop here a very simple
- watch client. This ZooKeeper client watches a ZooKeeper node for changes
- and responds to by starting or stopping a program.</para>
-
- <section id="sc_requirements"><title>Requirements</title>
-
- <para>The client has four requirements:</para>
-
- <itemizedlist><listitem><para>It takes as parameters:</para>
- <itemizedlist>
- <listitem><para>the address of the ZooKeeper service</para></listitem>
- <listitem> <para>then name of a znode - the one to be watched</para></listitem>
- <listitem><para>an executable with arguments.</para></listitem></itemizedlist></listitem>
- <listitem><para>It fetches the data associated with the znode and starts the executable.</para></listitem>
- <listitem><para>If the znode changes, the client refetches the contents and restarts the executable.</para></listitem>
- <listitem><para>If the znode disappears, the client kills the executable.</para></listitem></itemizedlist>
-
- </section>
-
- <section id="sc_design">
- <title>Program Design</title>
-
- <para>Conventionally, ZooKeeper applications are broken into two units, one which maintains the connection,
- and the other which monitors data. In this application, the class called the <emphasis role="bold">Executor</emphasis>
- maintains the ZooKeeper connection, and the class called the <emphasis role="bold">DataMonitor</emphasis> monitors the data
- in the ZooKeeper tree. Also, Executor contains the main thread and contains the execution logic.
- It is responsible for what little user interaction there is, as well as interaction with the exectuable program you
- pass in as an argument and which the sample (per the requirements) shuts down and restarts, according to the
- state of the znode.</para>
-
- </section>
-
- </section>
-
- <section id="sc_executor"><title>The Executor Class</title>
- <para>The Executor object is the primary container of the sample application. It contains
- both the <emphasis role="bold">ZooKeeper</emphasis> object, <emphasis role="bold">DataMonitor</emphasis>, as described above in
- <xref linkend="sc_design"/>. </para>
-
- <programlisting>
- // from the Executor class...
-
- public static void main(String[] args) {
- if (args.length < 4) {
- System.err
- .println("USAGE: Executor hostPort znode filename program [args ...]");
- System.exit(2);
- }
- String hostPort = args[0];
- String znode = args[1];
- String filename = args[2];
- String exec[] = new String[args.length - 3];
- System.arraycopy(args, 3, exec, 0, exec.length);
- try {
- new Executor(hostPort, znode, filename, exec).run();
- } catch (Exception e) {
- e.printStackTrace();
- }
- }
-
- public Executor(String hostPort, String znode, String filename,
- String exec[]) throws KeeperException, IOException {
- this.filename = filename;
- this.exec = exec;
- zk = new ZooKeeper(hostPort, 3000, this);
- dm = new DataMonitor(zk, znode, null, this);
- }
-
- public void run() {
- try {
- synchronized (this) {
- while (!dm.dead) {
- wait();
- }
- }
- } catch (InterruptedException e) {
- }
- }
-</programlisting>
-
-
- <para>
- Recall that the Executor's job is to start and stop the executable whose name you pass in on the command line.
- It does this in response to events fired by the ZooKeeper object. As you can see in the code above, the Executor passes
- a reference to itself as the Watcher argument in the ZooKeeper constructor. It also passes a reference to itself
- as DataMonitorListener argument to the DataMonitor constructor. Per the Executor's definition, it implements both these
- interfaces:
- </para>
-
- <programlisting>
-public class Executor implements Watcher, Runnable, DataMonitor.DataMonitorListener {
-...</programlisting>
-
- <para>The <emphasis role="bold">Watcher</emphasis> interface is defined by the ZooKeeper Java API.
- ZooKeeper uses it to communicate back to its container. It supports only one method, <command>process()</command>, and ZooKeeper uses
- it to communciates generic events that the main thread would be intersted in, such as the state of the ZooKeeper connection or the ZooKeeper session.The Executor
- in this example simply forwards those events down to the DataMonitor to decide what to do with them. It does this simply to illustrate
- the point that, by convention, the Executor or some Executor-like object "owns" the ZooKeeper connection, but it is free to delegate the events to other
- events to other objects. It also uses this as the default channel on which to fire watch events. (More on this later.)</para>
-
-<programlisting>
- public void process(WatchedEvent event) {
- dm.process(event);
- }
-</programlisting>
-
- <para>The <emphasis role="bold">DataMonitorListener</emphasis>
- interface, on the other hand, is not part of the the ZooKeeper API. It is a completely custom interface,
- designed for this sample application. The DataMonitor object uses it to communicate back to its container, which
- is also the the Executor object.The DataMonitorListener interface looks like this:</para>
- <programlisting>
-public interface DataMonitorListener {
- /**
- * The existence status of the node has changed.
- */
- void exists(byte data[]);
-
- /**
- * The ZooKeeper session is no longer valid.
- *
- * @param rc
- * the ZooKeeper reason code
- */
- void closing(int rc);
-}
-</programlisting>
- <para>This interface is defined in the DataMonitor class and implemented in the Executor class.
- When <command>Executor.exists()</command> is invoked,
- the Executor decides whether to start up or shut down per the requirements. Recall that the requires say to kill the executable when the
- znode ceases to <emphasis>exist</emphasis>. </para>
-
- <para>When <command>Executor.closing()</command>
- is invoked, the Executor decides whether or not to shut itself down in response to the ZooKeeper connection permanently disappearing.</para>
-
- <para>As you might have guessed, DataMonitor is the object that invokes
- these methods, in response to changes in ZooKeeper's state.</para>
-
- <para>Here are Executor's implementation of
- <command>DataMonitorListener.exists()</command> and <command>DataMonitorListener.closing</command>:
- </para>
- <programlisting>
-public void exists( byte[] data ) {
- if (data == null) {
- if (child != null) {
- System.out.println("Killing process");
- child.destroy();
- try {
- child.waitFor();
- } catch (InterruptedException e) {
- }
- }
- child = null;
- } else {
- if (child != null) {
- System.out.println("Stopping child");
- child.destroy();
- try {
- child.waitFor();
- } catch (InterruptedException e) {
- e.printStackTrace();
- }
- }
- try {
- FileOutputStream fos = new FileOutputStream(filename);
- fos.write(data);
- fos.close();
- } catch (IOException e) {
- e.printStackTrace();
- }
- try {
- System.out.println("Starting child");
- child = Runtime.getRuntime().exec(exec);
- new StreamWriter(child.getInputStream(), System.out);
- new StreamWriter(child.getErrorStream(), System.err);
- } catch (IOException e) {
- e.printStackTrace();
- }
- }
-}
-
-public void closing(int rc) {
- synchronized (this) {
- notifyAll();
- }
-}
-</programlisting>
-
-</section>
-<section id="sc_DataMonitor"><title>The DataMonitor Class</title>
-<para>
-The DataMonitor class has the meat of the ZooKeeper logic. It is mostly
-asynchronous and event driven. DataMonitor kicks things off in the constructor with:</para>
-<programlisting>
-public DataMonitor(ZooKeeper zk, String znode, Watcher chainedWatcher,
- DataMonitorListener listener) {
- this.zk = zk;
- this.znode = znode;
- this.chainedWatcher = chainedWatcher;
- this.listener = listener;
-
- // Get things started by checking if the node exists. We are going
- // to be completely event driven
- <emphasis role="bold">zk.exists(znode, true, this, null);</emphasis>
-}
-</programlisting>
-
-<para>The call to <command>ZooKeeper.exists()</command> checks for the existence of the znode,
-sets a watch, and passes a reference to itself (<command>this</command>)
-as the completion callback object. In this sense, it kicks things off, since the
-real processing happens when the watch is triggered.</para>
-
-<note>
-<para>Don't confuse the completion callback with the watch callback. The <command>ZooKeeper.exists()</command>
-completion callback, which happens to be the method <command>StatCallback.processResult()</command> implemented
-in the DataMonitor object, is invoked when the asynchronous <emphasis>setting of the watch</emphasis> operation
-(by <command>ZooKeeper.exists()</command>) completes on the server. </para>
-<para>
-The triggering of the watch, on the other hand, sends an event to the <emphasis>Executor</emphasis> object, since
-the Executor registered as the Watcher of the ZooKeeper object.</para>
-
-<para>As an aside, you might note that the DataMonitor could also register itself as the Watcher
-for this particular watch event. This is new to ZooKeeper 3.0.0 (the support of multiple Watchers). In this
-example, however, DataMonitor does not register as the Watcher.</para>
-</note>
-
-<para>When the <command>ZooKeeper.exists()</command> operation completes on the server, the ZooKeeper API invokes this completion callback on
-the client:</para>
-
-<programlisting>
-public void processResult(int rc, String path, Object ctx, Stat stat) {
- boolean exists;
- switch (rc) {
- case Code.Ok:
- exists = true;
- break;
- case Code.NoNode:
- exists = false;
- break;
- case Code.SessionExpired:
- case Code.NoAuth:
- dead = true;
- listener.closing(rc);
- return;
- default:
- // Retry errors
- zk.exists(znode, true, this, null);
- return;
- }
-
- byte b[] = null;
- if (exists) {
- try {
- <emphasis role="bold">b = zk.getData(znode, false, null);</emphasis>
- } catch (KeeperException e) {
- // We don't need to worry about recovering now. The watch
- // callbacks will kick off any exception handling
- e.printStackTrace();
- } catch (InterruptedException e) {
- return;
- }
- }
- if ((b == null && b != prevData)
- || (b != null && !Arrays.equals(prevData, b))) {
- <emphasis role="bold">listener.exists(b);</emphasis>
- prevData = b;
- }
-}
-</programlisting>
-
-<para>
-The code first checks the error codes for znode existence, fatal errors, and
-recoverable errors. If the file (or znode) exists, it gets the data from the znode, and
-then invoke the exists() callback of Executor if the state has changed. Note,
-it doesn't have to do any Exception processing for the getData call because it
-has watches pending for anything that could cause an error: if the node is deleted
-before it calls <command>ZooKeeper.getData()</command>, the watch event set by
-the <command>ZooKeeper.exists()</command> triggers a callback;
-if there is a communication error, a connection watch event fires when
-the connection comes back up.
-</para>
-
-<para>Finally, notice how DataMonitor processes watch events: </para>
-<programlisting>
- public void process(WatchedEvent event) {
- String path = event.getPath();
- if (event.getType() == Event.EventType.None) {
- // We are are being told that the state of the
- // connection has changed
- switch (event.getState()) {
- case SyncConnected:
- // In this particular example we don't need to do anything
- // here - watches are automatically re-registered with
- // server and any watches triggered while the client was
- // disconnected will be delivered (in order of course)
- break;
- case Expired:
- // It's all over
- dead = true;
- listener.closing(KeeperException.Code.SessionExpired);
- break;
- }
- } else {
- if (path != null && path.equals(znode)) {
- // Something has changed on the node, let's find out
- zk.exists(znode, true, this, null);
- }
- }
- if (chainedWatcher != null) {
- chainedWatcher.process(event);
- }
- }
-</programlisting>
-<para>
-If the client-side ZooKeeper libraries can re-establish the
-communication channel (SyncConnected event) to ZooKeeper before
-session expiration (Expired event) all of the session's watches will
-automatically be re-established with the server (auto-reset of watches
-is new in ZooKeeper 3.0.0). See <ulink
-url="zookeeperProgrammers.html#ch_zkWatches">ZooKeeper Watches</ulink>
-in the programmer guide for more on this. A bit lower down in this
-function, when DataMonitor gets an event for a znode, it calls
-<command>ZooKeeper.exists()</command> to find out what has changed.
-</para>
-</section>
-
-<section id="sc_completeSourceCode">
- <title>Complete Source Listings</title>
- <example id="eg_Executor_java"><title>Executor.java</title><programlisting>
-/**
- * A simple example program to use DataMonitor to start and
- * stop executables based on a znode. The program watches the
- * specified znode and saves the data that corresponds to the
- * znode in the filesystem. It also starts the specified program
- * with the specified arguments when the znode exists and kills
- * the program if the znode goes away.
- */
-import java.io.FileOutputStream;
-import java.io.IOException;
-import java.io.InputStream;
-import java.io.OutputStream;
-
-import org.apache.zookeeper.KeeperException;
-import org.apache.zookeeper.WatchedEvent;
-import org.apache.zookeeper.Watcher;
-import org.apache.zookeeper.ZooKeeper;
-
-public class Executor
- implements Watcher, Runnable, DataMonitor.DataMonitorListener
-{
- String znode;
-
- DataMonitor dm;
-
- ZooKeeper zk;
-
- String filename;
-
- String exec[];
-
- Process child;
-
- public Executor(String hostPort, String znode, String filename,
- String exec[]) throws KeeperException, IOException {
- this.filename = filename;
- this.exec = exec;
- zk = new ZooKeeper(hostPort, 3000, this);
- dm = new DataMonitor(zk, znode, null, this);
- }
-
- /**
- * @param args
- */
- public static void main(String[] args) {
- if (args.length < 4) {
- System.err
- .println("USAGE: Executor hostPort znode filename program [args ...]");
- System.exit(2);
- }
- String hostPort = args[0];
- String znode = args[1];
- String filename = args[2];
- String exec[] = new String[args.length - 3];
- System.arraycopy(args, 3, exec, 0, exec.length);
- try {
- new Executor(hostPort, znode, filename, exec).run();
- } catch (Exception e) {
- e.printStackTrace();
- }
- }
-
- /***************************************************************************
- * We do process any events ourselves, we just need to forward them on.
- *
- * @see org.apache.zookeeper.Watcher#process(org.apache.zookeeper.proto.WatcherEvent)
- */
- public void process(WatchedEvent event) {
- dm.process(event);
- }
-
- public void run() {
- try {
- synchronized (this) {
- while (!dm.dead) {
- wait();
- }
- }
- } catch (InterruptedException e) {
- }
- }
-
- public void closing(int rc) {
- synchronized (this) {
- notifyAll();
- }
- }
-
- static class StreamWriter extends Thread {
- OutputStream os;
-
- InputStream is;
-
- StreamWriter(InputStream is, OutputStream os) {
- this.is = is;
- this.os = os;
- start();
- }
-
- public void run() {
- byte b[] = new byte[80];
- int rc;
- try {
- while ((rc = is.read(b)) > 0) {
- os.write(b, 0, rc);
- }
- } catch (IOException e) {
- }
-
- }
- }
-
- public void exists(byte[] data) {
- if (data == null) {
- if (child != null) {
- System.out.println("Killing process");
- child.destroy();
- try {
- child.waitFor();
- } catch (InterruptedException e) {
- }
- }
- child = null;
- } else {
- if (child != null) {
- System.out.println("Stopping child");
- child.destroy();
- try {
- child.waitFor();
- } catch (InterruptedException e) {
- e.printStackTrace();
- }
- }
- try {
- FileOutputStream fos = new FileOutputStream(filename);
- fos.write(data);
- fos.close();
- } catch (IOException e) {
- e.printStackTrace();
- }
- try {
- System.out.println("Starting child");
- child = Runtime.getRuntime().exec(exec);
- new StreamWriter(child.getInputStream(), System.out);
- new StreamWriter(child.getErrorStream(), System.err);
- } catch (IOException e) {
- e.printStackTrace();
- }
- }
- }
-}
-</programlisting>
-
-</example>
-
-<example id="eg_DataMonitor_java">
- <title>DataMonitor.java</title>
- <programlisting>
-/**
- * A simple class that monitors the data and existence of a ZooKeeper
- * node. It uses asynchronous ZooKeeper APIs.
- */
-import java.util.Arrays;
-
-import org.apache.zookeeper.KeeperException;
-import org.apache.zookeeper.WatchedEvent;
-import org.apache.zookeeper.Watcher;
-import org.apache.zookeeper.ZooKeeper;
-import org.apache.zookeeper.AsyncCallback.StatCallback;
-import org.apache.zookeeper.KeeperException.Code;
-import org.apache.zookeeper.data.Stat;
-
-public class DataMonitor implements Watcher, StatCallback {
-
- ZooKeeper zk;
-
- String znode;
-
- Watcher chainedWatcher;
-
- boolean dead;
-
- DataMonitorListener listener;
-
- byte prevData[];
-
- public DataMonitor(ZooKeeper zk, String znode, Watcher chainedWatcher,
- DataMonitorListener listener) {
- this.zk = zk;
- this.znode = znode;
- this.chainedWatcher = chainedWatcher;
- this.listener = listener;
- // Get things started by checking if the node exists. We are going
- // to be completely event driven
- zk.exists(znode, true, this, null);
- }
-
- /**
- * Other classes use the DataMonitor by implementing this method
- */
- public interface DataMonitorListener {
- /**
- * The existence status of the node has changed.
- */
- void exists(byte data[]);
-
- /**
- * The ZooKeeper session is no longer valid.
- *
- * @param rc
- * the ZooKeeper reason code
- */
- void closing(int rc);
- }
-
- public void process(WatchedEvent event) {
- String path = event.getPath();
- if (event.getType() == Event.EventType.None) {
- // We are are being told that the state of the
- // connection has changed
- switch (event.getState()) {
- case SyncConnected:
- // In this particular example we don't need to do anything
- // here - watches are automatically re-registered with
- // server and any watches triggered while the client was
- // disconnected will be delivered (in order of course)
- break;
- case Expired:
- // It's all over
- dead = true;
- listener.closing(KeeperException.Code.SessionExpired);
- break;
- }
- } else {
- if (path != null && path.equals(znode)) {
- // Something has changed on the node, let's find out
- zk.exists(znode, true, this, null);
- }
- }
- if (chainedWatcher != null) {
- chainedWatcher.process(event);
- }
- }
-
- public void processResult(int rc, String path, Object ctx, Stat stat) {
- boolean exists;
- switch (rc) {
- case Code.Ok:
- exists = true;
- break;
- case Code.NoNode:
- exists = false;
- break;
- case Code.SessionExpired:
- case Code.NoAuth:
- dead = true;
- listener.closing(rc);
- return;
- default:
- // Retry errors
- zk.exists(znode, true, this, null);
- return;
- }
-
- byte b[] = null;
- if (exists) {
- try {
- b = zk.getData(znode, false, null);
- } catch (KeeperException e) {
- // We don't need to worry about recovering now. The watch
- // callbacks will kick off any exception handling
- e.printStackTrace();
- } catch (InterruptedException e) {
- return;
- }
- }
- if ((b == null && b != prevData)
- || (b != null && !Arrays.equals(prevData, b))) {
- listener.exists(b);
- prevData = b;
- }
- }
-}
-</programlisting>
-</example>
-</section>
-
-
-
-</article>
http://git-wip-us.apache.org/repos/asf/zookeeper/blob/c1efa954/src/docs/src/documentation/content/xdocs/recipes.xml
----------------------------------------------------------------------
diff --git a/src/docs/src/documentation/content/xdocs/recipes.xml b/src/docs/src/documentation/content/xdocs/recipes.xml
deleted file mode 100644
index ead041b..0000000
--- a/src/docs/src/documentation/content/xdocs/recipes.xml
+++ /dev/null
@@ -1,637 +0,0 @@
-<?xml version="1.0" encoding="UTF-8"?>
-<!--
- Copyright 2002-2004 The Apache Software Foundation
-
- Licensed under the Apache License, Version 2.0 (the "License");
- you may not use this file except in compliance with the License.
- You may obtain a copy of the License at
-
- http://www.apache.org/licenses/LICENSE-2.0
-
- Unless required by applicable law or agreed to in writing, software
- distributed under the License is distributed on an "AS IS" BASIS,
- WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- See the License for the specific language governing permissions and
- limitations under the License.
--->
-
-<!DOCTYPE article PUBLIC "-//OASIS//DTD Simplified DocBook XML V1.0//EN"
-"http://www.oasis-open.org/docbook/xml/simple/1.0/sdocbook.dtd">
-<article id="ar_Recipes">
- <title>ZooKeeper Recipes and Solutions</title>
-
- <articleinfo>
- <legalnotice>
- <para>Licensed under the Apache License, Version 2.0 (the "License");
- you may not use this file except in compliance with the License. You may
- obtain a copy of the License at <ulink
- url="http://www.apache.org/licenses/LICENSE-2.0">http://www.apache.org/licenses/LICENSE-2.0</ulink>.</para>
-
- <para>Unless required by applicable law or agreed to in writing,
- software distributed under the License is distributed on an "AS IS"
- BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
- implied. See the License for the specific language governing permissions
- and limitations under the License.</para>
- </legalnotice>
-
- <abstract>
- <para>This guide contains pseudocode and guidelines for using Zookeeper to
- solve common problems in Distributed Application Coordination. It
- discusses such problems as event handlers, queues, and locks..</para>
-
- <para>$Revision: 1.6 $ $Date: 2008/09/19 03:46:18 $</para>
- </abstract>
- </articleinfo>
-
- <section id="ch_recipes">
- <title>A Guide to Creating Higher-level Constructs with ZooKeeper</title>
-
- <para>In this article, you'll find guidelines for using
- ZooKeeper to implement higher order functions. All of them are conventions
- implemented at the client and do not require special support from
- ZooKeeper. Hopfully the community will capture these conventions in client-side libraries
- to ease their use and to encourage standardization.</para>
-
- <para>One of the most interesting things about ZooKeeper is that even
- though ZooKeeper uses <emphasis>asynchronous</emphasis> notifications, you
- can use it to build <emphasis>synchronous</emphasis> consistency
- primitives, such as queues and locks. As you will see, this is possible
- because ZooKeeper imposes an overall order on updates, and has mechanisms
- to expose this ordering.</para>
-
- <para>Note that the recipes below attempt to employ best practices. In
- particular, they avoid polling, timers or anything else that would result
- in a "herd effect", causing bursts of traffic and limiting
- scalability.</para>
-
- <para>There are many useful functions that can be imagined that aren't
- included here - revocable read-write priority locks, as just one example.
- And some of the constructs mentioned here - locks, in particular -
- illustrate certain points, even though you may find other constructs, such
- as event handles or queues, a more practical means of performing the same
- function. In general, the examples in this section are designed to
- stimulate thought.</para>
-
-
- <section id="sc_outOfTheBox">
- <title>Out of the Box Applications: Name Service, Configuration, Group
- Membership</title>
-
- <para>Name service and configuration are two of the primary applications
- of ZooKeeper. These two functions are provided directly by the ZooKeeper
- API.</para>
-
- <para>Another function directly provided by ZooKeeper is <emphasis>group
- membership</emphasis>. The group is represented by a node. Members of the
- group create ephemeral nodes under the group node. Nodes of the members
- that fail abnormally will be removed automatically when ZooKeeper detects
- the failure.</para>
- </section>
-
- <section id="sc_recipes_eventHandles">
- <title>Barriers</title>
-
- <para>Distributed systems use <emphasis>barriers</emphasis>
- to block processing of a set of nodes until a condition is met
- at which time all the nodes are allowed to proceed. Barriers are
- implemented in ZooKeeper by designating a barrier node. The
- barrier is in place if the barrier node exists. Here's the
- pseudo code:</para>
-
- <orderedlist>
- <listitem>
- <para>Client calls the ZooKeeper API's <emphasis
- role="bold">exists()</emphasis> function on the barrier node, with
- <emphasis>watch</emphasis> set to true.</para>
- </listitem>
-
- <listitem>
- <para>If <emphasis role="bold">exists()</emphasis> returns false, the
- barrier is gone and the client proceeds</para>
- </listitem>
-
- <listitem>
- <para>Else, if <emphasis role="bold">exists()</emphasis> returns true,
- the clients wait for a watch event from ZooKeeper for the barrier
- node.</para>
- </listitem>
-
- <listitem>
- <para>When the watch event is triggered, the client reissues the
- <emphasis role="bold">exists( )</emphasis> call, again waiting until
- the barrier node is removed.</para>
- </listitem>
- </orderedlist>
-
- <section id="sc_doubleBarriers">
- <title>Double Barriers</title>
-
- <para>Double barriers enable clients to synchronize the beginning and
- the end of a computation. When enough processes have joined the barrier,
- processes start their computation and leave the barrier once they have
- finished. This recipe shows how to use a ZooKeeper node as a
- barrier.</para>
-
- <para>The pseudo code in this recipe represents the barrier node as
- <emphasis>b</emphasis>. Every client process <emphasis>p</emphasis>
- registers with the barrier node on entry and unregisters when it is
- ready to leave. A node registers with the barrier node via the <emphasis
- role="bold">Enter</emphasis> procedure below, it waits until
- <emphasis>x</emphasis> client process register before proceeding with
- the computation. (The <emphasis>x</emphasis> here is up to you to
- determine for your system.)</para>
-
- <informaltable colsep="0" frame="none" rowsep="0">
- <tgroup cols="2">
- <tbody>
- <row>
- <entry align="center"><emphasis
- role="bold">Enter</emphasis></entry>
-
- <entry align="center"><emphasis
- role="bold">Leave</emphasis></entry>
- </row>
-
- <row>
- <entry align="left"><orderedlist>
- <listitem>
- <para>Create a name <emphasis><emphasis>n</emphasis> =
- <emphasis>b</emphasis>+“/”+<emphasis>p</emphasis></emphasis></para>
- </listitem>
-
- <listitem>
- <para>Set watch: <emphasis
- role="bold">exists(<emphasis>b</emphasis> + ‘‘/ready’’,
- true)</emphasis></para>
- </listitem>
-
- <listitem>
- <para>Create child: <emphasis role="bold">create(
- <emphasis>n</emphasis>, EPHEMERAL)</emphasis></para>
- </listitem>
-
- <listitem>
- <para><emphasis role="bold">L = getChildren(b,
- false)</emphasis></para>
- </listitem>
-
- <listitem>
- <para>if fewer children in L than<emphasis>
- x</emphasis>, wait for watch event</para>
- </listitem>
-
- <listitem>
- <para>else <emphasis role="bold">create(b + ‘‘/ready’’,
- REGULAR)</emphasis></para>
- </listitem>
- </orderedlist></entry>
-
- <entry><orderedlist>
- <listitem>
- <para><emphasis role="bold">L = getChildren(b,
- false)</emphasis></para>
- </listitem>
-
- <listitem>
- <para>if no children, exit</para>
- </listitem>
-
- <listitem>
- <para>if <emphasis>p</emphasis> is only process node in
- L, delete(n) and exit</para>
- </listitem>
-
- <listitem>
- <para>if <emphasis>p</emphasis> is the lowest process
- node in L, wait on highest process node in L</para>
- </listitem>
-
- <listitem>
- <para>else <emphasis
- role="bold">delete(<emphasis>n</emphasis>) </emphasis>if
- still exists and wait on lowest process node in L</para>
- </listitem>
-
- <listitem>
- <para>goto 1</para>
- </listitem>
- </orderedlist></entry>
- </row>
- </tbody>
- </tgroup>
- </informaltable>
- <para>On entering, all processes watch on a ready node and
- create an ephemeral node as a child of the barrier node. Each process
- but the last enters the barrier and waits for the ready node to appear
- at line 5. The process that creates the xth node, the last process, will
- see x nodes in the list of children and create the ready node, waking up
- the other processes. Note that waiting processes wake up only when it is
- time to exit, so waiting is efficient.
- </para>
-
- <para>On exit, you can't use a flag such as <emphasis>ready</emphasis>
- because you are watching for process nodes to go away. By using
- ephemeral nodes, processes that fail after the barrier has been entered
- do not prevent correct processes from finishing. When processes are
- ready to leave, they need to delete their process nodes and wait for all
- other processes to do the same.</para>
-
- <para>Processes exit when there are no process nodes left as children of
- <emphasis>b</emphasis>. However, as an efficiency, you can use the
- lowest process node as the ready flag. All other processes that are
- ready to exit watch for the lowest existing process node to go away, and
- the owner of the lowest process watches for any other process node
- (picking the highest for simplicity) to go away. This means that only a
- single process wakes up on each node deletion except for the last node,
- which wakes up everyone when it is removed.</para>
- </section>
- </section>
-
- <section id="sc_recipes_Queues">
- <title>Queues</title>
-
- <para>Distributed queues are a common data structure. To implement a
- distributed queue in ZooKeeper, first designate a znode to hold the queue,
- the queue node. The distributed clients put something into the queue by
- calling create() with a pathname ending in "queue-", with the
- <emphasis>sequence</emphasis> and <emphasis>ephemeral</emphasis> flags in
- the create() call set to true. Because the <emphasis>sequence</emphasis>
- flag is set, the new pathnames will have the form
- _path-to-queue-node_/queue-X, where X is a monotonic increasing number. A
- client that wants to be removed from the queue calls ZooKeeper's <emphasis
- role="bold">getChildren( )</emphasis> function, with
- <emphasis>watch</emphasis> set to true on the queue node, and begins
- processing nodes with the lowest number. The client does not need to issue
- another <emphasis role="bold">getChildren( )</emphasis> until it exhausts
- the list obtained from the first <emphasis role="bold">getChildren(
- )</emphasis> call. If there are are no children in the queue node, the
- reader waits for a watch notification to check the queue again.</para>
-
- <note>
- <para>There now exists a Queue implementation in ZooKeeper
- recipes directory. This is distributed with the release --
- src/recipes/queue directory of the release artifact.
- </para>
- </note>
-
- <section id="sc_recipes_priorityQueues">
- <title>Priority Queues</title>
-
- <para>To implement a priority queue, you need only make two simple
- changes to the generic <ulink url="#sc_recipes_Queues">queue
- recipe</ulink> . First, to add to a queue, the pathname ends with
- "queue-YY" where YY is the priority of the element with lower numbers
- representing higher priority (just like UNIX). Second, when removing
- from the queue, a client uses an up-to-date children list meaning that
- the client will invalidate previously obtained children lists if a watch
- notification triggers for the queue node.</para>
- </section>
- </section>
-
- <section id="sc_recipes_Locks">
- <title>Locks</title>
-
- <para>Fully distributed locks that are globally synchronous, meaning at
- any snapshot in time no two clients think they hold the same lock. These
- can be implemented using ZooKeeeper. As with priority queues, first define
- a lock node.</para>
-
- <note>
- <para>There now exists a Lock implementation in ZooKeeper
- recipes directory. This is distributed with the release --
- src/recipes/lock directory of the release artifact.
- </para>
- </note>
-
- <para>Clients wishing to obtain a lock do the following:</para>
-
- <orderedlist>
- <listitem>
- <para>Call <emphasis role="bold">create( )</emphasis> with a pathname
- of "_locknode_/lock-" and the <emphasis>sequence</emphasis> and
- <emphasis>ephemeral</emphasis> flags set.</para>
- </listitem>
-
- <listitem>
- <para>Call <emphasis role="bold">getChildren( )</emphasis> on the lock
- node <emphasis>without</emphasis> setting the watch flag (this is
- important to avoid the herd effect).</para>
- </listitem>
-
- <listitem>
- <para>If the pathname created in step <emphasis
- role="bold">1</emphasis> has the lowest sequence number suffix, the
- client has the lock and the client exits the protocol.</para>
- </listitem>
-
- <listitem>
- <para>The client calls <emphasis role="bold">exists( )</emphasis> with
- the watch flag set on the path in the lock directory with the next
- lowest sequence number.</para>
- </listitem>
-
- <listitem>
- <para>if <emphasis role="bold">exists( )</emphasis> returns false, go
- to step <emphasis role="bold">2</emphasis>. Otherwise, wait for a
- notification for the pathname from the previous step before going to
- step <emphasis role="bold">2</emphasis>.</para>
- </listitem>
- </orderedlist>
-
- <para>The unlock protocol is very simple: clients wishing to release a
- lock simply delete the node they created in step 1.</para>
-
- <para>Here are a few things to notice:</para>
-
- <itemizedlist>
- <listitem>
- <para>The removal of a node will only cause one client to wake up
- since each node is watched by exactly one client. In this way, you
- avoid the herd effect.</para>
- </listitem>
- </itemizedlist>
-
- <itemizedlist>
- <listitem>
- <para>There is no polling or timeouts.</para>
- </listitem>
- </itemizedlist>
-
- <itemizedlist>
- <listitem>
- <para>Because of the way you implement locking, it is easy to see the
- amount of lock contention, break locks, debug locking problems,
- etc.</para>
- </listitem>
- </itemizedlist>
-
- <section>
- <title>Shared Locks</title>
-
- <para>You can implement shared locks by with a few changes to the lock
- protocol:</para>
-
- <informaltable colsep="0" frame="none" rowsep="0">
- <tgroup cols="2">
- <tbody>
- <row>
- <entry align="center"><emphasis role="bold">Obtaining a read
- lock:</emphasis></entry>
-
- <entry align="center"><emphasis role="bold">Obtaining a write
- lock:</emphasis></entry>
- </row>
-
- <row>
- <entry align="left"><orderedlist>
- <listitem>
- <para>Call <emphasis role="bold">create( )</emphasis> to
- create a node with pathname
- "<filename>_locknode_/read-</filename>". This is the
- lock node use later in the protocol. Make sure to set both
- the <emphasis>sequence</emphasis> and
- <emphasis>ephemeral</emphasis> flags.</para>
- </listitem>
-
- <listitem>
- <para>Call <emphasis role="bold">getChildren( )</emphasis>
- on the lock node <emphasis>without</emphasis> setting the
- <emphasis>watch</emphasis> flag - this is important, as it
- avoids the herd effect.</para>
- </listitem>
-
- <listitem>
- <para>If there are no children with a pathname starting
- with "<filename>write-</filename>" and having a lower
- sequence number than the node created in step <emphasis
- role="bold">1</emphasis>, the client has the lock and can
- exit the protocol. </para>
- </listitem>
-
- <listitem>
- <para>Otherwise, call <emphasis role="bold">exists(
- )</emphasis>, with <emphasis>watch</emphasis> flag, set on
- the node in lock directory with pathname staring with
- "<filename>write-</filename>" having the next lowest
- sequence number.</para>
- </listitem>
-
- <listitem>
- <para>If <emphasis role="bold">exists( )</emphasis>
- returns <emphasis>false</emphasis>, goto step <emphasis
- role="bold">2</emphasis>.</para>
- </listitem>
-
- <listitem>
- <para>Otherwise, wait for a notification for the pathname
- from the previous step before going to step <emphasis
- role="bold">2</emphasis></para>
- </listitem>
- </orderedlist></entry>
-
- <entry><orderedlist>
- <listitem>
- <para>Call <emphasis role="bold">create( )</emphasis> to
- create a node with pathname
- "<filename>_locknode_/write-</filename>". This is the
- lock node spoken of later in the protocol. Make sure to
- set both <emphasis>sequence</emphasis> and
- <emphasis>ephemeral</emphasis> flags.</para>
- </listitem>
-
- <listitem>
- <para>Call <emphasis role="bold">getChildren( )
- </emphasis> on the lock node <emphasis>without</emphasis>
- setting the <emphasis>watch</emphasis> flag - this is
- important, as it avoids the herd effect.</para>
- </listitem>
-
- <listitem>
- <para>If there are no children with a lower sequence
- number than the node created in step <emphasis
- role="bold">1</emphasis>, the client has the lock and the
- client exits the protocol.</para>
- </listitem>
-
- <listitem>
- <para>Call <emphasis role="bold">exists( ),</emphasis>
- with <emphasis>watch</emphasis> flag set, on the node with
- the pathname that has the next lowest sequence
- number.</para>
- </listitem>
-
- <listitem>
- <para>If <emphasis role="bold">exists( )</emphasis>
- returns <emphasis>false</emphasis>, goto step <emphasis
- role="bold">2</emphasis>. Otherwise, wait for a
- notification for the pathname from the previous step
- before going to step <emphasis
- role="bold">2</emphasis>.</para>
- </listitem>
- </orderedlist></entry>
- </row>
- </tbody>
- </tgroup>
- </informaltable>
-
- <note>
- <para>It might appear that this recipe creates a herd effect:
- when there is a large group of clients waiting for a read
- lock, and all getting notified more or less simultaneously
- when the "<filename>write-</filename>" node with the lowest
- sequence number is deleted. In fact. that's valid behavior:
- as all those waiting reader clients should be released since
- they have the lock. The herd effect refers to releasing a
- "herd" when in fact only a single or a small number of
- machines can proceed.
- </para>
- </note>
- </section>
-
- <section id="sc_recoverableSharedLocks">
- <title>Recoverable Shared Locks</title>
-
- <para>With minor modifications to the Shared Lock protocol, you make
- shared locks revocable by modifying the shared lock protocol:</para>
-
- <para>In step <emphasis role="bold">1</emphasis>, of both obtain reader
- and writer lock protocols, call <emphasis role="bold">getData(
- )</emphasis> with <emphasis>watch</emphasis> set, immediately after the
- call to <emphasis role="bold">create( )</emphasis>. If the client
- subsequently receives notification for the node it created in step
- <emphasis role="bold">1</emphasis>, it does another <emphasis
- role="bold">getData( )</emphasis> on that node, with
- <emphasis>watch</emphasis> set and looks for the string "unlock", which
- signals to the client that it must release the lock. This is because,
- according to this shared lock protocol, you can request the client with
- the lock give up the lock by calling <emphasis role="bold">setData()
- </emphasis> on the lock node, writing "unlock" to that node.</para>
-
- <para>Note that this protocol requires the lock holder to consent to
- releasing the lock. Such consent is important, especially if the lock
- holder needs to do some processing before releasing the lock. Of course
- you can always implement <emphasis>Revocable Shared Locks with Freaking
- Laser Beams</emphasis> by stipulating in your protocol that the revoker
- is allowed to delete the lock node if after some length of time the lock
- isn't deleted by the lock holder.</para>
- </section>
- </section>
-
- <section id="sc_recipes_twoPhasedCommit">
- <title>Two-phased Commit</title>
-
- <para>A two-phase commit protocol is an algorithm that lets all clients in
- a distributed system agree either to commit a transaction or abort.</para>
-
- <para>In ZooKeeper, you can implement a two-phased commit by having a
- coordinator create a transaction node, say "/app/Tx", and one child node
- per participating site, say "/app/Tx/s_i". When coordinator creates the
- child node, it leaves the content undefined. Once each site involved in
- the transaction receives the transaction from the coordinator, the site
- reads each child node and sets a watch. Each site then processes the query
- and votes "commit" or "abort" by writing to its respective node. Once the
- write completes, the other sites are notified, and as soon as all sites
- have all votes, they can decide either "abort" or "commit". Note that a
- node can decide "abort" earlier if some site votes for "abort".</para>
-
- <para>An interesting aspect of this implementation is that the only role
- of the coordinator is to decide upon the group of sites, to create the
- ZooKeeper nodes, and to propagate the transaction to the corresponding
- sites. In fact, even propagating the transaction can be done through
- ZooKeeper by writing it in the transaction node.</para>
-
- <para>There are two important drawbacks of the approach described above.
- One is the message complexity, which is O(n²). The second is the
- impossibility of detecting failures of sites through ephemeral nodes. To
- detect the failure of a site using ephemeral nodes, it is necessary that
- the site create the node.</para>
-
- <para>To solve the first problem, you can have only the coordinator
- notified of changes to the transaction nodes, and then notify the sites
- once coordinator reaches a decision. Note that this approach is scalable,
- but it's is slower too, as it requires all communication to go through the
- coordinator.</para>
-
- <para>To address the second problem, you can have the coordinator
- propagate the transaction to the sites, and have each site creating its
- own ephemeral node.</para>
- </section>
-
- <section id="sc_leaderElection">
- <title>Leader Election</title>
-
- <para>A simple way of doing leader election with ZooKeeper is to use the
- <emphasis role="bold">SEQUENCE|EPHEMERAL</emphasis> flags when creating
- znodes that represent "proposals" of clients. The idea is to have a znode,
- say "/election", such that each znode creates a child znode "/election/n_"
- with both flags SEQUENCE|EPHEMERAL. With the sequence flag, ZooKeeper
- automatically appends a sequence number that is greater that any one
- previously appended to a child of "/election". The process that created
- the znode with the smallest appended sequence number is the leader.
- </para>
-
- <para>That's not all, though. It is important to watch for failures of the
- leader, so that a new client arises as the new leader in the case the
- current leader fails. A trivial solution is to have all application
- processes watching upon the current smallest znode, and checking if they
- are the new leader when the smallest znode goes away (note that the
- smallest znode will go away if the leader fails because the node is
- ephemeral). But this causes a herd effect: upon of failure of the current
- leader, all other processes receive a notification, and execute
- getChildren on "/election" to obtain the current list of children of
- "/election". If the number of clients is large, it causes a spike on the
- number of operations that ZooKeeper servers have to process. To avoid the
- herd effect, it is sufficient to watch for the next znode down on the
- sequence of znodes. If a client receives a notification that the znode it
- is watching is gone, then it becomes the new leader in the case that there
- is no smaller znode. Note that this avoids the herd effect by not having
- all clients watching the same znode. </para>
-
- <para>Here's the pseudo code:</para>
-
- <para>Let ELECTION be a path of choice of the application. To volunteer to
- be a leader: </para>
-
- <orderedlist>
- <listitem>
- <para>Create znode z with path "ELECTION/n_" with both SEQUENCE and
- EPHEMERAL flags;</para>
- </listitem>
-
- <listitem>
- <para>Let C be the children of "ELECTION", and i be the sequence
- number of z;</para>
- </listitem>
-
- <listitem>
- <para>Watch for changes on "ELECTION/n_j", where j is the largest
- sequence number such that j < i and n_j is a znode in C;</para>
- </listitem>
- </orderedlist>
-
- <para>Upon receiving a notification of znode deletion: </para>
-
- <orderedlist>
- <listitem>
- <para>Let C be the new set of children of ELECTION; </para>
- </listitem>
-
- <listitem>
- <para>If z is the smallest node in C, then execute leader
- procedure;</para>
- </listitem>
-
- <listitem>
- <para>Otherwise, watch for changes on "ELECTION/n_j", where j is the
- largest sequence number such that j < i and n_j is a znode in C;
- </para>
- </listitem>
- </orderedlist>
-
- <para>Note that the znode having no preceding znode on the list of
- children does not imply that the creator of this znode is aware that it is
- the current leader. Applications may consider creating a separate znode
- to acknowledge that the leader has executed the leader procedure. </para>
- </section>
- </section>
-</article>
http://git-wip-us.apache.org/repos/asf/zookeeper/blob/c1efa954/src/docs/src/documentation/content/xdocs/site.xml
----------------------------------------------------------------------
diff --git a/src/docs/src/documentation/content/xdocs/site.xml b/src/docs/src/documentation/content/xdocs/site.xml
deleted file mode 100644
index e49d92c..0000000
--- a/src/docs/src/documentation/content/xdocs/site.xml
+++ /dev/null
@@ -1,103 +0,0 @@
-<?xml version="1.0"?>
-<!--
- Copyright 2002-2004 The Apache Software Foundation
-
- Licensed under the Apache License, Version 2.0 (the "License");
- you may not use this file except in compliance with the License.
- You may obtain a copy of the License at
-
- http://www.apache.org/licenses/LICENSE-2.0
-
- Unless required by applicable law or agreed to in writing, software
- distributed under the License is distributed on an "AS IS" BASIS,
- WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- See the License for the specific language governing permissions and
- limitations under the License.
--->
-
-<!--
-Forrest site.xml
-
-This file contains an outline of the site's information content. It is used to:
-- Generate the website menus (though these can be overridden - see docs)
-- Provide semantic, location-independent aliases for internal 'site:' URIs, eg
-<link href="site:changes"> links to changes.html (or ../changes.html if in
- subdir).
-- Provide aliases for external URLs in the external-refs section. Eg, <link
- href="ext:cocoon"> links to http://xml.apache.org/cocoon/
-
-See http://forrest.apache.org/docs/linking.html for more info.
--->
-
-<site label="ZooKeeper" href="" xmlns="http://apache.org/forrest/linkmap/1.0">
-
- <docs label="Overview">
- <welcome label="Welcome" href="index.html" />
- <overview label="Overview" href="zookeeperOver.html" />
- <started label="Getting Started" href="zookeeperStarted.html" />
- <relnotes label="Release Notes" href="ext:relnotes" />
- </docs>
-
- <docs label="Developer">
- <api label="API Docs" href="ext:api/index" />
- <program label="Programmer's Guide" href="zookeeperProgrammers.html" />
- <javaEx label="Java Example" href="javaExample.html" />
- <barTutor label="Barrier and Queue Tutorial" href="zookeeperTutorial.html" />
- <recipes label="Recipes" href="recipes.html" />
- </docs>
-
- <docs label="BookKeeper">
- <bkStarted label="Getting started" href="bookkeeperStarted.html" />
- <bkOverview label="Overview" href="bookkeeperOverview.html" />
- <bkProgrammer label="Setup guide" href="bookkeeperConfig.html" />
- <bkProgrammer label="Programmer's guide" href="bookkeeperProgrammer.html" />
- </docs>
-
- <docs label="Admin & Ops">
- <admin label="Administrator's Guide" href="zookeeperAdmin.html" />
- <quota label="Quota Guide" href="zookeeperQuotas.html" />
- <jmx label="JMX" href="zookeeperJMX.html" />
- <observers label="Observers Guide" href="zookeeperObservers.html" />
- </docs>
-
- <docs label="Contributor">
- <internals label="ZooKeeper Internals" href="zookeeperInternals.html" />
- </docs>
-
- <docs label="Miscellaneous">
- <wiki label="Wiki" href="ext:wiki" />
- <faq label="FAQ" href="ext:faq" />
- <lists label="Mailing Lists" href="ext:lists" />
- <!--<other label="Other Info" href="zookeeperOtherInfo.html" />-->
- </docs>
-
-
-
- <external-refs>
- <site href="http://zookeeper.apache.org/"/>
- <lists href="http://zookeeper.apache.org/mailing_lists.html"/>
- <releases href="http://zookeeper.apache.org/releases.html">
- <download href="#Download" />
- </releases>
- <jira href="http://zookeeper.apache.org/issue_tracking.html"/>
- <wiki href="https://cwiki.apache.org/confluence/display/ZOOKEEPER" />
- <faq href="https://cwiki.apache.org/confluence/display/ZOOKEEPER/FAQ" />
- <zlib href="http://www.zlib.net/" />
- <lzo href="http://www.oberhumer.com/opensource/lzo/" />
- <gzip href="http://www.gzip.org/" />
- <cygwin href="http://www.cygwin.com/" />
- <osx href="http://www.apple.com/macosx" />
- <relnotes href="releasenotes.html" />
- <api href="api/">
- <started href="overview-summary.html#overview_description" />
- <index href="index.html" />
- <org href="org/">
- <apache href="apache/">
- <zookeeper href="zookeeper/">
- </zookeeper>
- </apache>
- </org>
- </api>
- </external-refs>
-
-</site>
http://git-wip-us.apache.org/repos/asf/zookeeper/blob/c1efa954/src/docs/src/documentation/content/xdocs/tabs.xml
----------------------------------------------------------------------
diff --git a/src/docs/src/documentation/content/xdocs/tabs.xml b/src/docs/src/documentation/content/xdocs/tabs.xml
deleted file mode 100644
index aef7e59..0000000
--- a/src/docs/src/documentation/content/xdocs/tabs.xml
+++ /dev/null
@@ -1,36 +0,0 @@
-<?xml version="1.0"?>
-<!--
- Copyright 2002-2004 The Apache Software Foundation
-
- Licensed under the Apache License, Version 2.0 (the "License");
- you may not use this file except in compliance with the License.
- You may obtain a copy of the License at
-
- http://www.apache.org/licenses/LICENSE-2.0
-
- Unless required by applicable law or agreed to in writing, software
- distributed under the License is distributed on an "AS IS" BASIS,
- WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- See the License for the specific language governing permissions and
- limitations under the License.
--->
-
-<!DOCTYPE tabs PUBLIC "-//APACHE//DTD Cocoon Documentation Tab V1.0//EN"
- "http://forrest.apache.org/dtd/tab-cocoon-v10.dtd">
-
-<tabs software="ZooKeeper"
- title="ZooKeeper"
- copyright="The Apache Software Foundation"
- xmlns:xlink="http://www.w3.org/1999/xlink">
-
- <!-- The rules are:
- @dir will always have /index.html added.
- @href is not modified unless it is root-relative and obviously specifies a
- directory (ends in '/'), in which case /index.html will be added
- -->
-
- <tab label="Project" href="http://zookeeper.apache.org/" />
- <tab label="Wiki" href="https://cwiki.apache.org/confluence/display/ZOOKEEPER/" />
- <tab label="ZooKeeper 3.4 Documentation" dir="" />
-
-</tabs>
[10/12] zookeeper git commit: ZOOKEEPER-3022: MAVEN MIGRATION 3.4 -
Iteration 1 - docs, it
Posted by an...@apache.org.
http://git-wip-us.apache.org/repos/asf/zookeeper/blob/c1efa954/src/docs/src/documentation/content/xdocs/zookeeperAdmin.xml
----------------------------------------------------------------------
diff --git a/src/docs/src/documentation/content/xdocs/zookeeperAdmin.xml b/src/docs/src/documentation/content/xdocs/zookeeperAdmin.xml
deleted file mode 100644
index d88ddbd..0000000
--- a/src/docs/src/documentation/content/xdocs/zookeeperAdmin.xml
+++ /dev/null
@@ -1,1861 +0,0 @@
-<?xml version="1.0" encoding="UTF-8"?>
-<!--
- Copyright 2002-2004 The Apache Software Foundation
-
- Licensed under the Apache License, Version 2.0 (the "License");
- you may not use this file except in compliance with the License.
- You may obtain a copy of the License at
-
- http://www.apache.org/licenses/LICENSE-2.0
-
- Unless required by applicable law or agreed to in writing, software
- distributed under the License is distributed on an "AS IS" BASIS,
- WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- See the License for the specific language governing permissions and
- limitations under the License.
--->
-<!DOCTYPE article PUBLIC "-//OASIS//DTD Simplified DocBook XML V1.0//EN"
-"http://www.oasis-open.org/docbook/xml/simple/1.0/sdocbook.dtd">
-<article id="bk_Admin">
- <title>ZooKeeper Administrator's Guide</title>
-
- <subtitle>A Guide to Deployment and Administration</subtitle>
-
- <articleinfo>
- <legalnotice>
- <para>Licensed under the Apache License, Version 2.0 (the "License");
- you may not use this file except in compliance with the License. You may
- obtain a copy of the License at <ulink
- url="http://www.apache.org/licenses/LICENSE-2.0">http://www.apache.org/licenses/LICENSE-2.0</ulink>.</para>
-
- <para>Unless required by applicable law or agreed to in writing,
- software distributed under the License is distributed on an "AS IS"
- BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
- implied. See the License for the specific language governing permissions
- and limitations under the License.</para>
- </legalnotice>
-
- <abstract>
- <para>This document contains information about deploying, administering
- and mantaining ZooKeeper. It also discusses best practices and common
- problems.</para>
- </abstract>
- </articleinfo>
-
- <section id="ch_deployment">
- <title>Deployment</title>
-
- <para>This section contains information about deploying Zookeeper and
- covers these topics:</para>
-
- <itemizedlist>
- <listitem>
- <para><xref linkend="sc_systemReq" /></para>
- </listitem>
-
- <listitem>
- <para><xref linkend="sc_zkMulitServerSetup" /></para>
- </listitem>
-
- <listitem>
- <para><xref linkend="sc_singleAndDevSetup" /></para>
- </listitem>
- </itemizedlist>
-
- <para>The first two sections assume you are interested in installing
- ZooKeeper in a production environment such as a datacenter. The final
- section covers situations in which you are setting up ZooKeeper on a
- limited basis - for evaluation, testing, or development - but not in a
- production environment.</para>
-
- <section id="sc_systemReq">
- <title>System Requirements</title>
-
- <section id="sc_supportedPlatforms">
- <title>Supported Platforms</title>
-
- <para>ZooKeeper consists of multiple components. Some components are
- supported broadly, and other components are supported only on a smaller
- set of platforms.</para>
-
- <itemizedlist>
- <listitem>
- <para><emphasis role="bold">Client</emphasis> is the Java client
- library, used by applications to connect to a ZooKeeper ensemble.
- </para>
- </listitem>
- <listitem>
- <para><emphasis role="bold">Server</emphasis> is the Java server
- that runs on the ZooKeeper ensemble nodes.</para>
- </listitem>
- <listitem>
- <para><emphasis role="bold">Native Client</emphasis> is a client
- implemented in C, similar to the Java client, used by applications
- to connect to a ZooKeeper ensemble.</para>
- </listitem>
- <listitem>
- <para><emphasis role="bold">Contrib</emphasis> refers to multiple
- optional add-on components.</para>
- </listitem>
- </itemizedlist>
-
- <para>The following matrix describes the level of support committed for
- running each component on different operating system platforms.</para>
-
- <table>
- <title>Support Matrix</title>
- <tgroup cols="5" align="left" colsep="1" rowsep="1">
- <thead>
- <row>
- <entry>Operating System</entry>
- <entry>Client</entry>
- <entry>Server</entry>
- <entry>Native Client</entry>
- <entry>Contrib</entry>
- </row>
- </thead>
- <tbody>
- <row>
- <entry>GNU/Linux</entry>
- <entry>Development and Production</entry>
- <entry>Development and Production</entry>
- <entry>Development and Production</entry>
- <entry>Development and Production</entry>
- </row>
- <row>
- <entry>Solaris</entry>
- <entry>Development and Production</entry>
- <entry>Development and Production</entry>
- <entry>Not Supported</entry>
- <entry>Not Supported</entry>
- </row>
- <row>
- <entry>FreeBSD</entry>
- <entry>Development and Production</entry>
- <entry>Development and Production</entry>
- <entry>Not Supported</entry>
- <entry>Not Supported</entry>
- </row>
- <row>
- <entry>Windows</entry>
- <entry>Development and Production</entry>
- <entry>Development and Production</entry>
- <entry>Not Supported</entry>
- <entry>Not Supported</entry>
- </row>
- <row>
- <entry>Mac OS X</entry>
- <entry>Development Only</entry>
- <entry>Development Only</entry>
- <entry>Not Supported</entry>
- <entry>Not Supported</entry>
- </row>
- </tbody>
- </tgroup>
- </table>
-
- <para>For any operating system not explicitly mentioned as supported in
- the matrix, components may or may not work. The ZooKeeper community
- will fix obvious bugs that are reported for other platforms, but there
- is no full support.</para>
- </section>
-
- <section id="sc_requiredSoftware">
- <title>Required Software </title>
-
- <para>ZooKeeper runs in Java, release 1.6 or greater (JDK 6 or
- greater). It runs as an <emphasis>ensemble</emphasis> of
- ZooKeeper servers. Three ZooKeeper servers is the minimum
- recommended size for an ensemble, and we also recommend that
- they run on separate machines. At Yahoo!, ZooKeeper is
- usually deployed on dedicated RHEL boxes, with dual-core
- processors, 2GB of RAM, and 80GB IDE hard drives.</para>
- </section>
-
- </section>
-
- <section id="sc_zkMulitServerSetup">
- <title>Clustered (Multi-Server) Setup</title>
-
- <para>For reliable ZooKeeper service, you should deploy ZooKeeper in a
- cluster known as an <emphasis>ensemble</emphasis>. As long as a majority
- of the ensemble are up, the service will be available. Because Zookeeper
- requires a majority, it is best to use an
- odd number of machines. For example, with four machines ZooKeeper can
- only handle the failure of a single machine; if two machines fail, the
- remaining two machines do not constitute a majority. However, with five
- machines ZooKeeper can handle the failure of two machines. </para>
- <note>
- <para>
- As mentioned in the
- <ulink url="zookeeperStarted.html">ZooKeeper Getting Started Guide</ulink>
- , a minimum of three servers are required for a fault tolerant
- clustered setup, and it is strongly recommended that you have an
- odd number of servers.
- </para>
- <para>Usually three servers is more than enough for a production
- install, but for maximum reliability during maintenance, you may
- wish to install five servers. With three servers, if you perform
- maintenance on one of them, you are vulnerable to a failure on one
- of the other two servers during that maintenance. If you have five
- of them running, you can take one down for maintenance, and know
- that you're still OK if one of the other four suddenly fails.
- </para>
- <para>Your redundancy considerations should include all aspects of
- your environment. If you have three ZooKeeper servers, but their
- network cables are all plugged into the same network switch, then
- the failure of that switch will take down your entire ensemble.
- </para>
- </note>
- <para>Here are the steps to setting a server that will be part of an
- ensemble. These steps should be performed on every host in the
- ensemble:</para>
-
- <orderedlist>
- <listitem>
- <para>Install the Java JDK. You can use the native packaging system
- for your system, or download the JDK from:</para>
-
- <para><ulink
- url="http://java.sun.com/javase/downloads/index.jsp">http://java.sun.com/javase/downloads/index.jsp</ulink></para>
- </listitem>
-
- <listitem>
- <para>Set the Java heap size. This is very important to avoid
- swapping, which will seriously degrade ZooKeeper performance. To
- determine the correct value, use load tests, and make sure you are
- well below the usage limit that would cause you to swap. Be
- conservative - use a maximum heap size of 3GB for a 4GB
- machine.</para>
- </listitem>
-
- <listitem>
- <para>Install the ZooKeeper Server Package. It can be downloaded
- from:
- </para>
- <para>
- <ulink url="http://zookeeper.apache.org/releases.html">
- http://zookeeper.apache.org/releases.html
- </ulink>
- </para>
- </listitem>
-
- <listitem>
- <para>Create a configuration file. This file can be called anything.
- Use the following settings as a starting point:</para>
-
- <programlisting>
-tickTime=2000
-dataDir=/var/lib/zookeeper/
-clientPort=2181
-initLimit=5
-syncLimit=2
-server.1=zoo1:2888:3888
-server.2=zoo2:2888:3888
-server.3=zoo3:2888:3888</programlisting>
-
- <para>You can find the meanings of these and other configuration
- settings in the section <xref linkend="sc_configuration" />. A word
- though about a few here:</para>
-
- <para>Every machine that is part of the ZooKeeper ensemble should know
- about every other machine in the ensemble. You accomplish this with
- the series of lines of the form <emphasis
- role="bold">server.id=host:port:port</emphasis>. The parameters <emphasis
- role="bold">host</emphasis> and <emphasis
- role="bold">port</emphasis> are straightforward. You attribute the
- server id to each machine by creating a file named
- <filename>myid</filename>, one for each server, which resides in
- that server's data directory, as specified by the configuration file
- parameter <emphasis role="bold">dataDir</emphasis>.</para></listitem>
-
- <listitem><para>The myid file
- consists of a single line containing only the text of that machine's
- id. So <filename>myid</filename> of server 1 would contain the text
- "1" and nothing else. The id must be unique within the
- ensemble and should have a value between 1 and 255.</para>
- </listitem>
-
- <listitem>
- <para>If your configuration file is set up, you can start a
- ZooKeeper server:</para>
-
- <para><computeroutput>$ java -cp zookeeper.jar:lib/slf4j-api-1.6.1.jar:lib/slf4j-log4j12-1.6.1.jar:lib/log4j-1.2.15.jar:conf \
- org.apache.zookeeper.server.quorum.QuorumPeerMain zoo.cfg
- </computeroutput></para>
-
- <para>QuorumPeerMain starts a ZooKeeper server,
- <ulink url="http://java.sun.com/javase/technologies/core/mntr-mgmt/javamanagement/">JMX</ulink>
- management beans are also registered which allows
- management through a JMX management console.
- The <ulink url="zookeeperJMX.html">ZooKeeper JMX
- document</ulink> contains details on managing ZooKeeper with JMX.
- </para>
-
- <para>See the script <emphasis>bin/zkServer.sh</emphasis>,
- which is included in the release, for an example
- of starting server instances.</para>
-
- </listitem>
-
- <listitem>
- <para>Test your deployment by connecting to the hosts:</para>
-
- <para>In Java, you can run the following command to execute
- simple operations:</para>
-
- <para><computeroutput>$ bin/zkCli.sh -server 127.0.0.1:2181</computeroutput></para>
- </listitem>
- </orderedlist>
- </section>
-
- <section id="sc_singleAndDevSetup">
- <title>Single Server and Developer Setup</title>
-
- <para>If you want to setup ZooKeeper for development purposes, you will
- probably want to setup a single server instance of ZooKeeper, and then
- install either the Java or C client-side libraries and bindings on your
- development machine.</para>
-
- <para>The steps to setting up a single server instance are the similar
- to the above, except the configuration file is simpler. You can find the
- complete instructions in the <ulink
- url="zookeeperStarted.html#sc_InstallingSingleMode">Installing and
- Running ZooKeeper in Single Server Mode</ulink> section of the <ulink
- url="zookeeperStarted.html">ZooKeeper Getting Started
- Guide</ulink>.</para>
-
- <para>For information on installing the client side libraries, refer to
- the <ulink url="zookeeperProgrammers.html#Bindings">Bindings</ulink>
- section of the <ulink url="zookeeperProgrammers.html">ZooKeeper
- Programmer's Guide</ulink>.</para>
- </section>
- </section>
-
- <section id="ch_administration">
- <title>Administration</title>
-
- <para>This section contains information about running and maintaining
- ZooKeeper and covers these topics: </para>
- <itemizedlist>
- <listitem>
- <para><xref linkend="sc_designing" /></para>
- </listitem>
-
- <listitem>
- <para><xref linkend="sc_provisioning" /></para>
- </listitem>
-
- <listitem>
- <para><xref linkend="sc_strengthsAndLimitations" /></para>
- </listitem>
-
- <listitem>
- <para><xref linkend="sc_administering" /></para>
- </listitem>
-
- <listitem>
- <para><xref linkend="sc_maintenance" /></para>
- </listitem>
-
- <listitem>
- <para><xref linkend="sc_supervision" /></para>
- </listitem>
-
- <listitem>
- <para><xref linkend="sc_monitoring" /></para>
- </listitem>
-
- <listitem>
- <para><xref linkend="sc_logging" /></para>
- </listitem>
-
- <listitem>
- <para><xref linkend="sc_troubleshooting" /></para>
- </listitem>
-
- <listitem>
- <para><xref linkend="sc_configuration" /></para>
- </listitem>
-
- <listitem>
- <para><xref linkend="sc_zkCommands" /></para>
- </listitem>
-
- <listitem>
- <para><xref linkend="sc_dataFileManagement" /></para>
- </listitem>
-
- <listitem>
- <para><xref linkend="sc_commonProblems" /></para>
- </listitem>
-
- <listitem>
- <para><xref linkend="sc_bestPractices" /></para>
- </listitem>
- </itemizedlist>
-
- <section id="sc_designing">
- <title>Designing a ZooKeeper Deployment</title>
-
- <para>The reliablity of ZooKeeper rests on two basic assumptions.</para>
- <orderedlist>
- <listitem><para> Only a minority of servers in a deployment
- will fail. <emphasis>Failure</emphasis> in this context
- means a machine crash, or some error in the network that
- partitions a server off from the majority.</para>
- </listitem>
- <listitem><para> Deployed machines operate correctly. To
- operate correctly means to execute code correctly, to have
- clocks that work properly, and to have storage and network
- components that perform consistently.</para>
- </listitem>
- </orderedlist>
-
- <para>The sections below contain considerations for ZooKeeper
- administrators to maximize the probability for these assumptions
- to hold true. Some of these are cross-machines considerations,
- and others are things you should consider for each and every
- machine in your deployment.</para>
-
- <section id="sc_CrossMachineRequirements">
- <title>Cross Machine Requirements</title>
-
- <para>For the ZooKeeper service to be active, there must be a
- majority of non-failing machines that can communicate with
- each other. To create a deployment that can tolerate the
- failure of F machines, you should count on deploying 2xF+1
- machines. Thus, a deployment that consists of three machines
- can handle one failure, and a deployment of five machines can
- handle two failures. Note that a deployment of six machines
- can only handle two failures since three machines is not a
- majority. For this reason, ZooKeeper deployments are usually
- made up of an odd number of machines.</para>
-
- <para>To achieve the highest probability of tolerating a failure
- you should try to make machine failures independent. For
- example, if most of the machines share the same switch,
- failure of that switch could cause a correlated failure and
- bring down the service. The same holds true of shared power
- circuits, cooling systems, etc.</para>
- </section>
-
- <section>
- <title>Single Machine Requirements</title>
-
- <para>If ZooKeeper has to contend with other applications for
- access to resourses like storage media, CPU, network, or
- memory, its performance will suffer markedly. ZooKeeper has
- strong durability guarantees, which means it uses storage
- media to log changes before the operation responsible for the
- change is allowed to complete. You should be aware of this
- dependency then, and take great care if you want to ensure
- that ZooKeeper operations aren’t held up by your media. Here
- are some things you can do to minimize that sort of
- degradation:
- </para>
-
- <itemizedlist>
- <listitem>
- <para>ZooKeeper's transaction log must be on a dedicated
- device. (A dedicated partition is not enough.) ZooKeeper
- writes the log sequentially, without seeking Sharing your
- log device with other processes can cause seeks and
- contention, which in turn can cause multi-second
- delays.</para>
- </listitem>
-
- <listitem>
- <para>Do not put ZooKeeper in a situation that can cause a
- swap. In order for ZooKeeper to function with any sort of
- timeliness, it simply cannot be allowed to swap.
- Therefore, make certain that the maximum heap size given
- to ZooKeeper is not bigger than the amount of real memory
- available to ZooKeeper. For more on this, see
- <xref linkend="sc_commonProblems"/>
- below. </para>
- </listitem>
- </itemizedlist>
- </section>
- </section>
-
- <section id="sc_provisioning">
- <title>Provisioning</title>
-
- <para></para>
- </section>
-
- <section id="sc_strengthsAndLimitations">
- <title>Things to Consider: ZooKeeper Strengths and Limitations</title>
-
- <para></para>
- </section>
-
- <section id="sc_administering">
- <title>Administering</title>
-
- <para></para>
- </section>
-
- <section id="sc_maintenance">
- <title>Maintenance</title>
-
- <para>Little long term maintenance is required for a ZooKeeper
- cluster however you must be aware of the following:</para>
-
- <section>
- <title>Ongoing Data Directory Cleanup</title>
-
- <para>The ZooKeeper <ulink url="#var_datadir">Data
- Directory</ulink> contains files which are a persistent copy
- of the znodes stored by a particular serving ensemble. These
- are the snapshot and transactional log files. As changes are
- made to the znodes these changes are appended to a
- transaction log. Occasionally, when a log grows large, a
- snapshot of the current state of all znodes will be written
- to the filesystem and a new transaction log file is created
- for future transactions. During snapshotting, ZooKeeper may
- continue appending incoming transactions to the old log file.
- Therefore, some transactions which are newer than a snapshot
- may be found in the last transaction log preceding the
- snapshot.
- </para>
-
- <para>A ZooKeeper server <emphasis role="bold">will not remove
- old snapshots and log files</emphasis> when using the default
- configuration (see autopurge below), this is the
- responsibility of the operator. Every serving environment is
- different and therefore the requirements of managing these
- files may differ from install to install (backup for example).
- </para>
-
- <para>The PurgeTxnLog utility implements a simple retention
- policy that administrators can use. The <ulink
- url="ext:api/index">API docs</ulink> contains details on
- calling conventions (arguments, etc...).
- </para>
-
- <para>In the following example the last count snapshots and
- their corresponding logs are retained and the others are
- deleted. The value of <count> should typically be
- greater than 3 (although not required, this provides 3 backups
- in the unlikely event a recent log has become corrupted). This
- can be run as a cron job on the ZooKeeper server machines to
- clean up the logs daily.</para>
-
- <programlisting> java -cp zookeeper.jar:lib/slf4j-api-1.6.1.jar:lib/slf4j-log4j12-1.6.1.jar:lib/log4j-1.2.15.jar:conf org.apache.zookeeper.server.PurgeTxnLog <dataDir> <snapDir> -n <count></programlisting>
-
- <para>Automatic purging of the snapshots and corresponding
- transaction logs was introduced in version 3.4.0 and can be
- enabled via the following configuration parameters <emphasis
- role="bold">autopurge.snapRetainCount</emphasis> and <emphasis
- role="bold">autopurge.purgeInterval</emphasis>. For more on
- this, see <xref linkend="sc_advancedConfiguration"/>
- below.</para>
- </section>
-
- <section>
- <title>Debug Log Cleanup (log4j)</title>
-
- <para>See the section on <ulink
- url="#sc_logging">logging</ulink> in this document. It is
- expected that you will setup a rolling file appender using the
- in-built log4j feature. The sample configuration file in the
- release tar's conf/log4j.properties provides an example of
- this.
- </para>
- </section>
-
- </section>
-
- <section id="sc_supervision">
- <title>Supervision</title>
-
- <para>You will want to have a supervisory process that manages
- each of your ZooKeeper server processes (JVM). The ZK server is
- designed to be "fail fast" meaning that it will shutdown
- (process exit) if an error occurs that it cannot recover
- from. As a ZooKeeper serving cluster is highly reliable, this
- means that while the server may go down the cluster as a whole
- is still active and serving requests. Additionally, as the
- cluster is "self healing" the failed server once restarted will
- automatically rejoin the ensemble w/o any manual
- interaction.</para>
-
- <para>Having a supervisory process such as <ulink
- url="http://cr.yp.to/daemontools.html">daemontools</ulink> or
- <ulink
- url="http://en.wikipedia.org/wiki/Service_Management_Facility">SMF</ulink>
- (other options for supervisory process are also available, it's
- up to you which one you would like to use, these are just two
- examples) managing your ZooKeeper server ensures that if the
- process does exit abnormally it will automatically be restarted
- and will quickly rejoin the cluster.</para>
- </section>
-
- <section id="sc_monitoring">
- <title>Monitoring</title>
-
- <para>The ZooKeeper service can be monitored in one of two
- primary ways; 1) the command port through the use of <ulink
- url="#sc_zkCommands">4 letter words</ulink> and 2) <ulink
- url="zookeeperJMX.html">JMX</ulink>. See the appropriate section for
- your environment/requirements.</para>
- </section>
-
- <section id="sc_logging">
- <title>Logging</title>
-
- <para>ZooKeeper uses <emphasis role="bold">log4j</emphasis> version 1.2 as
- its logging infrastructure. The ZooKeeper default <filename>log4j.properties</filename>
- file resides in the <filename>conf</filename> directory. Log4j requires that
- <filename>log4j.properties</filename> either be in the working directory
- (the directory from which ZooKeeper is run) or be accessible from the classpath.</para>
-
- <para>For more information, see
- <ulink url="http://logging.apache.org/log4j/1.2/manual.html#defaultInit">Log4j Default Initialization Procedure</ulink>
- of the log4j manual.</para>
-
- </section>
-
- <section id="sc_troubleshooting">
- <title>Troubleshooting</title>
- <variablelist>
- <varlistentry>
- <term> Server not coming up because of file corruption</term>
- <listitem>
- <para>A server might not be able to read its database and fail to come up because of
- some file corruption in the transaction logs of the ZooKeeper server. You will
- see some IOException on loading ZooKeeper database. In such a case,
- make sure all the other servers in your ensemble are up and working. Use "stat"
- command on the command port to see if they are in good health. After you have verified that
- all the other servers of the ensemble are up, you can go ahead and clean the database
- of the corrupt server. Delete all the files in datadir/version-2 and datalogdir/version-2/.
- Restart the server.
- </para>
- </listitem>
- </varlistentry>
- </variablelist>
- </section>
-
- <section id="sc_configuration">
- <title>Configuration Parameters</title>
-
- <para>ZooKeeper's behavior is governed by the ZooKeeper configuration
- file. This file is designed so that the exact same file can be used by
- all the servers that make up a ZooKeeper server assuming the disk
- layouts are the same. If servers use different configuration files, care
- must be taken to ensure that the list of servers in all of the different
- configuration files match.</para>
-
- <section id="sc_minimumConfiguration">
- <title>Minimum Configuration</title>
-
- <para>Here are the minimum configuration keywords that must be defined
- in the configuration file:</para>
-
- <variablelist>
- <varlistentry>
- <term>clientPort</term>
-
- <listitem>
- <para>the port to listen for client connections; that is, the
- port that clients attempt to connect to.</para>
- </listitem>
- </varlistentry>
-
- <varlistentry id="var_datadir">
- <term>dataDir</term>
-
- <listitem>
- <para>the location where ZooKeeper will store the in-memory
- database snapshots and, unless specified otherwise, the
- transaction log of updates to the database.</para>
-
- <note>
- <para>Be careful where you put the transaction log. A
- dedicated transaction log device is key to consistent good
- performance. Putting the log on a busy device will adversely
- effect performance.</para>
- </note>
- </listitem>
- </varlistentry>
-
- <varlistentry id="id_tickTime">
- <term>tickTime</term>
-
- <listitem>
- <para>the length of a single tick, which is the basic time unit
- used by ZooKeeper, as measured in milliseconds. It is used to
- regulate heartbeats, and timeouts. For example, the minimum
- session timeout will be two ticks.</para>
- </listitem>
- </varlistentry>
- </variablelist>
- </section>
-
- <section id="sc_advancedConfiguration">
- <title>Advanced Configuration</title>
-
- <para>The configuration settings in the section are optional. You can
- use them to further fine tune the behaviour of your ZooKeeper servers.
- Some can also be set using Java system properties, generally of the
- form <emphasis>zookeeper.keyword</emphasis>. The exact system
- property, when available, is noted below.</para>
-
- <variablelist>
- <varlistentry>
- <term>dataLogDir</term>
-
- <listitem>
- <para>(No Java system property)</para>
-
- <para>This option will direct the machine to write the
- transaction log to the <emphasis
- role="bold">dataLogDir</emphasis> rather than the <emphasis
- role="bold">dataDir</emphasis>. This allows a dedicated log
- device to be used, and helps avoid competition between logging
- and snaphots.</para>
-
- <note>
- <para>Having a dedicated log device has a large impact on
- throughput and stable latencies. It is highly recommened to
- dedicate a log device and set <emphasis
- role="bold">dataLogDir</emphasis> to point to a directory on
- that device, and then make sure to point <emphasis
- role="bold">dataDir</emphasis> to a directory
- <emphasis>not</emphasis> residing on that device.</para>
- </note>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>globalOutstandingLimit</term>
-
- <listitem>
- <para>(Java system property: <emphasis
- role="bold">zookeeper.globalOutstandingLimit.</emphasis>)</para>
-
- <para>Clients can submit requests faster than ZooKeeper can
- process them, especially if there are a lot of clients. To
- prevent ZooKeeper from running out of memory due to queued
- requests, ZooKeeper will throttle clients so that there is no
- more than globalOutstandingLimit outstanding requests in the
- system. The default limit is 1,000.</para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>preAllocSize</term>
-
- <listitem>
- <para>(Java system property: <emphasis
- role="bold">zookeeper.preAllocSize</emphasis>)</para>
-
- <para>To avoid seeks ZooKeeper allocates space in the
- transaction log file in blocks of preAllocSize kilobytes. The
- default block size is 64M. One reason for changing the size of
- the blocks is to reduce the block size if snapshots are taken
- more often. (Also, see <emphasis
- role="bold">snapCount</emphasis>).</para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>snapCount</term>
-
- <listitem>
- <para>(Java system property: <emphasis
- role="bold">zookeeper.snapCount</emphasis>)</para>
-
- <para>ZooKeeper records its transactions using snapshots and
- a transaction log (think write-ahead log).The number of
- transactions recorded in the transaction log before a snapshot
- can be taken (and the transaction log rolled) is determined
- by snapCount. In order to prevent all of the machines in the quorum
- from taking a snapshot at the same time, each ZooKeeper server
- will take a snapshot when the number of transactions in the transaction log
- reaches a runtime generated random value in the [snapCount/2+1, snapCount]
- range.The default snapCount is 100,000.</para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>maxClientCnxns</term>
- <listitem>
- <para>(No Java system property)</para>
-
- <para>Limits the number of concurrent connections (at the socket
- level) that a single client, identified by IP address, may make
- to a single member of the ZooKeeper ensemble. This is used to
- prevent certain classes of DoS attacks, including file
- descriptor exhaustion. The default is 60. Setting this to 0
- entirely removes the limit on concurrent connections.</para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>clientPortAddress</term>
-
- <listitem>
- <para><emphasis role="bold">New in 3.3.0:</emphasis> the
- address (ipv4, ipv6 or hostname) to listen for client
- connections; that is, the address that clients attempt
- to connect to. This is optional, by default we bind in
- such a way that any connection to the <emphasis
- role="bold">clientPort</emphasis> for any
- address/interface/nic on the server will be
- accepted.</para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>minSessionTimeout</term>
- <listitem>
- <para>(No Java system property)</para>
-
- <para><emphasis role="bold">New in 3.3.0:</emphasis> the
- minimum session timeout in milliseconds that the server
- will allow the client to negotiate. Defaults to 2 times
- the <emphasis role="bold">tickTime</emphasis>.</para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>maxSessionTimeout</term>
- <listitem>
- <para>(No Java system property)</para>
-
- <para><emphasis role="bold">New in 3.3.0:</emphasis> the
- maximum session timeout in milliseconds that the server
- will allow the client to negotiate. Defaults to 20 times
- the <emphasis role="bold">tickTime</emphasis>.</para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>fsync.warningthresholdms</term>
- <listitem>
- <para>(Java system property: <emphasis
- role="bold">zookeeper.fsync.warningthresholdms</emphasis>)</para>
-
- <para><emphasis role="bold">New in 3.3.4:</emphasis> A
- warning message will be output to the log whenever an
- fsync in the Transactional Log (WAL) takes longer than
- this value. The values is specified in milliseconds and
- defaults to 1000. This value can only be set as a
- system property.</para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>autopurge.snapRetainCount</term>
-
- <listitem>
- <para>(No Java system property)</para>
-
- <para><emphasis role="bold">New in 3.4.0:</emphasis>
- When enabled, ZooKeeper auto purge feature retains
- the <emphasis role="bold">autopurge.snapRetainCount</emphasis> most
- recent snapshots and the corresponding transaction logs in the
- <emphasis role="bold">dataDir</emphasis> and <emphasis
- role="bold">dataLogDir</emphasis> respectively and deletes the rest.
- Defaults to 3. Minimum value is 3.</para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>autopurge.purgeInterval</term>
-
- <listitem>
- <para>(No Java system property)</para>
-
- <para><emphasis role="bold">New in 3.4.0:</emphasis> The
- time interval in hours for which the purge task has to
- be triggered. Set to a positive integer (1 and above)
- to enable the auto purging. Defaults to 0.</para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>syncEnabled</term>
-
- <listitem>
- <para>(Java system property: <emphasis
- role="bold">zookeeper.observer.syncEnabled</emphasis>)</para>
-
- <para><emphasis role="bold">New in 3.4.6, 3.5.0:</emphasis>
- The observers now log transaction and write snapshot to disk
- by default like the participants. This reduces the recovery time
- of the observers on restart. Set to "false" to disable this
- feature. Default is "true"</para>
- </listitem>
- </varlistentry>
- </variablelist>
- </section>
-
- <section id="sc_clusterOptions">
- <title>Cluster Options</title>
-
- <para>The options in this section are designed for use with an ensemble
- of servers -- that is, when deploying clusters of servers.</para>
-
- <variablelist>
- <varlistentry>
- <term>electionAlg</term>
-
- <listitem>
- <para>(No Java system property)</para>
-
- <para>Election implementation to use. A value of "0" corresponds
- to the original UDP-based version, "1" corresponds to the
- non-authenticated UDP-based version of fast leader election, "2"
- corresponds to the authenticated UDP-based version of fast
- leader election, and "3" corresponds to TCP-based version of
- fast leader election. Currently, algorithm 3 is the default</para>
-
- <note>
- <para> The implementations of leader election 0, 1, and 2 are now
- <emphasis role="bold"> deprecated </emphasis>. We have the intention
- of removing them in the next release, at which point only the
- FastLeaderElection will be available.
- </para>
- </note>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>initLimit</term>
-
- <listitem>
- <para>(No Java system property)</para>
-
- <para>Amount of time, in ticks (see <ulink
- url="#id_tickTime">tickTime</ulink>), to allow followers to
- connect and sync to a leader. Increased this value as needed, if
- the amount of data managed by ZooKeeper is large.</para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>leaderServes</term>
-
- <listitem>
- <para>(Java system property: zookeeper.<emphasis
- role="bold">leaderServes</emphasis>)</para>
-
- <para>Leader accepts client connections. Default value is "yes".
- The leader machine coordinates updates. For higher update
- throughput at thes slight expense of read throughput the leader
- can be configured to not accept clients and focus on
- coordination. The default to this option is yes, which means
- that a leader will accept client connections.</para>
-
- <note>
- <para>Turning on leader selection is highly recommended when
- you have more than three ZooKeeper servers in an ensemble.</para>
- </note>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>server.x=[hostname]:nnnnn[:nnnnn], etc</term>
-
- <listitem>
- <para>(No Java system property)</para>
-
- <para>servers making up the ZooKeeper ensemble. When the server
- starts up, it determines which server it is by looking for the
- file <filename>myid</filename> in the data directory. That file
- contains the server number, in ASCII, and it should match
- <emphasis role="bold">x</emphasis> in <emphasis
- role="bold">server.x</emphasis> in the left hand side of this
- setting.</para>
-
- <para>The list of servers that make up ZooKeeper servers that is
- used by the clients must match the list of ZooKeeper servers
- that each ZooKeeper server has.</para>
-
- <para>There are two port numbers <emphasis role="bold">nnnnn</emphasis>.
- The first followers use to connect to the leader, and the second is for
- leader election. The leader election port is only necessary if electionAlg
- is 1, 2, or 3 (default). If electionAlg is 0, then the second port is not
- necessary. If you want to test multiple servers on a single machine, then
- different ports can be used for each server.</para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>syncLimit</term>
-
- <listitem>
- <para>(No Java system property)</para>
-
- <para>Amount of time, in ticks (see <ulink
- url="#id_tickTime">tickTime</ulink>), to allow followers to sync
- with ZooKeeper. If followers fall too far behind a leader, they
- will be dropped.</para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>group.x=nnnnn[:nnnnn]</term>
-
- <listitem>
- <para>(No Java system property)</para>
-
- <para>Enables a hierarchical quorum construction."x" is a group identifier
- and the numbers following the "=" sign correspond to server identifiers.
- The left-hand side of the assignment is a colon-separated list of server
- identifiers. Note that groups must be disjoint and the union of all groups
- must be the ZooKeeper ensemble. </para>
-
- <para> You will find an example <ulink url="zookeeperHierarchicalQuorums.html">here</ulink>
- </para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>weight.x=nnnnn</term>
-
- <listitem>
- <para>(No Java system property)</para>
-
- <para>Used along with "group", it assigns a weight to a server when
- forming quorums. Such a value corresponds to the weight of a server
- when voting. There are a few parts of ZooKeeper that require voting
- such as leader election and the atomic broadcast protocol. By default
- the weight of server is 1. If the configuration defines groups, but not
- weights, then a value of 1 will be assigned to all servers.
- </para>
-
- <para> You will find an example <ulink url="zookeeperHierarchicalQuorums.html">here</ulink>
- </para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>cnxTimeout</term>
-
- <listitem>
- <para>(Java system property: zookeeper.<emphasis
- role="bold">cnxTimeout</emphasis>)</para>
-
- <para>Sets the timeout value for opening connections for leader election notifications.
- Only applicable if you are using electionAlg 3.
- </para>
-
- <note>
- <para>Default value is 5 seconds.</para>
- </note>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>4lw.commands.whitelist</term>
-
- <listitem>
- <para>(Java system property: <emphasis
- role="bold">zookeeper.4lw.commands.whitelist</emphasis>)</para>
-
- <para><emphasis role="bold">New in 3.4.10:</emphasis>
- This property contains a list of comma separated
- <ulink url="#sc_zkCommands">Four Letter Words</ulink> commands. It is introduced
- to provide fine grained control over the set of commands ZooKeeper can execute,
- so users can turn off certain commands if necessary.
- By default it contains all supported four letter word commands except "wchp" and "wchc",
- if the property is not specified. If the property is specified, then only commands listed
- in the whitelist are enabled.
- </para>
-
- <para>Here's an example of the configuration that enables stat, ruok, conf, and isro
- command while disabling the rest of Four Letter Words command:</para>
- <programlisting>
- 4lw.commands.whitelist=stat, ruok, conf, isro
- </programlisting>
-
- <para>Users can also use asterisk option so they don't have to include every command one by one in the list.
- As an example, this will enable all four letter word commands:
- </para>
- <programlisting>
- 4lw.commands.whitelist=*
- </programlisting>
-
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>ipReachableTimeout</term>
-
- <listitem>
- <para>(Java system property: <emphasis
- role="bold">zookeeper.ipReachableTimeout</emphasis>)</para>
-
- <para><emphasis role="bold">New in 3.4.11:</emphasis>
- Set this timeout value for IP addresses reachable checking when hostname is resolved, as mesured in
- milliseconds.
- By default, ZooKeeper will use the first IP address of the hostname(without any reachable checking).
- When zookeeper.ipReachableTimeout is set(larger than 0), ZooKeeper will will try to pick up the first
- IP address which is reachable. This is done by calling Java API InetAddress.isReachable(long timeout)
- function, in which this timeout value is used. If none of such reachable IP address can be found, the
- first IP address of the hostname will be used anyway.
- </para>
-
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>tcpKeepAlive</term>
-
- <listitem>
- <para>(Java system property: <emphasis
- role="bold">zookeeper.tcpKeepAlive</emphasis>)</para>
-
- <para><emphasis role="bold">New in 3.4.11:</emphasis>
- Setting this to true sets the TCP keepAlive flag on the
- sockets used by quorum members to perform elections.
- This will allow for connections between quorum members to
- remain up when there is network infrastructure that may
- otherwise break them. Some NATs and firewalls may terminate
- or lose state for long running or idle connections.</para>
-
- <para> Enabling this option relies on OS level settings to work
- properly, check your operating system's options regarding TCP
- keepalive for more information. Defaults to
- <emphasis role="bold">false</emphasis>.
- </para>
- </listitem>
- </varlistentry>
-
- </variablelist>
- <para></para>
- </section>
-
- <section id="sc_authOptions">
- <title>Authentication & Authorization Options</title>
-
- <para>The options in this section allow control over
- authentication/authorization performed by the service.</para>
-
- <variablelist>
- <varlistentry>
- <term>zookeeper.DigestAuthenticationProvider.superDigest</term>
-
- <listitem>
- <para>(Java system property only: <emphasis
- role="bold">zookeeper.DigestAuthenticationProvider.superDigest</emphasis>)</para>
-
- <para>By default this feature is <emphasis
- role="bold">disabled</emphasis></para>
-
- <para><emphasis role="bold">New in 3.2:</emphasis>
- Enables a ZooKeeper ensemble administrator to access the
- znode hierarchy as a "super" user. In particular no ACL
- checking occurs for a user authenticated as
- super.</para>
-
- <para>org.apache.zookeeper.server.auth.DigestAuthenticationProvider
- can be used to generate the superDigest, call it with
- one parameter of "super:<password>". Provide the
- generated "super:<data>" as the system property value
- when starting each server of the ensemble.</para>
-
- <para>When authenticating to a ZooKeeper server (from a
- ZooKeeper client) pass a scheme of "digest" and authdata
- of "super:<password>". Note that digest auth passes
- the authdata in plaintext to the server, it would be
- prudent to use this authentication method only on
- localhost (not over the network) or over an encrypted
- connection.</para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>isro</term>
-
- <listitem>
- <para><emphasis role="bold">New in 3.4.0:</emphasis> Tests if
- server is running in read-only mode. The server will respond with
- "ro" if in read-only mode or "rw" if not in read-only mode.</para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>gtmk</term>
-
- <listitem>
- <para>Gets the current trace mask as a 64-bit signed long value in
- decimal format. See <command>stmk</command> for an explanation of
- the possible values.</para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>stmk</term>
-
- <listitem>
- <para>Sets the current trace mask. The trace mask is 64 bits,
- where each bit enables or disables a specific category of trace
- logging on the server. Log4J must be configured to enable
- <command>TRACE</command> level first in order to see trace logging
- messages. The bits of the trace mask correspond to the following
- trace logging categories.</para>
-
- <table>
- <title>Trace Mask Bit Values</title>
- <tgroup cols="2" align="left" colsep="1" rowsep="1">
- <tbody>
- <row>
- <entry>0b0000000000</entry>
- <entry>Unused, reserved for future use.</entry>
- </row>
- <row>
- <entry>0b0000000010</entry>
- <entry>Logs client requests, excluding ping
- requests.</entry>
- </row>
- <row>
- <entry>0b0000000100</entry>
- <entry>Unused, reserved for future use.</entry>
- </row>
- <row>
- <entry>0b0000001000</entry>
- <entry>Logs client ping requests.</entry>
- </row>
- <row>
- <entry>0b0000010000</entry>
- <entry>Logs packets received from the quorum peer that is
- the current leader, excluding ping requests.</entry>
- </row>
- <row>
- <entry>0b0000100000</entry>
- <entry>Logs addition, removal and validation of client
- sessions.</entry>
- </row>
- <row>
- <entry>0b0001000000</entry>
- <entry>Logs delivery of watch events to client
- sessions.</entry>
- </row>
- <row>
- <entry>0b0010000000</entry>
- <entry>Logs ping packets received from the quorum peer
- that is the current leader.</entry>
- </row>
- <row>
- <entry>0b0100000000</entry>
- <entry>Unused, reserved for future use.</entry>
- </row>
- <row>
- <entry>0b1000000000</entry>
- <entry>Unused, reserved for future use.</entry>
- </row>
- </tbody>
- </tgroup>
- </table>
-
- <para>All remaining bits in the 64-bit value are unused and
- reserved for future use. Multiple trace logging categories are
- specified by calculating the bitwise OR of the documented values.
- The default trace mask is 0b0100110010. Thus, by default, trace
- logging includes client requests, packets received from the
- leader and sessions.</para>
-
- <para>To set a different trace mask, send a request containing the
- <command>stmk</command> four-letter word followed by the trace
- mask represented as a 64-bit signed long value. This example uses
- the Perl <command>pack</command> function to construct a trace
- mask that enables all trace logging categories described above and
- convert it to a 64-bit signed long value with big-endian byte
- order. The result is appended to <command>stmk</command> and sent
- to the server using netcat. The server responds with the new
- trace mask in decimal format.</para>
-
- <programlisting>$ perl -e "print 'stmk', pack('q>', 0b0011111010)" | nc localhost 2181
-250
- </programlisting>
- </listitem>
- </varlistentry>
- </variablelist>
- </section>
-
- <section>
- <title>Experimental Options/Features</title>
-
- <para>New features that are currently considered experimental.</para>
-
- <variablelist>
- <varlistentry>
- <term>Read Only Mode Server</term>
-
- <listitem>
- <para>(Java system property: <emphasis
- role="bold">readonlymode.enabled</emphasis>)</para>
-
- <para><emphasis role="bold">New in 3.4.0:</emphasis>
- Setting this value to true enables Read Only Mode server
- support (disabled by default). ROM allows clients
- sessions which requested ROM support to connect to the
- server even when the server might be partitioned from
- the quorum. In this mode ROM clients can still read
- values from the ZK service, but will be unable to write
- values and see changes from other clients. See
- ZOOKEEPER-784 for more details.
- </para>
- </listitem>
- </varlistentry>
-
- </variablelist>
- </section>
-
- <section>
- <title>Unsafe Options</title>
-
- <para>The following options can be useful, but be careful when you use
- them. The risk of each is explained along with the explanation of what
- the variable does.</para>
-
- <variablelist>
- <varlistentry>
- <term>forceSync</term>
-
- <listitem>
- <para>(Java system property: <emphasis
- role="bold">zookeeper.forceSync</emphasis>)</para>
-
- <para>Requires updates to be synced to media of the transaction
- log before finishing processing the update. If this option is
- set to no, ZooKeeper will not require updates to be synced to
- the media.</para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>jute.maxbuffer:</term>
-
- <listitem>
- <para>(Java system property:<emphasis role="bold">
- jute.maxbuffer</emphasis>)</para>
-
- <para>This option can only be set as a Java system property.
- There is no zookeeper prefix on it. It specifies the maximum
- size of the data that can be stored in a znode. The default is
- 0xfffff, or just under 1M. If this option is changed, the system
- property must be set on all servers and clients otherwise
- problems will arise. This is really a sanity check. ZooKeeper is
- designed to store data on the order of kilobytes in size.</para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>skipACL</term>
-
- <listitem>
- <para>(Java system property: <emphasis
- role="bold">zookeeper.skipACL</emphasis>)</para>
-
- <para>Skips ACL checks. This results in a boost in throughput,
- but opens up full access to the data tree to everyone.</para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>quorumListenOnAllIPs</term>
-
- <listitem>
- <para>When set to true the ZooKeeper server will listen
- for connections from its peers on all available IP addresses,
- and not only the address configured in the server list of the
- configuration file. It affects the connections handling the
- ZAB protocol and the Fast Leader Election protocol. Default
- value is <emphasis role="bold">false</emphasis>.</para>
- </listitem>
- </varlistentry>
-
- </variablelist>
- </section>
-
- <section>
- <title>Communication using the Netty framework</title>
-
- <para><emphasis role="bold">New in
- 3.4:</emphasis> <ulink url="http://jboss.org/netty">Netty</ulink>
- is an NIO based client/server communication framework, it
- simplifies (over NIO being used directly) many of the
- complexities of network level communication for java
- applications. Additionally the Netty framework has built
- in support for encryption (SSL) and authentication
- (certificates). These are optional features and can be
- turned on or off individually.
- </para>
- <para>Prior to version 3.4 ZooKeeper has always used NIO
- directly, however in versions 3.4 and later Netty is
- supported as an option to NIO (replaces). NIO continues to
- be the default, however Netty based communication can be
- used in place of NIO by setting the environment variable
- "zookeeper.serverCnxnFactory" to
- "org.apache.zookeeper.server.NettyServerCnxnFactory". You
- have the option of setting this on either the client(s) or
- server(s), typically you would want to set this on both,
- however that is at your discretion.
- </para>
- <para>
- TBD - tuning options for netty - currently there are none that are netty specific but we should add some. Esp around max bound on the number of reader worker threads netty creates.
- </para>
- <para>
- TBD - how to manage encryption
- </para>
- <para>
- TBD - how to manage certificates
- </para>
-
- </section>
-
- </section>
-
- <section id="sc_zkCommands">
- <title>ZooKeeper Commands: The Four Letter Words</title>
-
- <para>ZooKeeper responds to a small set of commands. Each command is
- composed of four letters. You issue the commands to ZooKeeper via telnet
- or nc, at the client port.</para>
-
- <para>Three of the more interesting commands: "stat" gives some
- general information about the server and connected clients,
- while "srvr" and "cons" give extended details on server and
- connections respectively.</para>
-
- <variablelist>
- <varlistentry>
- <term>conf</term>
-
- <listitem>
- <para><emphasis role="bold">New in 3.3.0:</emphasis> Print
- details about serving configuration.</para>
- </listitem>
-
- </varlistentry>
-
- <varlistentry>
- <term>cons</term>
-
- <listitem>
- <para><emphasis role="bold">New in 3.3.0:</emphasis> List
- full connection/session details for all clients connected
- to this server. Includes information on numbers of packets
- received/sent, session id, operation latencies, last
- operation performed, etc...</para>
- </listitem>
-
- </varlistentry>
-
- <varlistentry>
- <term>crst</term>
-
- <listitem>
- <para><emphasis role="bold">New in 3.3.0:</emphasis> Reset
- connection/session statistics for all connections.</para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>dump</term>
-
- <listitem>
- <para>Lists the outstanding sessions and ephemeral nodes. This
- only works on the leader.</para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>envi</term>
-
- <listitem>
- <para>Print details about serving environment</para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>ruok</term>
-
- <listitem>
- <para>Tests if server is running in a non-error state. The server
- will respond with imok if it is running. Otherwise it will not
- respond at all.</para>
-
- <para>A response of "imok" does not necessarily indicate that the
- server has joined the quorum, just that the server process is active
- and bound to the specified client port. Use "stat" for details on
- state wrt quorum and client connection information.</para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>srst</term>
-
- <listitem>
- <para>Reset server statistics.</para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>srvr</term>
-
- <listitem>
- <para><emphasis role="bold">New in 3.3.0:</emphasis> Lists
- full details for the server.</para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>stat</term>
-
- <listitem>
- <para>Lists brief details for the server and connected
- clients.</para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>wchs</term>
-
- <listitem>
- <para><emphasis role="bold">New in 3.3.0:</emphasis> Lists
- brief information on watches for the server.</para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>wchc</term>
-
- <listitem>
- <para><emphasis role="bold">New in 3.3.0:</emphasis> Lists
- detailed information on watches for the server, by
- session. This outputs a list of sessions(connections)
- with associated watches (paths). Note, depending on the
- number of watches this operation may be expensive (ie
- impact server performance), use it carefully.</para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>wchp</term>
-
- <listitem>
- <para><emphasis role="bold">New in 3.3.0:</emphasis> Lists
- detailed information on watches for the server, by path.
- This outputs a list of paths (znodes) with associated
- sessions. Note, depending on the number of watches this
- operation may be expensive (ie impact server performance),
- use it carefully.</para>
- </listitem>
- </varlistentry>
-
-
- <varlistentry>
- <term>mntr</term>
-
- <listitem>
- <para><emphasis role="bold">New in 3.4.0:</emphasis> Outputs a list
- of variables that could be used for monitoring the health of the cluster.</para>
-
- <programlisting>$ echo mntr | nc localhost 2185
-
-zk_version 3.4.0
-zk_avg_latency 0
-zk_max_latency 0
-zk_min_latency 0
-zk_packets_received 70
-zk_packets_sent 69
-zk_outstanding_requests 0
-zk_server_state leader
-zk_znode_count 4
-zk_watch_count 0
-zk_ephemerals_count 0
-zk_approximate_data_size 27
-zk_followers 4 - only exposed by the Leader
-zk_synced_followers 4 - only exposed by the Leader
-zk_pending_syncs 0 - only exposed by the Leader
-zk_open_file_descriptor_count 23 - only available on Unix platforms
-zk_max_file_descriptor_count 1024 - only available on Unix platforms
-zk_fsync_threshold_exceed_count 0
-</programlisting>
-
- <para>The output is compatible with java properties format and the content
- may change over time (new keys added). Your scripts should expect changes.</para>
-
- <para>ATTENTION: Some of the keys are platform specific and some of the keys are only exported by the Leader. </para>
-
- <para>The output contains multiple lines with the following format:</para>
- <programlisting>key \t value</programlisting>
- </listitem>
- </varlistentry>
- </variablelist>
-
- <para>Here's an example of the <emphasis role="bold">ruok</emphasis>
- command:</para>
-
- <programlisting>$ echo ruok | nc 127.0.0.1 5111
-imok
-</programlisting>
-
-
- </section>
-
- <section id="sc_dataFileManagement">
- <title>Data File Management</title>
-
- <para>ZooKeeper stores its data in a data directory and its transaction
- log in a transaction log directory. By default these two directories are
- the same. The server can (and should) be configured to store the
- transaction log files in a separate directory than the data files.
- Throughput increases and latency decreases when transaction logs reside
- on a dedicated log devices.</para>
-
- <section>
- <title>The Data Directory</title>
-
- <para>This directory has two files in it:</para>
-
- <itemizedlist>
- <listitem>
- <para><filename>myid</filename> - contains a single integer in
- human readable ASCII text that represents the server id.</para>
- </listitem>
-
- <listitem>
- <para><filename>snapshot.<zxid></filename> - holds the fuzzy
- snapshot of a data tree.</para>
- </listitem>
- </itemizedlist>
-
- <para>Each ZooKeeper server has a unique id. This id is used in two
- places: the <filename>myid</filename> file and the configuration file.
- The <filename>myid</filename> file identifies the server that
- corresponds to the given data directory. The configuration file lists
- the contact information for each server identified by its server id.
- When a ZooKeeper server instance starts, it reads its id from the
- <filename>myid</filename> file and then, using that id, reads from the
- configuration file, looking up the port on which it should
- listen.</para>
-
- <para>The <filename>snapshot</filename> files stored in the data
- directory are fuzzy snapshots in the sense that during the time the
- ZooKeeper server is taking the snapshot, updates are occurring to the
- data tree. The suffix of the <filename>snapshot</filename> file names
- is the <emphasis>zxid</emphasis>, the ZooKeeper transaction id, of the
- last committed transaction at the start of the snapshot. Thus, the
- snapshot includes a subset of the updates to the data tree that
- occurred while the snapshot was in process. The snapshot, then, may
- not correspond to any data tree that actually existed, and for this
- reason we refer to it as a fuzzy snapshot. Still, ZooKeeper can
- recover using this snapshot because it takes advantage of the
- idempotent nature of its updates. By replaying the transaction log
- against fuzzy snapshots ZooKeeper gets the state of the system at the
- end of the log.</para>
- </section>
-
- <section>
- <title>The Log Directory</title>
-
- <para>The Log Directory contains the ZooKeeper transaction logs.
- Before any update takes place, ZooKeeper ensures that the transaction
- that represents the update is written to non-volatile storage. A new
- log file is started when the number of transactions written to the
- current log file reaches a (variable) threshold. The threshold is
- computed using the same parameter which influences the frequency of
- snapshotting (see snapCount above). The log file's suffix is the first
- zxid written to that log.</para>
- </section>
-
- <section id="sc_filemanagement">
- <title>File Management</title>
-
- <para>The format of snapshot and log files does not change between
- standalone ZooKeeper servers and different configurations of
- replicated ZooKeeper servers. Therefore, you can pull these files from
- a running replicated ZooKeeper server to a development machine with a
- stand-alone ZooKeeper server for trouble shooting.</para>
-
- <para>Using older log and snapshot files, you can look at the previous
- state of ZooKeeper servers and even restore that state. The
- LogFormatter class allows an administrator to look at the transactions
- in a log.</para>
-
- <para>The ZooKeeper server creates snapshot and log files, but
- never deletes them. The retention policy of the data and log
- files is implemented outside of the ZooKeeper server. The
- server itself only needs the latest complete fuzzy snapshot, all log
- files following it, and the last log file preceding it. The latter
- requirement is necessary to include updates which happened after this
- snapshot was started but went into the existing log file at that time.
- This is possible because snapshotting and rolling over of logs
- proceed somewhat independently in ZooKeeper. See the
- <ulink url="#sc_maintenance">maintenance</ulink> section in
- this document for more details on setting a retention policy
- and maintenance of ZooKeeper storage.
- </para>
- <note>
- <para>The data stored in these files is not encrypted. In the case of
- storing sensitive data in ZooKeeper, necessary measures need to be
- taken to prevent unauthorized access. Such measures are external to
- ZooKeeper (e.g., control access to the files) and depend on the
- individual settings in which it is being deployed. </para>
- </note>
- </section>
-
- <section>
- <title>Recovery - TxnLogToolkit</title>
-
- <para>TxnLogToolkit is a command line tool shipped with ZooKeeper which
- is capable of recovering transaction log entries with broken CRC.</para>
- <para>Running it without any command line parameters or with the "-h,--help"
- argument, it outputs the following help page:</para>
-
- <programlisting>
- $ bin/zkTxnLogToolkit.sh
-
- usage: TxnLogToolkit [-dhrv] txn_log_file_name
- -d,--dump Dump mode. Dump all entries of the log file. (this is the default)
- -h,--help Print help message
- -r,--recover Recovery mode. Re-calculate CRC for broken entries.
- -v,--verbose Be verbose in recovery mode: print all entries, not just fixed ones.
- -y,--yes Non-interactive mode: repair all CRC errors without asking
- </programlisting>
-
- <para>The default behaviour is safe: it dumps the entries of the given
- transaction log file to the screen: (same as using '-d,--dump' parameter)</para>
-
- <programlisting>
- $ bin/zkTxnLogToolkit.sh log.100000001
- ZooKeeper Transactional Log File with dbid 0 txnlog format version 2
- 4/5/18 2:15:58 PM CEST session 0x16295bafcc40000 cxid 0x0 zxid 0x100000001 createSession 30000
- <emphasis role="bold">CRC ERROR - 4/5/18 2:16:05 PM CEST session 0x16295bafcc40000 cxid 0x1 zxid 0x100000002 closeSession null</emphasis>
- 4/5/18 2:16:05 PM CEST session 0x16295bafcc40000 cxid 0x1 zxid 0x100000002 closeSession null
- 4/5/18 2:16:12 PM CEST session 0x26295bafcc90000 cxid 0x0 zxid 0x100000003 createSession 30000
- 4/5/18 2:17:34 PM CEST session 0x26295bafcc90000 cxid 0x0 zxid 0x200000001 closeSession null
- 4/5/18 2:17:34 PM CEST session 0x16295bd23720000 cxid 0x0 zxid 0x200000002 createSession 30000
- 4/5/18 2:18:02 PM CEST session 0x16295bd23720000 cxid 0x2 zxid 0x200000003 create '/andor,#626262,v{s{31,s{'world,'anyone}}},F,1
- EOF reached after 6 txns.
- </programlisting>
-
- <para>There's a CRC error in the 2nd entry of the above transaction log file. In <emphasis role="bold">dump</emphasis>
- mode, the toolkit only prints this information to the screen without touching the original file. In
- <emphasis role="bold">recovery</emphasis> mode (-r,--recover flag) the original file still remains
- untouched and all transactions will be copied over to a new txn log file with ".fixed" suffix. It recalculates
- CRC values and copies the calculated value, if it doesn't match the original txn entry.
- By default, the tool works interactively: it asks for confirmation whenever CRC error encountered.</para>
-
- <programlisting>
- $ bin/zkTxnLogToolkit.sh -r log.100000001
- ZooKeeper Transactional Log File with dbid 0 txnlog format version 2
- CRC ERROR - 4/5/18 2:16:05 PM CEST session 0x16295bafcc40000 cxid 0x1 zxid 0x100000002 closeSession null
- Would you like to fix it (Yes/No/Abort) ?
- </programlisting>
-
- <para>Answering <emphasis role="bold">Yes</emphasis> means the newly calculated CRC value will be outputted
- to the new file. <emphasis role="bold">No</emphasis> means that the original CRC value will be copied over.
- <emphasis role="bold">Abort</emphasis> will abort the entire operation and exits.
- (In this case the ".fixed" will not be deleted and left in a half-complete state: contains only entries which
- have already been processed or only the header if the operation was aborted at the first entry.)</para>
-
- <programlisting>
- $ bin/zkTxnLogToolkit.sh -r log.100000001
- ZooKeeper Transactional Log File with dbid 0 txnlog format version 2
- CRC ERROR - 4/5/18 2:16:05 PM CEST session 0x16295bafcc40000 cxid 0x1 zxid 0x100000002 closeSession null
- Would you like to fix it (Yes/No/Abort) ? y
- EOF reached after 6 txns.
- Recovery file log.100000001.fixed has been written with 1 fixed CRC error(s)
- </programlisting>
-
- <para>The default behaviour of recovery is to be silent: only entries with CRC error get printed to the screen.
- One can turn on verbose mode with the -v,--verbose parameter to see all records.
- Interactive mode can be turned off with the -y,--yes parameter. In this case all CRC errors will be fixed
- in the new transaction file.</para>
- </section>
- </section>
-
- <section id="sc_commonProblems">
- <title>Things to Avoid</title>
-
- <para>Here are some common problems you can avoid by configuring
- ZooKeeper correctly:</para>
-
- <variablelist>
- <varlistentry>
- <term>inconsistent lists of servers</term>
-
- <listitem>
- <para>The list of ZooKeeper servers used by the clients must match
- the list of ZooKeeper servers that each ZooKeeper server has.
- Things work okay if the client list is a subset of the real list,
- but things will really act strange if clients have a list of
- ZooKeeper servers that are in different ZooKeeper clusters. Also,
- the server lists in each Zookeeper server configuration file
- should be consistent with one another.</para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>incorrect placement of transaction log</term>
-
- <listitem>
- <para>The most performance critical part of ZooKeeper is the
- transaction log. ZooKeeper syncs transactions to media before it
- returns a response. A dedicated transaction log device is key to
- consistent good performance. Putting the log on a busy device will
- adversely effect performance. If you only have one storage device,
- put trace files on NFS and increase the snapshotCount; it doesn't
- eliminate the problem, but it should mitigate it.</para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>incorrect Java heap size</term>
-
- <listitem>
- <para>You should take special care to set your Java max heap size
- correctly. In particular, you should not create a situation in
- which ZooKeeper swaps to disk. The disk is death to ZooKeeper.
- Everything is ordered, so if processing one request swaps the
- disk, all other queued requests will probably do the same. the
- disk. DON'T SWAP.</para>
-
- <para>Be conservative in your estimates: if you have 4G of RAM, do
- not set the Java max heap size to 6G or even 4G. For example, it
- is more likely you would use a 3G heap for a 4G machine, as the
- operating system and the cache also need memory. The best and only
- recommend practice for estimating the heap size your system needs
- is to run load tests, and then make sure you are well below the
- usage limit that would cause the system to swap.</para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>Publicly accessible deployment</term>
- <listitem>
- <para>
- A ZooKeeper ensemble is expected to operate in a trusted computing environment.
- It is thus recommended to deploy ZooKeeper behind a firewall.
- </para>
- </listitem>
- </varlistentry>
- </variablelist>
- </section>
-
- <section id="sc_bestPractices">
- <title>Best Practices</title>
-
- <para>For best results, take note of the following list of good
- Zookeeper practices:</para>
-
-
- <para>For multi-tennant installations see the <ulink
- url="zookeeperProgrammers.html#ch_zkSessions">section</ulink>
- detailing ZooKeeper "chroot" support, this can be very useful
- when deploying many applications/services interfacing to a
- single ZooKeeper cluster.</para>
-
- </section>
- </section>
-</article>
http://git-wip-us.apache.org/repos/asf/zookeeper/blob/c1efa954/src/docs/src/documentation/content/xdocs/zookeeperHierarchicalQuorums.xml
----------------------------------------------------------------------
diff --git a/src/docs/src/documentation/content/xdocs/zookeeperHierarchicalQuorums.xml b/src/docs/src/documentation/content/xdocs/zookeeperHierarchicalQuorums.xml
deleted file mode 100644
index f71c4a8..0000000
--- a/src/docs/src/documentation/content/xdocs/zookeeperHierarchicalQuorums.xml
+++ /dev/null
@@ -1,75 +0,0 @@
-<?xml version="1.0" encoding="UTF-8"?>
-<!--
- Copyright 2002-2004 The Apache Software Foundation
-
- Licensed under the Apache License, Version 2.0 (the "License");
- you may not use this file except in compliance with the License.
- You may obtain a copy of the License at
-
- http://www.apache.org/licenses/LICENSE-2.0
-
- Unless required by applicable law or agreed to in writing, software
- distributed under the License is distributed on an "AS IS" BASIS,
- WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- See the License for the specific language governing permissions and
- limitations under the License.
--->
-
-<!DOCTYPE article PUBLIC "-//OASIS//DTD Simplified DocBook XML V1.0//EN"
-"http://www.oasis-open.org/docbook/xml/simple/1.0/sdocbook.dtd">
-<article id="zk_HierarchicalQuorums">
- <title>Introduction to hierarchical quorums</title>
-
- <articleinfo>
- <legalnotice>
- <para>Licensed under the Apache License, Version 2.0 (the "License");
- you may not use this file except in compliance with the License. You may
- obtain a copy of the License at <ulink
- url="http://www.apache.org/licenses/LICENSE-2.0">http://www.apache.org/licenses/LICENSE-2.0</ulink>.</para>
-
- <para>Unless required by applicable law or agreed to in writing,
- software distributed under the License is distributed on an "AS IS"
- BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
- implied. See the License for the specific language governing permissions
- and limitations under the License.</para>
- </legalnotice>
-
- <abstract>
- <para>This document contains information about hierarchical quorums.</para>
- </abstract>
- </articleinfo>
-
- <para>
- This document gives an example of how to use hierarchical quorums. The basic idea is
- very simple. First, we split servers into groups, and add a line for each group listing
- the servers that form this group. Next we have to assign a weight to each server.
- </para>
-
- <para>
- The following example shows how to configure a system with three groups of three servers
- each, and we assign a weight of 1 to each server:
- </para>
-
- <programlisting>
- group.1=1:2:3
- group.2=4:5:6
- group.3=7:8:9
-
- weight.1=1
- weight.2=1
- weight.3=1
- weight.4=1
- weight.5=1
- weight.6=1
- weight.7=1
- weight.8=1
- weight.9=1
- </programlisting>
-
- <para>
- When running the system, we are able to form a quorum once we have a majority of votes from
- a majority of non-zero-weight groups. Groups that have zero weight are discarded and not
- considered when forming quorums. Looking at the example, we are able to form a quorum once
- we have votes from at least two servers from each of two different groups.
- </para>
- </article>
\ No newline at end of file
[09/12] zookeeper git commit: ZOOKEEPER-3022: MAVEN MIGRATION 3.4 -
Iteration 1 - docs, it
Posted by an...@apache.org.
http://git-wip-us.apache.org/repos/asf/zookeeper/blob/c1efa954/src/docs/src/documentation/content/xdocs/zookeeperInternals.xml
----------------------------------------------------------------------
diff --git a/src/docs/src/documentation/content/xdocs/zookeeperInternals.xml b/src/docs/src/documentation/content/xdocs/zookeeperInternals.xml
deleted file mode 100644
index 4954123..0000000
--- a/src/docs/src/documentation/content/xdocs/zookeeperInternals.xml
+++ /dev/null
@@ -1,487 +0,0 @@
-<?xml version="1.0" encoding="UTF-8"?>
-<!--
- Copyright 2002-2004 The Apache Software Foundation
-
- Licensed under the Apache License, Version 2.0 (the "License");
- you may not use this file except in compliance with the License.
- You may obtain a copy of the License at
-
- http://www.apache.org/licenses/LICENSE-2.0
-
- Unless required by applicable law or agreed to in writing, software
- distributed under the License is distributed on an "AS IS" BASIS,
- WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- See the License for the specific language governing permissions and
- limitations under the License.
--->
-
-<!DOCTYPE article PUBLIC "-//OASIS//DTD Simplified DocBook XML V1.0//EN"
-"http://www.oasis-open.org/docbook/xml/simple/1.0/sdocbook.dtd">
-<article id="ar_ZooKeeperInternals">
- <title>ZooKeeper Internals</title>
-
- <articleinfo>
- <legalnotice>
- <para>Licensed under the Apache License, Version 2.0 (the "License");
- you may not use this file except in compliance with the License. You may
- obtain a copy of the License at <ulink
- url="http://www.apache.org/licenses/LICENSE-2.0">http://www.apache.org/licenses/LICENSE-2.0</ulink>.</para>
-
- <para>Unless required by applicable law or agreed to in writing,
- software distributed under the License is distributed on an "AS IS"
- BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
- implied. See the License for the specific language governing permissions
- and limitations under the License.</para>
- </legalnotice>
-
- <abstract>
- <para>This article contains topics which discuss the inner workings of
- ZooKeeper. So far, that's logging and atomic broadcast. </para>
-
- </abstract>
- </articleinfo>
-
- <section id="ch_Introduction">
- <title>Introduction</title>
-
- <para>This document contains information on the inner workings of ZooKeeper.
- So far, it discusses these topics:
- </para>
-
-<itemizedlist>
-<listitem><para><xref linkend="sc_atomicBroadcast"/></para></listitem>
-<listitem><para><xref linkend="sc_logging"/></para></listitem>
-</itemizedlist>
-
-</section>
-
-<section id="sc_atomicBroadcast">
-<title>Atomic Broadcast</title>
-
-<para>
-At the heart of ZooKeeper is an atomic messaging system that keeps all of the servers in sync.</para>
-
-<section id="sc_guaranteesPropertiesDefinitions"><title>Guarantees, Properties, and Definitions</title>
-<para>
-The specific guarantees provided by the messaging system used by ZooKeeper are the following:</para>
-
-<variablelist>
-
-<varlistentry><term><emphasis >Reliable delivery</emphasis></term>
-<listitem><para>If a message, m, is delivered
-by one server, it will be eventually delivered by all servers.</para></listitem></varlistentry>
-
-<varlistentry><term><emphasis >Total order</emphasis></term>
-<listitem><para> If a message is
-delivered before message b by one server, a will be delivered before b by all
-servers. If a and b are delivered messages, either a will be delivered before b
-or b will be delivered before a.</para></listitem></varlistentry>
-
-<varlistentry><term><emphasis >Causal order</emphasis> </term>
-
-<listitem><para>
-If a message b is sent after a message a has been delivered by the sender of b,
-a must be ordered before b. If a sender sends c after sending b, c must be ordered after b.
-</para></listitem></varlistentry>
-
-</variablelist>
-
-
-<para>
-The ZooKeeper messaging system also needs to be efficient, reliable, and easy to
-implement and maintain. We make heavy use of messaging, so we need the system to
-be able to handle thousands of requests per second. Although we can require at
-least k+1 correct servers to send new messages, we must be able to recover from
-correlated failures such as power outages. When we implemented the system we had
-little time and few engineering resources, so we needed a protocol that is
-accessible to engineers and is easy to implement. We found that our protocol
-satisfied all of these goals.
-
-</para>
-
-<para>
-Our protocol assumes that we can construct point-to-point FIFO channels between
-the servers. While similar services usually assume message delivery that can
-lose or reorder messages, our assumption of FIFO channels is very practical
-given that we use TCP for communication. Specifically we rely on the following property of TCP:</para>
-
-<variablelist>
-
-<varlistentry>
-<term><emphasis >Ordered delivery</emphasis></term>
-<listitem><para>Data is delivered in the same order it is sent and a message m is
-delivered only after all messages sent before m have been delivered.
-(The corollary to this is that if message m is lost all messages after m will be lost.)</para></listitem></varlistentry>
-
-<varlistentry><term><emphasis >No message after close</emphasis></term>
-<listitem><para>Once a FIFO channel is closed, no messages will be received from it.</para></listitem></varlistentry>
-
-</variablelist>
-
-<para>
-FLP proved that consensus cannot be achieved in asynchronous distributed systems
-if failures are possible. To ensure we achieve consensus in the presence of failures
-we use timeouts. However, we rely on times for liveness not for correctness. So,
-if timeouts stop working (clocks malfunction for example) the messaging system may
-hang, but it will not violate its guarantees.</para>
-
-<para>When describing the ZooKeeper messaging protocol we will talk of packets,
-proposals, and messages:</para>
-<variablelist>
-<varlistentry><term><emphasis >Packet</emphasis></term>
-<listitem><para>a sequence of bytes sent through a FIFO channel</para></listitem></varlistentry><varlistentry>
-
-<term><emphasis >Proposal</emphasis></term>
-<listitem><para>a unit of agreement. Proposals are agreed upon by exchanging packets
-with a quorum of ZooKeeper servers. Most proposals contain messages, however the
-NEW_LEADER proposal is an example of a proposal that does not correspond to a message.</para></listitem>
-</varlistentry><varlistentry>
-
-<term><emphasis >Message</emphasis></term>
-<listitem><para>a sequence of bytes to be atomically broadcast to all ZooKeeper
-servers. A message put into a proposal and agreed upon before it is delivered.</para></listitem>
-</varlistentry>
-
-</variablelist>
-
-<para>
-As stated above, ZooKeeper guarantees a total order of messages, and it also
-guarantees a total order of proposals. ZooKeeper exposes the total ordering using
-a ZooKeeper transaction id (<emphasis>zxid</emphasis>). All proposals will be stamped with a zxid when
-it is proposed and exactly reflects the total ordering. Proposals are sent to all
-ZooKeeper servers and committed when a quorum of them acknowledge the proposal.
-If a proposal contains a message, the message will be delivered when the proposal
-is committed. Acknowledgement means the server has recorded the proposal to persistent storage.
-Our quorums have the requirement that any pair of quorum must have at least one server
-in common. We ensure this by requiring that all quorums have size (<emphasis>n/2+1</emphasis>) where
-n is the number of servers that make up a ZooKeeper service.
-</para>
-
-<para>
-The zxid has two parts: the epoch and a counter. In our implementation the zxid
-is a 64-bit number. We use the high order 32-bits for the epoch and the low order
-32-bits for the counter. Because it has two parts represent the zxid both as a
-number and as a pair of integers, (<emphasis>epoch, count</emphasis>). The epoch number represents a
-change in leadership. Each time a new leader comes into power it will have its
-own epoch number. We have a simple algorithm to assign a unique zxid to a proposal:
-the leader simply increments the zxid to obtain a unique zxid for each proposal.
-<emphasis>Leadership activation will ensure that only one leader uses a given epoch, so our
-simple algorithm guarantees that every proposal will have a unique id.</emphasis>
-</para>
-
-<para>
-ZooKeeper messaging consists of two phases:</para>
-
-<variablelist>
-<varlistentry><term><emphasis >Leader activation</emphasis></term>
-<listitem><para>In this phase a leader establishes the correct state of the system
-and gets ready to start making proposals.</para></listitem>
-</varlistentry>
-
-<varlistentry><term><emphasis >Active messaging</emphasis></term>
-<listitem><para>In this phase a leader accepts messages to propose and coordinates message delivery.</para></listitem>
-</varlistentry>
-</variablelist>
-
-<para>
-ZooKeeper is a holistic protocol. We do not focus on individual proposals, rather
-look at the stream of proposals as a whole. Our strict ordering allows us to do this
-efficiently and greatly simplifies our protocol. Leadership activation embodies
-this holistic concept. A leader becomes active only when a quorum of followers
-(The leader counts as a follower as well. You can always vote for yourself ) has synced
-up with the leader, they have the same state. This state consists of all of the
-proposals that the leader believes have been committed and the proposal to follow
-the leader, the NEW_LEADER proposal. (Hopefully you are thinking to
-yourself, <emphasis>Does the set of proposals that the leader believes has been committed
-included all the proposals that really have been committed?</emphasis> The answer is <emphasis>yes</emphasis>.
-Below, we make clear why.)
-</para>
-
-</section>
-
-<section id="sc_leaderElection">
-
-<title>Leader Activation</title>
-<para>
-Leader activation includes leader election. We currently have two leader election
-algorithms in ZooKeeper: LeaderElection and FastLeaderElection (AuthFastLeaderElection
-is a variant of FastLeaderElection that uses UDP and allows servers to perform a simple
-form of authentication to avoid IP spoofing). ZooKeeper messaging doesn't care about the
-exact method of electing a leader has long as the following holds:
-</para>
-
-<itemizedlist>
-
-<listitem><para>The leader has seen the highest zxid of all the followers.</para></listitem>
-<listitem><para>A quorum of servers have committed to following the leader.</para></listitem>
-
-</itemizedlist>
-
-<para>
-Of these two requirements only the first, the highest zxid amoung the followers
-needs to hold for correct operation. The second requirement, a quorum of followers,
-just needs to hold with high probability. We are going to recheck the second requirement,
-so if a failure happens during or after the leader election and quorum is lost,
-we will recover by abandoning leader activation and running another election.
-</para>
-
-<para>
-After leader election a single server will be designated as a leader and start
-waiting for followers to connect. The rest of the servers will try to connect to
-the leader. The leader will sync up with followers by sending any proposals they
-are missing, or if a follower is missing too many proposals, it will send a full
-snapshot of the state to the follower.
-</para>
-
-<para>
-There is a corner case in which a follower that has proposals, U, not seen
-by a leader arrives. Proposals are seen in order, so the proposals of U will have a zxids
-higher than zxids seen by the leader. The follower must have arrived after the
-leader election, otherwise the follower would have been elected leader given that
-it has seen a higher zxid. Since committed proposals must be seen by a quorum of
-servers, and a quorum of servers that elected the leader did not see U, the proposals
-of you have not been committed, so they can be discarded. When the follower connects
-to the leader, the leader will tell the follower to discard U.
-</para>
-
-<para>
-A new leader establishes a zxid to start using for new proposals by getting the
-epoch, e, of the highest zxid it has seen and setting the next zxid to use to be
-(e+1, 0), fter the leader syncs with a follower, it will propose a NEW_LEADER
-proposal. Once the NEW_LEADER proposal has been committed, the leader will activate
-and start receiving and issuing proposals.
-</para>
-
-<para>
-It all sounds complicated but here are the basic rules of operation during leader
-activation:
-</para>
-
-<itemizedlist>
-<listitem><para>A follower will ACK the NEW_LEADER proposal after it has synced with the leader.</para></listitem>
-<listitem><para>A follower will only ACK a NEW_LEADER proposal with a given zxid from a single server.</para></listitem>
-<listitem><para>A new leader will COMMIT the NEW_LEADER proposal when a quorum of followers have ACKed it.</para></listitem>
-<listitem><para>A follower will commit any state it received from the leader when the NEW_LEADER proposal is COMMIT.</para></listitem>
-<listitem><para>A new leader will not accept new proposals until the NEW_LEADER proposal has been COMMITED.</para></listitem>
-</itemizedlist>
-
-<para>
-If leader election terminates erroneously, we don't have a problem since the
-NEW_LEADER proposal will not be committed since the leader will not have quorum.
-When this happens, the leader and any remaining followers will timeout and go back
-to leader election.
-</para>
-
-</section>
-
-<section id="sc_activeMessaging">
-<title>Active Messaging</title>
-<para>
-Leader Activation does all the heavy lifting. Once the leader is coronated he can
-start blasting out proposals. As long as he remains the leader no other leader can
-emerge since no other leader will be able to get a quorum of followers. If a new
-leader does emerge,
-it means that the leader has lost quorum, and the new leader will clean up any
-mess left over during her leadership activation.
-</para>
-
-<para>ZooKeeper messaging operates similar to a classic two-phase commit.</para>
-
-<mediaobject id="fg_2phaseCommit" >
- <imageobject>
- <imagedata fileref="images/2pc.jpg"/>
- </imageobject>
-</mediaobject>
-
-<para>
-All communication channels are FIFO, so everything is done in order. Specifically
-the following operating constraints are observed:</para>
-
-<itemizedlist>
-
-<listitem><para>The leader sends proposals to all followers using
-the same order. Moreover, this order follows the order in which requests have been
-received. Because we use FIFO channels this means that followers also receive proposals in order.
-</para></listitem>
-
-<listitem><para>Followers process messages in the order they are received. This
-means that messages will be ACKed in order and the leader will receive ACKs from
-followers in order, due to the FIFO channels. It also means that if message $m$
-has been written to non-volatile storage, all messages that were proposed before
-$m$ have been written to non-volatile storage.</para></listitem>
-
-<listitem><para>The leader will issue a COMMIT to all followers as soon as a
-quorum of followers have ACKed a message. Since messages are ACKed in order,
-COMMITs will be sent by the leader as received by the followers in order.</para></listitem>
-
-<listitem><para>COMMITs are processed in order. Followers deliver a proposals
-message when that proposal is committed.</para></listitem>
-
-</itemizedlist>
-
-</section>
-
-<section id="sc_summary">
-<title>Summary</title>
-<para>So there you go. Why does it work? Specifically, why does is set of proposals
-believed by a new leader always contain any proposal that has actually been committed?
-First, all proposals have a unique zxid, so unlike other protocols, we never have
-to worry about two different values being proposed for the same zxid; followers
-(a leader is also a follower) see and record proposals in order; proposals are
-committed in order; there is only one active leader at a time since followers only
-follow a single leader at a time; a new leader has seen all committed proposals
-from the previous epoch since it has seen the highest zxid from a quorum of servers;
-any uncommited proposals from a previous epoch seen by a new leader will be committed
-by that leader before it becomes active.</para></section>
-
-<section id="sc_comparisons"><title>Comparisons</title>
-<para>
-Isn't this just Multi-Paxos? No, Multi-Paxos requires some way of assuring that
-there is only a single coordinator. We do not count on such assurances. Instead
-we use the leader activation to recover from leadership change or old leaders
-believing they are still active.
-</para>
-
-<para>
-Isn't this just Paxos? Your active messaging phase looks just like phase 2 of Paxos?
-Actually, to us active messaging looks just like 2 phase commit without the need to
-handle aborts. Active messaging is different from both in the sense that it has
-cross proposal ordering requirements. If we do not maintain strict FIFO ordering of
-all packets, it all falls apart. Also, our leader activation phase is different from
-both of them. In particular, our use of epochs allows us to skip blocks of uncommitted
-proposals and to not worry about duplicate proposals for a given zxid.
-</para>
-
-</section>
-
-</section>
-
-<section id="sc_quorum">
-<title>Quorums</title>
-
-<para>
-Atomic broadcast and leader election use the notion of quorum to guarantee a consistent
-view of the system. By default, ZooKeeper uses majority quorums, which means that every
-voting that happens in one of these protocols requires a majority to vote on. One example is
-acknowledging a leader proposal: the leader can only commit once it receives an
-acknowledgement from a quorum of servers.
-</para>
-
-<para>
-If we extract the properties that we really need from our use of majorities, we have that we only
-need to guarantee that groups of processes used to validate an operation by voting (e.g., acknowledging
-a leader proposal) pairwise intersect in at least one server. Using majorities guarantees such a property.
-However, there are other ways of constructing quorums different from majorities. For example, we can assign
-weights to the votes of servers, and say that the votes of some servers are more important. To obtain a quorum,
-we get enough votes so that the sum of weights of all votes is larger than half of the total sum of all weights.
-</para>
-
-<para>
-A different construction that uses weights and is useful in wide-area deployments (co-locations) is a hierarchical
-one. With this construction, we split the servers into disjoint groups and assign weights to processes. To form
-a quorum, we have to get a hold of enough servers from a majority of groups G, such that for each group g in G,
-the sum of votes from g is larger than half of the sum of weights in g. Interestingly, this construction enables
-smaller quorums. If we have, for example, 9 servers, we split them into 3 groups, and assign a weight of 1 to each
-server, then we are able to form quorums of size 4. Note that two subsets of processes composed each of a majority
-of servers from each of a majority of groups necessarily have a non-empty intersection. It is reasonable to expect
-that a majority of co-locations will have a majority of servers available with high probability.
-</para>
-
-<para>
-With ZooKeeper, we provide a user with the ability of configuring servers to use majority quorums, weights, or a
-hierarchy of groups.
-</para>
-</section>
-
-<section id="sc_logging">
-
-<title>Logging</title>
-<para>
-Zookeeper uses
-<ulink url="http://www.slf4j.org/index.html">slf4j</ulink> as an abstraction layer for logging.
-<ulink url="http://logging.apache.org/log4j">log4j</ulink> in version 1.2 is chosen as the final logging implementation for now.
-For better embedding support, it is planned in the future to leave the decision of choosing the final logging implementation to the end user.
-Therefore, always use the slf4j api to write log statements in the code, but configure log4j for how to log at runtime.
-Note that slf4j has no FATAL level, former messages at FATAL level have been moved to ERROR level.
-For information on configuring log4j for
-ZooKeeper, see the <ulink url="zookeeperAdmin.html#sc_logging">Logging</ulink> section
-of the <ulink url="zookeeperAdmin.html">ZooKeeper Administrator's Guide.</ulink>
-
-</para>
-
-<section id="sc_developerGuidelines"><title>Developer Guidelines</title>
-
-<para>Please follow the
-<ulink url="http://www.slf4j.org/manual.html">slf4j manual</ulink> when creating log statements within code.
-Also read the
-<ulink url="http://www.slf4j.org/faq.html#logging_performance">FAQ on performance</ulink>
-, when creating log statements. Patch reviewers will look for the following:</para>
-<section id="sc_rightLevel"><title>Logging at the Right Level</title>
-<para>
-There are several levels of logging in slf4j.
-It's important to pick the right one. In order of higher to lower severity:</para>
-<orderedlist>
- <listitem><para>ERROR level designates error events that might still allow the application to continue running.</para></listitem>
- <listitem><para>WARN level designates potentially harmful situations.</para></listitem>
- <listitem><para>INFO level designates informational messages that highlight the progress of the application at coarse-grained level.</para></listitem>
- <listitem><para>DEBUG Level designates fine-grained informational events that are most useful to debug an application.</para></listitem>
- <listitem><para>TRACE Level designates finer-grained informational events than the DEBUG.</para></listitem>
-</orderedlist>
-
-<para>
-ZooKeeper is typically run in production such that log messages of INFO level
-severity and higher (more severe) are output to the log.</para>
-
-
-</section>
-
-<section id="sc_slf4jIdioms"><title>Use of Standard slf4j Idioms</title>
-
-<para><emphasis>Static Message Logging</emphasis></para>
-<programlisting>
-LOG.debug("process completed successfully!");
-</programlisting>
-
-<para>
-However when creating parameterized messages are required, use formatting anchors.
-</para>
-
-<programlisting>
-LOG.debug("got {} messages in {} minutes",new Object[]{count,time});
-</programlisting>
-
-
-<para><emphasis>Naming</emphasis></para>
-
-<para>
-Loggers should be named after the class in which they are used.
-</para>
-
-<programlisting>
-public class Foo {
- private static final Logger LOG = LoggerFactory.getLogger(Foo.class);
- ....
- public Foo() {
- LOG.info("constructing Foo");
-</programlisting>
-
-<para><emphasis>Exception handling</emphasis></para>
-<programlisting>
-try {
- // code
-} catch (XYZException e) {
- // do this
- LOG.error("Something bad happened", e);
- // don't do this (generally)
- // LOG.error(e);
- // why? because "don't do" case hides the stack trace
-
- // continue process here as you need... recover or (re)throw
-}
-</programlisting>
-</section>
-</section>
-
-</section>
-
-</article>
http://git-wip-us.apache.org/repos/asf/zookeeper/blob/c1efa954/src/docs/src/documentation/content/xdocs/zookeeperJMX.xml
----------------------------------------------------------------------
diff --git a/src/docs/src/documentation/content/xdocs/zookeeperJMX.xml b/src/docs/src/documentation/content/xdocs/zookeeperJMX.xml
deleted file mode 100644
index f0ea636..0000000
--- a/src/docs/src/documentation/content/xdocs/zookeeperJMX.xml
+++ /dev/null
@@ -1,236 +0,0 @@
-<?xml version="1.0" encoding="UTF-8"?>
-<!--
- Copyright 2002-2004 The Apache Software Foundation
-
- Licensed under the Apache License, Version 2.0 (the "License");
- you may not use this file except in compliance with the License.
- You may obtain a copy of the License at
-
- http://www.apache.org/licenses/LICENSE-2.0
-
- Unless required by applicable law or agreed to in writing, software
- distributed under the License is distributed on an "AS IS" BASIS,
- WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- See the License for the specific language governing permissions and
- limitations under the License.
--->
-
-<!DOCTYPE article PUBLIC "-//OASIS//DTD Simplified DocBook XML V1.0//EN"
-"http://www.oasis-open.org/docbook/xml/simple/1.0/sdocbook.dtd">
-<article id="bk_zookeeperjmx">
- <title>ZooKeeper JMX</title>
-
- <articleinfo>
- <legalnotice>
- <para>Licensed under the Apache License, Version 2.0 (the "License");
- you may not use this file except in compliance with the License. You may
- obtain a copy of the License at <ulink
- url="http://www.apache.org/licenses/LICENSE-2.0">http://www.apache.org/licenses/LICENSE-2.0</ulink>.</para>
-
- <para>Unless required by applicable law or agreed to in writing,
- software distributed under the License is distributed on an "AS IS"
- BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
- implied. See the License for the specific language governing permissions
- and limitations under the License.</para>
- </legalnotice>
-
- <abstract>
- <para>ZooKeeper support for JMX</para>
- </abstract>
- </articleinfo>
-
- <section id="ch_jmx">
- <title>JMX</title>
- <para>Apache ZooKeeper has extensive support for JMX, allowing you
- to view and manage a ZooKeeper serving ensemble.</para>
-
- <para>This document assumes that you have basic knowledge of
- JMX. See <ulink
- url="http://java.sun.com/javase/technologies/core/mntr-mgmt/javamanagement/">
- Sun JMX Technology</ulink> page to get started with JMX.
- </para>
-
- <para>See the <ulink
- url="http://java.sun.com/javase/6/docs/technotes/guides/management/agent.html">
- JMX Management Guide</ulink> for details on setting up local and
- remote management of VM instances. By default the included
- <emphasis>zkServer.sh</emphasis> supports only local management -
- review the linked document to enable support for remote management
- (beyond the scope of this document).
- </para>
-
- </section>
-
- <section id="ch_starting">
- <title>Starting ZooKeeper with JMX enabled</title>
-
- <para>The class
- <emphasis>org.apache.zookeeper.server.quorum.QuorumPeerMain</emphasis>
- will start a JMX manageable ZooKeeper server. This class
- registers the proper MBeans during initalization to support JMX
- monitoring and management of the
- instance. See <emphasis>bin/zkServer.sh</emphasis> for one
- example of starting ZooKeeper using QuorumPeerMain.</para>
- </section>
-
- <section id="ch_console">
- <title>Run a JMX console</title>
-
- <para>There are a number of JMX consoles available which can connect
- to the running server. For this example we will use Sun's
- <emphasis>jconsole</emphasis>.</para>
-
- <para>The Java JDK ships with a simple JMX console
- named <ulink url="http://java.sun.com/developer/technicalArticles/J2SE/jconsole.html">jconsole</ulink>
- which can be used to connect to ZooKeeper and inspect a running
- server. Once you've started ZooKeeper using QuorumPeerMain
- start <emphasis>jconsole</emphasis>, which typically resides in
- <emphasis>JDK_HOME/bin/jconsole</emphasis></para>
-
- <para>When the "new connection" window is displayed either connect
- to local process (if jconsole started on same host as Server) or
- use the remote process connection.</para>
-
- <para>By default the "overview" tab for the VM is displayed (this
- is a great way to get insight into the VM btw). Select
- the "MBeans" tab.</para>
-
- <para>You should now see <emphasis>org.apache.ZooKeeperService</emphasis>
- on the left hand side. Expand this item and depending on how you've
- started the server you will be able to monitor and manage various
- service related features.</para>
-
- <para>Also note that ZooKeeper will register log4j MBeans as
- well. In the same section along the left hand side you will see
- "log4j". Expand that to manage log4j through JMX. Of particular
- interest is the ability to dynamically change the logging levels
- used by editing the appender and root thresholds. Log4j MBean
- registration can be disabled by passing
- <emphasis>-Dzookeeper.jmx.log4j.disable=true</emphasis> to the JVM
- when starting ZooKeeper.
- </para>
-
- </section>
-
- <section id="ch_reference">
- <title>ZooKeeper MBean Reference</title>
-
- <para>This table details JMX for a server participating in a
- replicated ZooKeeper ensemble (ie not standalone). This is the
- typical case for a production environment.</para>
-
- <table>
- <title>MBeans, their names and description</title>
-
- <tgroup cols='4'>
- <thead>
- <row>
- <entry>MBean</entry>
- <entry>MBean Object Name</entry>
- <entry>Description</entry>
- </row>
- </thead>
- <tbody>
- <row>
- <entry>Quorum</entry>
- <entry>ReplicatedServer_id<#></entry>
- <entry>Represents the Quorum, or Ensemble - parent of all
- cluster members. Note that the object name includes the
- "myid" of the server (name suffix) that your JMX agent has
- connected to.</entry>
- </row>
- <row>
- <entry>LocalPeer|RemotePeer</entry>
- <entry>replica.<#></entry>
- <entry>Represents a local or remote peer (ie server
- participating in the ensemble). Note that the object name
- includes the "myid" of the server (name suffix).</entry>
- </row>
- <row>
- <entry>LeaderElection</entry>
- <entry>LeaderElection</entry>
- <entry>Represents a ZooKeeper cluster leader election which is
- in progress. Provides information about the election, such as
- when it started.</entry>
- </row>
- <row>
- <entry>Leader</entry>
- <entry>Leader</entry>
- <entry>Indicates that the parent replica is the leader and
- provides attributes/operations for that server. Note that
- Leader is a subclass of ZooKeeperServer, so it provides
- all of the information normally associated with a
- ZooKeeperServer node.</entry>
- </row>
- <row>
- <entry>Follower</entry>
- <entry>Follower</entry>
- <entry>Indicates that the parent replica is a follower and
- provides attributes/operations for that server. Note that
- Follower is a subclass of ZooKeeperServer, so it provides
- all of the information normally associated with a
- ZooKeeperServer node.</entry>
- </row>
- <row>
- <entry>DataTree</entry>
- <entry>InMemoryDataTree</entry>
- <entry>Statistics on the in memory znode database, also
- operations to access finer (and more computationally
- intensive) statistics on the data (such as ephemeral
- count). InMemoryDataTrees are children of ZooKeeperServer
- nodes.</entry>
- </row>
- <row>
- <entry>ServerCnxn</entry>
- <entry><session_id></entry>
- <entry>Statistics on each client connection, also
- operations on those connections (such as
- termination). Note the object name is the session id of
- the connection in hex form.</entry>
- </row>
- </tbody></tgroup></table>
-
- <para>This table details JMX for a standalone server. Typically
- standalone is only used in development situations.</para>
-
- <table>
- <title>MBeans, their names and description</title>
-
- <tgroup cols='4'>
- <thead>
- <row>
- <entry>MBean</entry>
- <entry>MBean Object Name</entry>
- <entry>Description</entry>
- </row>
- </thead>
- <tbody>
- <row>
- <entry>ZooKeeperServer</entry>
- <entry>StandaloneServer_port<#></entry>
- <entry>Statistics on the running server, also operations
- to reset these attributes. Note that the object name
- includes the client port of the server (name
- suffix).</entry>
- </row>
- <row>
- <entry>DataTree</entry>
- <entry>InMemoryDataTree</entry>
- <entry>Statistics on the in memory znode database, also
- operations to access finer (and more computationally
- intensive) statistics on the data (such as ephemeral
- count).</entry>
- </row>
- <row>
- <entry>ServerCnxn</entry>
- <entry><session_id></entry>
- <entry>Statistics on each client connection, also
- operations on those connections (such as
- termination). Note the object name is the session id of
- the connection in hex form.</entry>
- </row>
- </tbody></tgroup></table>
-
- </section>
-
-</article>
http://git-wip-us.apache.org/repos/asf/zookeeper/blob/c1efa954/src/docs/src/documentation/content/xdocs/zookeeperObservers.xml
----------------------------------------------------------------------
diff --git a/src/docs/src/documentation/content/xdocs/zookeeperObservers.xml b/src/docs/src/documentation/content/xdocs/zookeeperObservers.xml
deleted file mode 100644
index 3955f3d..0000000
--- a/src/docs/src/documentation/content/xdocs/zookeeperObservers.xml
+++ /dev/null
@@ -1,145 +0,0 @@
-<?xml version="1.0" encoding="UTF-8"?>
-<!--
- Copyright 2002-2004 The Apache Software Foundation
-
- Licensed under the Apache License, Version 2.0 (the "License");
- you may not use this file except in compliance with the License.
- You may obtain a copy of the License at
-
- http://www.apache.org/licenses/LICENSE-2.0
-
- Unless required by applicable law or agreed to in writing, software
- distributed under the License is distributed on an "AS IS" BASIS,
- WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- See the License for the specific language governing permissions and
- limitations under the License.
--->
-
-<!DOCTYPE article PUBLIC "-//OASIS//DTD Simplified DocBook XML V1.0//EN"
-"http://www.oasis-open.org/docbook/xml/simple/1.0/sdocbook.dtd">
-<article id="bk_GettStartedGuide">
- <title>ZooKeeper Observers</title>
-
- <articleinfo>
- <legalnotice>
- <para>Licensed under the Apache License, Version 2.0 (the "License"); you
- may not use this file except in compliance with the License. You may
- obtain a copy of the License
- at <ulink url="http://www.apache.org/licenses/LICENSE-2.0">http://www.apache.org/licenses/LICENSE-2.0</ulink>.</para>
-
- <para>Unless required by applicable law or agreed to in writing, software
- distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
- WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
- License for the specific language governing permissions and limitations
- under the License.</para>
- </legalnotice>
-
- <abstract>
- <para>This guide contains information about using non-voting servers, or
- observers in your ZooKeeper ensembles.</para>
- </abstract>
- </articleinfo>
-
- <section id="ch_Introduction">
- <title>Observers: Scaling ZooKeeper Without Hurting Write Performance
- </title>
- <para>
- Although ZooKeeper performs very well by having clients connect directly
- to voting members of the ensemble, this architecture makes it hard to
- scale out to huge numbers of clients. The problem is that as we add more
- voting members, the write performance drops. This is due to the fact that
- a write operation requires the agreement of (in general) at least half the
- nodes in an ensemble and therefore the cost of a vote can increase
- significantly as more voters are added.
- </para>
- <para>
- We have introduced a new type of ZooKeeper node called
- an <emphasis>Observer</emphasis> which helps address this problem and
- further improves ZooKeeper's scalability. Observers are non-voting members
- of an ensemble which only hear the results of votes, not the agreement
- protocol that leads up to them. Other than this simple distinction,
- Observers function exactly the same as Followers - clients may connect to
- them and send read and write requests to them. Observers forward these
- requests to the Leader like Followers do, but they then simply wait to
- hear the result of the vote. Because of this, we can increase the number
- of Observers as much as we like without harming the performance of votes.
- </para>
- <para>
- Observers have other advantages. Because they do not vote, they are not a
- critical part of the ZooKeeper ensemble. Therefore they can fail, or be
- disconnected from the cluster, without harming the availability of the
- ZooKeeper service. The benefit to the user is that Observers may connect
- over less reliable network links than Followers. In fact, Observers may be
- used to talk to a ZooKeeper server from another data center. Clients of
- the Observer will see fast reads, as all reads are served locally, and
- writes result in minimal network traffic as the number of messages
- required in the absence of the vote protocol is smaller.
- </para>
- </section>
- <section id="sc_UsingObservers">
- <title>How to use Observers</title>
- <para>Setting up a ZooKeeper ensemble that uses Observers is very simple,
- and requires just two changes to your config files. Firstly, in the config
- file of every node that is to be an Observer, you must place this line:
- </para>
- <programlisting>
- peerType=observer
- </programlisting>
-
- <para>
- This line tells ZooKeeper that the server is to be an Observer. Secondly,
- in every server config file, you must add :observer to the server
- definition line of each Observer. For example:
- </para>
-
- <programlisting>
- server.1:localhost:2181:3181:observer
- </programlisting>
-
- <para>
- This tells every other server that server.1 is an Observer, and that they
- should not expect it to vote. This is all the configuration you need to do
- to add an Observer to your ZooKeeper cluster. Now you can connect to it as
- though it were an ordinary Follower. Try it out, by running:</para>
- <programlisting>
- $ bin/zkCli.sh -server localhost:2181
- </programlisting>
- <para>
- where localhost:2181 is the hostname and port number of the Observer as
- specified in every config file. You should see a command line prompt
- through which you can issue commands like <emphasis>ls</emphasis> to query
- the ZooKeeper service.
- </para>
- </section>
-
- <section id="ch_UseCases">
- <title>Example use cases</title>
- <para>
- Two example use cases for Observers are listed below. In fact, wherever
- you wish to scale the numbe of clients of your ZooKeeper ensemble, or
- where you wish to insulate the critical part of an ensemble from the load
- of dealing with client requests, Observers are a good architectural
- choice.
- </para>
- <itemizedlist>
- <listitem>
- <para> As a datacenter bridge: Forming a ZK ensemble between two
- datacenters is a problematic endeavour as the high variance in latency
- between the datacenters could lead to false positive failure detection
- and partitioning. However if the ensemble runs entirely in one
- datacenter, and the second datacenter runs only Observers, partitions
- aren't problematic as the ensemble remains connected. Clients of the
- Observers may still see and issue proposals.</para>
- </listitem>
- <listitem>
- <para>As a link to a message bus: Some companies have expressed an
- interest in using ZK as a component of a persistent reliable message
- bus. Observers would give a natural integration point for this work: a
- plug-in mechanism could be used to attach the stream of proposals an
- Observer sees to a publish-subscribe system, again without loading the
- core ensemble.
- </para>
- </listitem>
- </itemizedlist>
- </section>
-</article>
http://git-wip-us.apache.org/repos/asf/zookeeper/blob/c1efa954/src/docs/src/documentation/content/xdocs/zookeeperOtherInfo.xml
----------------------------------------------------------------------
diff --git a/src/docs/src/documentation/content/xdocs/zookeeperOtherInfo.xml b/src/docs/src/documentation/content/xdocs/zookeeperOtherInfo.xml
deleted file mode 100644
index a2445b1..0000000
--- a/src/docs/src/documentation/content/xdocs/zookeeperOtherInfo.xml
+++ /dev/null
@@ -1,46 +0,0 @@
-<?xml version="1.0" encoding="UTF-8"?>
-<!--
- Copyright 2002-2004 The Apache Software Foundation
-
- Licensed under the Apache License, Version 2.0 (the "License");
- you may not use this file except in compliance with the License.
- You may obtain a copy of the License at
-
- http://www.apache.org/licenses/LICENSE-2.0
-
- Unless required by applicable law or agreed to in writing, software
- distributed under the License is distributed on an "AS IS" BASIS,
- WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- See the License for the specific language governing permissions and
- limitations under the License.
--->
-
-<!DOCTYPE article PUBLIC "-//OASIS//DTD Simplified DocBook XML V1.0//EN"
-"http://www.oasis-open.org/docbook/xml/simple/1.0/sdocbook.dtd">
-<article id="bk_OtherInfo">
- <title>ZooKeeper</title>
-
- <articleinfo>
- <legalnotice>
- <para>Licensed under the Apache License, Version 2.0 (the "License");
- you may not use this file except in compliance with the License. You may
- obtain a copy of the License at <ulink
- url="http://www.apache.org/licenses/LICENSE-2.0">http://www.apache.org/licenses/LICENSE-2.0</ulink>.</para>
-
- <para>Unless required by applicable law or agreed to in writing,
- software distributed under the License is distributed on an "AS IS"
- BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
- implied. See the License for the specific language governing permissions
- and limitations under the License.</para>
- </legalnotice>
-
- <abstract>
- <para> currently empty </para>
- </abstract>
- </articleinfo>
-
- <section id="ch_placeholder">
- <title>Other Info</title>
- <para> currently empty </para>
- </section>
-</article>
http://git-wip-us.apache.org/repos/asf/zookeeper/blob/c1efa954/src/docs/src/documentation/content/xdocs/zookeeperOver.xml
----------------------------------------------------------------------
diff --git a/src/docs/src/documentation/content/xdocs/zookeeperOver.xml b/src/docs/src/documentation/content/xdocs/zookeeperOver.xml
deleted file mode 100644
index 7a0444c..0000000
--- a/src/docs/src/documentation/content/xdocs/zookeeperOver.xml
+++ /dev/null
@@ -1,464 +0,0 @@
-<?xml version="1.0" encoding="UTF-8"?>
-<!--
- Copyright 2002-2004 The Apache Software Foundation
-
- Licensed under the Apache License, Version 2.0 (the "License");
- you may not use this file except in compliance with the License.
- You may obtain a copy of the License at
-
- http://www.apache.org/licenses/LICENSE-2.0
-
- Unless required by applicable law or agreed to in writing, software
- distributed under the License is distributed on an "AS IS" BASIS,
- WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- See the License for the specific language governing permissions and
- limitations under the License.
--->
-
-<!DOCTYPE article PUBLIC "-//OASIS//DTD Simplified DocBook XML V1.0//EN"
-"http://www.oasis-open.org/docbook/xml/simple/1.0/sdocbook.dtd">
-<article id="bk_Overview">
- <title>ZooKeeper</title>
-
- <articleinfo>
- <legalnotice>
- <para>Licensed under the Apache License, Version 2.0 (the "License");
- you may not use this file except in compliance with the License. You may
- obtain a copy of the License at <ulink
- url="http://www.apache.org/licenses/LICENSE-2.0">http://www.apache.org/licenses/LICENSE-2.0</ulink>.</para>
-
- <para>Unless required by applicable law or agreed to in writing,
- software distributed under the License is distributed on an "AS IS"
- BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
- implied. See the License for the specific language governing permissions
- and limitations under the License.</para>
- </legalnotice>
-
- <abstract>
- <para>This document contains overview information about ZooKeeper. It
- discusses design goals, key concepts, implementation, and
- performance.</para>
- </abstract>
- </articleinfo>
-
- <section id="ch_DesignOverview">
- <title>ZooKeeper: A Distributed Coordination Service for Distributed
- Applications</title>
-
- <para>ZooKeeper is a distributed, open-source coordination service for
- distributed applications. It exposes a simple set of primitives that
- distributed applications can build upon to implement higher level services
- for synchronization, configuration maintenance, and groups and naming. It
- is designed to be easy to program to, and uses a data model styled after
- the familiar directory tree structure of file systems. It runs in Java and
- has bindings for both Java and C.</para>
-
- <para>Coordination services are notoriously hard to get right. They are
- especially prone to errors such as race conditions and deadlock. The
- motivation behind ZooKeeper is to relieve distributed applications the
- responsibility of implementing coordination services from scratch.</para>
-
- <section id="sc_designGoals">
- <title>Design Goals</title>
-
- <para><emphasis role="bold">ZooKeeper is simple.</emphasis> ZooKeeper
- allows distributed processes to coordinate with each other through a
- shared hierarchal namespace which is organized similarly to a standard
- file system. The name space consists of data registers - called znodes,
- in ZooKeeper parlance - and these are similar to files and directories.
- Unlike a typical file system, which is designed for storage, ZooKeeper
- data is kept in-memory, which means ZooKeeper can acheive high
- throughput and low latency numbers.</para>
-
- <para>The ZooKeeper implementation puts a premium on high performance,
- highly available, strictly ordered access. The performance aspects of
- ZooKeeper means it can be used in large, distributed systems. The
- reliability aspects keep it from being a single point of failure. The
- strict ordering means that sophisticated synchronization primitives can
- be implemented at the client.</para>
-
- <para><emphasis role="bold">ZooKeeper is replicated.</emphasis> Like the
- distributed processes it coordinates, ZooKeeper itself is intended to be
- replicated over a sets of hosts called an ensemble.</para>
-
- <figure>
- <title>ZooKeeper Service</title>
-
- <mediaobject>
- <imageobject>
- <imagedata fileref="images/zkservice.jpg" />
- </imageobject>
- </mediaobject>
- </figure>
-
- <para>The servers that make up the ZooKeeper service must all know about
- each other. They maintain an in-memory image of state, along with a
- transaction logs and snapshots in a persistent store. As long as a
- majority of the servers are available, the ZooKeeper service will be
- available.</para>
-
- <para>Clients connect to a single ZooKeeper server. The client maintains
- a TCP connection through which it sends requests, gets responses, gets
- watch events, and sends heart beats. If the TCP connection to the server
- breaks, the client will connect to a different server.</para>
-
- <para><emphasis role="bold">ZooKeeper is ordered.</emphasis> ZooKeeper
- stamps each update with a number that reflects the order of all
- ZooKeeper transactions. Subsequent operations can use the order to
- implement higher-level abstractions, such as synchronization
- primitives.</para>
-
- <para><emphasis role="bold">ZooKeeper is fast.</emphasis> It is
- especially fast in "read-dominant" workloads. ZooKeeper applications run
- on thousands of machines, and it performs best where reads are more
- common than writes, at ratios of around 10:1.</para>
- </section>
-
- <section id="sc_dataModelNameSpace">
- <title>Data model and the hierarchical namespace</title>
-
- <para>The name space provided by ZooKeeper is much like that of a
- standard file system. A name is a sequence of path elements separated by
- a slash (/). Every node in ZooKeeper's name space is identified by a
- path.</para>
-
- <figure>
- <title>ZooKeeper's Hierarchical Namespace</title>
-
- <mediaobject>
- <imageobject>
- <imagedata fileref="images/zknamespace.jpg" />
- </imageobject>
- </mediaobject>
- </figure>
- </section>
-
- <section>
- <title>Nodes and ephemeral nodes</title>
-
- <para>Unlike is standard file systems, each node in a ZooKeeper
- namespace can have data associated with it as well as children. It is
- like having a file-system that allows a file to also be a directory.
- (ZooKeeper was designed to store coordination data: status information,
- configuration, location information, etc., so the data stored at each
- node is usually small, in the byte to kilobyte range.) We use the term
- <emphasis>znode</emphasis> to make it clear that we are talking about
- ZooKeeper data nodes.</para>
-
- <para>Znodes maintain a stat structure that includes version numbers for
- data changes, ACL changes, and timestamps, to allow cache validations
- and coordinated updates. Each time a znode's data changes, the version
- number increases. For instance, whenever a client retrieves data it also
- receives the version of the data.</para>
-
- <para>The data stored at each znode in a namespace is read and written
- atomically. Reads get all the data bytes associated with a znode and a
- write replaces all the data. Each node has an Access Control List (ACL)
- that restricts who can do what.</para>
-
- <para>ZooKeeper also has the notion of ephemeral nodes. These znodes
- exists as long as the session that created the znode is active. When the
- session ends the znode is deleted. Ephemeral nodes are useful when you
- want to implement <emphasis>[tbd]</emphasis>.</para>
- </section>
-
- <section>
- <title>Conditional updates and watches</title>
-
- <para>ZooKeeper supports the concept of <emphasis>watches</emphasis>.
- Clients can set a watch on a znodes. A watch will be triggered and
- removed when the znode changes. When a watch is triggered the client
- receives a packet saying that the znode has changed. And if the
- connection between the client and one of the Zoo Keeper servers is
- broken, the client will receive a local notification. These can be used
- to <emphasis>[tbd]</emphasis>.</para>
- </section>
-
- <section>
- <title>Guarantees</title>
-
- <para>ZooKeeper is very fast and very simple. Since its goal, though, is
- to be a basis for the construction of more complicated services, such as
- synchronization, it provides a set of guarantees. These are:</para>
-
- <itemizedlist>
- <listitem>
- <para>Sequential Consistency - Updates from a client will be applied
- in the order that they were sent.</para>
- </listitem>
-
- <listitem>
- <para>Atomicity - Updates either succeed or fail. No partial
- results.</para>
- </listitem>
-
- <listitem>
- <para>Single System Image - A client will see the same view of the
- service regardless of the server that it connects to.</para>
- </listitem>
- </itemizedlist>
-
- <itemizedlist>
- <listitem>
- <para>Reliability - Once an update has been applied, it will persist
- from that time forward until a client overwrites the update.</para>
- </listitem>
- </itemizedlist>
-
- <itemizedlist>
- <listitem>
- <para>Timeliness - The clients view of the system is guaranteed to
- be up-to-date within a certain time bound.</para>
- </listitem>
- </itemizedlist>
-
- <para>For more information on these, and how they can be used, see
- <emphasis>[tbd]</emphasis></para>
- </section>
-
- <section>
- <title>Simple API</title>
-
- <para>One of the design goals of ZooKeeper is provide a very simple
- programming interface. As a result, it supports only these
- operations:</para>
-
- <variablelist>
- <varlistentry>
- <term>create</term>
-
- <listitem>
- <para>creates a node at a location in the tree</para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>delete</term>
-
- <listitem>
- <para>deletes a node</para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>exists</term>
-
- <listitem>
- <para>tests if a node exists at a location</para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>get data</term>
-
- <listitem>
- <para>reads the data from a node</para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>set data</term>
-
- <listitem>
- <para>writes data to a node</para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>get children</term>
-
- <listitem>
- <para>retrieves a list of children of a node</para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>sync</term>
-
- <listitem>
- <para>waits for data to be propagated</para>
- </listitem>
- </varlistentry>
- </variablelist>
-
- <para>For a more in-depth discussion on these, and how they can be used
- to implement higher level operations, please refer to
- <emphasis>[tbd]</emphasis></para>
- </section>
-
- <section>
- <title>Implementation</title>
-
- <para><xref linkend="fg_zkComponents" /> shows the high-level components
- of the ZooKeeper service. With the exception of the request processor,
- each of
- the servers that make up the ZooKeeper service replicates its own copy
- of each of components.</para>
-
- <figure id="fg_zkComponents">
- <title>ZooKeeper Components</title>
-
- <mediaobject>
- <imageobject>
- <imagedata fileref="images/zkcomponents.jpg" />
- </imageobject>
- </mediaobject>
- </figure>
-
- <para>The replicated database is an in-memory database containing the
- entire data tree. Updates are logged to disk for recoverability, and
- writes are serialized to disk before they are applied to the in-memory
- database.</para>
-
- <para>Every ZooKeeper server services clients. Clients connect to
- exactly one server to submit irequests. Read requests are serviced from
- the local replica of each server database. Requests that change the
- state of the service, write requests, are processed by an agreement
- protocol.</para>
-
- <para>As part of the agreement protocol all write requests from clients
- are forwarded to a single server, called the
- <emphasis>leader</emphasis>. The rest of the ZooKeeper servers, called
- <emphasis>followers</emphasis>, receive message proposals from the
- leader and agree upon message delivery. The messaging layer takes care
- of replacing leaders on failures and syncing followers with
- leaders.</para>
-
- <para>ZooKeeper uses a custom atomic messaging protocol. Since the
- messaging layer is atomic, ZooKeeper can guarantee that the local
- replicas never diverge. When the leader receives a write request, it
- calculates what the state of the system is when the write is to be
- applied and transforms this into a transaction that captures this new
- state.</para>
- </section>
-
- <section>
- <title>Uses</title>
-
- <para>The programming interface to ZooKeeper is deliberately simple.
- With it, however, you can implement higher order operations, such as
- synchronizations primitives, group membership, ownership, etc. Some
- distributed applications have used it to: <emphasis>[tbd: add uses from
- white paper and video presentation.]</emphasis> For more information, see
- <emphasis>[tbd]</emphasis></para>
- </section>
-
- <section>
- <title>Performance</title>
-
- <para>ZooKeeper is designed to be highly performant. But is it? The
- results of the ZooKeeper's development team at Yahoo! Research indicate
- that it is. (See <xref linkend="fg_zkPerfRW" />.) It is especially high
- performance in applications where reads outnumber writes, since writes
- involve synchronizing the state of all servers. (Reads outnumbering
- writes is typically the case for a coordination service.)</para>
-
- <figure id="fg_zkPerfRW">
- <title>ZooKeeper Throughput as the Read-Write Ratio Varies</title>
-
- <mediaobject>
- <imageobject>
- <imagedata fileref="images/zkperfRW-3.2.jpg" />
- </imageobject>
- </mediaobject>
- </figure>
- <para>The figure <xref linkend="fg_zkPerfRW"/> is a throughput
- graph of ZooKeeper release 3.2 running on servers with dual 2Ghz
- Xeon and two SATA 15K RPM drives. One drive was used as a
- dedicated ZooKeeper log device. The snapshots were written to
- the OS drive. Write requests were 1K writes and the reads were
- 1K reads. "Servers" indicate the size of the ZooKeeper
- ensemble, the number of servers that make up the
- service. Approximately 30 other servers were used to simulate
- the clients. The ZooKeeper ensemble was configured such that
- leaders do not allow connections from clients.</para>
-
- <note><para>In version 3.2 r/w performance improved by ~2x
- compared to the <ulink
- url="http://zookeeper.apache.org/docs/r3.1.1/zookeeperOver.html#Performance">previous
- 3.1 release</ulink>.</para></note>
-
- <para>Benchmarks also indicate that it is reliable, too. <xref
- linkend="fg_zkPerfReliability" /> shows how a deployment responds to
- various failures. The events marked in the figure are the
- following:</para>
-
- <orderedlist>
- <listitem>
- <para>Failure and recovery of a follower</para>
- </listitem>
-
- <listitem>
- <para>Failure and recovery of a different follower</para>
- </listitem>
-
- <listitem>
- <para>Failure of the leader</para>
- </listitem>
-
- <listitem>
- <para>Failure and recovery of two followers</para>
- </listitem>
-
- <listitem>
- <para>Failure of another leader</para>
- </listitem>
- </orderedlist>
- </section>
-
- <section>
- <title>Reliability</title>
-
- <para>To show the behavior of the system over time as
- failures are injected we ran a ZooKeeper service made up of
- 7 machines. We ran the same saturation benchmark as before,
- but this time we kept the write percentage at a constant
- 30%, which is a conservative ratio of our expected
- workloads.
- </para>
- <figure id="fg_zkPerfReliability">
- <title>Reliability in the Presence of Errors</title>
- <mediaobject>
- <imageobject>
- <imagedata fileref="images/zkperfreliability.jpg" />
- </imageobject>
- </mediaobject>
- </figure>
-
- <para>The are a few important observations from this graph. First, if
- followers fail and recover quickly, then ZooKeeper is able to sustain a
- high throughput despite the failure. But maybe more importantly, the
- leader election algorithm allows for the system to recover fast enough
- to prevent throughput from dropping substantially. In our observations,
- ZooKeeper takes less than 200ms to elect a new leader. Third, as
- followers recover, ZooKeeper is able to raise throughput again once they
- start processing requests.</para>
- </section>
-
- <section>
- <title>The ZooKeeper Project</title>
-
- <para>ZooKeeper has been
- <ulink url="https://cwiki.apache.org/confluence/display/ZOOKEEPER/PoweredBy">
- successfully used
- </ulink>
- in many industrial applications. It is used at Yahoo! as the
- coordination and failure recovery service for Yahoo! Message
- Broker, which is a highly scalable publish-subscribe system
- managing thousands of topics for replication and data
- delivery. It is used by the Fetching Service for Yahoo!
- crawler, where it also manages failure recovery. A number of
- Yahoo! advertising systems also use ZooKeeper to implement
- reliable services.
- </para>
-
- <para>All users and developers are encouraged to join the
- community and contribute their expertise. See the
- <ulink url="http://zookeeper.apache.org/">
- Zookeeper Project on Apache
- </ulink>
- for more information.
- </para>
- </section>
- </section>
-</article>
[05/12] zookeeper git commit: ZOOKEEPER-3022: MAVEN MIGRATION 3.4 -
Iteration 1 - docs, it
Posted by an...@apache.org.
http://git-wip-us.apache.org/repos/asf/zookeeper/blob/c1efa954/zookeeper-docs/src/documentation/content/xdocs/javaExample.xml
----------------------------------------------------------------------
diff --git a/zookeeper-docs/src/documentation/content/xdocs/javaExample.xml b/zookeeper-docs/src/documentation/content/xdocs/javaExample.xml
new file mode 100644
index 0000000..c992282
--- /dev/null
+++ b/zookeeper-docs/src/documentation/content/xdocs/javaExample.xml
@@ -0,0 +1,663 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!--
+ Copyright 2002-2004 The Apache Software Foundation
+
+ Licensed under the Apache License, Version 2.0 (the "License");
+ you may not use this file except in compliance with the License.
+ You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ See the License for the specific language governing permissions and
+ limitations under the License.
+-->
+
+<!DOCTYPE article PUBLIC "-//OASIS//DTD Simplified DocBook XML V1.0//EN"
+"http://www.oasis-open.org/docbook/xml/simple/1.0/sdocbook.dtd">
+<article id="ar_JavaExample">
+ <title>ZooKeeper Java Example</title>
+
+ <articleinfo>
+ <legalnotice>
+ <para>Licensed under the Apache License, Version 2.0 (the "License");
+ you may not use this file except in compliance with the License. You may
+ obtain a copy of the License at <ulink
+ url="http://www.apache.org/licenses/LICENSE-2.0">http://www.apache.org/licenses/LICENSE-2.0</ulink>.</para>
+
+ <para>Unless required by applicable law or agreed to in writing,
+ software distributed under the License is distributed on an "AS IS"
+ BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
+ implied. See the License for the specific language governing permissions
+ and limitations under the License.</para>
+ </legalnotice>
+
+ <abstract>
+ <para>This article contains sample Java code for a simple watch client.</para>
+
+ </abstract>
+ </articleinfo>
+
+ <section id="ch_Introduction">
+ <title>A Simple Watch Client</title>
+
+ <para>To introduce you to the ZooKeeper Java API, we develop here a very simple
+ watch client. This ZooKeeper client watches a ZooKeeper node for changes
+ and responds to by starting or stopping a program.</para>
+
+ <section id="sc_requirements"><title>Requirements</title>
+
+ <para>The client has four requirements:</para>
+
+ <itemizedlist><listitem><para>It takes as parameters:</para>
+ <itemizedlist>
+ <listitem><para>the address of the ZooKeeper service</para></listitem>
+ <listitem> <para>then name of a znode - the one to be watched</para></listitem>
+ <listitem><para>an executable with arguments.</para></listitem></itemizedlist></listitem>
+ <listitem><para>It fetches the data associated with the znode and starts the executable.</para></listitem>
+ <listitem><para>If the znode changes, the client refetches the contents and restarts the executable.</para></listitem>
+ <listitem><para>If the znode disappears, the client kills the executable.</para></listitem></itemizedlist>
+
+ </section>
+
+ <section id="sc_design">
+ <title>Program Design</title>
+
+ <para>Conventionally, ZooKeeper applications are broken into two units, one which maintains the connection,
+ and the other which monitors data. In this application, the class called the <emphasis role="bold">Executor</emphasis>
+ maintains the ZooKeeper connection, and the class called the <emphasis role="bold">DataMonitor</emphasis> monitors the data
+ in the ZooKeeper tree. Also, Executor contains the main thread and contains the execution logic.
+ It is responsible for what little user interaction there is, as well as interaction with the exectuable program you
+ pass in as an argument and which the sample (per the requirements) shuts down and restarts, according to the
+ state of the znode.</para>
+
+ </section>
+
+ </section>
+
+ <section id="sc_executor"><title>The Executor Class</title>
+ <para>The Executor object is the primary container of the sample application. It contains
+ both the <emphasis role="bold">ZooKeeper</emphasis> object, <emphasis role="bold">DataMonitor</emphasis>, as described above in
+ <xref linkend="sc_design"/>. </para>
+
+ <programlisting>
+ // from the Executor class...
+
+ public static void main(String[] args) {
+ if (args.length < 4) {
+ System.err
+ .println("USAGE: Executor hostPort znode filename program [args ...]");
+ System.exit(2);
+ }
+ String hostPort = args[0];
+ String znode = args[1];
+ String filename = args[2];
+ String exec[] = new String[args.length - 3];
+ System.arraycopy(args, 3, exec, 0, exec.length);
+ try {
+ new Executor(hostPort, znode, filename, exec).run();
+ } catch (Exception e) {
+ e.printStackTrace();
+ }
+ }
+
+ public Executor(String hostPort, String znode, String filename,
+ String exec[]) throws KeeperException, IOException {
+ this.filename = filename;
+ this.exec = exec;
+ zk = new ZooKeeper(hostPort, 3000, this);
+ dm = new DataMonitor(zk, znode, null, this);
+ }
+
+ public void run() {
+ try {
+ synchronized (this) {
+ while (!dm.dead) {
+ wait();
+ }
+ }
+ } catch (InterruptedException e) {
+ }
+ }
+</programlisting>
+
+
+ <para>
+ Recall that the Executor's job is to start and stop the executable whose name you pass in on the command line.
+ It does this in response to events fired by the ZooKeeper object. As you can see in the code above, the Executor passes
+ a reference to itself as the Watcher argument in the ZooKeeper constructor. It also passes a reference to itself
+ as DataMonitorListener argument to the DataMonitor constructor. Per the Executor's definition, it implements both these
+ interfaces:
+ </para>
+
+ <programlisting>
+public class Executor implements Watcher, Runnable, DataMonitor.DataMonitorListener {
+...</programlisting>
+
+ <para>The <emphasis role="bold">Watcher</emphasis> interface is defined by the ZooKeeper Java API.
+ ZooKeeper uses it to communicate back to its container. It supports only one method, <command>process()</command>, and ZooKeeper uses
+ it to communciates generic events that the main thread would be intersted in, such as the state of the ZooKeeper connection or the ZooKeeper session.The Executor
+ in this example simply forwards those events down to the DataMonitor to decide what to do with them. It does this simply to illustrate
+ the point that, by convention, the Executor or some Executor-like object "owns" the ZooKeeper connection, but it is free to delegate the events to other
+ events to other objects. It also uses this as the default channel on which to fire watch events. (More on this later.)</para>
+
+<programlisting>
+ public void process(WatchedEvent event) {
+ dm.process(event);
+ }
+</programlisting>
+
+ <para>The <emphasis role="bold">DataMonitorListener</emphasis>
+ interface, on the other hand, is not part of the the ZooKeeper API. It is a completely custom interface,
+ designed for this sample application. The DataMonitor object uses it to communicate back to its container, which
+ is also the the Executor object.The DataMonitorListener interface looks like this:</para>
+ <programlisting>
+public interface DataMonitorListener {
+ /**
+ * The existence status of the node has changed.
+ */
+ void exists(byte data[]);
+
+ /**
+ * The ZooKeeper session is no longer valid.
+ *
+ * @param rc
+ * the ZooKeeper reason code
+ */
+ void closing(int rc);
+}
+</programlisting>
+ <para>This interface is defined in the DataMonitor class and implemented in the Executor class.
+ When <command>Executor.exists()</command> is invoked,
+ the Executor decides whether to start up or shut down per the requirements. Recall that the requires say to kill the executable when the
+ znode ceases to <emphasis>exist</emphasis>. </para>
+
+ <para>When <command>Executor.closing()</command>
+ is invoked, the Executor decides whether or not to shut itself down in response to the ZooKeeper connection permanently disappearing.</para>
+
+ <para>As you might have guessed, DataMonitor is the object that invokes
+ these methods, in response to changes in ZooKeeper's state.</para>
+
+ <para>Here are Executor's implementation of
+ <command>DataMonitorListener.exists()</command> and <command>DataMonitorListener.closing</command>:
+ </para>
+ <programlisting>
+public void exists( byte[] data ) {
+ if (data == null) {
+ if (child != null) {
+ System.out.println("Killing process");
+ child.destroy();
+ try {
+ child.waitFor();
+ } catch (InterruptedException e) {
+ }
+ }
+ child = null;
+ } else {
+ if (child != null) {
+ System.out.println("Stopping child");
+ child.destroy();
+ try {
+ child.waitFor();
+ } catch (InterruptedException e) {
+ e.printStackTrace();
+ }
+ }
+ try {
+ FileOutputStream fos = new FileOutputStream(filename);
+ fos.write(data);
+ fos.close();
+ } catch (IOException e) {
+ e.printStackTrace();
+ }
+ try {
+ System.out.println("Starting child");
+ child = Runtime.getRuntime().exec(exec);
+ new StreamWriter(child.getInputStream(), System.out);
+ new StreamWriter(child.getErrorStream(), System.err);
+ } catch (IOException e) {
+ e.printStackTrace();
+ }
+ }
+}
+
+public void closing(int rc) {
+ synchronized (this) {
+ notifyAll();
+ }
+}
+</programlisting>
+
+</section>
+<section id="sc_DataMonitor"><title>The DataMonitor Class</title>
+<para>
+The DataMonitor class has the meat of the ZooKeeper logic. It is mostly
+asynchronous and event driven. DataMonitor kicks things off in the constructor with:</para>
+<programlisting>
+public DataMonitor(ZooKeeper zk, String znode, Watcher chainedWatcher,
+ DataMonitorListener listener) {
+ this.zk = zk;
+ this.znode = znode;
+ this.chainedWatcher = chainedWatcher;
+ this.listener = listener;
+
+ // Get things started by checking if the node exists. We are going
+ // to be completely event driven
+ <emphasis role="bold">zk.exists(znode, true, this, null);</emphasis>
+}
+</programlisting>
+
+<para>The call to <command>ZooKeeper.exists()</command> checks for the existence of the znode,
+sets a watch, and passes a reference to itself (<command>this</command>)
+as the completion callback object. In this sense, it kicks things off, since the
+real processing happens when the watch is triggered.</para>
+
+<note>
+<para>Don't confuse the completion callback with the watch callback. The <command>ZooKeeper.exists()</command>
+completion callback, which happens to be the method <command>StatCallback.processResult()</command> implemented
+in the DataMonitor object, is invoked when the asynchronous <emphasis>setting of the watch</emphasis> operation
+(by <command>ZooKeeper.exists()</command>) completes on the server. </para>
+<para>
+The triggering of the watch, on the other hand, sends an event to the <emphasis>Executor</emphasis> object, since
+the Executor registered as the Watcher of the ZooKeeper object.</para>
+
+<para>As an aside, you might note that the DataMonitor could also register itself as the Watcher
+for this particular watch event. This is new to ZooKeeper 3.0.0 (the support of multiple Watchers). In this
+example, however, DataMonitor does not register as the Watcher.</para>
+</note>
+
+<para>When the <command>ZooKeeper.exists()</command> operation completes on the server, the ZooKeeper API invokes this completion callback on
+the client:</para>
+
+<programlisting>
+public void processResult(int rc, String path, Object ctx, Stat stat) {
+ boolean exists;
+ switch (rc) {
+ case Code.Ok:
+ exists = true;
+ break;
+ case Code.NoNode:
+ exists = false;
+ break;
+ case Code.SessionExpired:
+ case Code.NoAuth:
+ dead = true;
+ listener.closing(rc);
+ return;
+ default:
+ // Retry errors
+ zk.exists(znode, true, this, null);
+ return;
+ }
+
+ byte b[] = null;
+ if (exists) {
+ try {
+ <emphasis role="bold">b = zk.getData(znode, false, null);</emphasis>
+ } catch (KeeperException e) {
+ // We don't need to worry about recovering now. The watch
+ // callbacks will kick off any exception handling
+ e.printStackTrace();
+ } catch (InterruptedException e) {
+ return;
+ }
+ }
+ if ((b == null && b != prevData)
+ || (b != null && !Arrays.equals(prevData, b))) {
+ <emphasis role="bold">listener.exists(b);</emphasis>
+ prevData = b;
+ }
+}
+</programlisting>
+
+<para>
+The code first checks the error codes for znode existence, fatal errors, and
+recoverable errors. If the file (or znode) exists, it gets the data from the znode, and
+then invoke the exists() callback of Executor if the state has changed. Note,
+it doesn't have to do any Exception processing for the getData call because it
+has watches pending for anything that could cause an error: if the node is deleted
+before it calls <command>ZooKeeper.getData()</command>, the watch event set by
+the <command>ZooKeeper.exists()</command> triggers a callback;
+if there is a communication error, a connection watch event fires when
+the connection comes back up.
+</para>
+
+<para>Finally, notice how DataMonitor processes watch events: </para>
+<programlisting>
+ public void process(WatchedEvent event) {
+ String path = event.getPath();
+ if (event.getType() == Event.EventType.None) {
+ // We are are being told that the state of the
+ // connection has changed
+ switch (event.getState()) {
+ case SyncConnected:
+ // In this particular example we don't need to do anything
+ // here - watches are automatically re-registered with
+ // server and any watches triggered while the client was
+ // disconnected will be delivered (in order of course)
+ break;
+ case Expired:
+ // It's all over
+ dead = true;
+ listener.closing(KeeperException.Code.SessionExpired);
+ break;
+ }
+ } else {
+ if (path != null && path.equals(znode)) {
+ // Something has changed on the node, let's find out
+ zk.exists(znode, true, this, null);
+ }
+ }
+ if (chainedWatcher != null) {
+ chainedWatcher.process(event);
+ }
+ }
+</programlisting>
+<para>
+If the client-side ZooKeeper libraries can re-establish the
+communication channel (SyncConnected event) to ZooKeeper before
+session expiration (Expired event) all of the session's watches will
+automatically be re-established with the server (auto-reset of watches
+is new in ZooKeeper 3.0.0). See <ulink
+url="zookeeperProgrammers.html#ch_zkWatches">ZooKeeper Watches</ulink>
+in the programmer guide for more on this. A bit lower down in this
+function, when DataMonitor gets an event for a znode, it calls
+<command>ZooKeeper.exists()</command> to find out what has changed.
+</para>
+</section>
+
+<section id="sc_completeSourceCode">
+ <title>Complete Source Listings</title>
+ <example id="eg_Executor_java"><title>Executor.java</title><programlisting>
+/**
+ * A simple example program to use DataMonitor to start and
+ * stop executables based on a znode. The program watches the
+ * specified znode and saves the data that corresponds to the
+ * znode in the filesystem. It also starts the specified program
+ * with the specified arguments when the znode exists and kills
+ * the program if the znode goes away.
+ */
+import java.io.FileOutputStream;
+import java.io.IOException;
+import java.io.InputStream;
+import java.io.OutputStream;
+
+import org.apache.zookeeper.KeeperException;
+import org.apache.zookeeper.WatchedEvent;
+import org.apache.zookeeper.Watcher;
+import org.apache.zookeeper.ZooKeeper;
+
+public class Executor
+ implements Watcher, Runnable, DataMonitor.DataMonitorListener
+{
+ String znode;
+
+ DataMonitor dm;
+
+ ZooKeeper zk;
+
+ String filename;
+
+ String exec[];
+
+ Process child;
+
+ public Executor(String hostPort, String znode, String filename,
+ String exec[]) throws KeeperException, IOException {
+ this.filename = filename;
+ this.exec = exec;
+ zk = new ZooKeeper(hostPort, 3000, this);
+ dm = new DataMonitor(zk, znode, null, this);
+ }
+
+ /**
+ * @param args
+ */
+ public static void main(String[] args) {
+ if (args.length < 4) {
+ System.err
+ .println("USAGE: Executor hostPort znode filename program [args ...]");
+ System.exit(2);
+ }
+ String hostPort = args[0];
+ String znode = args[1];
+ String filename = args[2];
+ String exec[] = new String[args.length - 3];
+ System.arraycopy(args, 3, exec, 0, exec.length);
+ try {
+ new Executor(hostPort, znode, filename, exec).run();
+ } catch (Exception e) {
+ e.printStackTrace();
+ }
+ }
+
+ /***************************************************************************
+ * We do process any events ourselves, we just need to forward them on.
+ *
+ * @see org.apache.zookeeper.Watcher#process(org.apache.zookeeper.proto.WatcherEvent)
+ */
+ public void process(WatchedEvent event) {
+ dm.process(event);
+ }
+
+ public void run() {
+ try {
+ synchronized (this) {
+ while (!dm.dead) {
+ wait();
+ }
+ }
+ } catch (InterruptedException e) {
+ }
+ }
+
+ public void closing(int rc) {
+ synchronized (this) {
+ notifyAll();
+ }
+ }
+
+ static class StreamWriter extends Thread {
+ OutputStream os;
+
+ InputStream is;
+
+ StreamWriter(InputStream is, OutputStream os) {
+ this.is = is;
+ this.os = os;
+ start();
+ }
+
+ public void run() {
+ byte b[] = new byte[80];
+ int rc;
+ try {
+ while ((rc = is.read(b)) > 0) {
+ os.write(b, 0, rc);
+ }
+ } catch (IOException e) {
+ }
+
+ }
+ }
+
+ public void exists(byte[] data) {
+ if (data == null) {
+ if (child != null) {
+ System.out.println("Killing process");
+ child.destroy();
+ try {
+ child.waitFor();
+ } catch (InterruptedException e) {
+ }
+ }
+ child = null;
+ } else {
+ if (child != null) {
+ System.out.println("Stopping child");
+ child.destroy();
+ try {
+ child.waitFor();
+ } catch (InterruptedException e) {
+ e.printStackTrace();
+ }
+ }
+ try {
+ FileOutputStream fos = new FileOutputStream(filename);
+ fos.write(data);
+ fos.close();
+ } catch (IOException e) {
+ e.printStackTrace();
+ }
+ try {
+ System.out.println("Starting child");
+ child = Runtime.getRuntime().exec(exec);
+ new StreamWriter(child.getInputStream(), System.out);
+ new StreamWriter(child.getErrorStream(), System.err);
+ } catch (IOException e) {
+ e.printStackTrace();
+ }
+ }
+ }
+}
+</programlisting>
+
+</example>
+
+<example id="eg_DataMonitor_java">
+ <title>DataMonitor.java</title>
+ <programlisting>
+/**
+ * A simple class that monitors the data and existence of a ZooKeeper
+ * node. It uses asynchronous ZooKeeper APIs.
+ */
+import java.util.Arrays;
+
+import org.apache.zookeeper.KeeperException;
+import org.apache.zookeeper.WatchedEvent;
+import org.apache.zookeeper.Watcher;
+import org.apache.zookeeper.ZooKeeper;
+import org.apache.zookeeper.AsyncCallback.StatCallback;
+import org.apache.zookeeper.KeeperException.Code;
+import org.apache.zookeeper.data.Stat;
+
+public class DataMonitor implements Watcher, StatCallback {
+
+ ZooKeeper zk;
+
+ String znode;
+
+ Watcher chainedWatcher;
+
+ boolean dead;
+
+ DataMonitorListener listener;
+
+ byte prevData[];
+
+ public DataMonitor(ZooKeeper zk, String znode, Watcher chainedWatcher,
+ DataMonitorListener listener) {
+ this.zk = zk;
+ this.znode = znode;
+ this.chainedWatcher = chainedWatcher;
+ this.listener = listener;
+ // Get things started by checking if the node exists. We are going
+ // to be completely event driven
+ zk.exists(znode, true, this, null);
+ }
+
+ /**
+ * Other classes use the DataMonitor by implementing this method
+ */
+ public interface DataMonitorListener {
+ /**
+ * The existence status of the node has changed.
+ */
+ void exists(byte data[]);
+
+ /**
+ * The ZooKeeper session is no longer valid.
+ *
+ * @param rc
+ * the ZooKeeper reason code
+ */
+ void closing(int rc);
+ }
+
+ public void process(WatchedEvent event) {
+ String path = event.getPath();
+ if (event.getType() == Event.EventType.None) {
+ // We are are being told that the state of the
+ // connection has changed
+ switch (event.getState()) {
+ case SyncConnected:
+ // In this particular example we don't need to do anything
+ // here - watches are automatically re-registered with
+ // server and any watches triggered while the client was
+ // disconnected will be delivered (in order of course)
+ break;
+ case Expired:
+ // It's all over
+ dead = true;
+ listener.closing(KeeperException.Code.SessionExpired);
+ break;
+ }
+ } else {
+ if (path != null && path.equals(znode)) {
+ // Something has changed on the node, let's find out
+ zk.exists(znode, true, this, null);
+ }
+ }
+ if (chainedWatcher != null) {
+ chainedWatcher.process(event);
+ }
+ }
+
+ public void processResult(int rc, String path, Object ctx, Stat stat) {
+ boolean exists;
+ switch (rc) {
+ case Code.Ok:
+ exists = true;
+ break;
+ case Code.NoNode:
+ exists = false;
+ break;
+ case Code.SessionExpired:
+ case Code.NoAuth:
+ dead = true;
+ listener.closing(rc);
+ return;
+ default:
+ // Retry errors
+ zk.exists(znode, true, this, null);
+ return;
+ }
+
+ byte b[] = null;
+ if (exists) {
+ try {
+ b = zk.getData(znode, false, null);
+ } catch (KeeperException e) {
+ // We don't need to worry about recovering now. The watch
+ // callbacks will kick off any exception handling
+ e.printStackTrace();
+ } catch (InterruptedException e) {
+ return;
+ }
+ }
+ if ((b == null && b != prevData)
+ || (b != null && !Arrays.equals(prevData, b))) {
+ listener.exists(b);
+ prevData = b;
+ }
+ }
+}
+</programlisting>
+</example>
+</section>
+
+
+
+</article>
http://git-wip-us.apache.org/repos/asf/zookeeper/blob/c1efa954/zookeeper-docs/src/documentation/content/xdocs/recipes.xml
----------------------------------------------------------------------
diff --git a/zookeeper-docs/src/documentation/content/xdocs/recipes.xml b/zookeeper-docs/src/documentation/content/xdocs/recipes.xml
new file mode 100644
index 0000000..ead041b
--- /dev/null
+++ b/zookeeper-docs/src/documentation/content/xdocs/recipes.xml
@@ -0,0 +1,637 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!--
+ Copyright 2002-2004 The Apache Software Foundation
+
+ Licensed under the Apache License, Version 2.0 (the "License");
+ you may not use this file except in compliance with the License.
+ You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ See the License for the specific language governing permissions and
+ limitations under the License.
+-->
+
+<!DOCTYPE article PUBLIC "-//OASIS//DTD Simplified DocBook XML V1.0//EN"
+"http://www.oasis-open.org/docbook/xml/simple/1.0/sdocbook.dtd">
+<article id="ar_Recipes">
+ <title>ZooKeeper Recipes and Solutions</title>
+
+ <articleinfo>
+ <legalnotice>
+ <para>Licensed under the Apache License, Version 2.0 (the "License");
+ you may not use this file except in compliance with the License. You may
+ obtain a copy of the License at <ulink
+ url="http://www.apache.org/licenses/LICENSE-2.0">http://www.apache.org/licenses/LICENSE-2.0</ulink>.</para>
+
+ <para>Unless required by applicable law or agreed to in writing,
+ software distributed under the License is distributed on an "AS IS"
+ BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
+ implied. See the License for the specific language governing permissions
+ and limitations under the License.</para>
+ </legalnotice>
+
+ <abstract>
+ <para>This guide contains pseudocode and guidelines for using Zookeeper to
+ solve common problems in Distributed Application Coordination. It
+ discusses such problems as event handlers, queues, and locks..</para>
+
+ <para>$Revision: 1.6 $ $Date: 2008/09/19 03:46:18 $</para>
+ </abstract>
+ </articleinfo>
+
+ <section id="ch_recipes">
+ <title>A Guide to Creating Higher-level Constructs with ZooKeeper</title>
+
+ <para>In this article, you'll find guidelines for using
+ ZooKeeper to implement higher order functions. All of them are conventions
+ implemented at the client and do not require special support from
+ ZooKeeper. Hopfully the community will capture these conventions in client-side libraries
+ to ease their use and to encourage standardization.</para>
+
+ <para>One of the most interesting things about ZooKeeper is that even
+ though ZooKeeper uses <emphasis>asynchronous</emphasis> notifications, you
+ can use it to build <emphasis>synchronous</emphasis> consistency
+ primitives, such as queues and locks. As you will see, this is possible
+ because ZooKeeper imposes an overall order on updates, and has mechanisms
+ to expose this ordering.</para>
+
+ <para>Note that the recipes below attempt to employ best practices. In
+ particular, they avoid polling, timers or anything else that would result
+ in a "herd effect", causing bursts of traffic and limiting
+ scalability.</para>
+
+ <para>There are many useful functions that can be imagined that aren't
+ included here - revocable read-write priority locks, as just one example.
+ And some of the constructs mentioned here - locks, in particular -
+ illustrate certain points, even though you may find other constructs, such
+ as event handles or queues, a more practical means of performing the same
+ function. In general, the examples in this section are designed to
+ stimulate thought.</para>
+
+
+ <section id="sc_outOfTheBox">
+ <title>Out of the Box Applications: Name Service, Configuration, Group
+ Membership</title>
+
+ <para>Name service and configuration are two of the primary applications
+ of ZooKeeper. These two functions are provided directly by the ZooKeeper
+ API.</para>
+
+ <para>Another function directly provided by ZooKeeper is <emphasis>group
+ membership</emphasis>. The group is represented by a node. Members of the
+ group create ephemeral nodes under the group node. Nodes of the members
+ that fail abnormally will be removed automatically when ZooKeeper detects
+ the failure.</para>
+ </section>
+
+ <section id="sc_recipes_eventHandles">
+ <title>Barriers</title>
+
+ <para>Distributed systems use <emphasis>barriers</emphasis>
+ to block processing of a set of nodes until a condition is met
+ at which time all the nodes are allowed to proceed. Barriers are
+ implemented in ZooKeeper by designating a barrier node. The
+ barrier is in place if the barrier node exists. Here's the
+ pseudo code:</para>
+
+ <orderedlist>
+ <listitem>
+ <para>Client calls the ZooKeeper API's <emphasis
+ role="bold">exists()</emphasis> function on the barrier node, with
+ <emphasis>watch</emphasis> set to true.</para>
+ </listitem>
+
+ <listitem>
+ <para>If <emphasis role="bold">exists()</emphasis> returns false, the
+ barrier is gone and the client proceeds</para>
+ </listitem>
+
+ <listitem>
+ <para>Else, if <emphasis role="bold">exists()</emphasis> returns true,
+ the clients wait for a watch event from ZooKeeper for the barrier
+ node.</para>
+ </listitem>
+
+ <listitem>
+ <para>When the watch event is triggered, the client reissues the
+ <emphasis role="bold">exists( )</emphasis> call, again waiting until
+ the barrier node is removed.</para>
+ </listitem>
+ </orderedlist>
+
+ <section id="sc_doubleBarriers">
+ <title>Double Barriers</title>
+
+ <para>Double barriers enable clients to synchronize the beginning and
+ the end of a computation. When enough processes have joined the barrier,
+ processes start their computation and leave the barrier once they have
+ finished. This recipe shows how to use a ZooKeeper node as a
+ barrier.</para>
+
+ <para>The pseudo code in this recipe represents the barrier node as
+ <emphasis>b</emphasis>. Every client process <emphasis>p</emphasis>
+ registers with the barrier node on entry and unregisters when it is
+ ready to leave. A node registers with the barrier node via the <emphasis
+ role="bold">Enter</emphasis> procedure below, it waits until
+ <emphasis>x</emphasis> client process register before proceeding with
+ the computation. (The <emphasis>x</emphasis> here is up to you to
+ determine for your system.)</para>
+
+ <informaltable colsep="0" frame="none" rowsep="0">
+ <tgroup cols="2">
+ <tbody>
+ <row>
+ <entry align="center"><emphasis
+ role="bold">Enter</emphasis></entry>
+
+ <entry align="center"><emphasis
+ role="bold">Leave</emphasis></entry>
+ </row>
+
+ <row>
+ <entry align="left"><orderedlist>
+ <listitem>
+ <para>Create a name <emphasis><emphasis>n</emphasis> =
+ <emphasis>b</emphasis>+“/”+<emphasis>p</emphasis></emphasis></para>
+ </listitem>
+
+ <listitem>
+ <para>Set watch: <emphasis
+ role="bold">exists(<emphasis>b</emphasis> + ‘‘/ready’’,
+ true)</emphasis></para>
+ </listitem>
+
+ <listitem>
+ <para>Create child: <emphasis role="bold">create(
+ <emphasis>n</emphasis>, EPHEMERAL)</emphasis></para>
+ </listitem>
+
+ <listitem>
+ <para><emphasis role="bold">L = getChildren(b,
+ false)</emphasis></para>
+ </listitem>
+
+ <listitem>
+ <para>if fewer children in L than<emphasis>
+ x</emphasis>, wait for watch event</para>
+ </listitem>
+
+ <listitem>
+ <para>else <emphasis role="bold">create(b + ‘‘/ready’’,
+ REGULAR)</emphasis></para>
+ </listitem>
+ </orderedlist></entry>
+
+ <entry><orderedlist>
+ <listitem>
+ <para><emphasis role="bold">L = getChildren(b,
+ false)</emphasis></para>
+ </listitem>
+
+ <listitem>
+ <para>if no children, exit</para>
+ </listitem>
+
+ <listitem>
+ <para>if <emphasis>p</emphasis> is only process node in
+ L, delete(n) and exit</para>
+ </listitem>
+
+ <listitem>
+ <para>if <emphasis>p</emphasis> is the lowest process
+ node in L, wait on highest process node in L</para>
+ </listitem>
+
+ <listitem>
+ <para>else <emphasis
+ role="bold">delete(<emphasis>n</emphasis>) </emphasis>if
+ still exists and wait on lowest process node in L</para>
+ </listitem>
+
+ <listitem>
+ <para>goto 1</para>
+ </listitem>
+ </orderedlist></entry>
+ </row>
+ </tbody>
+ </tgroup>
+ </informaltable>
+ <para>On entering, all processes watch on a ready node and
+ create an ephemeral node as a child of the barrier node. Each process
+ but the last enters the barrier and waits for the ready node to appear
+ at line 5. The process that creates the xth node, the last process, will
+ see x nodes in the list of children and create the ready node, waking up
+ the other processes. Note that waiting processes wake up only when it is
+ time to exit, so waiting is efficient.
+ </para>
+
+ <para>On exit, you can't use a flag such as <emphasis>ready</emphasis>
+ because you are watching for process nodes to go away. By using
+ ephemeral nodes, processes that fail after the barrier has been entered
+ do not prevent correct processes from finishing. When processes are
+ ready to leave, they need to delete their process nodes and wait for all
+ other processes to do the same.</para>
+
+ <para>Processes exit when there are no process nodes left as children of
+ <emphasis>b</emphasis>. However, as an efficiency, you can use the
+ lowest process node as the ready flag. All other processes that are
+ ready to exit watch for the lowest existing process node to go away, and
+ the owner of the lowest process watches for any other process node
+ (picking the highest for simplicity) to go away. This means that only a
+ single process wakes up on each node deletion except for the last node,
+ which wakes up everyone when it is removed.</para>
+ </section>
+ </section>
+
+ <section id="sc_recipes_Queues">
+ <title>Queues</title>
+
+ <para>Distributed queues are a common data structure. To implement a
+ distributed queue in ZooKeeper, first designate a znode to hold the queue,
+ the queue node. The distributed clients put something into the queue by
+ calling create() with a pathname ending in "queue-", with the
+ <emphasis>sequence</emphasis> and <emphasis>ephemeral</emphasis> flags in
+ the create() call set to true. Because the <emphasis>sequence</emphasis>
+ flag is set, the new pathnames will have the form
+ _path-to-queue-node_/queue-X, where X is a monotonic increasing number. A
+ client that wants to be removed from the queue calls ZooKeeper's <emphasis
+ role="bold">getChildren( )</emphasis> function, with
+ <emphasis>watch</emphasis> set to true on the queue node, and begins
+ processing nodes with the lowest number. The client does not need to issue
+ another <emphasis role="bold">getChildren( )</emphasis> until it exhausts
+ the list obtained from the first <emphasis role="bold">getChildren(
+ )</emphasis> call. If there are are no children in the queue node, the
+ reader waits for a watch notification to check the queue again.</para>
+
+ <note>
+ <para>There now exists a Queue implementation in ZooKeeper
+ recipes directory. This is distributed with the release --
+ src/recipes/queue directory of the release artifact.
+ </para>
+ </note>
+
+ <section id="sc_recipes_priorityQueues">
+ <title>Priority Queues</title>
+
+ <para>To implement a priority queue, you need only make two simple
+ changes to the generic <ulink url="#sc_recipes_Queues">queue
+ recipe</ulink> . First, to add to a queue, the pathname ends with
+ "queue-YY" where YY is the priority of the element with lower numbers
+ representing higher priority (just like UNIX). Second, when removing
+ from the queue, a client uses an up-to-date children list meaning that
+ the client will invalidate previously obtained children lists if a watch
+ notification triggers for the queue node.</para>
+ </section>
+ </section>
+
+ <section id="sc_recipes_Locks">
+ <title>Locks</title>
+
+ <para>Fully distributed locks that are globally synchronous, meaning at
+ any snapshot in time no two clients think they hold the same lock. These
+ can be implemented using ZooKeeeper. As with priority queues, first define
+ a lock node.</para>
+
+ <note>
+ <para>There now exists a Lock implementation in ZooKeeper
+ recipes directory. This is distributed with the release --
+ src/recipes/lock directory of the release artifact.
+ </para>
+ </note>
+
+ <para>Clients wishing to obtain a lock do the following:</para>
+
+ <orderedlist>
+ <listitem>
+ <para>Call <emphasis role="bold">create( )</emphasis> with a pathname
+ of "_locknode_/lock-" and the <emphasis>sequence</emphasis> and
+ <emphasis>ephemeral</emphasis> flags set.</para>
+ </listitem>
+
+ <listitem>
+ <para>Call <emphasis role="bold">getChildren( )</emphasis> on the lock
+ node <emphasis>without</emphasis> setting the watch flag (this is
+ important to avoid the herd effect).</para>
+ </listitem>
+
+ <listitem>
+ <para>If the pathname created in step <emphasis
+ role="bold">1</emphasis> has the lowest sequence number suffix, the
+ client has the lock and the client exits the protocol.</para>
+ </listitem>
+
+ <listitem>
+ <para>The client calls <emphasis role="bold">exists( )</emphasis> with
+ the watch flag set on the path in the lock directory with the next
+ lowest sequence number.</para>
+ </listitem>
+
+ <listitem>
+ <para>if <emphasis role="bold">exists( )</emphasis> returns false, go
+ to step <emphasis role="bold">2</emphasis>. Otherwise, wait for a
+ notification for the pathname from the previous step before going to
+ step <emphasis role="bold">2</emphasis>.</para>
+ </listitem>
+ </orderedlist>
+
+ <para>The unlock protocol is very simple: clients wishing to release a
+ lock simply delete the node they created in step 1.</para>
+
+ <para>Here are a few things to notice:</para>
+
+ <itemizedlist>
+ <listitem>
+ <para>The removal of a node will only cause one client to wake up
+ since each node is watched by exactly one client. In this way, you
+ avoid the herd effect.</para>
+ </listitem>
+ </itemizedlist>
+
+ <itemizedlist>
+ <listitem>
+ <para>There is no polling or timeouts.</para>
+ </listitem>
+ </itemizedlist>
+
+ <itemizedlist>
+ <listitem>
+ <para>Because of the way you implement locking, it is easy to see the
+ amount of lock contention, break locks, debug locking problems,
+ etc.</para>
+ </listitem>
+ </itemizedlist>
+
+ <section>
+ <title>Shared Locks</title>
+
+ <para>You can implement shared locks by with a few changes to the lock
+ protocol:</para>
+
+ <informaltable colsep="0" frame="none" rowsep="0">
+ <tgroup cols="2">
+ <tbody>
+ <row>
+ <entry align="center"><emphasis role="bold">Obtaining a read
+ lock:</emphasis></entry>
+
+ <entry align="center"><emphasis role="bold">Obtaining a write
+ lock:</emphasis></entry>
+ </row>
+
+ <row>
+ <entry align="left"><orderedlist>
+ <listitem>
+ <para>Call <emphasis role="bold">create( )</emphasis> to
+ create a node with pathname
+ "<filename>_locknode_/read-</filename>". This is the
+ lock node use later in the protocol. Make sure to set both
+ the <emphasis>sequence</emphasis> and
+ <emphasis>ephemeral</emphasis> flags.</para>
+ </listitem>
+
+ <listitem>
+ <para>Call <emphasis role="bold">getChildren( )</emphasis>
+ on the lock node <emphasis>without</emphasis> setting the
+ <emphasis>watch</emphasis> flag - this is important, as it
+ avoids the herd effect.</para>
+ </listitem>
+
+ <listitem>
+ <para>If there are no children with a pathname starting
+ with "<filename>write-</filename>" and having a lower
+ sequence number than the node created in step <emphasis
+ role="bold">1</emphasis>, the client has the lock and can
+ exit the protocol. </para>
+ </listitem>
+
+ <listitem>
+ <para>Otherwise, call <emphasis role="bold">exists(
+ )</emphasis>, with <emphasis>watch</emphasis> flag, set on
+ the node in lock directory with pathname staring with
+ "<filename>write-</filename>" having the next lowest
+ sequence number.</para>
+ </listitem>
+
+ <listitem>
+ <para>If <emphasis role="bold">exists( )</emphasis>
+ returns <emphasis>false</emphasis>, goto step <emphasis
+ role="bold">2</emphasis>.</para>
+ </listitem>
+
+ <listitem>
+ <para>Otherwise, wait for a notification for the pathname
+ from the previous step before going to step <emphasis
+ role="bold">2</emphasis></para>
+ </listitem>
+ </orderedlist></entry>
+
+ <entry><orderedlist>
+ <listitem>
+ <para>Call <emphasis role="bold">create( )</emphasis> to
+ create a node with pathname
+ "<filename>_locknode_/write-</filename>". This is the
+ lock node spoken of later in the protocol. Make sure to
+ set both <emphasis>sequence</emphasis> and
+ <emphasis>ephemeral</emphasis> flags.</para>
+ </listitem>
+
+ <listitem>
+ <para>Call <emphasis role="bold">getChildren( )
+ </emphasis> on the lock node <emphasis>without</emphasis>
+ setting the <emphasis>watch</emphasis> flag - this is
+ important, as it avoids the herd effect.</para>
+ </listitem>
+
+ <listitem>
+ <para>If there are no children with a lower sequence
+ number than the node created in step <emphasis
+ role="bold">1</emphasis>, the client has the lock and the
+ client exits the protocol.</para>
+ </listitem>
+
+ <listitem>
+ <para>Call <emphasis role="bold">exists( ),</emphasis>
+ with <emphasis>watch</emphasis> flag set, on the node with
+ the pathname that has the next lowest sequence
+ number.</para>
+ </listitem>
+
+ <listitem>
+ <para>If <emphasis role="bold">exists( )</emphasis>
+ returns <emphasis>false</emphasis>, goto step <emphasis
+ role="bold">2</emphasis>. Otherwise, wait for a
+ notification for the pathname from the previous step
+ before going to step <emphasis
+ role="bold">2</emphasis>.</para>
+ </listitem>
+ </orderedlist></entry>
+ </row>
+ </tbody>
+ </tgroup>
+ </informaltable>
+
+ <note>
+ <para>It might appear that this recipe creates a herd effect:
+ when there is a large group of clients waiting for a read
+ lock, and all getting notified more or less simultaneously
+ when the "<filename>write-</filename>" node with the lowest
+ sequence number is deleted. In fact. that's valid behavior:
+ as all those waiting reader clients should be released since
+ they have the lock. The herd effect refers to releasing a
+ "herd" when in fact only a single or a small number of
+ machines can proceed.
+ </para>
+ </note>
+ </section>
+
+ <section id="sc_recoverableSharedLocks">
+ <title>Recoverable Shared Locks</title>
+
+ <para>With minor modifications to the Shared Lock protocol, you make
+ shared locks revocable by modifying the shared lock protocol:</para>
+
+ <para>In step <emphasis role="bold">1</emphasis>, of both obtain reader
+ and writer lock protocols, call <emphasis role="bold">getData(
+ )</emphasis> with <emphasis>watch</emphasis> set, immediately after the
+ call to <emphasis role="bold">create( )</emphasis>. If the client
+ subsequently receives notification for the node it created in step
+ <emphasis role="bold">1</emphasis>, it does another <emphasis
+ role="bold">getData( )</emphasis> on that node, with
+ <emphasis>watch</emphasis> set and looks for the string "unlock", which
+ signals to the client that it must release the lock. This is because,
+ according to this shared lock protocol, you can request the client with
+ the lock give up the lock by calling <emphasis role="bold">setData()
+ </emphasis> on the lock node, writing "unlock" to that node.</para>
+
+ <para>Note that this protocol requires the lock holder to consent to
+ releasing the lock. Such consent is important, especially if the lock
+ holder needs to do some processing before releasing the lock. Of course
+ you can always implement <emphasis>Revocable Shared Locks with Freaking
+ Laser Beams</emphasis> by stipulating in your protocol that the revoker
+ is allowed to delete the lock node if after some length of time the lock
+ isn't deleted by the lock holder.</para>
+ </section>
+ </section>
+
+ <section id="sc_recipes_twoPhasedCommit">
+ <title>Two-phased Commit</title>
+
+ <para>A two-phase commit protocol is an algorithm that lets all clients in
+ a distributed system agree either to commit a transaction or abort.</para>
+
+ <para>In ZooKeeper, you can implement a two-phased commit by having a
+ coordinator create a transaction node, say "/app/Tx", and one child node
+ per participating site, say "/app/Tx/s_i". When coordinator creates the
+ child node, it leaves the content undefined. Once each site involved in
+ the transaction receives the transaction from the coordinator, the site
+ reads each child node and sets a watch. Each site then processes the query
+ and votes "commit" or "abort" by writing to its respective node. Once the
+ write completes, the other sites are notified, and as soon as all sites
+ have all votes, they can decide either "abort" or "commit". Note that a
+ node can decide "abort" earlier if some site votes for "abort".</para>
+
+ <para>An interesting aspect of this implementation is that the only role
+ of the coordinator is to decide upon the group of sites, to create the
+ ZooKeeper nodes, and to propagate the transaction to the corresponding
+ sites. In fact, even propagating the transaction can be done through
+ ZooKeeper by writing it in the transaction node.</para>
+
+ <para>There are two important drawbacks of the approach described above.
+ One is the message complexity, which is O(n²). The second is the
+ impossibility of detecting failures of sites through ephemeral nodes. To
+ detect the failure of a site using ephemeral nodes, it is necessary that
+ the site create the node.</para>
+
+ <para>To solve the first problem, you can have only the coordinator
+ notified of changes to the transaction nodes, and then notify the sites
+ once coordinator reaches a decision. Note that this approach is scalable,
+ but it's is slower too, as it requires all communication to go through the
+ coordinator.</para>
+
+ <para>To address the second problem, you can have the coordinator
+ propagate the transaction to the sites, and have each site creating its
+ own ephemeral node.</para>
+ </section>
+
+ <section id="sc_leaderElection">
+ <title>Leader Election</title>
+
+ <para>A simple way of doing leader election with ZooKeeper is to use the
+ <emphasis role="bold">SEQUENCE|EPHEMERAL</emphasis> flags when creating
+ znodes that represent "proposals" of clients. The idea is to have a znode,
+ say "/election", such that each znode creates a child znode "/election/n_"
+ with both flags SEQUENCE|EPHEMERAL. With the sequence flag, ZooKeeper
+ automatically appends a sequence number that is greater that any one
+ previously appended to a child of "/election". The process that created
+ the znode with the smallest appended sequence number is the leader.
+ </para>
+
+ <para>That's not all, though. It is important to watch for failures of the
+ leader, so that a new client arises as the new leader in the case the
+ current leader fails. A trivial solution is to have all application
+ processes watching upon the current smallest znode, and checking if they
+ are the new leader when the smallest znode goes away (note that the
+ smallest znode will go away if the leader fails because the node is
+ ephemeral). But this causes a herd effect: upon of failure of the current
+ leader, all other processes receive a notification, and execute
+ getChildren on "/election" to obtain the current list of children of
+ "/election". If the number of clients is large, it causes a spike on the
+ number of operations that ZooKeeper servers have to process. To avoid the
+ herd effect, it is sufficient to watch for the next znode down on the
+ sequence of znodes. If a client receives a notification that the znode it
+ is watching is gone, then it becomes the new leader in the case that there
+ is no smaller znode. Note that this avoids the herd effect by not having
+ all clients watching the same znode. </para>
+
+ <para>Here's the pseudo code:</para>
+
+ <para>Let ELECTION be a path of choice of the application. To volunteer to
+ be a leader: </para>
+
+ <orderedlist>
+ <listitem>
+ <para>Create znode z with path "ELECTION/n_" with both SEQUENCE and
+ EPHEMERAL flags;</para>
+ </listitem>
+
+ <listitem>
+ <para>Let C be the children of "ELECTION", and i be the sequence
+ number of z;</para>
+ </listitem>
+
+ <listitem>
+ <para>Watch for changes on "ELECTION/n_j", where j is the largest
+ sequence number such that j < i and n_j is a znode in C;</para>
+ </listitem>
+ </orderedlist>
+
+ <para>Upon receiving a notification of znode deletion: </para>
+
+ <orderedlist>
+ <listitem>
+ <para>Let C be the new set of children of ELECTION; </para>
+ </listitem>
+
+ <listitem>
+ <para>If z is the smallest node in C, then execute leader
+ procedure;</para>
+ </listitem>
+
+ <listitem>
+ <para>Otherwise, watch for changes on "ELECTION/n_j", where j is the
+ largest sequence number such that j < i and n_j is a znode in C;
+ </para>
+ </listitem>
+ </orderedlist>
+
+ <para>Note that the znode having no preceding znode on the list of
+ children does not imply that the creator of this znode is aware that it is
+ the current leader. Applications may consider creating a separate znode
+ to acknowledge that the leader has executed the leader procedure. </para>
+ </section>
+ </section>
+</article>
http://git-wip-us.apache.org/repos/asf/zookeeper/blob/c1efa954/zookeeper-docs/src/documentation/content/xdocs/site.xml
----------------------------------------------------------------------
diff --git a/zookeeper-docs/src/documentation/content/xdocs/site.xml b/zookeeper-docs/src/documentation/content/xdocs/site.xml
new file mode 100644
index 0000000..e49d92c
--- /dev/null
+++ b/zookeeper-docs/src/documentation/content/xdocs/site.xml
@@ -0,0 +1,103 @@
+<?xml version="1.0"?>
+<!--
+ Copyright 2002-2004 The Apache Software Foundation
+
+ Licensed under the Apache License, Version 2.0 (the "License");
+ you may not use this file except in compliance with the License.
+ You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ See the License for the specific language governing permissions and
+ limitations under the License.
+-->
+
+<!--
+Forrest site.xml
+
+This file contains an outline of the site's information content. It is used to:
+- Generate the website menus (though these can be overridden - see docs)
+- Provide semantic, location-independent aliases for internal 'site:' URIs, eg
+<link href="site:changes"> links to changes.html (or ../changes.html if in
+ subdir).
+- Provide aliases for external URLs in the external-refs section. Eg, <link
+ href="ext:cocoon"> links to http://xml.apache.org/cocoon/
+
+See http://forrest.apache.org/docs/linking.html for more info.
+-->
+
+<site label="ZooKeeper" href="" xmlns="http://apache.org/forrest/linkmap/1.0">
+
+ <docs label="Overview">
+ <welcome label="Welcome" href="index.html" />
+ <overview label="Overview" href="zookeeperOver.html" />
+ <started label="Getting Started" href="zookeeperStarted.html" />
+ <relnotes label="Release Notes" href="ext:relnotes" />
+ </docs>
+
+ <docs label="Developer">
+ <api label="API Docs" href="ext:api/index" />
+ <program label="Programmer's Guide" href="zookeeperProgrammers.html" />
+ <javaEx label="Java Example" href="javaExample.html" />
+ <barTutor label="Barrier and Queue Tutorial" href="zookeeperTutorial.html" />
+ <recipes label="Recipes" href="recipes.html" />
+ </docs>
+
+ <docs label="BookKeeper">
+ <bkStarted label="Getting started" href="bookkeeperStarted.html" />
+ <bkOverview label="Overview" href="bookkeeperOverview.html" />
+ <bkProgrammer label="Setup guide" href="bookkeeperConfig.html" />
+ <bkProgrammer label="Programmer's guide" href="bookkeeperProgrammer.html" />
+ </docs>
+
+ <docs label="Admin & Ops">
+ <admin label="Administrator's Guide" href="zookeeperAdmin.html" />
+ <quota label="Quota Guide" href="zookeeperQuotas.html" />
+ <jmx label="JMX" href="zookeeperJMX.html" />
+ <observers label="Observers Guide" href="zookeeperObservers.html" />
+ </docs>
+
+ <docs label="Contributor">
+ <internals label="ZooKeeper Internals" href="zookeeperInternals.html" />
+ </docs>
+
+ <docs label="Miscellaneous">
+ <wiki label="Wiki" href="ext:wiki" />
+ <faq label="FAQ" href="ext:faq" />
+ <lists label="Mailing Lists" href="ext:lists" />
+ <!--<other label="Other Info" href="zookeeperOtherInfo.html" />-->
+ </docs>
+
+
+
+ <external-refs>
+ <site href="http://zookeeper.apache.org/"/>
+ <lists href="http://zookeeper.apache.org/mailing_lists.html"/>
+ <releases href="http://zookeeper.apache.org/releases.html">
+ <download href="#Download" />
+ </releases>
+ <jira href="http://zookeeper.apache.org/issue_tracking.html"/>
+ <wiki href="https://cwiki.apache.org/confluence/display/ZOOKEEPER" />
+ <faq href="https://cwiki.apache.org/confluence/display/ZOOKEEPER/FAQ" />
+ <zlib href="http://www.zlib.net/" />
+ <lzo href="http://www.oberhumer.com/opensource/lzo/" />
+ <gzip href="http://www.gzip.org/" />
+ <cygwin href="http://www.cygwin.com/" />
+ <osx href="http://www.apple.com/macosx" />
+ <relnotes href="releasenotes.html" />
+ <api href="api/">
+ <started href="overview-summary.html#overview_description" />
+ <index href="index.html" />
+ <org href="org/">
+ <apache href="apache/">
+ <zookeeper href="zookeeper/">
+ </zookeeper>
+ </apache>
+ </org>
+ </api>
+ </external-refs>
+
+</site>
http://git-wip-us.apache.org/repos/asf/zookeeper/blob/c1efa954/zookeeper-docs/src/documentation/content/xdocs/tabs.xml
----------------------------------------------------------------------
diff --git a/zookeeper-docs/src/documentation/content/xdocs/tabs.xml b/zookeeper-docs/src/documentation/content/xdocs/tabs.xml
new file mode 100644
index 0000000..aef7e59
--- /dev/null
+++ b/zookeeper-docs/src/documentation/content/xdocs/tabs.xml
@@ -0,0 +1,36 @@
+<?xml version="1.0"?>
+<!--
+ Copyright 2002-2004 The Apache Software Foundation
+
+ Licensed under the Apache License, Version 2.0 (the "License");
+ you may not use this file except in compliance with the License.
+ You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ See the License for the specific language governing permissions and
+ limitations under the License.
+-->
+
+<!DOCTYPE tabs PUBLIC "-//APACHE//DTD Cocoon Documentation Tab V1.0//EN"
+ "http://forrest.apache.org/dtd/tab-cocoon-v10.dtd">
+
+<tabs software="ZooKeeper"
+ title="ZooKeeper"
+ copyright="The Apache Software Foundation"
+ xmlns:xlink="http://www.w3.org/1999/xlink">
+
+ <!-- The rules are:
+ @dir will always have /index.html added.
+ @href is not modified unless it is root-relative and obviously specifies a
+ directory (ends in '/'), in which case /index.html will be added
+ -->
+
+ <tab label="Project" href="http://zookeeper.apache.org/" />
+ <tab label="Wiki" href="https://cwiki.apache.org/confluence/display/ZOOKEEPER/" />
+ <tab label="ZooKeeper 3.4 Documentation" dir="" />
+
+</tabs>
[12/12] zookeeper git commit: ZOOKEEPER-3022: MAVEN MIGRATION 3.4 -
Iteration 1 - docs, it
Posted by an...@apache.org.
ZOOKEEPER-3022: MAVEN MIGRATION 3.4 - Iteration 1 - docs, it
Maven migration - first iteration (zookeeper-docs and zookeeper-it), branch 3.4
Author: Norbert Kalmar <nk...@yahoo.com>
Reviewers: Andor Molnar <an...@apache.org>
Closes #552 from nkalmar/ZOOKEEPER-3022-1_b3.4
Project: http://git-wip-us.apache.org/repos/asf/zookeeper/repo
Commit: http://git-wip-us.apache.org/repos/asf/zookeeper/commit/c1efa954
Tree: http://git-wip-us.apache.org/repos/asf/zookeeper/tree/c1efa954
Diff: http://git-wip-us.apache.org/repos/asf/zookeeper/diff/c1efa954
Branch: refs/heads/branch-3.4
Commit: c1efa954d015130a8a01be8334dcf8e7189da7fd
Parents: 4a8cceb
Author: Norbert Kalmar <nk...@yahoo.com>
Authored: Wed Jul 4 15:11:11 2018 +0200
Committer: Andor Molnar <an...@apache.org>
Committed: Wed Jul 4 15:11:11 2018 +0200
----------------------------------------------------------------------
build.xml | 2 +-
src/docs/forrest.properties | 109 -
src/docs/src/documentation/README.txt | 7 -
src/docs/src/documentation/TODO.txt | 227 ---
.../classes/CatalogManager.properties | 37 -
src/docs/src/documentation/conf/cli.xconf | 328 ---
.../content/xdocs/bookkeeperConfig.xml | 156 --
.../content/xdocs/bookkeeperOverview.xml | 419 ----
.../content/xdocs/bookkeeperProgrammer.xml | 678 -------
.../content/xdocs/bookkeeperStarted.xml | 208 --
.../content/xdocs/bookkeeperStream.xml | 331 ----
.../src/documentation/content/xdocs/index.xml | 98 -
.../documentation/content/xdocs/javaExample.xml | 663 -------
.../src/documentation/content/xdocs/recipes.xml | 637 ------
.../src/documentation/content/xdocs/site.xml | 103 -
.../src/documentation/content/xdocs/tabs.xml | 36 -
.../content/xdocs/zookeeperAdmin.xml | 1861 ------------------
.../xdocs/zookeeperHierarchicalQuorums.xml | 75 -
.../content/xdocs/zookeeperInternals.xml | 487 -----
.../content/xdocs/zookeeperJMX.xml | 236 ---
.../content/xdocs/zookeeperObservers.xml | 145 --
.../content/xdocs/zookeeperOtherInfo.xml | 46 -
.../content/xdocs/zookeeperOver.xml | 464 -----
.../content/xdocs/zookeeperProgrammers.xml | 1640 ---------------
.../content/xdocs/zookeeperQuotas.xml | 71 -
.../content/xdocs/zookeeperStarted.xml | 418 ----
.../content/xdocs/zookeeperTutorial.xml | 712 -------
.../src/documentation/resources/images/2pc.jpg | Bin 15174 -> 0 bytes
.../resources/images/bk-overview.jpg | Bin 124211 -> 0 bytes
.../documentation/resources/images/favicon.ico | Bin 766 -> 0 bytes
.../resources/images/hadoop-logo.jpg | Bin 9443 -> 0 bytes
.../resources/images/state_dia.dia | Bin 2597 -> 0 bytes
.../resources/images/state_dia.jpg | Bin 51364 -> 0 bytes
.../documentation/resources/images/zkarch.jpg | Bin 24535 -> 0 bytes
.../resources/images/zkcomponents.jpg | Bin 30831 -> 0 bytes
.../resources/images/zknamespace.jpg | Bin 35414 -> 0 bytes
.../resources/images/zkperfRW-3.2.jpg | Bin 41948 -> 0 bytes
.../documentation/resources/images/zkperfRW.jpg | Bin 161542 -> 0 bytes
.../resources/images/zkperfreliability.jpg | Bin 69825 -> 0 bytes
.../resources/images/zkservice.jpg | Bin 86790 -> 0 bytes
.../resources/images/zookeeper_small.gif | Bin 4847 -> 0 bytes
src/docs/src/documentation/skinconf.xml | 360 ----
src/docs/status.xml | 74 -
zookeeper-docs/forrest.properties | 109 +
zookeeper-docs/src/documentation/README.txt | 7 +
zookeeper-docs/src/documentation/TODO.txt | 227 +++
.../classes/CatalogManager.properties | 37 +
zookeeper-docs/src/documentation/conf/cli.xconf | 328 +++
.../content/xdocs/bookkeeperConfig.xml | 156 ++
.../content/xdocs/bookkeeperOverview.xml | 419 ++++
.../content/xdocs/bookkeeperProgrammer.xml | 678 +++++++
.../content/xdocs/bookkeeperStarted.xml | 208 ++
.../content/xdocs/bookkeeperStream.xml | 331 ++++
.../src/documentation/content/xdocs/index.xml | 98 +
.../documentation/content/xdocs/javaExample.xml | 663 +++++++
.../src/documentation/content/xdocs/recipes.xml | 637 ++++++
.../src/documentation/content/xdocs/site.xml | 103 +
.../src/documentation/content/xdocs/tabs.xml | 36 +
.../content/xdocs/zookeeperAdmin.xml | 1861 ++++++++++++++++++
.../xdocs/zookeeperHierarchicalQuorums.xml | 75 +
.../content/xdocs/zookeeperInternals.xml | 487 +++++
.../content/xdocs/zookeeperJMX.xml | 236 +++
.../content/xdocs/zookeeperObservers.xml | 145 ++
.../content/xdocs/zookeeperOtherInfo.xml | 46 +
.../content/xdocs/zookeeperOver.xml | 464 +++++
.../content/xdocs/zookeeperProgrammers.xml | 1640 +++++++++++++++
.../content/xdocs/zookeeperQuotas.xml | 71 +
.../content/xdocs/zookeeperStarted.xml | 418 ++++
.../content/xdocs/zookeeperTutorial.xml | 712 +++++++
.../src/documentation/resources/images/2pc.jpg | Bin 0 -> 15174 bytes
.../resources/images/bk-overview.jpg | Bin 0 -> 124211 bytes
.../documentation/resources/images/favicon.ico | Bin 0 -> 766 bytes
.../resources/images/hadoop-logo.jpg | Bin 0 -> 9443 bytes
.../resources/images/state_dia.dia | Bin 0 -> 2597 bytes
.../resources/images/state_dia.jpg | Bin 0 -> 51364 bytes
.../documentation/resources/images/zkarch.jpg | Bin 0 -> 24535 bytes
.../resources/images/zkcomponents.jpg | Bin 0 -> 30831 bytes
.../resources/images/zknamespace.jpg | Bin 0 -> 35414 bytes
.../resources/images/zkperfRW-3.2.jpg | Bin 0 -> 41948 bytes
.../documentation/resources/images/zkperfRW.jpg | Bin 0 -> 161542 bytes
.../resources/images/zkperfreliability.jpg | Bin 0 -> 69825 bytes
.../resources/images/zkservice.jpg | Bin 0 -> 86790 bytes
.../resources/images/zookeeper_small.gif | Bin 0 -> 4847 bytes
zookeeper-docs/src/documentation/skinconf.xml | 360 ++++
zookeeper-docs/status.xml | 74 +
zookeeper-it/.empty | 0
86 files changed, 10627 insertions(+), 10627 deletions(-)
----------------------------------------------------------------------
http://git-wip-us.apache.org/repos/asf/zookeeper/blob/c1efa954/build.xml
----------------------------------------------------------------------
diff --git a/build.xml b/build.xml
index 8ffa2f3..640be98 100644
--- a/build.xml
+++ b/build.xml
@@ -133,7 +133,7 @@ xmlns:cs="antlib:com.puppycrawl.tools.checkstyle">
<property name="test.quick" value="no" />
<property name="conf.dir" value="${basedir}/conf"/>
<property name="docs.dir" value="${basedir}/docs"/>
- <property name="docs.src" value="${basedir}/src/docs"/>
+ <property name="docs.src" value="${basedir}/zookeeper-docs"/>
<property name="javadoc.link.java"
value="http://docs.oracle.com/javase/6/docs/api/" />
<property name="javadoc.packages" value="org.apache.*" />
http://git-wip-us.apache.org/repos/asf/zookeeper/blob/c1efa954/src/docs/forrest.properties
----------------------------------------------------------------------
diff --git a/src/docs/forrest.properties b/src/docs/forrest.properties
deleted file mode 100644
index 70cf81d..0000000
--- a/src/docs/forrest.properties
+++ /dev/null
@@ -1,109 +0,0 @@
-# Copyright 2002-2004 The Apache Software Foundation
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-# http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-
-##############
-# Properties used by forrest.build.xml for building the website
-# These are the defaults, un-comment them if you need to change them.
-##############
-
-# Prints out a summary of Forrest settings for this project
-#forrest.echo=true
-
-# Project name (used to name .war file)
-#project.name=my-project
-
-# Specifies name of Forrest skin to use
-#project.skin=tigris
-#project.skin=pelt
-
-# comma separated list, file:// is supported
-#forrest.skins.descriptors=http://forrest.apache.org/skins/skins.xml,file:///c:/myskins/skins.xml
-
-##############
-# behavioural properties
-#project.menu-scheme=tab_attributes
-#project.menu-scheme=directories
-
-##############
-# layout properties
-
-# Properties that can be set to override the default locations
-#
-# Parent properties must be set. This usually means uncommenting
-# project.content-dir if any other property using it is uncommented
-
-#project.status=status.xml
-#project.content-dir=src/documentation
-project.configfile=${project.home}/src/documentation/conf/cli.xconf
-#project.raw-content-dir=${project.content-dir}/content
-#project.conf-dir=${project.content-dir}/conf
-#project.sitemap-dir=${project.content-dir}
-#project.xdocs-dir=${project.content-dir}/content/xdocs
-#project.resources-dir=${project.content-dir}/resources
-#project.stylesheets-dir=${project.resources-dir}/stylesheets
-#project.images-dir=${project.resources-dir}/images
-#project.schema-dir=${project.resources-dir}/schema
-#project.skins-dir=${project.content-dir}/skins
-#project.skinconf=${project.content-dir}/skinconf.xml
-#project.lib-dir=${project.content-dir}/lib
-#project.classes-dir=${project.content-dir}/classes
-#project.translations-dir=${project.content-dir}/translations
-
-##############
-# validation properties
-
-# This set of properties determine if validation is performed
-# Values are inherited unless overridden.
-# e.g. if forrest.validate=false then all others are false unless set to true.
-forrest.validate=true
-forrest.validate.xdocs=${forrest.validate}
-forrest.validate.skinconf=${forrest.validate}
-forrest.validate.stylesheets=${forrest.validate}
-forrest.validate.skins=${forrest.validate}
-forrest.validate.skins.stylesheets=${forrest.validate.skins}
-
-# Make Forrest work with JDK6
-forrest.validate.sitemap=false
-
-# *.failonerror=(true|false) - stop when an XML file is invalid
-forrest.validate.failonerror=true
-
-# *.excludes=(pattern) - comma-separated list of path patterns to not validate
-# e.g.
-#forrest.validate.xdocs.excludes=samples/subdir/**, samples/faq.xml
-#forrest.validate.xdocs.excludes=
-
-
-##############
-# General Forrest properties
-
-# The URL to start crawling from
-#project.start-uri=linkmap.html
-# Set logging level for messages printed to the console
-# (DEBUG, INFO, WARN, ERROR, FATAL_ERROR)
-#project.debuglevel=ERROR
-# Max memory to allocate to Java
-#forrest.maxmemory=64m
-# Any other arguments to pass to the JVM. For example, to run on an X-less
-# server, set to -Djava.awt.headless=true
-#forrest.jvmargs=
-# The bugtracking URL - the issue number will be appended
-#project.bugtracking-url=http://issues.apache.org/bugzilla/show_bug.cgi?id=
-#project.bugtracking-url=http://issues.apache.org/jira/browse/
-# The issues list as rss
-#project.issues-rss-url=
-#I18n Property only works for the "forrest run" target.
-#project.i18n=true
-
-project.required.plugins=org.apache.forrest.plugin.output.pdf,org.apache.forrest.plugin.input.simplifiedDocbook
http://git-wip-us.apache.org/repos/asf/zookeeper/blob/c1efa954/src/docs/src/documentation/README.txt
----------------------------------------------------------------------
diff --git a/src/docs/src/documentation/README.txt b/src/docs/src/documentation/README.txt
deleted file mode 100644
index 9bc261b..0000000
--- a/src/docs/src/documentation/README.txt
+++ /dev/null
@@ -1,7 +0,0 @@
-This is the base documentation directory.
-
-skinconf.xml # This file customizes Forrest for your project. In it, you
- # tell forrest the project name, logo, copyright info, etc
-
-sitemap.xmap # Optional. This sitemap is consulted before all core sitemaps.
- # See http://forrest.apache.org/docs/project-sitemap.html
http://git-wip-us.apache.org/repos/asf/zookeeper/blob/c1efa954/src/docs/src/documentation/TODO.txt
----------------------------------------------------------------------
diff --git a/src/docs/src/documentation/TODO.txt b/src/docs/src/documentation/TODO.txt
deleted file mode 100644
index 84e7dfa..0000000
--- a/src/docs/src/documentation/TODO.txt
+++ /dev/null
@@ -1,227 +0,0 @@
-This is a running list of todo documentation items. Feel free
-to add to the list or take on an item as you wish (in the form
-of a JIRA patch of course).
--------------------------------------------------------------
-
-recipes.xml:110:
-[maybe an illustration would be nice for each recipe?]
-
-recipes.xml:167:
-"wait for each watch event". [how do you wait?]
-
-recipes.xml:457:
-<remark>[tbd: myabe helpful to indicate which step this refers to?]</remark>
-
-zookeeperAdmin.xml:77:
-because requires a majority <remark>[tbd: why?]</remark>, it is best to use...
-
-zookeeperAdmin.xml:112:
- <screen>$yinst -i jdk-1.6.0.00_3 -br test <remark>[y! prop - replace with open equiv]</remark></screen>
-
-zookeeperAdmin.xml:99:
-- use a maximum heap size of 3GB for a 4GB machine. <remark>[tbd: where would they do this? Environment variable, etc?]</remark>
-
-zookeeperAdmin.xml:120
-<screen>$ yinst install -nostart zookeeper_server <remark>[Y! prop - replace with open eq]</remark></screen>
-
-zookeeperAdmin.xml:171:
-In Java, you can run the following command to execute simple operations:<remark> [tbd: also, maybe give some of those simple operations?]
-
-zookeeperAdmin.xml:194:
-Running either program gives you a shell in which to execute simple file-system-like operations. <remark>[tbd: again, sample
- operations?]
-
-zookeeperAdmin.xml:252:
-If servers use different configuration files,
-care must be taken to ensure that the list of servers in all of the
-standard form, with legal values, etc]</remark>
-
-zookeeperAdmin.xml:408:
-(Note: The system property has no zookeeper
-prefix, and the configuration variable name is different from
-the system property. Yes - it's not consistent, and it's
-annoying.<remark> [tbd: is there any explanation for
-this?]</remark>)
-
-zookeeperAdmin.xml:445: When the election algorithm is
- "0" a UDP port with the same port number as the port listed in
- the <emphasis role="bold">server.num</emphasis> option will be
- used. <remark>[tbd: should that be <emphasis
- role="bold">server.id</emphasis>? Also, why isn't server.id
- documented anywhere?]</remark>
-
-zookeeperAdmin.xml:481: The default to this option is yes, which
- means that a leader will accept client connections.
- <remark>[tbd: how do you specifiy which server is the
- leader?]</remark>
-
-zookeeperAdmin.xml:495 When the server
- starts up, it determines which server it is by looking for the
- file <filename>myid</filename> in the data directory.<remark>
- [tdb: should we mention somewhere about creating this file,
- myid, in the setup procedure?]</remark>
-
-zookeeperAdmin.xml:508: [tbd: is the next sentence explanation an of what the
- election port or is it a description of a special case?]
- </remark>If you want to test multiple servers on a single
- machine, the individual choices of electionPort for each
- server can be defined in each server's config files using the
- line electionPort=xxxx to avoid clashes.
-
-zookeeperAdmin.xml:524: If followers fall too far behind a
- leader, they will be dropped. <remark>[tbd: is this a correct
- rewording: if followers fall beyond this limit, they are
- dropped?]</remark>
-
-zookeeperAdmin.xml:551: ZooKeeper will not require updates
- to be synced to the media. <remark>[tbd: useful because...,
- dangerous because...]</remark>
-
-zookeeperAdmin.xml:580: Skips ACL checks. <remark>[tbd: when? where?]</remark>
-
-zookeeperAdmin.xml:649: <remark>[tbd: Patrick, Ben, et al: I believe the Message Broker
- team does perform routine monitoring of Zookeeper. But I might be
- wrong. To your knowledge, is there any monitoring of a Zookeeper
- deployment that will a Zookeeper sys admin will want to do, outside of
- Yahoo?]</remark>
-
-zookeeperAdmin.xml:755: Also,
- the server lists in each Zookeeper server configuration file
- should be consistent with one another. <remark>[tbd: I'm assuming
- this last part is true. Is it?]</remark>
-
-zookeeperAdmin.xml:812: For best results, take note of the following list of good
- Zookeeper practices. <remark>[tbd: I just threw this section in. Do we
- have list that is is different from the "things to avoid"? If not, I can
- easily remove this section.]</remark>
-
-
-zookeeperOver.xml:162: Ephemeral nodes are useful when you
- want to implement <remark>[tbd]</remark>.
-
-zookeeperOver.xml:174: And if the
- connection between the client and one of the Zoo Keeper servers is
- broken, the client will receive a local notification. These can be used
- to <remark>[tbd]</remark>
-
-zookeeperOver.xml:215: <para>For more information on these (guarantees), and how they can be used, see
- <remark>[tbd]</remark></para>
-
-zookeeperOver.xml:294: <para><xref linkend="fg_zkComponents" /> shows the high-level components
- of the ZooKeeper service. With the exception of the request processor,
- <remark>[tbd: where does the request processor live?]</remark>
-
-zookeeperOver.xml:298: <para><xref linkend="fg_zkComponents" /> shows the high-level components
- of the ZooKeeper service. With the exception of the request processor,
- each of
- the servers that make up the ZooKeeper service replicates its own copy
- of each of components. <remark>[tbd: I changed the wording in this
- sentence from the white paper. Can someone please make sure it is still
- correct?]</remark>
-
-zookeeperOver.xml:342: The programming interface to ZooKeeper is deliberately simple.
- With it, however, you can implement higher order operations, such as
- synchronizations primitives, group membership, ownership, etc. Some
- distributed applications have used it to: <remark>[tbd: add uses from
- white paper and video presentation.]</remark>
-
-
-zookeeperProgrammers.xml:94: <listitem>
- <para><xref linkend="ch_programStructureWithExample" />
- <remark>[tbd]</remark></para>
- </listitem>
-
-zookeeperProgrammers.xml:115: Also,
- the <ulink url="#ch_programStructureWithExample">Simple Programmming
- Example</ulink> <remark>[tbd]</remark> is helpful for understand the basic
- structure of a ZooKeeper client application.
-
-zookeeperProgrammers.xml:142: The following characters are not
- allowed because <remark>[tbd:
- do we need reasons?]</remark>
-
-zookeeperProgrammers.xml:172: If
- the version it supplies doesn't match the actual version of the data,
- the update will fail. (This behavior can be overridden. For more
- information see... )<remark>[tbd... reference here to the section
- describing the special version number -1]</remark>
-
-zookeeperProgrammers.xml:197: More information about watches can be
- found in the section
- <ulink url="recipes.html#sc_recipes_Locks">
- Zookeeper Watches</ulink>.
- <remark>[tbd: fix this link] [tbd: Ben there is note from to emphasize
- that "it is queued". What is "it" and is what we have here
- sufficient?]</remark></para>
-
-zookeeperProgrammers.xml:335: it will send the session id as a part of the connection handshake.
- As a security measure, the server creates a password for the session id
- that any ZooKeeper server can validate. <remark>[tbd: note from Ben:
- "perhaps capability is a better word." need clarification on that.]
- </remark>
-
-zookeeperProgrammers.xml:601: <ulink
- url="recipes.html#sc_recipes_Locks">Locks</ulink>
- <remark>[tbd:...]</remark> in <ulink
- url="recipes.html">Zookeeper Recipes</ulink>.
- <remark>[tbd:..]</remark>).</para>
-
-zookeeperProgrammers.xml:766: <para>See INSTALL for general information about running
- <emphasis role="bold">configure</emphasis>. <remark>[tbd: what
- is INSTALL? a directory? a file?]</remark></para>
-
-
-
-zookeeperProgrammers.xml:813: <para>To verify that the node's been created:</para>
-
- <para>You should see a list of node who are children of the root node
- "/".</para><remark>[tbd: document all the cli commands (I think this is ben's comment)
-
-zookeeperProgrammers.xml:838: <para>Refer to <xref linkend="ch_programStructureWithExample"/>for examples of usage in Java and C.
- <remark>[tbd]</remark></para>
-
-zookeeperProgrammers.xml 847: <remark>[tbd: This is a new section. The below
- is just placeholder. Eventually, a subsection on each of those operations, with a little
- bit of illustrative code for each op.] </remark>
-
-zookeeperProgrammers.xml:915: Program Structure, with Simple Example</title>
-
-zookeeperProgrammers.xml:999: <term>ZooKeeper Whitepaper <remark>[tbd: find url]</remark></term>
-
-zookeeperProgrammers.xml:1008: <term>API Reference <remark>[tbd: find url]</remark></term>
-
-zookeeperProgrammers.xml:1062: [tbd]</remark></term><listitem>
- <para>Any other good sources anyone can think of...</para>
- </listitem>
-
-zookeeperStarted.xml:73: <para>[tbd: should we start w/ a word here about were to get the source,
- exactly what to download, how to unpack it, and where to put it? Also,
- does the user need to be in sudo, or can they be under their regular
- login?]</para>
-
-zookeeperStarted.xml:84: <para>This should generate a JAR file called zookeeper.jar. To start
- Zookeeper, compile and run zookeeper.jar. <emphasis>[tbd, some more
- instruction here. Perhaps a command line? Are these two steps or
- one?]</emphasis></para>
-
-zookeeperStarted.xml:139: <para>ZooKeeper logs messages using log4j -- more detail available in
- the <ulink url="zookeeperProgrammers.html#Logging">Logging</ulink>
- section of the Programmer's Guide.<remark revision="include_tbd">[tbd:
- real reference needed]</remark>
-
-zookeeperStarted.xml:201: The C bindings exist in two variants: single
- threaded and multi-threaded. These differ only in how the messaging loop
- is done. <remark>[tbd: what is the messaging loop? Do we talk about it
- anywyhere? is this too much info for a getting started guide?]</remark>
-
-zookeeperStarted.xml:217: The entry <emphasis
- role="bold">syncLimit</emphasis> limits how far out of date a server can
- be from a leader. [TBD: someone please verify that the previous is
- true.]
-
-zookeeperStarted.xml:232: These are the "electionPort" numbers of the servers (as opposed to
- clientPorts), that is ports for <remark>[tbd: feedback need: what are
- these ports, exactly?]
-
-zookeeperStarted.xml:258: <remark>[tbd: what is the other config param?
- (I believe two are mentioned above.)]</remark>
http://git-wip-us.apache.org/repos/asf/zookeeper/blob/c1efa954/src/docs/src/documentation/classes/CatalogManager.properties
----------------------------------------------------------------------
diff --git a/src/docs/src/documentation/classes/CatalogManager.properties b/src/docs/src/documentation/classes/CatalogManager.properties
deleted file mode 100644
index ac060b9..0000000
--- a/src/docs/src/documentation/classes/CatalogManager.properties
+++ /dev/null
@@ -1,37 +0,0 @@
-# Copyright 2002-2004 The Apache Software Foundation
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-# http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-
-#=======================================================================
-# CatalogManager.properties
-#
-# This is the default properties file for Apache Forrest.
-# This facilitates local configuration of application-specific catalogs.
-#
-# See the Apache Forrest documentation:
-# http://forrest.apache.org/docs/your-project.html
-# http://forrest.apache.org/docs/validation.html
-
-# verbosity ... level of messages for status/debug
-# See forrest/src/core/context/WEB-INF/cocoon.xconf
-
-# catalogs ... list of additional catalogs to load
-# (Note that Apache Forrest will automatically load its own default catalog
-# from src/core/context/resources/schema/catalog.xcat)
-# use full pathnames
-# pathname separator is always semi-colon (;) regardless of operating system
-# directory separator is always slash (/) regardless of operating system
-#
-#catalogs=/home/me/forrest/my-site/src/documentation/resources/schema/catalog.xcat
-catalogs=
-
http://git-wip-us.apache.org/repos/asf/zookeeper/blob/c1efa954/src/docs/src/documentation/conf/cli.xconf
----------------------------------------------------------------------
diff --git a/src/docs/src/documentation/conf/cli.xconf b/src/docs/src/documentation/conf/cli.xconf
deleted file mode 100644
index c671340..0000000
--- a/src/docs/src/documentation/conf/cli.xconf
+++ /dev/null
@@ -1,328 +0,0 @@
-<?xml version="1.0"?>
-<!--
- Licensed to the Apache Software Foundation (ASF) under one or more
- contributor license agreements. See the NOTICE file distributed with
- this work for additional information regarding copyright ownership.
- The ASF licenses this file to You under the Apache License, Version 2.0
- (the "License"); you may not use this file except in compliance with
- the License. You may obtain a copy of the License at
-
- http://www.apache.org/licenses/LICENSE-2.0
-
- Unless required by applicable law or agreed to in writing, software
- distributed under the License is distributed on an "AS IS" BASIS,
- WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- See the License for the specific language governing permissions and
- limitations under the License.
--->
-<!--+
- | This is the Apache Cocoon command line configuration file.
- | Here you give the command line interface details of where
- | to find various aspects of your Cocoon installation.
- |
- | If you wish, you can also use this file to specify the URIs
- | that you wish to generate.
- |
- | The current configuration information in this file is for
- | building the Cocoon documentation. Therefore, all links here
- | are relative to the build context dir, which, in the build.xml
- | file, is set to ${build.context}
- |
- | Options:
- | verbose: increase amount of information presented
- | to standard output (default: false)
- | follow-links: whether linked pages should also be
- | generated (default: true)
- | precompile-only: precompile sitemaps and XSP pages, but
- | do not generate any pages (default: false)
- | confirm-extensions: check the mime type for the generated page
- | and adjust filename and links extensions
- | to match the mime type
- | (e.g. text/html->.html)
- |
- | Note: Whilst using an xconf file to configure the Cocoon
- | Command Line gives access to more features, the use of
- | command line parameters is more stable, as there are
- | currently plans to improve the xconf format to allow
- | greater flexibility. If you require a stable and
- | consistent method for accessing the CLI, it is recommended
- | that you use the command line parameters to configure
- | the CLI. See documentation at:
- | http://cocoon.apache.org/2.1/userdocs/offline/
- | http://wiki.apache.org/cocoon/CommandLine
- |
- +-->
-
-<cocoon verbose="true"
- follow-links="true"
- precompile-only="false"
- confirm-extensions="false">
-
- <!--+
- | The context directory is usually the webapp directory
- | containing the sitemap.xmap file.
- |
- | The config file is the cocoon.xconf file.
- |
- | The work directory is used by Cocoon to store temporary
- | files and cache files.
- |
- | The destination directory is where generated pages will
- | be written (assuming the 'simple' mapper is used, see
- | below)
- +-->
- <context-dir>.</context-dir>
- <config-file>WEB-INF/cocoon.xconf</config-file>
- <work-dir>../tmp/cocoon-work</work-dir>
- <dest-dir>../site</dest-dir>
-
- <!--+
- | A checksum file can be used to store checksums for pages
- | as they are generated. When the site is next generated,
- | files will not be written if their checksum has not changed.
- | This means that it will be easier to detect which files
- | need to be uploaded to a server, using the timestamp.
- |
- | The default path is relative to the core webapp directory.
- | An asolute path can be used.
- +-->
- <!-- <checksums-uri>build/work/checksums</checksums-uri>-->
-
- <!--+
- | Broken link reporting options:
- | Report into a text file, one link per line:
- | <broken-links type="text" report="filename"/>
- | Report into an XML file:
- | <broken-links type="xml" report="filename"/>
- | Ignore broken links (default):
- | <broken-links type="none"/>
- |
- | Two attributes to this node specify whether a page should
- | be generated when an error has occurred. 'generate' specifies
- | whether a page should be generated (default: true) and
- | extension specifies an extension that should be appended
- | to the generated page's filename (default: none)
- |
- | Using this, a quick scan through the destination directory
- | will show broken links, by their filename extension.
- +-->
- <broken-links type="xml"
- file="../brokenlinks.xml"
- generate="false"
- extension=".error"
- show-referrers="true"/>
-
- <!--+
- | Load classes at startup. This is necessary for generating
- | from sites that use SQL databases and JDBC.
- | The <load-class> element can be repeated if multiple classes
- | are needed.
- +-->
- <!--
- <load-class>org.firebirdsql.jdbc.Driver</load-class>
- -->
-
- <!--+
- | Configures logging.
- | The 'log-kit' parameter specifies the location of the log kit
- | configuration file (usually called logkit.xconf.
- |
- | Logger specifies the logging category (for all logging prior
- | to other Cocoon logging categories taking over)
- |
- | Available log levels are:
- | DEBUG: prints all level of log messages.
- | INFO: prints all level of log messages except DEBUG
- | ones.
- | WARN: prints all level of log messages except DEBUG
- | and INFO ones.
- | ERROR: prints all level of log messages except DEBUG,
- | INFO and WARN ones.
- | FATAL_ERROR: prints only log messages of this level
- +-->
- <!-- <logging log-kit="WEB-INF/logkit.xconf" logger="cli" level="ERROR" /> -->
-
- <!--+
- | Specifies the filename to be appended to URIs that
- | refer to a directory (i.e. end with a forward slash).
- +-->
- <default-filename>index.html</default-filename>
-
- <!--+
- | Specifies a user agent string to the sitemap when
- | generating the site.
- |
- | A generic term for a web browser is "user agent". Any
- | user agent, when connecting to a web server, will provide
- | a string to identify itself (e.g. as Internet Explorer or
- | Mozilla). It is possible to have Cocoon serve different
- | content depending upon the user agent string provided by
- | the browser. If your site does this, then you may want to
- | use this <user-agent> entry to provide a 'fake' user agent
- | to Cocoon, so that it generates the correct version of your
- | site.
- |
- | For most sites, this can be ignored.
- +-->
- <!--
- <user-agent>Cocoon Command Line Environment 2.1</user-agent>
- -->
-
- <!--+
- | Specifies an accept string to the sitemap when generating
- | the site.
- | User agents can specify to an HTTP server what types of content
- | (by mime-type) they are able to receive. E.g. a browser may be
- | able to handle jpegs, but not pngs. The HTTP accept header
- | allows the server to take the browser's capabilities into account,
- | and only send back content that it can handle.
- |
- | For most sites, this can be ignored.
- +-->
-
- <accept>*/*</accept>
-
- <!--+
- | Specifies which URIs should be included or excluded, according
- | to wildcard patterns.
- |
- | These includes/excludes are only relevant when you are following
- | links. A link URI must match an include pattern (if one is given)
- | and not match an exclude pattern, if it is to be followed by
- | Cocoon. It can be useful, for example, where there are links in
- | your site to pages that are not generated by Cocoon, such as
- | references to api-documentation.
- |
- | By default, all URIs are included. If both include and exclude
- | patterns are specified, a URI is first checked against the
- | include patterns, and then against the exclude patterns.
- |
- | Multiple patterns can be given, using muliple include or exclude
- | nodes.
- |
- | The order of the elements is not significant, as only the first
- | successful match of each category is used.
- |
- | Currently, only the complete source URI can be matched (including
- | any URI prefix). Future plans include destination URI matching
- | and regexp matching. If you have requirements for these, contact
- | dev@cocoon.apache.org.
- +-->
-
- <exclude pattern="**/"/>
- <exclude pattern="**apidocs**"/>
- <exclude pattern="api/**"/>
-
- <!-- ZOOKEEPER-2364 - we build our own release notes separately -->
- <exclude pattern="releasenotes.**"/>
-
-<!--
- This is a workaround for FOR-284 "link rewriting broken when
- linking to xml source views which contain site: links".
- See the explanation there and in declare-broken-site-links.xsl
--->
- <exclude pattern="site:**"/>
- <exclude pattern="ext:**"/>
- <exclude pattern="lm:**"/>
- <exclude pattern="**/site:**"/>
- <exclude pattern="**/ext:**"/>
- <exclude pattern="**/lm:**"/>
-
- <!-- Exclude tokens used in URLs to ASF mirrors (interpreted by a CGI) -->
- <exclude pattern="[preferred]/**"/>
- <exclude pattern="[location]"/>
-
- <!-- <include-links extension=".html"/>-->
-
- <!--+
- | <uri> nodes specify the URIs that should be generated, and
- | where required, what should be done with the generated pages.
- | They describe the way the URI of the generated file is created
- | from the source page's URI. There are three ways that a generated
- | file URI can be created: append, replace and insert.
- |
- | The "type" attribute specifies one of (append|replace|insert):
- |
- | append:
- | Append the generated page's URI to the end of the source URI:
- |
- | <uri type="append" src-prefix="documents/" src="index.html"
- | dest="build/dest/"/>
- |
- | This means that
- | (1) the "documents/index.html" page is generated
- | (2) the file will be written to "build/dest/documents/index.html"
- |
- | replace:
- | Completely ignore the generated page's URI - just
- | use the destination URI:
- |
- | <uri type="replace" src-prefix="documents/" src="index.html"
- | dest="build/dest/docs.html"/>
- |
- | This means that
- | (1) the "documents/index.html" page is generated
- | (2) the result is written to "build/dest/docs.html"
- | (3) this works only for "single" pages - and not when links
- | are followed
- |
- | insert:
- | Insert generated page's URI into the destination
- | URI at the point marked with a * (example uses fictional
- | zip protocol)
- |
- | <uri type="insert" src-prefix="documents/" src="index.html"
- | dest="zip://*.zip/page.html"/>
- |
- | This means that
- | (1)
- |
- | In any of these scenarios, if the dest attribute is omitted,
- | the value provided globally using the <dest-dir> node will
- | be used instead.
- +-->
- <!--
- <uri type="replace"
- src-prefix="samples/"
- src="hello-world/hello.html"
- dest="build/dest/hello-world.html"/>
- -->
-
- <!--+
- | <uri> nodes can be grouped together in a <uris> node. This
- | enables a group of URIs to share properties. The following
- | properties can be set for a group of URIs:
- | * follow-links: should pages be crawled for links
- | * confirm-extensions: should file extensions be checked
- | for the correct mime type
- | * src-prefix: all source URIs should be
- | pre-pended with this prefix before
- | generation. The prefix is not
- | included when calculating the
- | destination URI
- | * dest: the base destination URI to be
- | shared by all pages in this group
- | * type: the method to be used to calculate
- | the destination URI. See above
- | section on <uri> node for details.
- |
- | Each <uris> node can have a name attribute. When a name
- | attribute has been specified, the -n switch on the command
- | line can be used to tell Cocoon to only process the URIs
- | within this URI group. When no -n switch is given, all
- | <uris> nodes are processed. Thus, one xconf file can be
- | used to manage multiple sites.
- +-->
- <!--
- <uris name="mirrors" follow-links="false">
- <uri type="append" src="mirrors.html"/>
- </uris>
- -->
-
- <!--+
- | File containing URIs (plain text, one per line).
- +-->
- <!--
- <uri-file>uris.txt</uri-file>
- -->
-</cocoon>
http://git-wip-us.apache.org/repos/asf/zookeeper/blob/c1efa954/src/docs/src/documentation/content/xdocs/bookkeeperConfig.xml
----------------------------------------------------------------------
diff --git a/src/docs/src/documentation/content/xdocs/bookkeeperConfig.xml b/src/docs/src/documentation/content/xdocs/bookkeeperConfig.xml
deleted file mode 100644
index 7a80949..0000000
--- a/src/docs/src/documentation/content/xdocs/bookkeeperConfig.xml
+++ /dev/null
@@ -1,156 +0,0 @@
-<?xml version="1.0" encoding="UTF-8"?>
-<!--
- Copyright 2002-2004 The Apache Software Foundation
-
- Licensed under the Apache License, Version 2.0 (the "License");
- you may not use this file except in compliance with the License.
- You may obtain a copy of the License at
-
- http://www.apache.org/licenses/LICENSE-2.0
-
- Unless required by applicable law or agreed to in writing, software
- distributed under the License is distributed on an "AS IS" BASIS,
- WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- See the License for the specific language governing permissions and
- limitations under the License.
--->
-<!DOCTYPE article PUBLIC "-//OASIS//DTD Simplified DocBook XML V1.0//EN"
-"http://www.oasis-open.org/docbook/xml/simple/1.0/sdocbook.dtd">
-<article id="bk_Admin">
- <title>BookKeeper Administrator's Guide</title>
-
- <subtitle>Setup Guide</subtitle>
-
- <articleinfo>
- <legalnotice>
- <para>Licensed under the Apache License, Version 2.0 (the "License");
- you may not use this file except in compliance with the License. You may
- obtain a copy of the License at <ulink
- url="http://www.apache.org/licenses/LICENSE-2.0">http://www.apache.org/licenses/LICENSE-2.0</ulink>.
- </para>
-
- <para>Unless required by applicable law or agreed to in writing,
- software distributed under the License is distributed on an "AS IS"
- BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
- implied. See the License for the specific language governing permissions
- and limitations under the License.
- </para>
- </legalnotice>
-
- <abstract>
- <para>This document contains information about deploying, administering
- and mantaining BookKeeper. It also discusses best practices and common
- problems.
- </para>
- <para> As BookKeeper is still a prototype, this article is likely to change
- significantly over time.
- </para>
- </abstract>
- </articleinfo>
-
- <section id="bk_deployment">
- <title>Deployment</title>
-
- <para>This section contains information about deploying BookKeeper and
- covers these topics:</para>
-
- <itemizedlist>
- <listitem>
- <para><xref linkend="bk_sysReq" /></para>
- </listitem>
-
- <listitem>
- <para><xref linkend="bk_runningBookies" /></para>
- </listitem>
-
- <listitem>
- <para><xref linkend="bk_zkMetadata" /></para>
- </listitem>
- </itemizedlist>
-
- <para> The first section tells you how many machines you need. The second explains how to bootstrap bookies
- (BookKeeper storage servers). The third section explains how we use ZooKeeper and our requirements with
- respect to ZooKeeper.
- </para>
-
- <section id="bk_sysReq">
- <title>System requirements</title>
- <para> A typical BookKeeper installation comprises a set of bookies and a set of ZooKeeper replicas. The exact number of bookies
- depends on the quorum mode, desired throughput, and number of clients using this installation simultaneously. The minimum number of
- bookies is three for self-verifying (stores a message authentication code along with each entry) and four for generic (does not
- store a message authentication codewith each entry), and there is no upper limit on the number of bookies. Increasing the number of
- bookies, in fact, enables higher throughput.
- </para>
-
- <para> For performance, we require each server to have at least two disks. It is possible to run a bookie with a single disk, but
- performance will be significantly lower in this case. Of course, it works with one disk, but performance is significantly lower.
- </para>
-
- <para> For ZooKeeper, there is no constraint with respect to the number of replicas. Having a single machine running ZooKeeper
- in standalone mode is sufficient for BookKeeper. For resilience purposes, it might be a good idea to run ZooKeeper in quorum
- mode with multiple servers. Please refer to the ZooKeeper documentation for detail on how to configure ZooKeeper with multiple
- replicas
- </para>
- </section>
-
- <section id="bk_runningBookies">
- <title>Running bookies</title>
- <para>
- To run a bookie, we execute the following command:
- </para>
-
- <para><computeroutput>
- java -cp .:./zookeeper-<version>-bookkeeper.jar:./zookeeper-<version>.jar\
- :../log4j/apache-log4j-1.2.15/log4j-1.2.15.jar -Dlog4j.configuration=log4j.properties\
- org.apache.bookkeeper.proto.BookieServer 3181 127.0.0.1:2181 /path_to_log_device/\
- /path_to_ledger_device/
- </computeroutput></para>
-
- <para>
- The parameters are:
- </para>
-
- <itemizedlist>
- <listitem>
- <para>
- Port number that the bookie listens on;
- </para>
- </listitem>
-
- <listitem>
- <para>
- Comma separated list of ZooKeeper servers with a hostname:port format;
- </para>
- </listitem>
-
- <listitem>
- <para>
- Path for Log Device (stores bookie write-ahead log);
- </para>
- </listitem>
-
- <listitem>
- <para>
- Path for Ledger Device (stores ledger entries);
- </para>
- </listitem>
- </itemizedlist>
-
- <para>
- Ideally, <computeroutput>/path_to_log_device/ </computeroutput> and <computeroutput>/path_to_ledger_device/ </computeroutput> are each
- in a different device.
- </para>
- </section>
-
- <section id="bk_zkMetadata">
- <title>ZooKeeper Metadata</title>
- <para>
- For BookKeeper, we require a ZooKeeper installation to store metadata, and to pass the list
- of ZooKeeper servers as parameter to the constructor of the BookKeeper class (<computeroutput>
- org.apache.bookkeeper.client,BookKeeper</computeroutput>).
- To setup ZooKeeper, please check the <ulink url="index.html">
- ZooKeeper documentation</ulink>.
- </para>
- </section>
- </section>
-</article>
http://git-wip-us.apache.org/repos/asf/zookeeper/blob/c1efa954/src/docs/src/documentation/content/xdocs/bookkeeperOverview.xml
----------------------------------------------------------------------
diff --git a/src/docs/src/documentation/content/xdocs/bookkeeperOverview.xml b/src/docs/src/documentation/content/xdocs/bookkeeperOverview.xml
deleted file mode 100644
index cdc1878..0000000
--- a/src/docs/src/documentation/content/xdocs/bookkeeperOverview.xml
+++ /dev/null
@@ -1,419 +0,0 @@
-<?xml version="1.0" encoding="UTF-8"?>
-<!--
- Copyright 2002-2004 The Apache Software Foundation
-
- Licensed under the Apache License, Version 2.0 (the "License");
- you may not use this file except in compliance with the License.
- You may obtain a copy of the License at
-
- http://www.apache.org/licenses/LICENSE-2.0
-
- Unless required by applicable law or agreed to in writing, software
- distributed under the License is distributed on an "AS IS" BASIS,
- WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- See the License for the specific language governing permissions and
- limitations under the License.
--->
-
-<!DOCTYPE article PUBLIC "-//OASIS//DTD Simplified DocBook XML V1.0//EN"
-"http://www.oasis-open.org/docbook/xml/simple/1.0/sdocbook.dtd">
-<article id="bk_GettStartedGuide">
- <title>BookKeeper overview</title>
-
- <articleinfo>
- <legalnotice>
- <para>Licensed under the Apache License, Version 2.0 (the "License");
- you may not use this file except in compliance with the License. You may
- obtain a copy of the License at <ulink
- url="http://www.apache.org/licenses/LICENSE-2.0">http://www.apache.org/licenses/LICENSE-2.0</ulink>.</para>
-
- <para>Unless required by applicable law or agreed to in writing,
- software distributed under the License is distributed on an "AS IS"
- BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
- implied. See the License for the specific language governing permissions
- and limitations under the License.</para>
- </legalnotice>
-
- <abstract>
- <para>This guide contains detailed information about using BookKeeper
- for logging. It discusses the basic operations BookKeeper supports,
- and how to create logs and perform basic read and write operations on these
- logs.</para>
- </abstract>
- </articleinfo>
- <section id="bk_Overview">
- <title>BookKeeper overview</title>
-
- <section id="bk_Intro">
- <title>BookKeeper introduction</title>
- <para>
- BookKeeper is a replicated service to reliably log streams of records. In BookKeeper,
- servers are "bookies", log streams are "ledgers", and each unit of a log (aka record) is a
- "ledger entry". BookKeeper is designed to be reliable; bookies, the servers that store
- ledgers, can crash, corrupt data, discard data, but as long as there are enough bookies
- behaving correctly the service as a whole behaves correctly.
- </para>
-
- <para>
- The initial motivation for BookKeeper comes from the namenode of HDFS. Namenodes have to
- log operations in a reliable fashion so that recovery is possible in the case of crashes.
- We have found the applications for BookKeeper extend far beyond HDFS, however. Essentially,
- any application that requires an append storage can replace their implementations with
- BookKeeper. BookKeeper has the advantage of scaling throughput with the number of servers.
- </para>
-
- <para>
- At a high level, a bookkeeper client receives entries from a client application and stores it to
- sets of bookies, and there are a few advantages in having such a service:
- </para>
-
- <itemizedlist>
- <listitem>
- <para>
- We can use hardware that is optimized for such a service. We currently believe that such a
- system has to be optimized only for disk I/O;
- </para>
- </listitem>
-
- <listitem>
- <para>
- We can have a pool of servers implementing such a log system, and shared among a number of servers;
- </para>
- </listitem>
-
- <listitem>
- <para>
- We can have a higher degree of replication with such a pool, which makes sense if the hardware necessary for it is cheaper compared to the one the application uses.
- </para>
- </listitem>
- </itemizedlist>
-
- </section>
-
- <section id="bk_moreDetail">
- <title>In slightly more detail...</title>
-
- <para> BookKeeper implements highly available logs, and it has been designed with write-ahead logging in mind. Besides high availability
- due to the replicated nature of the service, it provides high throughput due to striping. As we write entries in a subset of bookies of an
- ensemble and rotate writes across available quorums, we are able to increase throughput with the number of servers for both reads and writes.
- Scalability is a property that is possible to achieve in this case due to the use of quorums. Other replication techniques, such as
- state-machine replication, do not enable such a property.
- </para>
-
- <para> An application first creates a ledger before writing to bookies through a local BookKeeper client instance.
- Upon creating a ledger, a BookKeeper client writes metadata about the ledger to ZooKeeper. Each ledger currently
- has a single writer. This writer has to execute a close ledger operation before any other client can read from it.
- If the writer of a ledger does not close a ledger properly because, for example, it has crashed before having the
- opportunity of closing the ledger, then the next client that tries to open a ledger executes a procedure to recover
- it. As closing a ledger consists essentially of writing the last entry written to a ledger to ZooKeeper, the recovery
- procedure simply finds the last entry written correctly and writes it to ZooKeeper.
- </para>
-
- <para>
- Note that currently this recovery procedure is executed automatically upon trying to open a ledger and no explicit action is necessary.
- Although two clients may try to recover a ledger concurrently, only one will succeed, the first one that is able to create the close znode
- for the ledger.
- </para>
- </section>
-
- <section id="bk_basicComponents">
- <title>Bookkeeper elements and concepts</title>
- <para>
- BookKeeper uses four basic elements:
- </para>
-
- <itemizedlist>
- <listitem>
- <para>
- <emphasis role="bold">Ledger</emphasis>: A ledger is a sequence of entries, and each entry is a sequence of bytes. Entries are
- written sequentially to a ledger and at most once. Consequently, ledgers have an append-only semantics;
- </para>
- </listitem>
-
- <listitem>
- <para>
- <emphasis role="bold">BookKeeper client</emphasis>: A client runs along with a BookKeeper application, and it enables applications
- to execute operations on ledgers, such as creating a ledger and writing to it;
- </para>
- </listitem>
-
- <listitem>
- <para>
- <emphasis role="bold">Bookie</emphasis>: A bookie is a BookKeeper storage server. Bookies store the content of ledgers. For any given
- ledger L, we call an <emphasis>ensemble</emphasis> the group of bookies storing the content of L. For performance, we store on
- each bookie of an ensemble only a fragment of a ledger. That is, we stripe when writing entries to a ledger such that
- each entry is written to sub-group of bookies of the ensemble.
- </para>
- </listitem>
-
- <listitem>
- <para>
- <emphasis role="bold">Metadata storage service</emphasis>: BookKeeper requires a metadata storage service to store information related
- to ledgers and available bookies. We currently use ZooKeeper for such a task.
- </para>
- </listitem>
- </itemizedlist>
- </section>
-
- <section id="bk_initialDesign">
- <title>Bookkeeper initial design</title>
- <para>
- A set of bookies implements BookKeeper, and we use a quorum-based protocol to replicate data across the bookies.
- There are basically two operations to an existing ledger: read and append. Here is the complete API list
- (mode detail <ulink url="bookkeeperProgrammer.html">
- here</ulink>):
- </para>
-
- <itemizedlist>
- <listitem>
- <para>
- Create ledger: creates a new empty ledger;
- </para>
- </listitem>
-
- <listitem>
- <para>
- Open ledger: opens an existing ledger for reading;
- </para>
- </listitem>
-
- <listitem>
- <para>
- Add entry: adds a record to a ledger either synchronously or asynchronously;
- </para>
- </listitem>
-
- <listitem>
- <para>
- Read entries: reads a sequence of entries from a ledger either synchronously or asynchronously
- </para>
- </listitem>
- </itemizedlist>
-
- <para>
- There is only a single client that can write to a ledger. Once that ledger is closed or the client fails,
- no more entries can be added. (We take advantage of this behavior to provide our strong guarantees.)
- There will not be gaps in the ledger. Fingers get broken, people get roughed up or end up in prison when
- books are manipulated, so there is no deleting or changing of entries.
- </para>
-
- <figure>
- <title>BookKeeper Overview</title>
-
- <mediaobject>
- <imageobject>
- <imagedata fileref="images/bk-overview.jpg" width="3in" depth="3in" contentwidth="3in" contentdepth="3in" scalefit="0"/>
- </imageobject>
- </mediaobject>
- </figure>
-
- <para>
- A simple use of BooKeeper is to implement a write-ahead transaction log. A server maintains an in-memory data structure
- (with periodic snapshots for example) and logs changes to that structure before it applies the change. The application
- server creates a ledger at startup and store the ledger id and password in a well known place (ZooKeeper maybe). When
- it needs to make a change, the server adds an entry with the change information to a ledger and apply the change when
- BookKeeper adds the entry successfully. The server can even use asyncAddEntry to queue up many changes for high change
- throughput. BooKeeper meticulously logs the changes in order and call the completion functions in order.
- </para>
-
- <para>
- When the application server dies, a backup server will come online, get the last snapshot and then it will open the
- ledger of the old server and read all the entries from the time the snapshot was taken. (Since it doesn't know the
- last entry number it will use MAX_INTEGER). Once all the entries have been processed, it will close the ledger and
- start a new one for its use.
- </para>
-
- <para>
- A client library takes care of communicating with bookies and managing entry numbers. An entry has the following fields:
- </para>
-
- <table frame='all'><title>Entry fields</title>
- <tgroup cols='3' align='left' colsep='1' rowsep='1'>
- <colspec colname='Field'/>
- <colspec colname='Type'/>
- <colspec colname='Description'/>
- <colspec colnum='5' colname='c5'/>
- <thead>
- <row>
- <entry>Field</entry>
- <entry>Type</entry>
- <entry>Description</entry>
- </row>
- </thead>
- <tfoot>
- <row>
- <entry>Ledger number</entry>
- <entry>long</entry>
- <entry>The id of the ledger of this entry</entry>
- </row>
- <row>
- <entry>Entry number</entry>
- <entry>long</entry>
- <entry>The id of this entry</entry>
- </row>
- </tfoot>
- <tbody>
- <row>
- <entry>last confirmed (<emphasis>LC</emphasis>)</entry>
- <entry>long</entry>
- <entry>id of the last recorded entry</entry>
- </row>
- <row>
- <entry>data</entry>
- <entry>byte[]</entry>
- <entry>the entry data (supplied by application)</entry>
- </row>
- <row>
- <entry>authentication code</entry>
- <entry>byte[]</entry>
- <entry>Message authentication code that includes all other fields of the entry</entry>
- </row>
-
- </tbody>
- </tgroup>
- </table>
-
- <para>
- The client library generates a ledger entry. None of the fields are modified by the bookies and only the first three
- fields are interpreted by the bookies.
- </para>
-
- <para>
- To add to a ledger, the client generates the entry above using the ledger number. The entry number will be one more
- than the last entry generated. The <emphasis>LC</emphasis> field contains the last entry that has been successfully recorded by BookKeeper.
- If the client writes entries one at a time, <emphasis>LC</emphasis> is the last entry id. But, if the client is using asyncAddEntry, there
- may be many entries in flight. An entry is considered recorded when both of the following conditions are met:
- </para>
-
- <itemizedlist>
- <listitem>
- <para>
- the entry has been accepted by a quorum of bookies
- </para>
- </listitem>
-
- <listitem>
- <para>
- all entries with a lower entry id have been accepted by a quorum of bookies
- </para>
- </listitem>
- </itemizedlist>
-
- <para>
- <emphasis>LC</emphasis> seems mysterious right now, but it is too early to explain how we use it; just smile and move on.
- </para>
-
- <para>
- Once all the other fields have been field in, the client generates an authentication code with all of the previous fields.
- The entry is then sent to a quorum of bookies to be recorded. Any failures will result in the entry being sent to a new
- quorum of bookies.
- </para>
-
- <para>
- To read, the client library initially contacts a bookie and starts requesting entries. If an entry is missing or
- invalid (a bad MAC for example), the client will make a request to a different bookie. By using quorum writes,
- as long as enough bookies are up we are guaranteed to eventually be able to read an entry.
- </para>
-
- </section>
-
- <section id="bk_metadata">
- <title>Bookkeeper metadata management</title>
-
- <para>
- There are some meta data that needs to be made available to BookKeeper clients:
- </para>
-
- <itemizedlist>
- <listitem>
- <para>
- The available bookies;
- </para>
- </listitem>
-
- <listitem>
- <para>
- The list of ledgers;
- </para>
- </listitem>
-
- <listitem>
- <para>
- The list of bookies that have been used for a given ledger;
- </para>
- </listitem>
-
- <listitem>
- <para>
- The last entry of a ledger;
- </para>
- </listitem>
- </itemizedlist>
-
- <para>
- We maintain this information in ZooKeeper. Bookies use ephemeral nodes to indicate their availability. Clients
- use znodes to track ledger creation and deletion and also to know the end of the ledger and the bookies that
- were used to store the ledger. Bookies also watch the ledger list so that they can cleanup ledgers that get deleted.
- </para>
-
- </section>
-
- <section id="bk_closingOut">
- <title>Closing out ledgers</title>
-
- <para>
- The process of closing out the ledger and finding the last ledger is difficult due to the durability guarantees of BookKeeper:
- </para>
-
- <itemizedlist>
- <listitem>
- <para>
- If an entry has been successfully recorded, it must be readable.
- </para>
- </listitem>
-
- <listitem>
- <para>
- If an entry is read once, it must always be available to be read.
- </para>
- </listitem>
- </itemizedlist>
-
- <para>
- If the ledger was closed gracefully, ZooKeeper will have the last entry and everything will work well. But, if the
- BookKeeper client that was writing the ledger dies, there is some recovery that needs to take place.
- </para>
-
- <para>
- The problematic entries are the ones at the end of the ledger. There can be entries in flight when a BookKeeper client
- dies. If the entry only gets to one bookie, the entry should not be readable since the entry will disappear if that bookie
- fails. If the entry is only on one bookie, that doesn't mean that the entry has not been recorded successfully; the other
- bookies that recorded the entry might have failed.
- </para>
-
- <para>
- The trick to making everything work is to have a correct idea of a last entry. We do it in roughly three steps:
- </para>
- <orderedlist>
- <listitem>
- <para>
- Find the entry with the highest last recorded entry, <emphasis>LC</emphasis>;
- </para>
- </listitem>
-
- <listitem>
- <para>
- Find the highest consecutively recorded entry, <emphasis>LR</emphasis>;
- </para>
- </listitem>
-
- <listitem>
- <para>
- Make sure that all entries between <emphasis>LC</emphasis> and <emphasis>LR</emphasis> are on a quorum of bookies;
- </para>
- </listitem>
-
- </orderedlist>
- </section>
- </section>
-</article>
\ No newline at end of file
http://git-wip-us.apache.org/repos/asf/zookeeper/blob/c1efa954/src/docs/src/documentation/content/xdocs/bookkeeperProgrammer.xml
----------------------------------------------------------------------
diff --git a/src/docs/src/documentation/content/xdocs/bookkeeperProgrammer.xml b/src/docs/src/documentation/content/xdocs/bookkeeperProgrammer.xml
deleted file mode 100644
index 5f330e1..0000000
--- a/src/docs/src/documentation/content/xdocs/bookkeeperProgrammer.xml
+++ /dev/null
@@ -1,678 +0,0 @@
-<?xml version="1.0" encoding="UTF-8"?>
-<!--
- Copyright 2002-2004 The Apache Software Foundation
-
- Licensed under the Apache License, Version 2.0 (the "License");
- you may not use this file except in compliance with the License.
- You may obtain a copy of the License at
-
- http://www.apache.org/licenses/LICENSE-2.0
-
- Unless required by applicable law or agreed to in writing, software
- distributed under the License is distributed on an "AS IS" BASIS,
- WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- See the License for the specific language governing permissions and
- limitations under the License.
--->
-
-<!DOCTYPE article PUBLIC "-//OASIS//DTD Simplified DocBook XML V1.0//EN"
-"http://www.oasis-open.org/docbook/xml/simple/1.0/sdocbook.dtd">
-<article id="bk_GettStartedGuide">
- <title>BookKeeper Getting Started Guide</title>
-
- <articleinfo>
- <legalnotice>
- <para>Licensed under the Apache License, Version 2.0 (the "License");
- you may not use this file except in compliance with the License. You may
- obtain a copy of the License at <ulink
- url="http://www.apache.org/licenses/LICENSE-2.0">http://www.apache.org/licenses/LICENSE-2.0</ulink>.</para>
-
- <para>Unless required by applicable law or agreed to in writing,
- software distributed under the License is distributed on an "AS IS"
- BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
- implied. See the License for the specific language governing permissions
- and limitations under the License.</para>
- </legalnotice>
-
- <abstract>
- <para>This guide contains detailed information about using BookKeeper
- for logging. It discusses the basic operations BookKeeper supports,
- and how to create logs and perform basic read and write operations on these
- logs.</para>
- </abstract>
- </articleinfo>
- <section id="bk_GettingStarted">
- <title>Programming with BookKeeper</title>
-
- <itemizedlist>
- <listitem>
- <para><xref linkend="bk_instance" /></para>
- </listitem>
-
- <listitem>
- <para><xref linkend="bk_createLedger" /></para>
- </listitem>
-
- <listitem>
- <para><xref linkend="bk_writeLedger" /></para>
- </listitem>
-
- <listitem>
- <para><xref linkend="bk_closeLedger" /></para>
- </listitem>
-
- <listitem>
- <para><xref linkend="bk_openLedger" /></para>
- </listitem>
-
- <listitem>
- <para><xref linkend="bk_readLedger" /></para>
- </listitem>
-
- <listitem>
- <para><xref linkend="bk_deleteLedger" /></para>
- </listitem>
-
- </itemizedlist>
-
- <section id="bk_instance">
- <title> Instantiating BookKeeper.</title>
- <para>
- The first step to use BookKeeper is to instantiate a BookKeeper object:
- </para>
- <para>
- <computeroutput>
- org.apache.bookkeeper.BookKeeper
- </computeroutput>
- </para>
-
- <para>
- There are three BookKeeper constructors:
- </para>
-
- <para>
- <computeroutput>
- public BookKeeper(String servers)
- throws KeeperException, IOException
- </computeroutput>
- </para>
-
- <para>
- where:
- </para>
- <itemizedlist>
- <listitem>
- <para>
- <computeroutput>servers</computeroutput> is a comma-separated list of ZooKeeper servers.
- </para>
- </listitem>
- </itemizedlist>
-
- <para>
- <computeroutput>
- public BookKeeper(ZooKeeper zk)
- throws InterruptedException, KeeperException
- </computeroutput>
- </para>
-
- <para>
- where:
- </para>
- <itemizedlist>
- <listitem>
- <para>
- <computeroutput>zk</computeroutput> is a ZooKeeper object. This constructor is useful when
- the application also using ZooKeeper and wants to have a single instance of ZooKeeper.
- </para>
- </listitem>
- </itemizedlist>
-
-
- <para>
- <computeroutput>
- public BookKeeper(ZooKeeper zk, ClientSocketChannelFactory channelFactory)
- throws InterruptedException, KeeperException
- </computeroutput>
- </para>
-
- <para>
- where:
- </para>
- <itemizedlist>
- <listitem>
- <para>
- <computeroutput>zk</computeroutput> is a ZooKeeper object. This constructor is useful when
- the application also using ZooKeeper and wants to have a single instance of ZooKeeper.
- </para>
- </listitem>
-
- <listitem>
- <para>
- <computeroutput>channelFactory</computeroutput> is a netty channel object
- (<computeroutput>org.jboss.netty.channel.socket</computeroutput>).
- </para>
- </listitem>
- </itemizedlist>
-
-
-
- </section>
-
- <section id="bk_createLedger">
- <title> Creating a ledger. </title>
-
- <para> Before writing entries to BookKeeper, it is necessary to create a ledger.
- With the current BookKeeper API, it is possible to create a ledger both synchronously
- or asynchronously. The following methods belong
- to <computeroutput>org.apache.bookkeeper.client.BookKeeper</computeroutput>.
- </para>
-
- <para>
- <emphasis role="bold">Synchronous call:</emphasis>
- </para>
-
- <para>
- <computeroutput>
- public LedgerHandle createLedger(int ensSize, int qSize, DigestType type, byte passwd[])
- throws KeeperException, InterruptedException,
- IOException, BKException
- </computeroutput>
- </para>
-
- <para>
- where:
- </para>
- <itemizedlist>
- <listitem>
- <para>
- <computeroutput>ensSize</computeroutput> is the number of bookies (ensemble size);
- </para>
- </listitem>
-
- <listitem>
- <para>
- <computeroutput>qSize</computeroutput> is the write quorum size;
- </para>
- </listitem>
-
- <listitem>
- <para>
- <computeroutput>type</computeroutput> is the type of digest used with entries: either MAC or CRC32.
- </para>
- </listitem>
-
- <listitem>
- <para>
- <computeroutput>passwd</computeroutput> is a password that authorizes the client to write to the
- ledger being created.
- </para>
- </listitem>
- </itemizedlist>
-
- <para>
- All further operations on a ledger are invoked through the <computeroutput>LedgerHandle</computeroutput>
- object returned.
- </para>
-
- <para>
- As a convenience, we provide a <computeroutput>createLedger</computeroutput> with default parameters (3,2,VERIFIABLE),
- and the only two input parameters it requires are a digest type and a password.
- </para>
-
- <para>
- <emphasis role="bold">Asynchronous call:</emphasis>
- </para>
-
- <para>
- <computeroutput>
- public void asyncCreateLedger(int ensSize,
- int qSize,
- DigestType type,
- byte passwd[],
- CreateCallback cb,
- Object ctx
- )
- </computeroutput>
- </para>
-
- <para>
- The parameters are the same of the synchronous version, with the
- exception of <computeroutput>cb</computeroutput> and <computeroutput>ctx</computeroutput>. <computeroutput>CreateCallback</computeroutput>
- is an interface in <computeroutput>org.apache.bookkeeper.client.AsyncCallback</computeroutput>, and
- a class implementing it has to implement a method called <computeroutput>createComplete</computeroutput>
- that has the following signature:
- </para>
-
- <para>
- <computeroutput>
- void createComplete(int rc, LedgerHandle lh, Object ctx);
- </computeroutput>
- </para>
-
- <para>
- where:
- </para>
- <itemizedlist>
- <listitem>
- <para>
- <computeroutput>rc</computeroutput> is a return code (please refer to <computeroutput>org.apache.bookeeper.client.BKException</computeroutput> for a list);
- </para>
- </listitem>
-
- <listitem>
- <para>
- <computeroutput>lh</computeroutput> is a <computeroutput>LedgerHandle</computeroutput> object to manipulate a ledger;
- </para>
- </listitem>
-
- <listitem>
- <para>
- <computeroutput>ctx</computeroutput> is a control object for accountability purposes. It can be essentially any object the application is happy with.
- </para>
- </listitem>
- </itemizedlist>
-
- <para>
- The <computeroutput>ctx</computeroutput> object passed as a parameter to the call to create a ledger
- is the one same returned in the callback.
- </para>
- </section>
-
- <section id="bk_writeLedger">
- <title> Adding entries to a ledger. </title>
- <para>
- Once we have a ledger handle <computeroutput>lh</computeroutput> obtained through a call to create a ledger, we
- can start writing entries. As with creating ledgers, we can write both synchronously and
- asynchronously. The following methods belong
- to <computeroutput>org.apache.bookkeeper.client.LedgerHandle</computeroutput>.
- </para>
-
- <para>
- <emphasis role="bold">Synchronous call:</emphasis>
- </para>
-
- <para>
- <computeroutput>
- public long addEntry(byte[] data)
- throws InterruptedException
- </computeroutput>
- </para>
-
- <para>
- where:
- </para>
-
- <itemizedlist>
- <listitem>
- <para>
- <computeroutput>data</computeroutput> is a byte array;
- </para>
- </listitem>
- </itemizedlist>
-
- <para>
- A call to <computeroutput>addEntry</computeroutput> returns the status of the operation (please refer to <computeroutput>org.apache.bookeeper.client.BKDefs</computeroutput> for a list);
- </para>
-
- <para>
- <emphasis role="bold">Asynchronous call:</emphasis>
- </para>
-
- <para>
- <computeroutput>
- public void asyncAddEntry(byte[] data, AddCallback cb, Object ctx)
- </computeroutput>
- </para>
-
- <para>
- It also takes a byte array as the sequence of bytes to be stored as an entry. Additionaly, it takes
- a callback object <computeroutput>cb</computeroutput> and a control object <computeroutput>ctx</computeroutput>. The callback object must implement
- the <computeroutput>AddCallback</computeroutput> interface in <computeroutput>org.apache.bookkeeper.client.AsyncCallback</computeroutput>, and
- a class implementing it has to implement a method called <computeroutput>addComplete</computeroutput>
- that has the following signature:
- </para>
-
- <para>
- <computeroutput>
- void addComplete(int rc, LedgerHandle lh, long entryId, Object ctx);
- </computeroutput>
- </para>
-
- <para>
- where:
- </para>
- <itemizedlist>
- <listitem>
- <para>
- <computeroutput>rc</computeroutput> is a return code (please refer to <computeroutput>org.apache.bookeeper.client.BKDefs</computeroutput> for a list);
- </para>
- </listitem>
-
- <listitem>
- <para>
- <computeroutput>lh</computeroutput> is a <computeroutput>LedgerHandle</computeroutput> object to manipulate a ledger;
- </para>
- </listitem>
-
- <listitem>
- <para>
- <computeroutput>entryId</computeroutput> is the identifier of entry associated with this request;
- </para>
- </listitem>
-
- <listitem>
- <para>
- <computeroutput>ctx</computeroutput> is control object used for accountability purposes. It can be any object the application is happy with.
- </para>
- </listitem>
- </itemizedlist>
- </section>
-
- <section id="bk_closeLedger">
- <title> Closing a ledger. </title>
- <para>
- Once a client is done writing, it closes the ledger. The following methods belong
- to <computeroutput>org.apache.bookkeeper.client.LedgerHandle</computeroutput>.
- </para>
- <para>
- <emphasis role="bold">Synchronous close:</emphasis>
- </para>
-
- <para>
- <computeroutput>
- public void close()
- throws InterruptedException
- </computeroutput>
- </para>
-
- <para>
- It takes no input parameters.
- </para>
-
- <para>
- <emphasis role="bold">Asynchronous close:</emphasis>
- </para>
- <para>
- <computeroutput>
- public void asyncClose(CloseCallback cb, Object ctx)
- throws InterruptedException
- </computeroutput>
- </para>
-
- <para>
- It takes a callback object <computeroutput>cb</computeroutput> and a control object <computeroutput>ctx</computeroutput>. The callback object must implement
- the <computeroutput>CloseCallback</computeroutput> interface in <computeroutput>org.apache.bookkeeper.client.AsyncCallback</computeroutput>, and
- a class implementing it has to implement a method called <computeroutput>closeComplete</computeroutput>
- that has the following signature:
- </para>
-
- <para>
- <computeroutput>
- void closeComplete(int rc, LedgerHandle lh, Object ctx)
- </computeroutput>
- </para>
-
- <para>
- where:
- </para>
- <itemizedlist>
- <listitem>
- <para>
- <computeroutput>rc</computeroutput> is a return code (please refer to <computeroutput>org.apache.bookeeper.client.BKDefs</computeroutput> for a list);
- </para>
- </listitem>
-
- <listitem>
- <para>
- <computeroutput>lh</computeroutput> is a <computeroutput>LedgerHandle</computeroutput> object to manipulate a ledger;
- </para>
- </listitem>
-
- <listitem>
- <para>
- <computeroutput>ctx</computeroutput> is control object used for accountability purposes.
- </para>
- </listitem>
- </itemizedlist>
-
- </section>
-
- <section id="bk_openLedger">
- <title> Opening a ledger. </title>
- <para>
- To read from a ledger, a client must open it first. The following methods belong
- to <computeroutput>org.apache.bookkeeper.client.BookKeeper</computeroutput>.
- </para>
-
- <para>
- <emphasis role="bold">Synchronous open:</emphasis>
- </para>
-
- <para>
- <computeroutput>
- public LedgerHandle openLedger(long lId, DigestType type, byte passwd[])
- throws InterruptedException, BKException
- </computeroutput>
- </para>
-
- <itemizedlist>
- <listitem>
- <para>
- <computeroutput>ledgerId</computeroutput> is the ledger identifier;
- </para>
- </listitem>
-
- <listitem>
- <para>
- <computeroutput>type</computeroutput> is the type of digest used with entries: either MAC or CRC32.
- </para>
- </listitem>
-
- <listitem>
- <para>
- <computeroutput>passwd</computeroutput> is a password to access the ledger (used only in the case of <computeroutput>VERIFIABLE</computeroutput> ledgers);
- </para>
- </listitem>
- </itemizedlist>
-
- <para>
- <emphasis role="bold">Asynchronous open:</emphasis>
- </para>
- <para>
- <computeroutput>
- public void asyncOpenLedger(long lId, DigestType type, byte passwd[], OpenCallback cb, Object ctx)
- </computeroutput>
- </para>
-
- <para>
- It also takes a a ledger identifier and a password. Additionaly, it takes a callback object
- <computeroutput>cb</computeroutput> and a control object <computeroutput>ctx</computeroutput>. The callback object must implement
- the <computeroutput>OpenCallback</computeroutput> interface in <computeroutput>org.apache.bookkeeper.client.AsyncCallback</computeroutput>, and
- a class implementing it has to implement a method called <computeroutput>openComplete</computeroutput>
- that has the following signature:
- </para>
-
- <para>
- <computeroutput>
- public void openComplete(int rc, LedgerHandle lh, Object ctx)
- </computeroutput>
- </para>
-
- <para>
- where:
- </para>
- <itemizedlist>
- <listitem>
- <para>
- <computeroutput>rc</computeroutput> is a return code (please refer to <computeroutput>org.apache.bookeeper.client.BKDefs</computeroutput> for a list);
- </para>
- </listitem>
-
- <listitem>
- <para>
- <computeroutput>lh</computeroutput> is a <computeroutput>LedgerHandle</computeroutput> object to manipulate a ledger;
- </para>
- </listitem>
-
- <listitem>
- <para>
- <computeroutput>ctx</computeroutput> is control object used for accountability purposes.
- </para>
- </listitem>
- </itemizedlist>
- </section>
-
- <section id="bk_readLedger">
- <title> Reading from ledger </title>
- <para>
- Read calls may request one or more consecutive entries. The following methods belong
- to <computeroutput>org.apache.bookkeeper.client.LedgerHandle</computeroutput>.
- </para>
-
- <para>
- <emphasis role="bold">Synchronous read:</emphasis>
- </para>
-
- <para>
- <computeroutput>
- public Enumeration<LedgerEntry> readEntries(long firstEntry, long lastEntry)
- throws InterruptedException, BKException
- </computeroutput>
- </para>
-
- <itemizedlist>
- <listitem>
- <para>
- <computeroutput>firstEntry</computeroutput> is the identifier of the first entry in the sequence of entries to read;
- </para>
- </listitem>
-
- <listitem>
- <para>
- <computeroutput>lastEntry</computeroutput> is the identifier of the last entry in the sequence of entries to read.
- </para>
- </listitem>
- </itemizedlist>
-
- <para>
- <emphasis role="bold">Asynchronous read:</emphasis>
- </para>
- <para>
- <computeroutput>
- public void asyncReadEntries(long firstEntry,
- long lastEntry, ReadCallback cb, Object ctx)
- throws BKException, InterruptedException
- </computeroutput>
- </para>
-
- <para>
- It also takes a first and a last entry identifiers. Additionaly, it takes a callback object
- <computeroutput>cb</computeroutput> and a control object <computeroutput>ctx</computeroutput>. The callback object must implement
- the <computeroutput>ReadCallback</computeroutput> interface in <computeroutput>org.apache.bookkeeper.client.AsyncCallback</computeroutput>, and
- a class implementing it has to implement a method called <computeroutput>readComplete</computeroutput>
- that has the following signature:
- </para>
-
- <para>
- <computeroutput>
- void readComplete(int rc, LedgerHandle lh, Enumeration<LedgerEntry> seq, Object ctx)
- </computeroutput>
- </para>
-
- <para>
- where:
- </para>
- <itemizedlist>
- <listitem>
- <para>
- <computeroutput>rc</computeroutput> is a return code (please refer to <computeroutput>org.apache.bookeeper.client.BKDefs</computeroutput> for a list);
- </para>
- </listitem>
-
- <listitem>
- <para>
- <computeroutput>lh</computeroutput> is a <computeroutput>LedgerHandle</computeroutput> object to manipulate a ledger;
- </para>
- </listitem>
-
- <listitem>
- <para>
- <computeroutput>seq</computeroutput> is a <computeroutput>Enumeration<LedgerEntry> </computeroutput> object to containing the list of entries requested;
- </para>
- </listitem>
-
- <listitem>
- <para>
- <computeroutput>ctx</computeroutput> is control object used for accountability purposes.
- </para>
- </listitem>
- </itemizedlist>
- </section>
-
- <section id="bk_deleteLedger">
- <title> Deleting a ledger </title>
- <para>
- Once a client is done with a ledger and is sure that nobody will ever need to read from it again, they can delete the ledger.
- The following methods belong to <computeroutput>org.apache.bookkeeper.client.BookKeeper</computeroutput>.
- </para>
-
- <para>
- <emphasis role="bold">Synchronous delete:</emphasis>
- </para>
-
- <para>
- <computeroutput>
- public void deleteLedger(long lId) throws InterruptedException, BKException
- </computeroutput>
- </para>
-
- <itemizedlist>
- <listitem>
- <para>
- <computeroutput>lId</computeroutput> is the ledger identifier;
- </para>
- </listitem>
- </itemizedlist>
-
- <para>
- <emphasis role="bold">Asynchronous delete:</emphasis>
- </para>
- <para>
- <computeroutput>
- public void asyncDeleteLedger(long lId, DeleteCallback cb, Object ctx)
- </computeroutput>
- </para>
-
- <para>
- It takes a ledger identifier. Additionally, it takes a callback object
- <computeroutput>cb</computeroutput> and a control object <computeroutput>ctx</computeroutput>. The callback object must implement
- the <computeroutput>DeleteCallback</computeroutput> interface in <computeroutput>org.apache.bookkeeper.client.AsyncCallback</computeroutput>, and
- a class implementing it has to implement a method called <computeroutput>deleteComplete</computeroutput>
- that has the following signature:
- </para>
-
- <para>
- <computeroutput>
- void deleteComplete(int rc, Object ctx)
- </computeroutput>
- </para>
-
- <para>
- where:
- </para>
- <itemizedlist>
- <listitem>
- <para>
- <computeroutput>rc</computeroutput> is a return code (please refer to <computeroutput>org.apache.bookeeper.client.BKDefs</computeroutput> for a list);
- </para>
- </listitem>
-
- <listitem>
- <para>
- <computeroutput>ctx</computeroutput> is control object used for accountability purposes.
- </para>
- </listitem>
- </itemizedlist>
- </section>
- </section>
-</article>
\ No newline at end of file
[08/12] zookeeper git commit: ZOOKEEPER-3022: MAVEN MIGRATION 3.4 -
Iteration 1 - docs, it
Posted by an...@apache.org.
http://git-wip-us.apache.org/repos/asf/zookeeper/blob/c1efa954/src/docs/src/documentation/content/xdocs/zookeeperProgrammers.xml
----------------------------------------------------------------------
diff --git a/src/docs/src/documentation/content/xdocs/zookeeperProgrammers.xml b/src/docs/src/documentation/content/xdocs/zookeeperProgrammers.xml
deleted file mode 100644
index 8fbd679..0000000
--- a/src/docs/src/documentation/content/xdocs/zookeeperProgrammers.xml
+++ /dev/null
@@ -1,1640 +0,0 @@
-<?xml version="1.0" encoding="UTF-8"?>
-<!--
- Copyright 2002-2004 The Apache Software Foundation
-
- Licensed under the Apache License, Version 2.0 (the "License");
- you may not use this file except in compliance with the License.
- You may obtain a copy of the License at
-
- http://www.apache.org/licenses/LICENSE-2.0
-
- Unless required by applicable law or agreed to in writing, software
- distributed under the License is distributed on an "AS IS" BASIS,
- WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- See the License for the specific language governing permissions and
- limitations under the License.
--->
-<!DOCTYPE article PUBLIC "-//OASIS//DTD Simplified DocBook XML V1.0//EN"
-"http://www.oasis-open.org/docbook/xml/simple/1.0/sdocbook.dtd">
-<article id="bk_programmersGuide">
- <title>ZooKeeper Programmer's Guide</title>
-
- <subtitle>Developing Distributed Applications that use ZooKeeper</subtitle>
-
- <articleinfo>
- <legalnotice>
- <para>Licensed under the Apache License, Version 2.0 (the "License");
- you may not use this file except in compliance with the License. You may
- obtain a copy of the License at <ulink
- url="http://www.apache.org/licenses/LICENSE-2.0">http://www.apache.org/licenses/LICENSE-2.0</ulink>.</para>
-
- <para>Unless required by applicable law or agreed to in writing,
- software distributed under the License is distributed on an "AS IS"
- BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
- implied. See the License for the specific language governing permissions
- and limitations under the License.</para>
- </legalnotice>
-
- <abstract>
- <para>This guide contains detailed information about creating
- distributed applications that use ZooKeeper. It discusses the basic
- operations ZooKeeper supports, and how these can be used to build
- higher-level abstractions. It contains solutions to common tasks, a
- troubleshooting guide, and links to other information.</para>
-
- <para>$Revision: 1.14 $ $Date: 2008/09/19 05:31:45 $</para>
- </abstract>
- </articleinfo>
-
- <section id="_introduction">
- <title>Introduction</title>
-
- <para>This document is a guide for developers wishing to create
- distributed applications that take advantage of ZooKeeper's coordination
- services. It contains conceptual and practical information.</para>
-
- <para>The first four sections of this guide present higher level
- discussions of various ZooKeeper concepts. These are necessary both for an
- understanding of how ZooKeeper works as well how to work with it. It does
- not contain source code, but it does assume a familiarity with the
- problems associated with distributed computing. The sections in this first
- group are:</para>
-
- <itemizedlist>
- <listitem>
- <para><xref linkend="ch_zkDataModel" /></para>
- </listitem>
-
- <listitem>
- <para><xref linkend="ch_zkSessions" /></para>
- </listitem>
-
- <listitem>
- <para><xref linkend="ch_zkWatches" /></para>
- </listitem>
-
- <listitem>
- <para><xref linkend="ch_zkGuarantees" /></para>
- </listitem>
- </itemizedlist>
-
- <para>The next four sections provide practical programming
- information. These are:</para>
-
- <itemizedlist>
- <listitem>
- <para><xref linkend="ch_guideToZkOperations" /></para>
- </listitem>
-
- <listitem>
- <para><xref linkend="ch_bindings" /></para>
- </listitem>
-
- <listitem>
- <para><xref linkend="ch_programStructureWithExample" />
- <emphasis>[tbd]</emphasis></para>
- </listitem>
-
- <listitem>
- <para><xref linkend="ch_gotchas" /></para>
- </listitem>
- </itemizedlist>
-
- <para>The book concludes with an <ulink
- url="#apx_linksToOtherInfo">appendix</ulink> containing links to other
- useful, ZooKeeper-related information.</para>
-
- <para>Most of information in this document is written to be accessible as
- stand-alone reference material. However, before starting your first
- ZooKeeper application, you should probably at least read the chaptes on
- the <ulink url="#ch_zkDataModel">ZooKeeper Data Model</ulink> and <ulink
- url="#ch_guideToZkOperations">ZooKeeper Basic Operations</ulink>. Also,
- the <ulink url="#ch_programStructureWithExample">Simple Programmming
- Example</ulink> <emphasis>[tbd]</emphasis> is helpful for understanding the basic
- structure of a ZooKeeper client application.</para>
- </section>
-
- <section id="ch_zkDataModel">
- <title>The ZooKeeper Data Model</title>
-
- <para>ZooKeeper has a hierarchal name space, much like a distributed file
- system. The only difference is that each node in the namespace can have
- data associated with it as well as children. It is like having a file
- system that allows a file to also be a directory. Paths to nodes are
- always expressed as canonical, absolute, slash-separated paths; there are
- no relative reference. Any unicode character can be used in a path subject
- to the following constraints:</para>
-
- <itemizedlist>
- <listitem>
- <para>The null character (\u0000) cannot be part of a path name. (This
- causes problems with the C binding.)</para>
- </listitem>
-
- <listitem>
- <para>The following characters can't be used because they don't
- display well, or render in confusing ways: \u0001 - \u0019 and \u007F
- - \u009F.</para>
- </listitem>
-
- <listitem>
- <para>The following characters are not allowed: \ud800 -uF8FFF,
- \uFFF0 - uFFFF.</para>
- </listitem>
-
- <listitem>
- <para>The "." character can be used as part of another name, but "."
- and ".." cannot alone be used to indicate a node along a path,
- because ZooKeeper doesn't use relative paths. The following would be
- invalid: "/a/b/./c" or "/a/b/../c".</para>
- </listitem>
-
- <listitem>
- <para>The token "zookeeper" is reserved.</para>
- </listitem>
- </itemizedlist>
-
- <section id="sc_zkDataModel_znodes">
- <title>ZNodes</title>
-
- <para>Every node in a ZooKeeper tree is referred to as a
- <emphasis>znode</emphasis>. Znodes maintain a stat structure that
- includes version numbers for data changes, acl changes. The stat
- structure also has timestamps. The version number, together with the
- timestamp, allows ZooKeeper to validate the cache and to coordinate
- updates. Each time a znode's data changes, the version number increases.
- For instance, whenever a client retrieves data, it also receives the
- version of the data. And when a client performs an update or a delete,
- it must supply the version of the data of the znode it is changing. If
- the version it supplies doesn't match the actual version of the data,
- the update will fail. (This behavior can be overridden. For more
- information see... )<emphasis>[tbd...]</emphasis></para>
-
- <note>
- <para>In distributed application engineering, the word
- <emphasis>node</emphasis> can refer to a generic host machine, a
- server, a member of an ensemble, a client process, etc. In the ZooKeeper
- documentation, <emphasis>znodes</emphasis> refer to the data nodes.
- <emphasis>Servers</emphasis> refer to machines that make up the
- ZooKeeper service; <emphasis>quorum peers</emphasis> refer to the
- servers that make up an ensemble; client refers to any host or process
- which uses a ZooKeeper service.</para>
- </note>
-
- <para> A znode is the main abstraction a programmer needs to be aware of. Znodes have
- several characteristics that are worth mentioning here.</para>
-
- <section id="sc_zkDataMode_watches">
- <title>Watches</title>
-
- <para>Clients can set watches on znodes. Changes to that znode trigger
- the watch and then clear the watch. When a watch triggers, ZooKeeper
- sends the client a notification. More information about watches can be
- found in the section
- <ulink url="#ch_zkWatches">ZooKeeper Watches</ulink>.</para>
- </section>
-
- <section>
- <title>Data Access</title>
-
- <para>The data stored at each znode in a namespace is read and written
- atomically. Reads get all the data bytes associated with a znode and a
- write replaces all the data. Each node has an Access Control List
- (ACL) that restricts who can do what.</para>
-
- <para>ZooKeeper was not designed to be a general database or large
- object store. Instead, it manages coordination data. This data can
- come in the form of configuration, status information, rendezvous, etc.
- A common property of the various forms of coordination data is that
- they are relatively small: measured in kilobytes.
- The ZooKeeper client and the server implementations have sanity checks
- to ensure that znodes have less than 1M of data, but the data should
- be much less than that on average. Operating on relatively large data
- sizes will cause some operations to take much more time than others and
- will affect the latencies of some operations because of the extra time
- needed to move more data over the network and onto storage media. If
- large data storage is needed, the usually pattern of dealing with such
- data is to store it on a bulk storage system, such as NFS or HDFS, and
- store pointers to the storage locations in ZooKeeper.</para>
- </section>
-
- <section>
- <title>Ephemeral Nodes</title>
-
- <para>ZooKeeper also has the notion of ephemeral nodes. These znodes
- exists as long as the session that created the znode is active. When
- the session ends the znode is deleted. Because of this behavior
- ephemeral znodes are not allowed to have children.</para>
- </section>
-
- <section>
- <title>Sequence Nodes -- Unique Naming</title>
-
- <para>When creating a znode you can also request that
- ZooKeeper append a monotonically increasing counter to the end
- of path. This counter is unique to the parent znode. The
- counter has a format of %010d -- that is 10 digits with 0
- (zero) padding (the counter is formatted in this way to
- simplify sorting), i.e. "<path>0000000001". See
- <ulink url="recipes.html#sc_recipes_Queues">Queue
- Recipe</ulink> for an example use of this feature. Note: the
- counter used to store the next sequence number is a signed int
- (4bytes) maintained by the parent node, the counter will
- overflow when incremented beyond 2147483647 (resulting in a
- name "<path>-2147483648").</para>
- </section>
- </section>
-
- <section id="sc_timeInZk">
- <title>Time in ZooKeeper</title>
-
- <para>ZooKeeper tracks time multiple ways:</para>
-
- <itemizedlist>
- <listitem>
- <para><emphasis role="bold">Zxid</emphasis></para>
-
- <para>Every change to the ZooKeeper state receives a stamp in the
- form of a <emphasis>zxid</emphasis> (ZooKeeper Transaction Id).
- This exposes the total ordering of all changes to ZooKeeper. Each
- change will have a unique zxid and if zxid1 is smaller than zxid2
- then zxid1 happened before zxid2.</para>
- </listitem>
-
- <listitem>
- <para><emphasis role="bold">Version numbers</emphasis></para>
-
- <para>Every change to a node will cause an increase to one of the
- version numbers of that node. The three version numbers are version
- (number of changes to the data of a znode), cversion (number of
- changes to the children of a znode), and aversion (number of changes
- to the ACL of a znode).</para>
- </listitem>
-
- <listitem>
- <para><emphasis role="bold">Ticks</emphasis></para>
-
- <para>When using multi-server ZooKeeper, servers use ticks to define
- timing of events such as status uploads, session timeouts,
- connection timeouts between peers, etc. The tick time is only
- indirectly exposed through the minimum session timeout (2 times the
- tick time); if a client requests a session timeout less than the
- minimum session timeout, the server will tell the client that the
- session timeout is actually the minimum session timeout.</para>
- </listitem>
-
- <listitem>
- <para><emphasis role="bold">Real time</emphasis></para>
-
- <para>ZooKeeper doesn't use real time, or clock time, at all except
- to put timestamps into the stat structure on znode creation and
- znode modification.</para>
- </listitem>
- </itemizedlist>
- </section>
-
- <section id="sc_zkStatStructure">
- <title>ZooKeeper Stat Structure</title>
-
- <para>The Stat structure for each znode in ZooKeeper is made up of the
- following fields:</para>
-
- <itemizedlist>
- <listitem>
- <para><emphasis role="bold">czxid</emphasis></para>
-
- <para>The zxid of the change that caused this znode to be
- created.</para>
- </listitem>
-
- <listitem>
- <para><emphasis role="bold">mzxid</emphasis></para>
-
- <para>The zxid of the change that last modified this znode.</para>
- </listitem>
-
- <listitem>
- <para><emphasis role="bold">pzxid</emphasis></para>
-
- <para>The zxid of the change that last modified children of this znode.</para>
- </listitem>
-
- <listitem>
- <para><emphasis role="bold">ctime</emphasis></para>
-
- <para>The time in milliseconds from epoch when this znode was
- created.</para>
- </listitem>
-
- <listitem>
- <para><emphasis role="bold">mtime</emphasis></para>
-
- <para>The time in milliseconds from epoch when this znode was last
- modified.</para>
- </listitem>
-
- <listitem>
- <para><emphasis role="bold">version</emphasis></para>
-
- <para>The number of changes to the data of this znode.</para>
- </listitem>
-
- <listitem>
- <para><emphasis role="bold">cversion</emphasis></para>
-
- <para>The number of changes to the children of this znode.</para>
- </listitem>
-
- <listitem>
- <para><emphasis role="bold">aversion</emphasis></para>
-
- <para>The number of changes to the ACL of this znode.</para>
- </listitem>
-
- <listitem>
- <para><emphasis role="bold">ephemeralOwner</emphasis></para>
-
- <para>The session id of the owner of this znode if the znode is an
- ephemeral node. If it is not an ephemeral node, it will be
- zero.</para>
- </listitem>
-
- <listitem>
- <para><emphasis role="bold">dataLength</emphasis></para>
-
- <para>The length of the data field of this znode.</para>
- </listitem>
-
- <listitem>
- <para><emphasis role="bold">numChildren</emphasis></para>
-
- <para>The number of children of this znode.</para>
- </listitem>
-
- </itemizedlist>
- </section>
- </section>
-
- <section id="ch_zkSessions">
- <title>ZooKeeper Sessions</title>
-
- <para>A ZooKeeper client establishes a session with the ZooKeeper
- service by creating a handle to the service using a language
- binding. Once created, the handle starts of in the CONNECTING state
- and the client library tries to connect to one of the servers that
- make up the ZooKeeper service at which point it switches to the
- CONNECTED state. During normal operation will be in one of these
- two states. If an unrecoverable error occurs, such as session
- expiration or authentication failure, or if the application explicitly
- closes the handle, the handle will move to the CLOSED state.
- The following figure shows the possible state transitions of a
- ZooKeeper client:</para>
-
- <mediaobject id="fg_states" >
- <imageobject>
- <imagedata fileref="images/state_dia.jpg"/>
- </imageobject>
- </mediaobject>
-
- <para>To create a client session the application code must provide
- a connection string containing a comma separated list of host:port pairs,
- each corresponding to a ZooKeeper server (e.g. "127.0.0.1:4545" or
- "127.0.0.1:3000,127.0.0.1:3001,127.0.0.1:3002"). The ZooKeeper
- client library will pick an arbitrary server and try to connect to
- it. If this connection fails, or if the client becomes
- disconnected from the server for any reason, the client will
- automatically try the next server in the list, until a connection
- is (re-)established.</para>
-
- <para> <emphasis role="bold">Added in 3.2.0</emphasis>: An
- optional "chroot" suffix may also be appended to the connection
- string. This will run the client commands while interpreting all
- paths relative to this root (similar to the unix chroot
- command). If used the example would look like:
- "127.0.0.1:4545/app/a" or
- "127.0.0.1:3000,127.0.0.1:3001,127.0.0.1:3002/app/a" where the
- client would be rooted at "/app/a" and all paths would be relative
- to this root - ie getting/setting/etc... "/foo/bar" would result
- in operations being run on "/app/a/foo/bar" (from the server
- perspective). This feature is particularly useful in multi-tenant
- environments where each user of a particular ZooKeeper service
- could be rooted differently. This makes re-use much simpler as
- each user can code his/her application as if it were rooted at
- "/", while actual location (say /app/a) could be determined at
- deployment time.</para>
-
- <para>When a client gets a handle to the ZooKeeper service,
- ZooKeeper creates a ZooKeeper session, represented as a 64-bit
- number, that it assigns to the client. If the client connects to a
- different ZooKeeper server, it will send the session id as a part
- of the connection handshake. As a security measure, the server
- creates a password for the session id that any ZooKeeper server
- can validate.The password is sent to the client with the session
- id when the client establishes the session. The client sends this
- password with the session id whenever it reestablishes the session
- with a new server.</para>
-
- <para>One of the parameters to the ZooKeeper client library call
- to create a ZooKeeper session is the session timeout in
- milliseconds. The client sends a requested timeout, the server
- responds with the timeout that it can give the client. The current
- implementation requires that the timeout be a minimum of 2 times
- the tickTime (as set in the server configuration) and a maximum of
- 20 times the tickTime. The ZooKeeper client API allows access to
- the negotiated timeout.</para>
-
- <para>When a client (session) becomes partitioned from the ZK
- serving cluster it will begin searching the list of servers that
- were specified during session creation. Eventually, when
- connectivity between the client and at least one of the servers is
- re-established, the session will either again transition to the
- "connected" state (if reconnected within the session timeout
- value) or it will transition to the "expired" state (if
- reconnected after the session timeout). It is not advisable to
- create a new session object (a new ZooKeeper.class or zookeeper
- handle in the c binding) for disconnection. The ZK client library
- will handle reconnect for you. In particular we have heuristics
- built into the client library to handle things like "herd effect",
- etc... Only create a new session when you are notified of session
- expiration (mandatory).</para>
-
- <para>Session expiration is managed by the ZooKeeper cluster
- itself, not by the client. When the ZK client establishes a
- session with the cluster it provides a "timeout" value detailed
- above. This value is used by the cluster to determine when the
- client's session expires. Expirations happens when the cluster
- does not hear from the client within the specified session timeout
- period (i.e. no heartbeat). At session expiration the cluster will
- delete any/all ephemeral nodes owned by that session and
- immediately notify any/all connected clients of the change (anyone
- watching those znodes). At this point the client of the expired
- session is still disconnected from the cluster, it will not be
- notified of the session expiration until/unless it is able to
- re-establish a connection to the cluster. The client will stay in
- disconnected state until the TCP connection is re-established with
- the cluster, at which point the watcher of the expired session
- will receive the "session expired" notification.</para>
-
- <para>Example state transitions for an expired session as seen by
- the expired session's watcher:</para>
-
- <orderedlist>
- <listitem><para>'connected' : session is established and client
- is communicating with cluster (client/server communication is
- operating properly)</para></listitem>
- <listitem><para>.... client is partitioned from the
- cluster</para></listitem>
- <listitem><para>'disconnected' : client has lost connectivity
- with the cluster</para></listitem>
- <listitem><para>.... time elapses, after 'timeout' period the
- cluster expires the session, nothing is seen by client as it is
- disconnected from cluster</para></listitem>
- <listitem><para>.... time elapses, the client regains network
- level connectivity with the cluster</para></listitem>
- <listitem><para>'expired' : eventually the client reconnects to
- the cluster, it is then notified of the
- expiration</para></listitem>
- </orderedlist>
-
- <para>Another parameter to the ZooKeeper session establishment
- call is the default watcher. Watchers are notified when any state
- change occurs in the client. For example if the client loses
- connectivity to the server the client will be notified, or if the
- client's session expires, etc... This watcher should consider the
- initial state to be disconnected (i.e. before any state changes
- events are sent to the watcher by the client lib). In the case of
- a new connection, the first event sent to the watcher is typically
- the session connection event.</para>
-
- <para>The session is kept alive by requests sent by the client. If
- the session is idle for a period of time that would timeout the
- session, the client will send a PING request to keep the session
- alive. This PING request not only allows the ZooKeeper server to
- know that the client is still active, but it also allows the
- client to verify that its connection to the ZooKeeper server is
- still active. The timing of the PING is conservative enough to
- ensure reasonable time to detect a dead connection and reconnect
- to a new server.</para>
-
- <para>
- Once a connection to the server is successfully established
- (connected) there are basically two cases where the client lib generates
- connectionloss (the result code in c binding, exception in Java -- see
- the API documentation for binding specific details) when either a synchronous or
- asynchronous operation is performed and one of the following holds:
- </para>
-
- <orderedlist>
- <listitem><para>The application calls an operation on a session that is no
- longer alive/valid</para></listitem>
- <listitem><para>The ZooKeeper client disconnects from a server when there
- are pending operations to that server, i.e., there is a pending asynchronous call.
- </para></listitem>
- </orderedlist>
-
- <para> <emphasis role="bold">Added in 3.2.0 -- SessionMovedException</emphasis>. There is an internal
- exception that is generally not seen by clients called the SessionMovedException.
- This exception occurs because a request was received on a connection for a session
- which has been reestablished on a different server. The normal cause of this error is
- a client that sends a request to a server, but the network packet gets delayed, so
- the client times out and connects to a new server. When the delayed packet arrives at
- the first server, the old server detects that the session has moved, and closes the
- client connection. Clients normally do not see this error since they do not read
- from those old connections. (Old connections are usually closed.) One situation in which this
- condition can be seen is when two clients try to reestablish the same connection using
- a saved session id and password. One of the clients will reestablish the connection
- and the second client will be disconnected (causing the pair to attempt to re-establish
- its connection/session indefinitely).</para>
-
- </section>
-
- <section id="ch_zkWatches">
- <title>ZooKeeper Watches</title>
-
- <para>All of the read operations in ZooKeeper - <emphasis
- role="bold">getData()</emphasis>, <emphasis
- role="bold">getChildren()</emphasis>, and <emphasis
- role="bold">exists()</emphasis> - have the option of setting a watch as a
- side effect. Here is ZooKeeper's definition of a watch: a watch event is
- one-time trigger, sent to the client that set the watch, which occurs when
- the data for which the watch was set changes. There are three key points
- to consider in this definition of a watch:</para>
-
- <itemizedlist>
- <listitem>
- <para><emphasis role="bold">One-time trigger</emphasis></para>
-
- <para>One watch event will be sent to the client when the data has changed.
- For example, if a client does a getData("/znode1", true) and later the
- data for /znode1 is changed or deleted, the client will get a watch
- event for /znode1. If /znode1 changes again, no watch event will be
- sent unless the client has done another read that sets a new
- watch.</para>
- </listitem>
-
- <listitem>
- <para><emphasis role="bold">Sent to the client</emphasis></para>
-
- <para>This implies that an event is on the way to the client, but may
- not reach the client before the successful return code to the change
- operation reaches the client that initiated the change. Watches are
- sent asynchronously to watchers. ZooKeeper provides an ordering
- guarantee: a client will never see a change for which it has set a
- watch until it first sees the watch event. Network delays or other
- factors may cause different clients to see watches and return codes
- from updates at different times. The key point is that everything seen
- by the different clients will have a consistent order.</para>
- </listitem>
-
- <listitem>
- <para><emphasis role="bold">The data for which the watch was
- set</emphasis></para>
-
- <para>This refers to the different ways a node can change. It
- helps to think of ZooKeeper as maintaining two lists of
- watches: data watches and child watches. getData() and
- exists() set data watches. getChildren() sets child
- watches. Alternatively, it may help to think of watches being
- set according to the kind of data returned. getData() and
- exists() return information about the data of the node,
- whereas getChildren() returns a list of children. Thus,
- setData() will trigger data watches for the znode being set
- (assuming the set is successful). A successful create() will
- trigger a data watch for the znode being created and a child
- watch for the parent znode. A successful delete() will trigger
- both a data watch and a child watch (since there can be no
- more children) for a znode being deleted as well as a child
- watch for the parent znode.</para>
- </listitem>
- </itemizedlist>
-
- <para>Watches are maintained locally at the ZooKeeper server to which the
- client is connected. This allows watches to be lightweight to set,
- maintain, and dispatch. When a client connects to a new server, the watch
- will be triggered for any session events. Watches will not be received
- while disconnected from a server. When a client reconnects, any previously
- registered watches will be reregistered and triggered if needed. In
- general this all occurs transparently. There is one case where a watch
- may be missed: a watch for the existence of a znode not yet created will
- be missed if the znode is created and deleted while disconnected.</para>
-
- <section id="sc_WatchSemantics">
- <title>Semantics of Watches</title>
-
- <para> We can set watches with the three calls that read the state of
- ZooKeeper: exists, getData, and getChildren. The following list details
- the events that a watch can trigger and the calls that enable them:
- </para>
-
- <itemizedlist>
- <listitem>
- <para><emphasis role="bold">Created event:</emphasis></para>
- <para>Enabled with a call to exists.</para>
- </listitem>
-
- <listitem>
- <para><emphasis role="bold">Deleted event:</emphasis></para>
- <para>Enabled with a call to exists, getData, and getChildren.</para>
- </listitem>
-
- <listitem>
- <para><emphasis role="bold">Changed event:</emphasis></para>
- <para>Enabled with a call to exists and getData.</para>
- </listitem>
-
- <listitem>
- <para><emphasis role="bold">Child event:</emphasis></para>
- <para>Enabled with a call to getChildren.</para>
- </listitem>
- </itemizedlist>
- </section>
-
- <section id="sc_WatchGuarantees">
- <title>What ZooKeeper Guarantees about Watches</title>
-
- <para>With regard to watches, ZooKeeper maintains these
- guarantees:</para>
-
- <itemizedlist>
- <listitem>
- <para>Watches are ordered with respect to other events, other
- watches, and asynchronous replies. The ZooKeeper client libraries
- ensures that everything is dispatched in order.</para>
- </listitem>
- </itemizedlist>
-
- <itemizedlist>
- <listitem>
- <para>A client will see a watch event for a znode it is watching
- before seeing the new data that corresponds to that znode.</para>
- </listitem>
- </itemizedlist>
-
- <itemizedlist>
- <listitem>
- <para>The order of watch events from ZooKeeper corresponds to the
- order of the updates as seen by the ZooKeeper service.</para>
- </listitem>
- </itemizedlist>
- </section>
-
- <section id="sc_WatchRememberThese">
- <title>Things to Remember about Watches</title>
-
- <itemizedlist>
- <listitem>
- <para>Watches are one time triggers; if you get a watch event and
- you want to get notified of future changes, you must set another
- watch.</para>
- </listitem>
- </itemizedlist>
-
- <itemizedlist>
- <listitem>
- <para>Because watches are one time triggers and there is latency
- between getting the event and sending a new request to get a watch
- you cannot reliably see every change that happens to a node in
- ZooKeeper. Be prepared to handle the case where the znode changes
- multiple times between getting the event and setting the watch
- again. (You may not care, but at least realize it may
- happen.)</para>
- </listitem>
- </itemizedlist>
-
- <itemizedlist>
- <listitem>
- <para>A watch object, or function/context pair, will only be
- triggered once for a given notification. For example, if the same
- watch object is registered for an exists and a getData call for the
- same file and that file is then deleted, the watch object would
- only be invoked once with the deletion notification for the file.
- </para>
- </listitem>
- </itemizedlist>
-
- <itemizedlist>
- <listitem>
- <para>When you disconnect from a server (for example, when the
- server fails), you will not get any watches until the connection
- is reestablished. For this reason session events are sent to all
- outstanding watch handlers. Use session events to go into a safe
- mode: you will not be receiving events while disconnected, so your
- process should act conservatively in that mode.</para>
- </listitem>
- </itemizedlist>
- </section>
- </section>
-
- <section id="sc_ZooKeeperAccessControl">
- <title>ZooKeeper access control using ACLs</title>
-
- <para>ZooKeeper uses ACLs to control access to its znodes (the
- data nodes of a ZooKeeper data tree). The ACL implementation is
- quite similar to UNIX file access permissions: it employs
- permission bits to allow/disallow various operations against a
- node and the scope to which the bits apply. Unlike standard UNIX
- permissions, a ZooKeeper node is not limited by the three standard
- scopes for user (owner of the file), group, and world
- (other). ZooKeeper does not have a notion of an owner of a
- znode. Instead, an ACL specifies sets of ids and permissions that
- are associated with those ids.</para>
-
- <para>Note also that an ACL pertains only to a specific znode. In
- particular it does not apply to children. For example, if
- <emphasis>/app</emphasis> is only readable by ip:172.16.16.1 and
- <emphasis>/app/status</emphasis> is world readable, anyone will
- be able to read <emphasis>/app/status</emphasis>; ACLs are not
- recursive.</para>
-
- <para>ZooKeeper supports pluggable authentication schemes. Ids are
- specified using the form <emphasis>scheme:id</emphasis>,
- where <emphasis>scheme</emphasis> is a the authentication scheme
- that the id corresponds to. For
- example, <emphasis>ip:172.16.16.1</emphasis> is an id for a
- host with the address <emphasis>172.16.16.1</emphasis>.</para>
-
- <para>When a client connects to ZooKeeper and authenticates
- itself, ZooKeeper associates all the ids that correspond to a
- client with the clients connection. These ids are checked against
- the ACLs of znodes when a clients tries to access a node. ACLs are
- made up of pairs of <emphasis>(scheme:expression,
- perms)</emphasis>. The format of
- the <emphasis>expression</emphasis> is specific to the scheme. For
- example, the pair <emphasis>(ip:19.22.0.0/16, READ)</emphasis>
- gives the <emphasis>READ</emphasis> permission to any clients with
- an IP address that starts with 19.22.</para>
-
- <section id="sc_ACLPermissions">
- <title>ACL Permissions</title>
-
- <para>ZooKeeper supports the following permissions:</para>
-
- <itemizedlist>
- <listitem><para><emphasis role="bold">CREATE</emphasis>: you can create a child node</para></listitem>
- <listitem><para><emphasis role="bold">READ</emphasis>: you can get data from a node and list its children.</para></listitem>
- <listitem><para><emphasis role="bold">WRITE</emphasis>: you can set data for a node</para></listitem>
- <listitem><para><emphasis role="bold">DELETE</emphasis>: you can delete a child node</para></listitem>
- <listitem><para><emphasis role="bold">ADMIN</emphasis>: you can set permissions</para></listitem>
- </itemizedlist>
-
- <para>The <emphasis>CREATE</emphasis>
- and <emphasis>DELETE</emphasis> permissions have been broken out
- of the <emphasis>WRITE</emphasis> permission for finer grained
- access controls. The cases for <emphasis>CREATE</emphasis>
- and <emphasis>DELETE</emphasis> are the following:</para>
-
- <para>You want A to be able to do a set on a ZooKeeper node, but
- not be able to <emphasis>CREATE</emphasis>
- or <emphasis>DELETE</emphasis> children.</para>
-
- <para><emphasis>CREATE</emphasis>
- without <emphasis>DELETE</emphasis>: clients create requests by
- creating ZooKeeper nodes in a parent directory. You want all
- clients to be able to add, but only request processor can
- delete. (This is kind of like the APPEND permission for
- files.)</para>
-
- <para>Also, the <emphasis>ADMIN</emphasis> permission is there
- since ZooKeeper doesn’t have a notion of file owner. In some
- sense the <emphasis>ADMIN</emphasis> permission designates the
- entity as the owner. ZooKeeper doesn’t support the LOOKUP
- permission (execute permission bit on directories to allow you
- to LOOKUP even though you can't list the directory). Everyone
- implicitly has LOOKUP permission. This allows you to stat a
- node, but nothing more. (The problem is, if you want to call
- zoo_exists() on a node that doesn't exist, there is no
- permission to check.)</para>
-
- <section id="sc_BuiltinACLSchemes">
- <title>Builtin ACL Schemes</title>
-
- <para>ZooKeeeper has the following built in schemes:</para>
-
- <itemizedlist>
- <listitem><para><emphasis role="bold">world</emphasis> has a
- single id, <emphasis>anyone</emphasis>, that represents
- anyone.</para></listitem>
-
- <listitem><para><emphasis role="bold">auth</emphasis> doesn't
- use any id, represents any authenticated
- user.</para></listitem>
-
- <listitem><para><emphasis role="bold">digest</emphasis> uses
- a <emphasis>username:password</emphasis> string to generate
- MD5 hash which is then used as an ACL ID
- identity. Authentication is done by sending
- the <emphasis>username:password</emphasis> in clear text. When
- used in the ACL the expression will be
- the <emphasis>username:base64</emphasis>
- encoded <emphasis>SHA1</emphasis>
- password <emphasis>digest</emphasis>.</para>
- </listitem>
-
- <listitem><para><emphasis role="bold">ip</emphasis> uses the
- client host IP as an ACL ID identity. The ACL expression is of
- the form <emphasis>addr/bits</emphasis> where the most
- significant <emphasis>bits</emphasis>
- of <emphasis>addr</emphasis> are matched against the most
- significant <emphasis>bits</emphasis> of the client host
- IP.</para></listitem>
-
- </itemizedlist>
- </section>
-
- <section>
- <title>ZooKeeper C client API</title>
-
- <para>The following constants are provided by the ZooKeeper C
- library:</para>
-
- <itemizedlist>
- <listitem><para><emphasis>const</emphasis> <emphasis>int</emphasis> ZOO_PERM_READ; //can read node’s value and list its children</para></listitem>
- <listitem><para><emphasis>const</emphasis> <emphasis>int</emphasis> ZOO_PERM_WRITE;// can set the node’s value</para></listitem>
- <listitem><para><emphasis>const</emphasis> <emphasis>int</emphasis> ZOO_PERM_CREATE; //can create children</para></listitem>
- <listitem><para><emphasis>const</emphasis> <emphasis>int</emphasis> ZOO_PERM_DELETE;// can delete children</para></listitem>
- <listitem><para><emphasis>const</emphasis> <emphasis>int</emphasis> ZOO_PERM_ADMIN; //can execute set_acl()</para></listitem>
- <listitem><para><emphasis>const</emphasis> <emphasis>int</emphasis> ZOO_PERM_ALL;// all of the above flags OR’d together</para></listitem>
- </itemizedlist>
-
- <para>The following are the standard ACL IDs:</para>
-
- <itemizedlist>
- <listitem><para><emphasis>struct</emphasis> Id ZOO_ANYONE_ID_UNSAFE; //(‘world’,’anyone’)</para></listitem>
- <listitem><para><emphasis>struct</emphasis> Id ZOO_AUTH_IDS;// (‘auth’,’’)</para></listitem>
- </itemizedlist>
-
- <para>ZOO_AUTH_IDS empty identity string should be interpreted as “the identity of the creator”.</para>
-
- <para>ZooKeeper client comes with three standard ACLs:</para>
-
- <itemizedlist>
- <listitem><para><emphasis>struct</emphasis> ACL_vector ZOO_OPEN_ACL_UNSAFE; //(ZOO_PERM_ALL,ZOO_ANYONE_ID_UNSAFE)</para></listitem>
- <listitem><para><emphasis>struct</emphasis> ACL_vector ZOO_READ_ACL_UNSAFE;// (ZOO_PERM_READ, ZOO_ANYONE_ID_UNSAFE)</para></listitem>
- <listitem><para><emphasis>struct</emphasis> ACL_vector ZOO_CREATOR_ALL_ACL; //(ZOO_PERM_ALL,ZOO_AUTH_IDS)</para></listitem>
- </itemizedlist>
-
- <para>The ZOO_OPEN_ACL_UNSAFE is completely open free for all
- ACL: any application can execute any operation on the node and
- can create, list and delete its children. The
- ZOO_READ_ACL_UNSAFE is read-only access for any
- application. CREATE_ALL_ACL grants all permissions to the
- creator of the node. The creator must have been authenticated by
- the server (for example, using “<emphasis>digest</emphasis>”
- scheme) before it can create nodes with this ACL.</para>
-
- <para>The following ZooKeeper operations deal with ACLs:</para>
-
- <itemizedlist><listitem>
- <para><emphasis>int</emphasis> <emphasis>zoo_add_auth</emphasis>
- (zhandle_t *zh,<emphasis>const</emphasis> <emphasis>char</emphasis>*
- scheme,<emphasis>const</emphasis> <emphasis>char</emphasis>*
- cert, <emphasis>int</emphasis> certLen, void_completion_t
- completion, <emphasis>const</emphasis> <emphasis>void</emphasis>
- *data);</para>
- </listitem></itemizedlist>
-
- <para>The application uses the zoo_add_auth function to
- authenticate itself to the server. The function can be called
- multiple times if the application wants to authenticate using
- different schemes and/or identities.</para>
-
- <itemizedlist><listitem>
- <para><emphasis>int</emphasis> <emphasis>zoo_create</emphasis>
- (zhandle_t *zh, <emphasis>const</emphasis> <emphasis>char</emphasis>
- *path, <emphasis>const</emphasis> <emphasis>char</emphasis>
- *value,<emphasis>int</emphasis>
- valuelen, <emphasis>const</emphasis> <emphasis>struct</emphasis>
- ACL_vector *acl, <emphasis>int</emphasis>
- flags,<emphasis>char</emphasis>
- *realpath, <emphasis>int</emphasis>
- max_realpath_len);</para>
- </listitem></itemizedlist>
-
- <para>zoo_create(...) operation creates a new node. The acl
- parameter is a list of ACLs associated with the node. The parent
- node must have the CREATE permission bit set.</para>
-
- <itemizedlist><listitem>
- <para><emphasis>int</emphasis> <emphasis>zoo_get_acl</emphasis>
- (zhandle_t *zh, <emphasis>const</emphasis> <emphasis>char</emphasis>
- *path,<emphasis>struct</emphasis> ACL_vector
- *acl, <emphasis>struct</emphasis> Stat *stat);</para>
- </listitem></itemizedlist>
-
- <para>This operation returns a node’s ACL info.</para>
-
- <itemizedlist><listitem>
- <para><emphasis>int</emphasis> <emphasis>zoo_set_acl</emphasis>
- (zhandle_t *zh, <emphasis>const</emphasis> <emphasis>char</emphasis>
- *path, <emphasis>int</emphasis>
- version,<emphasis>const</emphasis> <emphasis>struct</emphasis>
- ACL_vector *acl);</para>
- </listitem></itemizedlist>
-
- <para>This function replaces node’s ACL list with a new one. The
- node must have the ADMIN permission set.</para>
-
- <para>Here is a sample code that makes use of the above APIs to
- authenticate itself using the “<emphasis>foo</emphasis>” scheme
- and create an ephemeral node “/xyz” with create-only
- permissions.</para>
-
- <note><para>This is a very simple example which is intended to show
- how to interact with ZooKeeper ACLs
- specifically. See <filename>.../trunk/src/c/src/cli.c</filename>
- for an example of a C client implementation</para>
- </note>
-
- <programlisting>
-#include <string.h>
-#include <errno.h>
-
-#include "zookeeper.h"
-
-static zhandle_t *zh;
-
-/**
- * In this example this method gets the cert for your
- * environment -- you must provide
- */
-char *foo_get_cert_once(char* id) { return 0; }
-
-/** Watcher function -- empty for this example, not something you should
- * do in real code */
-void watcher(zhandle_t *zzh, int type, int state, const char *path,
- void *watcherCtx) {}
-
-int main(int argc, char argv) {
- char buffer[512];
- char p[2048];
- char *cert=0;
- char appId[64];
-
- strcpy(appId, "example.foo_test");
- cert = foo_get_cert_once(appId);
- if(cert!=0) {
- fprintf(stderr,
- "Certificate for appid [%s] is [%s]\n",appId,cert);
- strncpy(p,cert, sizeof(p)-1);
- free(cert);
- } else {
- fprintf(stderr, "Certificate for appid [%s] not found\n",appId);
- strcpy(p, "dummy");
- }
-
- zoo_set_debug_level(ZOO_LOG_LEVEL_DEBUG);
-
- zh = zookeeper_init("localhost:3181", watcher, 10000, 0, 0, 0);
- if (!zh) {
- return errno;
- }
- if(zoo_add_auth(zh,"foo",p,strlen(p),0,0)!=ZOK)
- return 2;
-
- struct ACL CREATE_ONLY_ACL[] = {{ZOO_PERM_CREATE, ZOO_AUTH_IDS}};
- struct ACL_vector CREATE_ONLY = {1, CREATE_ONLY_ACL};
- int rc = zoo_create(zh,"/xyz","value", 5, &CREATE_ONLY, ZOO_EPHEMERAL,
- buffer, sizeof(buffer)-1);
-
- /** this operation will fail with a ZNOAUTH error */
- int buflen= sizeof(buffer);
- struct Stat stat;
- rc = zoo_get(zh, "/xyz", 0, buffer, &buflen, &stat);
- if (rc) {
- fprintf(stderr, "Error %d for %s\n", rc, __LINE__);
- }
-
- zookeeper_close(zh);
- return 0;
-}
- </programlisting>
- </section>
- </section>
- </section>
-
- <section id="sc_ZooKeeperPluggableAuthentication">
- <title>Pluggable ZooKeeper authentication</title>
-
- <para>ZooKeeper runs in a variety of different environments with
- various different authentication schemes, so it has a completely
- pluggable authentication framework. Even the builtin authentication
- schemes use the pluggable authentication framework.</para>
-
- <para>To understand how the authentication framework works, first you must
- understand the two main authentication operations. The framework
- first must authenticate the client. This is usually done as soon as
- the client connects to a server and consists of validating information
- sent from or gathered about a client and associating it with the connection.
- The second operation handled by the framework is finding the entries in an
- ACL that correspond to client. ACL entries are <<emphasis>idspec,
- permissions</emphasis>> pairs. The <emphasis>idspec</emphasis> may be
- a simple string match against the authentication information associated
- with the connection or it may be a expression that is evaluated against that
- information. It is up to the implementation of the authentication plugin
- to do the match. Here is the interface that an authentication plugin must
- implement:</para>
-
- <programlisting>
-public interface AuthenticationProvider {
- String getScheme();
- KeeperException.Code handleAuthentication(ServerCnxn cnxn, byte authData[]);
- boolean isValid(String id);
- boolean matches(String id, String aclExpr);
- boolean isAuthenticated();
-}
- </programlisting>
-
- <para>The first method <emphasis>getScheme</emphasis> returns the string
- that identifies the plugin. Because we support multiple methods of authentication,
- an authentication credential or an <emphasis>idspec</emphasis> will always be
- prefixed with <emphasis>scheme:</emphasis>. The ZooKeeper server uses the scheme
- returned by the authentication plugin to determine which ids the scheme
- applies to.</para>
-
- <para><emphasis>handleAuthentication</emphasis> is called when a client
- sends authentication information to be associated with a connection. The
- client specifies the scheme to which the information corresponds. The
- ZooKeeper server passes the information to the authentication plugin whose
- <emphasis>getScheme</emphasis> matches the scheme passed by the client. The
- implementor of <emphasis>handleAuthentication</emphasis> will usually return
- an error if it determines that the information is bad, or it will associate information
- with the connection using <emphasis>cnxn.getAuthInfo().add(new Id(getScheme(), data))</emphasis>.
- </para>
-
- <para>The authentication plugin is involved in both setting and using ACLs. When an
- ACL is set for a znode, the ZooKeeper server will pass the id part of the entry to
- the <emphasis>isValid(String id)</emphasis> method. It is up to the plugin to verify
- that the id has a correct form. For example, <emphasis>ip:172.16.0.0/16</emphasis>
- is a valid id, but <emphasis>ip:host.com</emphasis> is not. If the new ACL includes
- an "auth" entry, <emphasis>isAuthenticated</emphasis> is used to see if the
- authentication information for this scheme that is assocatied with the connection
- should be added to the ACL. Some schemes
- should not be included in auth. For example, the IP address of the client is not
- considered as an id that should be added to the ACL if auth is specified.</para>
-
- <para>ZooKeeper invokes
- <emphasis>matches(String id, String aclExpr)</emphasis> when checking an ACL. It
- needs to match authentication information of the client against the relevant ACL
- entries. To find the entries which apply to the client, the ZooKeeper server will
- find the scheme of each entry and if there is authentication information
- from that client for that scheme, <emphasis>matches(String id, String aclExpr)</emphasis>
- will be called with <emphasis>id</emphasis> set to the authentication information
- that was previously added to the connection by <emphasis>handleAuthentication</emphasis> and
- <emphasis>aclExpr</emphasis> set to the id of the ACL entry. The authentication plugin
- uses its own logic and matching scheme to determine if <emphasis>id</emphasis> is included
- in <emphasis>aclExpr</emphasis>.
- </para>
-
- <para>There are two built in authentication plugins: <emphasis>ip</emphasis> and
- <emphasis>digest</emphasis>. Additional plugins can adding using system properties. At
- startup the ZooKeeper server will look for system properties that start with
- "zookeeper.authProvider." and interpret the value of those properties as the class name
- of an authentication plugin. These properties can be set using the
- <emphasis>-Dzookeeeper.authProvider.X=com.f.MyAuth</emphasis> or adding entries such as
- the following in the server configuration file:</para>
-
- <programlisting>
-authProvider.1=com.f.MyAuth
-authProvider.2=com.f.MyAuth2
- </programlisting>
-
- <para>Care should be taking to ensure that the suffix on the property is unique. If there are
- duplicates such as <emphasis>-Dzookeeeper.authProvider.X=com.f.MyAuth -Dzookeeper.authProvider.X=com.f.MyAuth2</emphasis>,
- only one will be used. Also all servers must have the same plugins defined, otherwise clients using
- the authentication schemes provided by the plugins will have problems connecting to some servers.
- </para>
- </section>
-
- <section id="ch_zkGuarantees">
- <title>Consistency Guarantees</title>
-
- <para>ZooKeeper is a high performance, scalable service. Both reads and
- write operations are designed to be fast, though reads are faster than
- writes. The reason for this is that in the case of reads, ZooKeeper can
- serve older data, which in turn is due to ZooKeeper's consistency
- guarantees:</para>
-
- <variablelist>
- <varlistentry>
- <term>Sequential Consistency</term>
-
- <listitem>
- <para>Updates from a client will be applied in the order that they
- were sent.</para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>Atomicity</term>
-
- <listitem>
- <para>Updates either succeed or fail -- there are no partial
- results.</para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>Single System Image</term>
-
- <listitem>
- <para>A client will see the same view of the service regardless of
- the server that it connects to.</para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>Reliability</term>
-
- <listitem>
- <para>Once an update has been applied, it will persist from that
- time forward until a client overwrites the update. This guarantee
- has two corollaries:</para>
-
- <orderedlist>
- <listitem>
- <para>If a client gets a successful return code, the update will
- have been applied. On some failures (communication errors,
- timeouts, etc) the client will not know if the update has
- applied or not. We take steps to minimize the failures, but the
- guarantee is only present with successful return codes.
- (This is called the <emphasis>monotonicity condition</emphasis> in Paxos.)</para>
- </listitem>
-
- <listitem>
- <para>Any updates that are seen by the client, through a read
- request or successful update, will never be rolled back when
- recovering from server failures.</para>
- </listitem>
- </orderedlist>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>Timeliness</term>
-
- <listitem>
- <para>The clients view of the system is guaranteed to be up-to-date
- within a certain time bound (on the order of tens of seconds).
- Either system changes will be seen by a client within this bound, or
- the client will detect a service outage.</para>
- </listitem>
- </varlistentry>
- </variablelist>
-
- <para>Using these consistency guarantees it is easy to build higher level
- functions such as leader election, barriers, queues, and read/write
- revocable locks solely at the ZooKeeper client (no additions needed to
- ZooKeeper). See <ulink url="recipes.html">Recipes and Solutions</ulink>
- for more details.</para>
-
- <note>
- <para>Sometimes developers mistakenly assume one other guarantee that
- ZooKeeper does <emphasis>not</emphasis> in fact make. This is:</para>
-
- <variablelist>
- <varlistentry>
- <term>Simultaneously Consistent Cross-Client Views</term>
-
- <listitem>
- <para>ZooKeeper does not guarantee that at every instance in
- time, two different clients will have identical views of
- ZooKeeper data. Due to factors like network delays, one client
- may perform an update before another client gets notified of the
- change. Consider the scenario of two clients, A and B. If client
- A sets the value of a znode /a from 0 to 1, then tells client B
- to read /a, client B may read the old value of 0, depending on
- which server it is connected to. If it
- is important that Client A and Client B read the same value,
- Client B should should call the <emphasis
- role="bold">sync()</emphasis> method from the ZooKeeper API
- method before it performs its read.</para>
-
- <para>So, ZooKeeper by itself doesn't guarantee that changes occur
- synchronously across all servers, but ZooKeeper
- primitives can be used to construct higher level functions that
- provide useful client synchronization. (For more information,
- see the <ulink
- url="recipes.html">ZooKeeper Recipes</ulink>.
- <emphasis>[tbd:..]</emphasis>).</para>
- </listitem>
- </varlistentry>
- </variablelist>
- </note>
- </section>
-
- <section id="ch_bindings">
- <title>Bindings</title>
-
- <para>The ZooKeeper client libraries come in two languages: Java and C.
- The following sections describe these.</para>
-
- <section>
- <title>Java Binding</title>
-
- <para>There are two packages that make up the ZooKeeper Java binding:
- <emphasis role="bold">org.apache.zookeeper</emphasis> and <emphasis
- role="bold">org.apache.zookeeper.data</emphasis>. The rest of the
- packages that make up ZooKeeper are used internally or are part of the
- server implementation. The <emphasis
- role="bold">org.apache.zookeeper.data</emphasis> package is made up of
- generated classes that are used simply as containers.</para>
-
- <para>The main class used by a ZooKeeper Java client is the <emphasis
- role="bold">ZooKeeper</emphasis> class. Its two constructors differ only
- by an optional session id and password. ZooKeeper supports session
- recovery accross instances of a process. A Java program may save its
- session id and password to stable storage, restart, and recover the
- session that was used by the earlier instance of the program.</para>
-
- <para>When a ZooKeeper object is created, two threads are created as
- well: an IO thread and an event thread. All IO happens on the IO thread
- (using Java NIO). All event callbacks happen on the event thread.
- Session maintenance such as reconnecting to ZooKeeper servers and
- maintaining heartbeat is done on the IO thread. Responses for
- synchronous methods are also processed in the IO thread. All responses
- to asynchronous methods and watch events are processed on the event
- thread. There are a few things to notice that result from this
- design:</para>
-
- <itemizedlist>
- <listitem>
- <para>All completions for asynchronous calls and watcher callbacks
- will be made in order, one at a time. The caller can do any
- processing they wish, but no other callbacks will be processed
- during that time.</para>
- </listitem>
-
- <listitem>
- <para>Callbacks do not block the processing of the IO thread or the
- processing of the synchronous calls.</para>
- </listitem>
-
- <listitem>
- <para>Synchronous calls may not return in the correct order. For
- example, assume a client does the following processing: issues an
- asynchronous read of node <emphasis role="bold">/a</emphasis> with
- <emphasis>watch</emphasis> set to true, and then in the completion
- callback of the read it does a synchronous read of <emphasis
- role="bold">/a</emphasis>. (Maybe not good practice, but not illegal
- either, and it makes for a simple example.)</para>
-
- <para>Note that if there is a change to <emphasis
- role="bold">/a</emphasis> between the asynchronous read and the
- synchronous read, the client library will receive the watch event
- saying <emphasis role="bold">/a</emphasis> changed before the
- response for the synchronous read, but because the completion
- callback is blocking the event queue, the synchronous read will
- return with the new value of <emphasis role="bold">/a</emphasis>
- before the watch event is processed.</para>
- </listitem>
- </itemizedlist>
-
- <para>Finally, the rules associated with shutdown are straightforward:
- once a ZooKeeper object is closed or receives a fatal event
- (SESSION_EXPIRED and AUTH_FAILED), the ZooKeeper object becomes invalid.
- On a close, the two threads shut down and any further access on zookeeper
- handle is undefined behavior and should be avoided. </para>
- </section>
-
- <section>
- <title>C Binding</title>
-
- <para>The C binding has a single-threaded and multi-threaded library.
- The multi-threaded library is easiest to use and is most similar to the
- Java API. This library will create an IO thread and an event dispatch
- thread for handling connection maintenance and callbacks. The
- single-threaded library allows ZooKeeper to be used in event driven
- applications by exposing the event loop used in the multi-threaded
- library.</para>
-
- <para>The package includes two shared libraries: zookeeper_st and
- zookeeper_mt. The former only provides the asynchronous APIs and
- callbacks for integrating into the application's event loop. The only
- reason this library exists is to support the platforms were a
- <emphasis>pthread</emphasis> library is not available or is unstable
- (i.e. FreeBSD 4.x). In all other cases, application developers should
- link with zookeeper_mt, as it includes support for both Sync and Async
- API.</para>
-
- <section>
- <title>Installation</title>
-
- <para>If you're building the client from a check-out from the Apache
- repository, follow the steps outlined below. If you're building from a
- project source package downloaded from apache, skip to step <emphasis
- role="bold">3</emphasis>.</para>
-
- <orderedlist>
- <listitem>
- <para>Run <command>ant compile_jute</command> from the ZooKeeper
- top level directory (<filename>.../trunk</filename>).
- This will create a directory named "generated" under
- <filename>.../trunk/src/c</filename>.</para>
- </listitem>
-
- <listitem>
- <para>Change directory to the<filename>.../trunk/src/c</filename>
- and run <command>autoreconf -if</command> to bootstrap <emphasis
- role="bold">autoconf</emphasis>, <emphasis
- role="bold">automake</emphasis> and <emphasis
- role="bold">libtool</emphasis>. Make sure you have <emphasis
- role="bold">autoconf version 2.59</emphasis> or greater installed.
- Skip to step<emphasis role="bold"> 4</emphasis>.</para>
- </listitem>
-
- <listitem>
- <para>If you are building from a project source package,
- unzip/untar the source tarball and cd to the<filename>
- zookeeper-x.x.x/src/c</filename> directory.</para>
- </listitem>
-
- <listitem>
- <para>Run <command>./configure <your-options></command> to
- generate the makefile. Here are some of options the <emphasis
- role="bold">configure</emphasis> utility supports that can be
- useful in this step:</para>
-
- <itemizedlist>
- <listitem>
- <para><command>--enable-debug</command></para>
-
- <para>Enables optimization and enables debug info compiler
- options. (Disabled by default.)</para>
- </listitem>
-
- <listitem>
- <para><command>--without-syncapi </command></para>
-
- <para>Disables Sync API support; zookeeper_mt library won't be
- built. (Enabled by default.)</para>
- </listitem>
-
- <listitem>
- <para><command>--disable-static </command></para>
-
- <para>Do not build static libraries. (Enabled by
- default.)</para>
- </listitem>
-
- <listitem>
- <para><command>--disable-shared</command></para>
-
- <para>Do not build shared libraries. (Enabled by
- default.)</para>
- </listitem>
- </itemizedlist>
-
- <note>
- <para>See INSTALL for general information about running
- <emphasis role="bold">configure</emphasis>.</para>
- </note>
- </listitem>
-
- <listitem>
- <para>Run <command>make</command> or <command>make
- install</command> to build the libraries and install them.</para>
- </listitem>
-
- <listitem>
- <para>To generate doxygen documentation for the ZooKeeper API, run
- <command>make doxygen-doc</command>. All documentation will be
- placed in a new subfolder named docs. By default, this command
- only generates HTML. For information on other document formats,
- run <command>./configure --help</command></para>
- </listitem>
- </orderedlist>
- </section>
-
- <section>
- <title>Building Your Own C Client</title>
-
- <para>In order to be able to use the ZooKeeper API in your application
- you have to remember to</para>
-
- <orderedlist>
- <listitem>
- <para>Include ZooKeeper header: #include
- <zookeeper/zookeeper.h></para>
- </listitem>
-
- <listitem>
- <para>If you are building a multithreaded client, compile with
- -DTHREADED compiler flag to enable the multi-threaded version of
- the library, and then link against against the
- <emphasis>zookeeper_mt</emphasis> library. If you are building a
- single-threaded client, do not compile with -DTHREADED, and be
- sure to link against the<emphasis> zookeeper_st
- </emphasis>library.</para>
- </listitem>
- </orderedlist>
-
- <note><para>
- See <filename>.../trunk/src/c/src/cli.c</filename>
- for an example of a C client implementation</para>
- </note>
- </section>
- </section>
- </section>
-
- <section id="ch_guideToZkOperations">
- <title>Building Blocks: A Guide to ZooKeeper Operations</title>
-
- <para>This section surveys all the operations a developer can perform
- against a ZooKeeper server. It is lower level information than the earlier
- concepts chapters in this manual, but higher level than the ZooKeeper API
- Reference. It covers these topics:</para>
-
- <itemizedlist>
- <listitem>
- <para><xref linkend="sc_connectingToZk" /></para>
- </listitem>
- </itemizedlist>
-
- <section id="sc_errorsZk">
- <title>Handling Errors</title>
-
- <para>Both the Java and C client bindings may report errors. The Java client binding does so by throwing KeeperException, calling code() on the exception will return the specific error code. The C client binding returns an error code as defined in the enum ZOO_ERRORS. API callbacks indicate result code for both language bindings. See the API documentation (javadoc for Java, doxygen for C) for full details on the possible errors and their meaning.</para>
- </section>
-
- <section id="sc_connectingToZk">
- <title>Connecting to ZooKeeper</title>
-
- <para></para>
- </section>
-
- <section id="sc_readOps">
- <title>Read Operations</title>
-
- <para></para>
- </section>
-
- <section id="sc_writeOps">
- <title>Write Operations</title>
-
- <para></para>
- </section>
-
- <section id="sc_handlingWatches">
- <title>Handling Watches</title>
-
- <para></para>
- </section>
-
- <section id="sc_miscOps">
- <title>Miscelleaneous ZooKeeper Operations</title>
- <para></para>
- </section>
-
-
- </section>
-
- <section id="ch_programStructureWithExample">
- <title>Program Structure, with Simple Example</title>
-
- <para><emphasis>[tbd]</emphasis></para>
- </section>
-
- <section id="ch_gotchas">
- <title>Gotchas: Common Problems and Troubleshooting</title>
-
- <para>So now you know ZooKeeper. It's fast, simple, your application
- works, but wait ... something's wrong. Here are some pitfalls that
- ZooKeeper users fall into:</para>
-
- <orderedlist>
- <listitem>
- <para>If you are using watches, you must look for the connected watch
- event. When a ZooKeeper client disconnects from a server, you will
- not receive notification of changes until reconnected. If you are
- watching for a znode to come into existence, you will miss the event
- if the znode is created and deleted while you are disconnected.</para>
- </listitem>
-
- <listitem>
- <para>You must test ZooKeeper server failures. The ZooKeeper service
- can survive failures as long as a majority of servers are active. The
- question to ask is: can your application handle it? In the real world
- a client's connection to ZooKeeper can break. (ZooKeeper server
- failures and network partitions are common reasons for connection
- loss.) The ZooKeeper client library takes care of recovering your
- connection and letting you know what happened, but you must make sure
- that you recover your state and any outstanding requests that failed.
- Find out if you got it right in the test lab, not in production - test
- with a ZooKeeper service made up of a several of servers and subject
- them to reboots.</para>
- </listitem>
-
- <listitem>
- <para>The list of ZooKeeper servers used by the client must match the
- list of ZooKeeper servers that each ZooKeeper server has. Things can
- work, although not optimally, if the client list is a subset of the
- real list of ZooKeeper servers, but not if the client lists ZooKeeper
- servers not in the ZooKeeper cluster.</para>
- </listitem>
-
- <listitem>
- <para>Be careful where you put that transaction log. The most
- performance-critical part of ZooKeeper is the transaction log.
- ZooKeeper must sync transactions to media before it returns a
- response. A dedicated transaction log device is key to consistent good
- performance. Putting the log on a busy device will adversely effect
- performance. If you only have one storage device, put trace files on
- NFS and increase the snapshotCount; it doesn't eliminate the problem,
- but it can mitigate it.</para>
- </listitem>
-
- <listitem>
- <para>Set your Java max heap size correctly. It is very important to
- <emphasis>avoid swapping.</emphasis> Going to disk unnecessarily will
- almost certainly degrade your performance unacceptably. Remember, in
- ZooKeeper, everything is ordered, so if one request hits the disk, all
- other queued requests hit the disk.</para>
-
- <para>To avoid swapping, try to set the heapsize to the amount of
- physical memory you have, minus the amount needed by the OS and cache.
- The best way to determine an optimal heap size for your configurations
- is to <emphasis>run load tests</emphasis>. If for some reason you
- can't, be conservative in your estimates and choose a number well
- below the limit that would cause your machine to swap. For example, on
- a 4G machine, a 3G heap is a conservative estimate to start
- with.</para>
- </listitem>
- </orderedlist>
- </section>
-
- <appendix id="apx_linksToOtherInfo">
- <title>Links to Other Information</title>
-
- <para>Outside the formal documentation, there're several other sources of
- information for ZooKeeper developers.</para>
-
- <variablelist>
- <varlistentry>
- <term>ZooKeeper Whitepaper <emphasis>[tbd: find url]</emphasis></term>
-
- <listitem>
- <para>The definitive discussion of ZooKeeper design and performance,
- by Yahoo! Research</para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>API Reference <emphasis>[tbd: find url]</emphasis></term>
-
- <listitem>
- <para>The complete reference to the ZooKeeper API</para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term><ulink
- url="http://us.dl1.yimg.com/download.yahoo.com/dl/ydn/zookeeper.m4v">ZooKeeper
- Talk at the Hadoup Summit 2008</ulink></term>
-
- <listitem>
- <para>A video introduction to ZooKeeper, by Benjamin Reed of Yahoo!
- Research</para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term><ulink
- url="https://cwiki.apache.org/confluence/display/ZOOKEEPER/Tutorial">Barrier and
- Queue Tutorial</ulink></term>
-
- <listitem>
- <para>The excellent Java tutorial by Flavio Junqueira, implementing
- simple barriers and producer-consumer queues using ZooKeeper.</para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term><ulink
- url="https://cwiki.apache.org/confluence/display/ZOOKEEPER/ZooKeeperArticles">ZooKeeper
- - A Reliable, Scalable Distributed Coordination System</ulink></term>
-
- <listitem>
- <para>An article by Todd Hoff (07/15/2008)</para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term><ulink url="recipes.html">ZooKeeper Recipes</ulink></term>
-
- <listitem>
- <para>Pseudo-level discussion of the implementation of various
- synchronization solutions with ZooKeeper: Event Handles, Queues,
- Locks, and Two-phase Commits.</para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term><emphasis>[tbd]</emphasis></term>
-
- <listitem>
- <para>Any other good sources anyone can think of...</para>
- </listitem>
- </varlistentry>
- </variablelist>
- </appendix>
-</article>
http://git-wip-us.apache.org/repos/asf/zookeeper/blob/c1efa954/src/docs/src/documentation/content/xdocs/zookeeperQuotas.xml
----------------------------------------------------------------------
diff --git a/src/docs/src/documentation/content/xdocs/zookeeperQuotas.xml b/src/docs/src/documentation/content/xdocs/zookeeperQuotas.xml
deleted file mode 100644
index 7668e6a..0000000
--- a/src/docs/src/documentation/content/xdocs/zookeeperQuotas.xml
+++ /dev/null
@@ -1,71 +0,0 @@
-<?xml version="1.0" encoding="UTF-8"?>
- <!--
- Copyright 2002-2004 The Apache Software Foundation Licensed under the
- Apache License, Version 2.0 (the "License"); you may not use this file
- except in compliance with the License. You may obtain a copy of the
- License at http://www.apache.org/licenses/LICENSE-2.0 Unless required
- by applicable law or agreed to in writing, software distributed under
- the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR
- CONDITIONS OF ANY KIND, either express or implied. See the License for
- the specific language governing permissions and limitations under the
- License.
- -->
- <!DOCTYPE article PUBLIC "-//OASIS//DTD Simplified DocBook XML V1.0//EN"
- "http://www.oasis-open.org/docbook/xml/simple/1.0/sdocbook.dtd">
-<article id="bk_Quota">
- <title>ZooKeeper Quota's Guide</title>
- <subtitle>A Guide to Deployment and Administration</subtitle>
- <articleinfo>
- <legalnotice>
- <para>
- Licensed under the Apache License, Version 2.0 (the "License"); you
- may not use this file except in compliance with the License. You may
- obtain a copy of the License at
- <ulink url="http://www.apache.org/licenses/LICENSE-2.0">http://www.apache.org/licenses/LICENSE-2.0
- </ulink>
- .
- </para>
- <para>Unless required by applicable law or agreed to in
- writing, software distributed under the License is distributed on an
- "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either
- express or implied. See the License for the specific language
- governing permissions and limitations under the License.</para>
- </legalnotice>
- <abstract>
- <para>This document contains information about deploying,
- administering and mantaining ZooKeeper. It also discusses best
- practices and common problems.</para>
- </abstract>
- </articleinfo>
- <section id="zookeeper_quotas">
- <title>Quotas</title>
- <para> ZooKeeper has both namespace and bytes quotas. You can use the ZooKeeperMain class to setup quotas.
- ZooKeeper prints <emphasis>WARN</emphasis> messages if users exceed the quota assigned to them. The messages
- are printed in the log of the ZooKeeper.
- </para>
- <para><computeroutput>$ bin/zkCli.sh -server host:port</computeroutput></para>
- <para> The above command gives you a command line option of using quotas.</para>
- <section>
- <title>Setting Quotas</title>
- <para>You can use
- <emphasis>setquota</emphasis> to set a quota on a ZooKeeper node. It has an option of setting quota with
- -n (for namespace)
- and -b (for bytes). </para>
- <para> The ZooKeeper quota are stored in ZooKeeper itself in /zookeeper/quota. To disable other people from
- changing the quota's set the ACL for /zookeeper/quota such that only admins are able to read and write to it.
- </para>
- </section>
- <section>
- <title>Listing Quotas</title>
- <para> You can use
- <emphasis>listquota</emphasis> to list a quota on a ZooKeeper node.
- </para>
- </section>
- <section>
- <title> Deleting Quotas</title>
- <para> You can use
- <emphasis>delquota</emphasis> to delete quota on a ZooKeeper node.
- </para>
- </section>
- </section>
- </article>
[07/12] zookeeper git commit: ZOOKEEPER-3022: MAVEN MIGRATION 3.4 -
Iteration 1 - docs, it
Posted by an...@apache.org.
http://git-wip-us.apache.org/repos/asf/zookeeper/blob/c1efa954/src/docs/src/documentation/content/xdocs/zookeeperStarted.xml
----------------------------------------------------------------------
diff --git a/src/docs/src/documentation/content/xdocs/zookeeperStarted.xml b/src/docs/src/documentation/content/xdocs/zookeeperStarted.xml
deleted file mode 100644
index 70c227f..0000000
--- a/src/docs/src/documentation/content/xdocs/zookeeperStarted.xml
+++ /dev/null
@@ -1,418 +0,0 @@
-<?xml version="1.0" encoding="UTF-8"?>
-<!--
- Copyright 2002-2004 The Apache Software Foundation
-
- Licensed under the Apache License, Version 2.0 (the "License");
- you may not use this file except in compliance with the License.
- You may obtain a copy of the License at
-
- http://www.apache.org/licenses/LICENSE-2.0
-
- Unless required by applicable law or agreed to in writing, software
- distributed under the License is distributed on an "AS IS" BASIS,
- WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- See the License for the specific language governing permissions and
- limitations under the License.
--->
-
-<!DOCTYPE article PUBLIC "-//OASIS//DTD Simplified DocBook XML V1.0//EN"
-"http://www.oasis-open.org/docbook/xml/simple/1.0/sdocbook.dtd">
-<article id="bk_GettStartedGuide">
- <title>ZooKeeper Getting Started Guide</title>
-
- <articleinfo>
- <legalnotice>
- <para>Licensed under the Apache License, Version 2.0 (the "License");
- you may not use this file except in compliance with the License. You may
- obtain a copy of the License at <ulink
- url="http://www.apache.org/licenses/LICENSE-2.0">http://www.apache.org/licenses/LICENSE-2.0</ulink>.</para>
-
- <para>Unless required by applicable law or agreed to in writing,
- software distributed under the License is distributed on an "AS IS"
- BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
- implied. See the License for the specific language governing permissions
- and limitations under the License.</para>
- </legalnotice>
-
- <abstract>
- <para>This guide contains detailed information about creating
- distributed applications that use ZooKeeper. It discusses the basic
- operations ZooKeeper supports, and how these can be used to build
- higher-level abstractions. It contains solutions to common tasks, a
- troubleshooting guide, and links to other information.</para>
- </abstract>
- </articleinfo>
-
- <section id="ch_GettingStarted">
- <title>Getting Started: Coordinating Distributed Applications with
- ZooKeeper</title>
-
- <para>This document contains information to get you started quickly with
- ZooKeeper. It is aimed primarily at developers hoping to try it out, and
- contains simple installation instructions for a single ZooKeeper server, a
- few commands to verify that it is running, and a simple programming
- example. Finally, as a convenience, there are a few sections regarding
- more complicated installations, for example running replicated
- deployments, and optimizing the transaction log. However for the complete
- instructions for commercial deployments, please refer to the <ulink
- url="zookeeperAdmin.html">ZooKeeper
- Administrator's Guide</ulink>.</para>
-
- <section id="sc_Prerequisites">
- <title>Pre-requisites</title>
-
- <para>See <ulink url="zookeeperAdmin.html#sc_systemReq">
- System Requirements</ulink> in the Admin guide.</para>
- </section>
-
- <section id="sc_Download">
- <title>Download</title>
-
- <para>To get a ZooKeeper distribution, download a recent
- <ulink url="http://zookeeper.apache.org/releases.html">
- stable</ulink> release from one of the Apache Download
- Mirrors.</para>
- </section>
-
- <section id="sc_InstallingSingleMode">
- <title>Standalone Operation</title>
-
- <para>Setting up a ZooKeeper server in standalone mode is
- straightforward. The server is contained in a single JAR file,
- so installation consists of creating a configuration.</para>
-
- <para>Once you've downloaded a stable ZooKeeper release unpack
- it and cd to the root</para>
-
- <para>To start ZooKeeper you need a configuration file. Here is a sample,
- create it in <emphasis role="bold">conf/zoo.cfg</emphasis>:</para>
-
-<programlisting>
-tickTime=2000
-dataDir=/var/lib/zookeeper
-clientPort=2181
-</programlisting>
-
- <para>This file can be called anything, but for the sake of this
- discussion call
- it <emphasis role="bold">conf/zoo.cfg</emphasis>. Change the
- value of <emphasis role="bold">dataDir</emphasis> to specify an
- existing (empty to start with) directory. Here are the meanings
- for each of the fields:</para>
-
- <variablelist>
- <varlistentry>
- <term><emphasis role="bold">tickTime</emphasis></term>
-
- <listitem>
- <para>the basic time unit in milliseconds used by ZooKeeper. It is
- used to do heartbeats and the minimum session timeout will be
- twice the tickTime.</para>
- </listitem>
- </varlistentry>
- </variablelist>
-
- <variablelist>
- <varlistentry>
- <term><emphasis role="bold">dataDir</emphasis></term>
-
- <listitem>
- <para>the location to store the in-memory database snapshots and,
- unless specified otherwise, the transaction log of updates to the
- database.</para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term><emphasis role="bold">clientPort</emphasis></term>
-
- <listitem>
- <para>the port to listen for client connections</para>
- </listitem>
- </varlistentry>
- </variablelist>
-
- <para>Now that you created the configuration file, you can start
- ZooKeeper:</para>
-
- <programlisting>bin/zkServer.sh start</programlisting>
-
- <para>ZooKeeper logs messages using log4j -- more detail
- available in the
- <ulink url="zookeeperProgrammers.html#Logging">Logging</ulink>
- section of the Programmer's Guide. You will see log messages
- coming to the console (default) and/or a log file depending on
- the log4j configuration.</para>
-
- <para>The steps outlined here run ZooKeeper in standalone mode. There is
- no replication, so if ZooKeeper process fails, the service will go down.
- This is fine for most development situations, but to run ZooKeeper in
- replicated mode, please see <ulink
- url="#sc_RunningReplicatedZooKeeper">Running Replicated
- ZooKeeper</ulink>.</para>
- </section>
-
- <section id="sc_FileManagement">
- <title>Managing ZooKeeper Storage</title>
- <para>For long running production systems ZooKeeper storage must
- be managed externally (dataDir and logs). See the section on
- <ulink
- url="zookeeperAdmin.html#sc_maintenance">maintenance</ulink> for
- more details.</para>
- </section>
-
- <section id="sc_ConnectingToZooKeeper">
- <title>Connecting to ZooKeeper</title>
-
- <programlisting>$ bin/zkCli.sh -server 127.0.0.1:2181</programlisting>
-
- <para>This lets you perform simple, file-like operations.</para>
-
- <para>Once you have connected, you should see something like:
- </para>
- <programlisting>
-<![CDATA[
-Connecting to localhost:2181
-log4j:WARN No appenders could be found for logger (org.apache.zookeeper.ZooKeeper).
-log4j:WARN Please initialize the log4j system properly.
-Welcome to ZooKeeper!
-JLine support is enabled
-[zkshell: 0]
-]]> </programlisting>
- <para>
- From the shell, type <command>help</command> to get a listing of commands that can be executed from the client, as in:
- </para>
- <programlisting>
-<![CDATA[
-[zkshell: 0] help
-ZooKeeper host:port cmd args
- get path [watch]
- ls path [watch]
- set path data [version]
- delquota [-n|-b] path
- quit
- printwatches on|off
- createpath data acl
- stat path [watch]
- listquota path
- history
- setAcl path acl
- getAcl path
- sync path
- redo cmdno
- addauth scheme auth
- delete path [version]
- setquota -n|-b val path
-
-]]> </programlisting>
- <para>From here, you can try a few simple commands to get a feel for this simple command line interface. First, start by issuing the list command, as
- in <command>ls</command>, yielding:
- </para>
- <programlisting>
-<![CDATA[
-[zkshell: 8] ls /
-[zookeeper]
-]]> </programlisting>
- <para>Next, create a new znode by running <command>create /zk_test my_data</command>. This creates a new znode and associates the string "my_data" with the node.
- You should see:</para>
- <programlisting>
-<![CDATA[
-[zkshell: 9] create /zk_test my_data
-Created /zk_test
-]]> </programlisting>
- <para> Issue another <command>ls /</command> command to see what the directory looks like:
- </para>
- <programlisting>
-<![CDATA[
-[zkshell: 11] ls /
-[zookeeper, zk_test]
-
-]]> </programlisting><para>
- Notice that the zk_test directory has now been created.
- </para>
- <para>Next, verify that the data was associated with the znode by running the <command>get</command> command, as in:
- </para>
- <programlisting>
-<![CDATA[
-[zkshell: 12] get /zk_test
-my_data
-cZxid = 5
-ctime = Fri Jun 05 13:57:06 PDT 2009
-mZxid = 5
-mtime = Fri Jun 05 13:57:06 PDT 2009
-pZxid = 5
-cversion = 0
-dataVersion = 0
-aclVersion = 0
-ephemeralOwner = 0
-dataLength = 7
-numChildren = 0
-]]> </programlisting>
- <para>We can change the data associated with zk_test by issuing the <command>set</command> command, as in:
- </para>
- <programlisting>
-<![CDATA[
-[zkshell: 14] set /zk_test junk
-cZxid = 5
-ctime = Fri Jun 05 13:57:06 PDT 2009
-mZxid = 6
-mtime = Fri Jun 05 14:01:52 PDT 2009
-pZxid = 5
-cversion = 0
-dataVersion = 1
-aclVersion = 0
-ephemeralOwner = 0
-dataLength = 4
-numChildren = 0
-[zkshell: 15] get /zk_test
-junk
-cZxid = 5
-ctime = Fri Jun 05 13:57:06 PDT 2009
-mZxid = 6
-mtime = Fri Jun 05 14:01:52 PDT 2009
-pZxid = 5
-cversion = 0
-dataVersion = 1
-aclVersion = 0
-ephemeralOwner = 0
-dataLength = 4
-numChildren = 0
-]]> </programlisting>
- <para>
- (Notice we did a <command>get</command> after setting the data and it did, indeed, change.</para>
- <para>Finally, let's <command>delete</command> the node by issuing:
- </para>
- <programlisting>
-<![CDATA[
-[zkshell: 16] delete /zk_test
-[zkshell: 17] ls /
-[zookeeper]
-[zkshell: 18]
-]]></programlisting>
- <para>That's it for now. To explore more, continue with the rest of this document and see the <ulink url="zookeeperProgrammers.html">Programmer's Guide</ulink>. </para>
- </section>
-
- <section id="sc_ProgrammingToZooKeeper">
- <title>Programming to ZooKeeper</title>
-
- <para>ZooKeeper has a Java bindings and C bindings. They are
- functionally equivalent. The C bindings exist in two variants: single
- threaded and multi-threaded. These differ only in how the messaging loop
- is done. For more information, see the <ulink
- url="zookeeperProgrammers.html#ch_programStructureWithExample">Programming
- Examples in the ZooKeeper Programmer's Guide</ulink> for
- sample code using of the different APIs.</para>
- </section>
-
- <section id="sc_RunningReplicatedZooKeeper">
- <title>Running Replicated ZooKeeper</title>
-
- <para>Running ZooKeeper in standalone mode is convenient for evaluation,
- some development, and testing. But in production, you should run
- ZooKeeper in replicated mode. A replicated group of servers in the same
- application is called a <emphasis>quorum</emphasis>, and in replicated
- mode, all servers in the quorum have copies of the same configuration
- file.</para>
- <note>
- <para>
- For replicated mode, a minimum of three servers are required,
- and it is strongly recommended that you have an odd number of
- servers. If you only have two servers, then you are in a
- situation where if one of them fails, there are not enough
- machines to form a majority quorum. Two servers is inherently
- <emphasis role="bold">less</emphasis>
- stable than a single server, because there are two single
- points of failure.
- </para>
- </note>
- <para>
- The required
- <emphasis role="bold">conf/zoo.cfg</emphasis>
- file for replicated mode is similar to the one used in standalone
- mode, but with a few differences. Here is an example:
- </para>
-
-<programlisting>
-tickTime=2000
-dataDir=/var/lib/zookeeper
-clientPort=2181
-initLimit=5
-syncLimit=2
-server.1=zoo1:2888:3888
-server.2=zoo2:2888:3888
-server.3=zoo3:2888:3888
-</programlisting>
-
- <para>The new entry, <emphasis role="bold">initLimit</emphasis> is
- timeouts ZooKeeper uses to limit the length of time the ZooKeeper
- servers in quorum have to connect to a leader. The entry <emphasis
- role="bold">syncLimit</emphasis> limits how far out of date a server can
- be from a leader.</para>
-
- <para>With both of these timeouts, you specify the unit of time using
- <emphasis role="bold">tickTime</emphasis>. In this example, the timeout
- for initLimit is 5 ticks at 2000 milleseconds a tick, or 10
- seconds.</para>
-
- <para>The entries of the form <emphasis>server.X</emphasis> list the
- servers that make up the ZooKeeper service. When the server starts up,
- it knows which server it is by looking for the file
- <emphasis>myid</emphasis> in the data directory. That file has the
- contains the server number, in ASCII.</para>
-
- <para>Finally, note the two port numbers after each server
- name: " 2888" and "3888". Peers use the former port to connect
- to other peers. Such a connection is necessary so that peers
- can communicate, for example, to agree upon the order of
- updates. More specifically, a ZooKeeper server uses this port
- to connect followers to the leader. When a new leader arises, a
- follower opens a TCP connection to the leader using this
- port. Because the default leader election also uses TCP, we
- currently require another port for leader election. This is the
- second port in the server entry.
- </para>
-
- <note>
- <para>If you want to test multiple servers on a single
- machine, specify the servername
- as <emphasis>localhost</emphasis> with unique quorum &
- leader election ports (i.e. 2888:3888, 2889:3889, 2890:3890 in
- the example above) for each server.X in that server's config
- file. Of course separate <emphasis>dataDir</emphasis>s and
- distinct <emphasis>clientPort</emphasis>s are also necessary
- (in the above replicated example, running on a
- single <emphasis>localhost</emphasis>, you would still have
- three config files).</para>
- <para>Please be aware that setting up multiple servers on a single
- machine will not create any redundancy. If something were to
- happen which caused the machine to die, all of the zookeeper
- servers would be offline. Full redundancy requires that each
- server have its own machine. It must be a completely separate
- physical server. Multiple virtual machines on the same physical
- host are still vulnerable to the complete failure of that host.</para>
- </note>
- </section>
-
- <section>
- <title>Other Optimizations</title>
-
- <para>There are a couple of other configuration parameters that can
- greatly increase performance:</para>
-
- <itemizedlist>
- <listitem>
- <para>To get low latencies on updates it is important to
- have a dedicated transaction log directory. By default
- transaction logs are put in the same directory as the data
- snapshots and <emphasis>myid</emphasis> file. The dataLogDir
- parameters indicates a different directory to use for the
- transaction logs.</para>
- </listitem>
-
- <listitem>
- <para><emphasis>[tbd: what is the other config param?]</emphasis></para>
- </listitem>
- </itemizedlist>
- </section>
- </section>
-</article>
http://git-wip-us.apache.org/repos/asf/zookeeper/blob/c1efa954/src/docs/src/documentation/content/xdocs/zookeeperTutorial.xml
----------------------------------------------------------------------
diff --git a/src/docs/src/documentation/content/xdocs/zookeeperTutorial.xml b/src/docs/src/documentation/content/xdocs/zookeeperTutorial.xml
deleted file mode 100644
index 77cca8f..0000000
--- a/src/docs/src/documentation/content/xdocs/zookeeperTutorial.xml
+++ /dev/null
@@ -1,712 +0,0 @@
-<?xml version="1.0" encoding="UTF-8"?>
-<!--
- Copyright 2002-2004 The Apache Software Foundation
-
- Licensed under the Apache License, Version 2.0 (the "License");
- you may not use this file except in compliance with the License.
- You may obtain a copy of the License at
-
- http://www.apache.org/licenses/LICENSE-2.0
-
- Unless required by applicable law or agreed to in writing, software
- distributed under the License is distributed on an "AS IS" BASIS,
- WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- See the License for the specific language governing permissions and
- limitations under the License.
--->
-
-<!DOCTYPE article PUBLIC "-//OASIS//DTD Simplified DocBook XML V1.0//EN"
-"http://www.oasis-open.org/docbook/xml/simple/1.0/sdocbook.dtd">
-<article id="ar_Tutorial">
- <title>Programming with ZooKeeper - A basic tutorial</title>
-
- <articleinfo>
- <legalnotice>
- <para>Licensed under the Apache License, Version 2.0 (the "License");
- you may not use this file except in compliance with the License. You may
- obtain a copy of the License at <ulink
- url="http://www.apache.org/licenses/LICENSE-2.0">http://www.apache.org/licenses/LICENSE-2.0</ulink>.</para>
-
- <para>Unless required by applicable law or agreed to in writing,
- software distributed under the License is distributed on an "AS IS"
- BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
- implied. See the License for the specific language governing permissions
- and limitations under the License.</para>
- </legalnotice>
-
- <abstract>
- <para>This article contains sample Java code for simple implementations of barrier
- and consumers queues..</para>
-
- </abstract>
- </articleinfo>
-
- <section id="ch_Introduction">
- <title>Introduction</title>
-
- <para>In this tutorial, we show simple implementations of barriers and
- producer-consumer queues using ZooKeeper. We call the respective classes Barrier and Queue.
- These examples assume that you have at least one ZooKeeper server running.</para>
-
- <para>Both primitives use the following common excerpt of code:</para>
-
- <programlisting>
- static ZooKeeper zk = null;
- static Integer mutex;
-
- String root;
-
- SyncPrimitive(String address) {
- if(zk == null){
- try {
- System.out.println("Starting ZK:");
- zk = new ZooKeeper(address, 3000, this);
- mutex = new Integer(-1);
- System.out.println("Finished starting ZK: " + zk);
- } catch (IOException e) {
- System.out.println(e.toString());
- zk = null;
- }
- }
- }
-
- synchronized public void process(WatchedEvent event) {
- synchronized (mutex) {
- mutex.notify();
- }
- }
-</programlisting>
-
-<para>Both classes extend SyncPrimitive. In this way, we execute steps that are
-common to all primitives in the constructor of SyncPrimitive. To keep the examples
-simple, we create a ZooKeeper object the first time we instantiate either a barrier
-object or a queue object, and we declare a static variable that is a reference
-to this object. The subsequent instances of Barrier and Queue check whether a
-ZooKeeper object exists. Alternatively, we could have the application creating a
-ZooKeeper object and passing it to the constructor of Barrier and Queue.</para>
-<para>
-We use the process() method to process notifications triggered due to watches.
-In the following discussion, we present code that sets watches. A watch is internal
-structure that enables ZooKeeper to notify a client of a change to a node. For example,
-if a client is waiting for other clients to leave a barrier, then it can set a watch and
-wait for modifications to a particular node, which can indicate that it is the end of the wait.
-This point becomes clear once we go over the examples.
-</para>
-</section>
-
- <section id="sc_barriers"><title>Barriers</title>
-
- <para>
- A barrier is a primitive that enables a group of processes to synchronize the
- beginning and the end of a computation. The general idea of this implementation
- is to have a barrier node that serves the purpose of being a parent for individual
- process nodes. Suppose that we call the barrier node "/b1". Each process "p" then
- creates a node "/b1/p". Once enough processes have created their corresponding
- nodes, joined processes can start the computation.
- </para>
-
- <para>In this example, each process instantiates a Barrier object, and its constructor takes as parameters:</para>
-
- <itemizedlist><listitem><para>the address of a ZooKeeper server (e.g., "zoo1.foo.com:2181")</para></listitem>
-<listitem><para>the path of the barrier node on ZooKeeper (e.g., "/b1")</para></listitem>
-<listitem><para>the size of the group of processes</para></listitem>
-</itemizedlist>
-
-<para>The constructor of Barrier passes the address of the Zookeeper server to the
-constructor of the parent class. The parent class creates a ZooKeeper instance if
-one does not exist. The constructor of Barrier then creates a
-barrier node on ZooKeeper, which is the parent node of all process nodes, and
-we call root (<emphasis role="bold">Note:</emphasis> This is not the ZooKeeper root "/").</para>
-
-<programlisting>
- /**
- * Barrier constructor
- *
- * @param address
- * @param root
- * @param size
- */
- Barrier(String address, String root, int size) {
- super(address);
- this.root = root;
- this.size = size;
-
- // Create barrier node
- if (zk != null) {
- try {
- Stat s = zk.exists(root, false);
- if (s == null) {
- zk.create(root, new byte[0], Ids.OPEN_ACL_UNSAFE,
- CreateMode.PERSISTENT);
- }
- } catch (KeeperException e) {
- System.out
- .println("Keeper exception when instantiating queue: "
- + e.toString());
- } catch (InterruptedException e) {
- System.out.println("Interrupted exception");
- }
- }
-
- // My node name
- try {
- name = new String(InetAddress.getLocalHost().getCanonicalHostName().toString());
- } catch (UnknownHostException e) {
- System.out.println(e.toString());
- }
-
- }
-</programlisting>
-<para>
-To enter the barrier, a process calls enter(). The process creates a node under
-the root to represent it, using its host name to form the node name. It then wait
-until enough processes have entered the barrier. A process does it by checking
-the number of children the root node has with "getChildren()", and waiting for
-notifications in the case it does not have enough. To receive a notification when
-there is a change to the root node, a process has to set a watch, and does it
-through the call to "getChildren()". In the code, we have that "getChildren()"
-has two parameters. The first one states the node to read from, and the second is
-a boolean flag that enables the process to set a watch. In the code the flag is true.
-</para>
-
-<programlisting>
- /**
- * Join barrier
- *
- * @return
- * @throws KeeperException
- * @throws InterruptedException
- */
-
- boolean enter() throws KeeperException, InterruptedException{
- zk.create(root + "/" + name, new byte[0], Ids.OPEN_ACL_UNSAFE,
- CreateMode.EPHEMERAL_SEQUENTIAL);
- while (true) {
- synchronized (mutex) {
- List<String> list = zk.getChildren(root, true);
-
- if (list.size() < size) {
- mutex.wait();
- } else {
- return true;
- }
- }
- }
- }
-</programlisting>
-<para>
-Note that enter() throws both KeeperException and InterruptedException, so it is
-the reponsability of the application to catch and handle such exceptions.</para>
-
-<para>
-Once the computation is finished, a process calls leave() to leave the barrier.
-First it deletes its corresponding node, and then it gets the children of the root
-node. If there is at least one child, then it waits for a notification (obs: note
-that the second parameter of the call to getChildren() is true, meaning that
-ZooKeeper has to set a watch on the the root node). Upon reception of a notification,
-it checks once more whether the root node has any child.</para>
-
-<programlisting>
- /**
- * Wait until all reach barrier
- *
- * @return
- * @throws KeeperException
- * @throws InterruptedException
- */
-
- boolean leave() throws KeeperException, InterruptedException{
- zk.delete(root + "/" + name, 0);
- while (true) {
- synchronized (mutex) {
- List<String> list = zk.getChildren(root, true);
- if (list.size() > 0) {
- mutex.wait();
- } else {
- return true;
- }
- }
- }
- }
- }
-</programlisting>
-</section>
-<section id="sc_producerConsumerQueues"><title>Producer-Consumer Queues</title>
-<para>
-A producer-consumer queue is a distributed data estructure thata group of processes
-use to generate and consume items. Producer processes create new elements and add
-them to the queue. Consumer processes remove elements from the list, and process them.
-In this implementation, the elements are simple integers. The queue is represented
-by a root node, and to add an element to the queue, a producer process creates a new node,
-a child of the root node.
-</para>
-
-<para>
-The following excerpt of code corresponds to the constructor of the object. As
-with Barrier objects, it first calls the constructor of the parent class, SyncPrimitive,
-that creates a ZooKeeper object if one doesn't exist. It then verifies if the root
-node of the queue exists, and creates if it doesn't.
-</para>
-<programlisting>
- /**
- * Constructor of producer-consumer queue
- *
- * @param address
- * @param name
- */
- Queue(String address, String name) {
- super(address);
- this.root = name;
- // Create ZK node name
- if (zk != null) {
- try {
- Stat s = zk.exists(root, false);
- if (s == null) {
- zk.create(root, new byte[0], Ids.OPEN_ACL_UNSAFE,
- CreateMode.PERSISTENT);
- }
- } catch (KeeperException e) {
- System.out
- .println("Keeper exception when instantiating queue: "
- + e.toString());
- } catch (InterruptedException e) {
- System.out.println("Interrupted exception");
- }
- }
- }
-</programlisting>
-
-<para>
-A producer process calls "produce()" to add an element to the queue, and passes
-an integer as an argument. To add an element to the queue, the method creates a
-new node using "create()", and uses the SEQUENCE flag to instruct ZooKeeper to
-append the value of the sequencer counter associated to the root node. In this way,
-we impose a total order on the elements of the queue, thus guaranteeing that the
-oldest element of the queue is the next one consumed.
-</para>
-
-<programlisting>
- /**
- * Add element to the queue.
- *
- * @param i
- * @return
- */
-
- boolean produce(int i) throws KeeperException, InterruptedException{
- ByteBuffer b = ByteBuffer.allocate(4);
- byte[] value;
-
- // Add child with value i
- b.putInt(i);
- value = b.array();
- zk.create(root + "/element", value, Ids.OPEN_ACL_UNSAFE,
- CreateMode.PERSISTENT_SEQUENTIAL);
-
- return true;
- }
-</programlisting>
-<para>
-To consume an element, a consumer process obtains the children of the root node,
-reads the node with smallest counter value, and returns the element. Note that
-if there is a conflict, then one of the two contending processes won't be able to
-delete the node and the delete operation will throw an exception.</para>
-
-<para>
-A call to getChildren() returns the list of children in lexicographic order.
-As lexicographic order does not necessary follow the numerical order of the counter
-values, we need to decide which element is the smallest. To decide which one has
-the smallest counter value, we traverse the list, and remove the prefix "element"
-from each one.</para>
-
-<programlisting>
- /**
- * Remove first element from the queue.
- *
- * @return
- * @throws KeeperException
- * @throws InterruptedException
- */
- int consume() throws KeeperException, InterruptedException{
- int retvalue = -1;
- Stat stat = null;
-
- // Get the first element available
- while (true) {
- synchronized (mutex) {
- List<String> list = zk.getChildren(root, true);
- if (list.size() == 0) {
- System.out.println("Going to wait");
- mutex.wait();
- } else {
- Integer min = new Integer(list.get(0).substring(7));
- for(String s : list){
- Integer tempValue = new Integer(s.substring(7));
- //System.out.println("Temporary value: " + tempValue);
- if(tempValue < min) min = tempValue;
- }
- System.out.println("Temporary value: " + root + "/element" + min);
- byte[] b = zk.getData(root + "/element" + min,
- false, stat);
- zk.delete(root + "/element" + min, 0);
- ByteBuffer buffer = ByteBuffer.wrap(b);
- retvalue = buffer.getInt();
-
- return retvalue;
- }
- }
- }
- }
- }
-</programlisting>
-
-</section>
-
-<section>
-<title>Complete example</title>
-<para>
-In the following section you can find a complete command line application to demonstrate the above mentioned
-recipes. Use the following command to run it.
-</para>
-<programlisting>
-ZOOBINDIR="[path_to_distro]/bin"
-. "$ZOOBINDIR"/zkEnv.sh
-java SyncPrimitive [Test Type] [ZK server] [No of elements] [Client type]
-</programlisting>
-
-<section>
-<title>Queue test</title>
-<para>Start a producer to create 100 elements</para>
-<programlisting>
-java SyncPrimitive qTest localhost 100 p
-</programlisting>
-
-<para>Start a consumer to consume 100 elements</para>
-<programlisting>
-java SyncPrimitive qTest localhost 100 c
-</programlisting>
-</section>
-
-<section>
-<title>Barrier test</title>
-<para>Start a barrier with 2 participants (start as many times as many participants you'd like to enter)</para>
-<programlisting>
-java SyncPrimitive bTest localhost 2
-</programlisting>
-</section>
-
-<section id="sc_sourceListing"><title>Source Listing</title>
-<example id="eg_SyncPrimitive_java">
-<title>SyncPrimitive.Java</title>
-<programlisting>
-import java.io.IOException;
-import java.net.InetAddress;
-import java.net.UnknownHostException;
-import java.nio.ByteBuffer;
-import java.util.List;
-import java.util.Random;
-
-import org.apache.zookeeper.CreateMode;
-import org.apache.zookeeper.KeeperException;
-import org.apache.zookeeper.WatchedEvent;
-import org.apache.zookeeper.Watcher;
-import org.apache.zookeeper.ZooKeeper;
-import org.apache.zookeeper.ZooDefs.Ids;
-import org.apache.zookeeper.data.Stat;
-
-public class SyncPrimitive implements Watcher {
-
- static ZooKeeper zk = null;
- static Integer mutex;
-
- String root;
-
- SyncPrimitive(String address) {
- if(zk == null){
- try {
- System.out.println("Starting ZK:");
- zk = new ZooKeeper(address, 3000, this);
- mutex = new Integer(-1);
- System.out.println("Finished starting ZK: " + zk);
- } catch (IOException e) {
- System.out.println(e.toString());
- zk = null;
- }
- }
- //else mutex = new Integer(-1);
- }
-
- synchronized public void process(WatchedEvent event) {
- synchronized (mutex) {
- //System.out.println("Process: " + event.getType());
- mutex.notify();
- }
- }
-
- /**
- * Barrier
- */
- static public class Barrier extends SyncPrimitive {
- int size;
- String name;
-
- /**
- * Barrier constructor
- *
- * @param address
- * @param root
- * @param size
- */
- Barrier(String address, String root, int size) {
- super(address);
- this.root = root;
- this.size = size;
-
- // Create barrier node
- if (zk != null) {
- try {
- Stat s = zk.exists(root, false);
- if (s == null) {
- zk.create(root, new byte[0], Ids.OPEN_ACL_UNSAFE,
- CreateMode.PERSISTENT);
- }
- } catch (KeeperException e) {
- System.out
- .println("Keeper exception when instantiating queue: "
- + e.toString());
- } catch (InterruptedException e) {
- System.out.println("Interrupted exception");
- }
- }
-
- // My node name
- try {
- name = new String(InetAddress.getLocalHost().getCanonicalHostName().toString());
- } catch (UnknownHostException e) {
- System.out.println(e.toString());
- }
-
- }
-
- /**
- * Join barrier
- *
- * @return
- * @throws KeeperException
- * @throws InterruptedException
- */
-
- boolean enter() throws KeeperException, InterruptedException{
- zk.create(root + "/" + name, new byte[0], Ids.OPEN_ACL_UNSAFE,
- CreateMode.EPHEMERAL_SEQUENTIAL);
- while (true) {
- synchronized (mutex) {
- List<String> list = zk.getChildren(root, true);
-
- if (list.size() < size) {
- mutex.wait();
- } else {
- return true;
- }
- }
- }
- }
-
- /**
- * Wait until all reach barrier
- *
- * @return
- * @throws KeeperException
- * @throws InterruptedException
- */
-
- boolean leave() throws KeeperException, InterruptedException{
- zk.delete(root + "/" + name, 0);
- while (true) {
- synchronized (mutex) {
- List<String> list = zk.getChildren(root, true);
- if (list.size() > 0) {
- mutex.wait();
- } else {
- return true;
- }
- }
- }
- }
- }
-
- /**
- * Producer-Consumer queue
- */
- static public class Queue extends SyncPrimitive {
-
- /**
- * Constructor of producer-consumer queue
- *
- * @param address
- * @param name
- */
- Queue(String address, String name) {
- super(address);
- this.root = name;
- // Create ZK node name
- if (zk != null) {
- try {
- Stat s = zk.exists(root, false);
- if (s == null) {
- zk.create(root, new byte[0], Ids.OPEN_ACL_UNSAFE,
- CreateMode.PERSISTENT);
- }
- } catch (KeeperException e) {
- System.out
- .println("Keeper exception when instantiating queue: "
- + e.toString());
- } catch (InterruptedException e) {
- System.out.println("Interrupted exception");
- }
- }
- }
-
- /**
- * Add element to the queue.
- *
- * @param i
- * @return
- */
-
- boolean produce(int i) throws KeeperException, InterruptedException{
- ByteBuffer b = ByteBuffer.allocate(4);
- byte[] value;
-
- // Add child with value i
- b.putInt(i);
- value = b.array();
- zk.create(root + "/element", value, Ids.OPEN_ACL_UNSAFE,
- CreateMode.PERSISTENT_SEQUENTIAL);
-
- return true;
- }
-
-
- /**
- * Remove first element from the queue.
- *
- * @return
- * @throws KeeperException
- * @throws InterruptedException
- */
- int consume() throws KeeperException, InterruptedException{
- int retvalue = -1;
- Stat stat = null;
-
- // Get the first element available
- while (true) {
- synchronized (mutex) {
- List<String> list = zk.getChildren(root, true);
- if (list.size() == 0) {
- System.out.println("Going to wait");
- mutex.wait();
- } else {
- Integer min = new Integer(list.get(0).substring(7));
- String minNode = list.get(0);
- for(String s : list){
- Integer tempValue = new Integer(s.substring(7));
- //System.out.println("Temporary value: " + tempValue);
- if(tempValue < min) {
- min = tempValue;
- minNode = s;
- }
- }
- System.out.println("Temporary value: " + root + "/" + minNode);
- byte[] b = zk.getData(root + "/" + minNode,
- false, stat);
- zk.delete(root + "/" + minNode, 0);
- ByteBuffer buffer = ByteBuffer.wrap(b);
- retvalue = buffer.getInt();
-
- return retvalue;
- }
- }
- }
- }
- }
-
- public static void main(String args[]) {
- if (args[0].equals("qTest"))
- queueTest(args);
- else
- barrierTest(args);
-
- }
-
- public static void queueTest(String args[]) {
- Queue q = new Queue(args[1], "/app1");
-
- System.out.println("Input: " + args[1]);
- int i;
- Integer max = new Integer(args[2]);
-
- if (args[3].equals("p")) {
- System.out.println("Producer");
- for (i = 0; i < max; i++)
- try{
- q.produce(10 + i);
- } catch (KeeperException e){
-
- } catch (InterruptedException e){
-
- }
- } else {
- System.out.println("Consumer");
-
- for (i = 0; i < max; i++) {
- try{
- int r = q.consume();
- System.out.println("Item: " + r);
- } catch (KeeperException e){
- i--;
- } catch (InterruptedException e){
-
- }
- }
- }
- }
-
- public static void barrierTest(String args[]) {
- Barrier b = new Barrier(args[1], "/b1", new Integer(args[2]));
- try{
- boolean flag = b.enter();
- System.out.println("Entered barrier: " + args[2]);
- if(!flag) System.out.println("Error when entering the barrier");
- } catch (KeeperException e){
-
- } catch (InterruptedException e){
-
- }
-
- // Generate random integer
- Random rand = new Random();
- int r = rand.nextInt(100);
- // Loop for rand iterations
- for (int i = 0; i < r; i++) {
- try {
- Thread.sleep(100);
- } catch (InterruptedException e) {
-
- }
- }
- try{
- b.leave();
- } catch (KeeperException e){
-
- } catch (InterruptedException e){
-
- }
- System.out.println("Left barrier");
- }
-}
-</programlisting></example>
-</section>
-</section>
-
-</article>
http://git-wip-us.apache.org/repos/asf/zookeeper/blob/c1efa954/src/docs/src/documentation/resources/images/2pc.jpg
----------------------------------------------------------------------
diff --git a/src/docs/src/documentation/resources/images/2pc.jpg b/src/docs/src/documentation/resources/images/2pc.jpg
deleted file mode 100755
index fe4488f..0000000
Binary files a/src/docs/src/documentation/resources/images/2pc.jpg and /dev/null differ
http://git-wip-us.apache.org/repos/asf/zookeeper/blob/c1efa954/src/docs/src/documentation/resources/images/bk-overview.jpg
----------------------------------------------------------------------
diff --git a/src/docs/src/documentation/resources/images/bk-overview.jpg b/src/docs/src/documentation/resources/images/bk-overview.jpg
deleted file mode 100644
index 6e12fb4..0000000
Binary files a/src/docs/src/documentation/resources/images/bk-overview.jpg and /dev/null differ
http://git-wip-us.apache.org/repos/asf/zookeeper/blob/c1efa954/src/docs/src/documentation/resources/images/favicon.ico
----------------------------------------------------------------------
diff --git a/src/docs/src/documentation/resources/images/favicon.ico b/src/docs/src/documentation/resources/images/favicon.ico
deleted file mode 100644
index 161bcf7..0000000
Binary files a/src/docs/src/documentation/resources/images/favicon.ico and /dev/null differ
http://git-wip-us.apache.org/repos/asf/zookeeper/blob/c1efa954/src/docs/src/documentation/resources/images/hadoop-logo.jpg
----------------------------------------------------------------------
diff --git a/src/docs/src/documentation/resources/images/hadoop-logo.jpg b/src/docs/src/documentation/resources/images/hadoop-logo.jpg
deleted file mode 100644
index 809525d..0000000
Binary files a/src/docs/src/documentation/resources/images/hadoop-logo.jpg and /dev/null differ
http://git-wip-us.apache.org/repos/asf/zookeeper/blob/c1efa954/src/docs/src/documentation/resources/images/state_dia.dia
----------------------------------------------------------------------
diff --git a/src/docs/src/documentation/resources/images/state_dia.dia b/src/docs/src/documentation/resources/images/state_dia.dia
deleted file mode 100755
index 4a58a00..0000000
Binary files a/src/docs/src/documentation/resources/images/state_dia.dia and /dev/null differ
http://git-wip-us.apache.org/repos/asf/zookeeper/blob/c1efa954/src/docs/src/documentation/resources/images/state_dia.jpg
----------------------------------------------------------------------
diff --git a/src/docs/src/documentation/resources/images/state_dia.jpg b/src/docs/src/documentation/resources/images/state_dia.jpg
deleted file mode 100755
index b6f4a8b..0000000
Binary files a/src/docs/src/documentation/resources/images/state_dia.jpg and /dev/null differ
http://git-wip-us.apache.org/repos/asf/zookeeper/blob/c1efa954/src/docs/src/documentation/resources/images/zkarch.jpg
----------------------------------------------------------------------
diff --git a/src/docs/src/documentation/resources/images/zkarch.jpg b/src/docs/src/documentation/resources/images/zkarch.jpg
deleted file mode 100644
index a0e5fcc..0000000
Binary files a/src/docs/src/documentation/resources/images/zkarch.jpg and /dev/null differ
http://git-wip-us.apache.org/repos/asf/zookeeper/blob/c1efa954/src/docs/src/documentation/resources/images/zkcomponents.jpg
----------------------------------------------------------------------
diff --git a/src/docs/src/documentation/resources/images/zkcomponents.jpg b/src/docs/src/documentation/resources/images/zkcomponents.jpg
deleted file mode 100644
index 7690578..0000000
Binary files a/src/docs/src/documentation/resources/images/zkcomponents.jpg and /dev/null differ
http://git-wip-us.apache.org/repos/asf/zookeeper/blob/c1efa954/src/docs/src/documentation/resources/images/zknamespace.jpg
----------------------------------------------------------------------
diff --git a/src/docs/src/documentation/resources/images/zknamespace.jpg b/src/docs/src/documentation/resources/images/zknamespace.jpg
deleted file mode 100644
index 05534bc..0000000
Binary files a/src/docs/src/documentation/resources/images/zknamespace.jpg and /dev/null differ
http://git-wip-us.apache.org/repos/asf/zookeeper/blob/c1efa954/src/docs/src/documentation/resources/images/zkperfRW-3.2.jpg
----------------------------------------------------------------------
diff --git a/src/docs/src/documentation/resources/images/zkperfRW-3.2.jpg b/src/docs/src/documentation/resources/images/zkperfRW-3.2.jpg
deleted file mode 100644
index 594b50b..0000000
Binary files a/src/docs/src/documentation/resources/images/zkperfRW-3.2.jpg and /dev/null differ
http://git-wip-us.apache.org/repos/asf/zookeeper/blob/c1efa954/src/docs/src/documentation/resources/images/zkperfRW.jpg
----------------------------------------------------------------------
diff --git a/src/docs/src/documentation/resources/images/zkperfRW.jpg b/src/docs/src/documentation/resources/images/zkperfRW.jpg
deleted file mode 100644
index ad3019f..0000000
Binary files a/src/docs/src/documentation/resources/images/zkperfRW.jpg and /dev/null differ
http://git-wip-us.apache.org/repos/asf/zookeeper/blob/c1efa954/src/docs/src/documentation/resources/images/zkperfreliability.jpg
----------------------------------------------------------------------
diff --git a/src/docs/src/documentation/resources/images/zkperfreliability.jpg b/src/docs/src/documentation/resources/images/zkperfreliability.jpg
deleted file mode 100644
index 232bba8..0000000
Binary files a/src/docs/src/documentation/resources/images/zkperfreliability.jpg and /dev/null differ
http://git-wip-us.apache.org/repos/asf/zookeeper/blob/c1efa954/src/docs/src/documentation/resources/images/zkservice.jpg
----------------------------------------------------------------------
diff --git a/src/docs/src/documentation/resources/images/zkservice.jpg b/src/docs/src/documentation/resources/images/zkservice.jpg
deleted file mode 100644
index 1ec9154..0000000
Binary files a/src/docs/src/documentation/resources/images/zkservice.jpg and /dev/null differ
http://git-wip-us.apache.org/repos/asf/zookeeper/blob/c1efa954/src/docs/src/documentation/resources/images/zookeeper_small.gif
----------------------------------------------------------------------
diff --git a/src/docs/src/documentation/resources/images/zookeeper_small.gif b/src/docs/src/documentation/resources/images/zookeeper_small.gif
deleted file mode 100644
index 4e8014f..0000000
Binary files a/src/docs/src/documentation/resources/images/zookeeper_small.gif and /dev/null differ
http://git-wip-us.apache.org/repos/asf/zookeeper/blob/c1efa954/src/docs/src/documentation/skinconf.xml
----------------------------------------------------------------------
diff --git a/src/docs/src/documentation/skinconf.xml b/src/docs/src/documentation/skinconf.xml
deleted file mode 100644
index 43f3a49..0000000
--- a/src/docs/src/documentation/skinconf.xml
+++ /dev/null
@@ -1,360 +0,0 @@
-<?xml version="1.0"?>
-<!--
- Copyright 2002-2004 The Apache Software Foundation
-
- Licensed under the Apache License, Version 2.0 (the "License");
- you may not use this file except in compliance with the License.
- You may obtain a copy of the License at
-
- http://www.apache.org/licenses/LICENSE-2.0
-
- Unless required by applicable law or agreed to in writing, software
- distributed under the License is distributed on an "AS IS" BASIS,
- WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- See the License for the specific language governing permissions and
- limitations under the License.
--->
-
-<!--
-Skin configuration file. This file contains details of your project,
-which will be used to configure the chosen Forrest skin.
--->
-
-<!DOCTYPE skinconfig PUBLIC "-//APACHE//DTD Skin Configuration V0.6-3//EN" "http://forrest.apache.org/dtd/skinconfig-v06-3.dtd">
-<skinconfig>
- <!-- To enable lucene search add provider="lucene" (default is google).
- Add box-location="alt" to move the search box to an alternate location
- (if the skin supports it) and box-location="all" to show it in all
- available locations on the page. Remove the <search> element to show
- no search box. @domain will enable sitesearch for the specific domain with google.
- In other words google will search the @domain for the query string.
-
- -->
- <search name="ZooKeeper" domain="zookeeper.apache.org" provider="google"/>
-
- <!-- Disable the print link? If enabled, invalid HTML 4.0.1 -->
- <disable-print-link>true</disable-print-link>
- <!-- Disable the PDF link? -->
- <disable-pdf-link>false</disable-pdf-link>
- <!-- Disable the POD link? -->
- <disable-pod-link>true</disable-pod-link>
- <!-- Disable the Text link? FIXME: NOT YET IMPLEMENETED. -->
- <disable-txt-link>true</disable-txt-link>
- <!-- Disable the xml source link? -->
- <!-- The xml source link makes it possible to access the xml rendition
- of the source frim the html page, and to have it generated statically.
- This can be used to enable other sites and services to reuse the
- xml format for their uses. Keep this disabled if you don't want other
- sites to easily reuse your pages.-->
- <disable-xml-link>true</disable-xml-link>
-
- <!-- Disable navigation icons on all external links? -->
- <disable-external-link-image>true</disable-external-link-image>
-
- <!-- Disable w3c compliance links?
- Use e.g. align="center" to move the compliance links logos to
- an alternate location default is left.
- (if the skin supports it) -->
- <disable-compliance-links>true</disable-compliance-links>
-
- <!-- Render mailto: links unrecognisable by spam harvesters? -->
- <obfuscate-mail-links>false</obfuscate-mail-links>
-
- <!-- Disable the javascript facility to change the font size -->
- <disable-font-script>true</disable-font-script>
-
- <!-- project logo -->
- <project-name>ZooKeeper</project-name>
- <project-description>ZooKeeper: distributed coordination</project-description>
- <project-url>http://zookeeper.apache.org/</project-url>
- <project-logo>images/zookeeper_small.gif</project-logo>
-
- <!-- group logo -->
- <group-name>Hadoop</group-name>
- <group-description>Apache Hadoop</group-description>
- <group-url>http://hadoop.apache.org/</group-url>
- <group-logo>images/hadoop-logo.jpg</group-logo>
-
- <!-- optional host logo (e.g. sourceforge logo)
- default skin: renders it at the bottom-left corner -->
- <host-url></host-url>
- <host-logo></host-logo>
-
- <!-- relative url of a favicon file, normally favicon.ico -->
- <favicon-url>images/favicon.ico</favicon-url>
-
- <!-- The following are used to construct a copyright statement -->
- <year></year>
- <vendor>The Apache Software Foundation.</vendor>
- <copyright-link>http://www.apache.org/licenses/</copyright-link>
-
- <!-- Some skins use this to form a 'breadcrumb trail' of links.
- Use location="alt" to move the trail to an alternate location
- (if the skin supports it).
- Omit the location attribute to display the trail in the default location.
- Use location="none" to not display the trail (if the skin supports it).
- For some skins just set the attributes to blank.
- -->
- <trail>
- <link1 name="Apache" href="http://www.apache.org/"/>
- <link2 name="ZooKeeper" href="http://zookeeper.apache.org/"/>
- <link3 name="ZooKeeper" href="http://zookeeper.apache.org/"/>
- </trail>
-
- <!-- Configure the TOC, i.e. the Table of Contents.
- @max-depth
- how many "section" levels need to be included in the
- generated Table of Contents (TOC).
- @min-sections
- Minimum required to create a TOC.
- @location ("page","menu","page,menu", "none")
- Where to show the TOC.
- -->
- <toc max-depth="2" min-sections="1" location="page"/>
-
- <!-- Heading types can be clean|underlined|boxed -->
- <headings type="clean"/>
-
- <!-- The optional feedback element will be used to construct a
- feedback link in the footer with the page pathname appended:
- <a href="@href">{@to}</a>
- <feedback to="webmaster@foo.com"
- href="mailto:webmaster@foo.com?subject=Feedback " >
- Send feedback about the website to:
- </feedback>
- -->
- <!--
- extra-css - here you can define custom css-elements that are
- a. overriding the fallback elements or
- b. adding the css definition from new elements that you may have
- used in your documentation.
- -->
- <extra-css>
- <!--Example of b.
- To define the css definition of a new element that you may have used
- in the class attribute of a <p> node.
- e.g. <p class="quote"/>
- -->
- p.quote {
- margin-left: 2em;
- padding: .5em;
- background-color: #f0f0f0;
- font-family: monospace;
- }
-
- pre.code {
- margin-left: 0em;
- padding: 0.5em;
- background-color: #f0f0f0;
- font-family: monospace;
- }
-
-<!-- patricks
- .code {
- font-family: "Courier New", Courier, monospace;
- font-size: 110%;
- }
--->
-
- </extra-css>
-
- <colors>
- <!-- These values are used for the generated CSS files. -->
-
- <!-- Krysalis -->
-<!--
- <color name="header" value="#FFFFFF"/>
-
- <color name="tab-selected" value="#a5b6c6" link="#000000" vlink="#000000" hlink="#000000"/>
- <color name="tab-unselected" value="#F7F7F7" link="#000000" vlink="#000000" hlink="#000000"/>
- <color name="subtab-selected" value="#a5b6c6" link="#000000" vlink="#000000" hlink="#000000"/>
- <color name="subtab-unselected" value="#a5b6c6" link="#000000" vlink="#000000" hlink="#000000"/>
-
- <color name="heading" value="#a5b6c6"/>
- <color name="subheading" value="#CFDCED"/>
-
- <color name="navstrip" value="#CFDCED" font="#000000" link="#000000" vlink="#000000" hlink="#000000"/>
- <color name="toolbox" value="#a5b6c6"/>
- <color name="border" value="#a5b6c6"/>
-
- <color name="menu" value="#F7F7F7" link="#000000" vlink="#000000" hlink="#000000"/>
- <color name="dialog" value="#F7F7F7"/>
-
- <color name="body" value="#ffffff" link="#0F3660" vlink="#009999" hlink="#000066"/>
-
- <color name="table" value="#a5b6c6"/>
- <color name="table-cell" value="#ffffff"/>
- <color name="highlight" value="#ffff00"/>
- <color name="fixme" value="#cc6600"/>
- <color name="note" value="#006699"/>
- <color name="warning" value="#990000"/>
- <color name="code" value="#a5b6c6"/>
-
- <color name="footer" value="#a5b6c6"/>
--->
-
- <!-- Forrest -->
-<!--
- <color name="header" value="#294563"/>
-
- <color name="tab-selected" value="#4a6d8c" link="#0F3660" vlink="#0F3660" hlink="#000066"/>
- <color name="tab-unselected" value="#b5c7e7" link="#0F3660" vlink="#0F3660" hlink="#000066"/>
- <color name="subtab-selected" value="#4a6d8c" link="#0F3660" vlink="#0F3660" hlink="#000066"/>
- <color name="subtab-unselected" value="#4a6d8c" link="#0F3660" vlink="#0F3660" hlink="#000066"/>
-
- <color name="heading" value="#294563"/>
- <color name="subheading" value="#4a6d8c"/>
-
- <color name="navstrip" value="#cedfef" font="#0F3660" link="#0F3660" vlink="#0F3660" hlink="#000066"/>
- <color name="toolbox" value="#4a6d8c"/>
- <color name="border" value="#294563"/>
-
- <color name="menu" value="#4a6d8c" font="#cedfef" link="#ffffff" vlink="#ffffff" hlink="#ffcf00"/>
- <color name="dialog" value="#4a6d8c"/>
-
- <color name="body" value="#ffffff" link="#0F3660" vlink="#009999" hlink="#000066"/>
-
- <color name="table" value="#7099C5"/>
- <color name="table-cell" value="#f0f0ff"/>
- <color name="highlight" value="#ffff00"/>
- <color name="fixme" value="#cc6600"/>
- <color name="note" value="#006699"/>
- <color name="warning" value="#990000"/>
- <color name="code" value="#CFDCED"/>
-
- <color name="footer" value="#cedfef"/>
--->
-
- <!-- Collabnet -->
-<!--
- <color name="header" value="#003366"/>
-
- <color name="tab-selected" value="#dddddd" link="#555555" vlink="#555555" hlink="#555555"/>
- <color name="tab-unselected" value="#999999" link="#ffffff" vlink="#ffffff" hlink="#ffffff"/>
- <color name="subtab-selected" value="#cccccc" link="#000000" vlink="#000000" hlink="#000000"/>
- <color name="subtab-unselected" value="#cccccc" link="#555555" vlink="#555555" hlink="#555555"/>
-
- <color name="heading" value="#003366"/>
- <color name="subheading" value="#888888"/>
-
- <color name="navstrip" value="#dddddd" font="#555555"/>
- <color name="toolbox" value="#dddddd" font="#555555"/>
- <color name="border" value="#999999"/>
-
- <color name="menu" value="#ffffff"/>
- <color name="dialog" value="#eeeeee"/>
-
- <color name="body" value="#ffffff"/>
-
- <color name="table" value="#ccc"/>
- <color name="table-cell" value="#ffffff"/>
- <color name="highlight" value="#ffff00"/>
- <color name="fixme" value="#cc6600"/>
- <color name="note" value="#006699"/>
- <color name="warning" value="#990000"/>
- <color name="code" value="#003366"/>
-
- <color name="footer" value="#ffffff"/>
--->
- <!-- Lenya using pelt-->
-<!--
- <color name="header" value="#ffffff"/>
-
- <color name="tab-selected" value="#4C6C8F" link="#ffffff" vlink="#ffffff" hlink="#ffffff"/>
- <color name="tab-unselected" value="#E5E4D9" link="#000000" vlink="#000000" hlink="#000000"/>
- <color name="subtab-selected" value="#000000" link="#000000" vlink="#000000" hlink="#000000"/>
- <color name="subtab-unselected" value="#E5E4D9" link="#000000" vlink="#000000" hlink="#000000"/>
-
- <color name="heading" value="#E5E4D9"/>
- <color name="subheading" value="#000000"/>
- <color name="published" value="#4C6C8F" font="#FFFFFF"/>
- <color name="feedback" value="#4C6C8F" font="#FFFFFF" align="center"/>
- <color name="navstrip" value="#E5E4D9" font="#000000"/>
-
- <color name="toolbox" value="#CFDCED" font="#000000"/>
-
- <color name="border" value="#999999"/>
- <color name="menu" value="#4C6C8F" font="#ffffff" link="#ffffff" vlink="#ffffff" hlink="#ffffff" current="#FFCC33" />
- <color name="menuheading" value="#cfdced" font="#000000" />
- <color name="searchbox" value="#E5E4D9" font="#000000"/>
-
- <color name="dialog" value="#CFDCED"/>
- <color name="body" value="#ffffff" />
-
- <color name="table" value="#ccc"/>
- <color name="table-cell" value="#ffffff"/>
- <color name="highlight" value="#ffff00"/>
- <color name="fixme" value="#cc6600"/>
- <color name="note" value="#006699"/>
- <color name="warning" value="#990000"/>
- <color name="code" value="#003366"/>
-
- <color name="footer" value="#E5E4D9"/>
--->
- </colors>
-
- <!-- Settings specific to PDF output. -->
- <pdf>
- <!--
- Supported page sizes are a0, a1, a2, a3, a4, a5, executive,
- folio, legal, ledger, letter, quarto, tabloid (default letter).
- Supported page orientations are portrait, landscape (default
- portrait).
- Supported text alignments are left, right, justify (default left).
- -->
- <page size="letter" orientation="portrait" text-align="left"/>
-
- <!--
- Margins can be specified for top, bottom, inner, and outer
- edges. If double-sided="false", the inner edge is always left
- and the outer is always right. If double-sided="true", the
- inner edge will be left on odd pages, right on even pages,
- the outer edge vice versa.
- Specified below are the default settings.
- -->
- <margins double-sided="false">
- <top>1in</top>
- <bottom>1in</bottom>
- <inner>1.25in</inner>
- <outer>1in</outer>
- </margins>
-
- <!--
- Print the URL text next to all links going outside the file
- -->
- <show-external-urls>false</show-external-urls>
-
- <!--
- Disable the copyright footer on each page of the PDF.
- A footer is composed for each page. By default, a "credit" with role=pdf
- will be used, as explained below. Otherwise a copyright statement
- will be generated. This latter can be disabled.
- -->
- <disable-copyright-footer>false</disable-copyright-footer>
- </pdf>
-
- <!-- Credits are typically rendered as a set of small clickable
- images in the page footer.
- Use box-location="alt" to move the credit to an alternate location
- (if the skin supports it).
- -->
- <credits>
- <credit box-location="alt">
- <name>Built with Apache Forrest</name>
- <url>http://forrest.apache.org/</url>
- <image>images/built-with-forrest-button.png</image>
- <width>88</width>
- <height>31</height>
- </credit>
- <!-- A credit with @role="pdf" will be used to compose a footer
- for each page in the PDF, using either "name" or "url" or both.
- -->
- <!--
- <credit role="pdf">
- <name>Built with Apache Forrest</name>
- <url>http://forrest.apache.org/</url>
- </credit>
- -->
- </credits>
-
-</skinconfig>
http://git-wip-us.apache.org/repos/asf/zookeeper/blob/c1efa954/src/docs/status.xml
----------------------------------------------------------------------
diff --git a/src/docs/status.xml b/src/docs/status.xml
deleted file mode 100644
index 3ac3fda..0000000
--- a/src/docs/status.xml
+++ /dev/null
@@ -1,74 +0,0 @@
-<?xml version="1.0"?>
-<!--
- Copyright 2002-2004 The Apache Software Foundation
-
- Licensed under the Apache License, Version 2.0 (the "License");
- you may not use this file except in compliance with the License.
- You may obtain a copy of the License at
-
- http://www.apache.org/licenses/LICENSE-2.0
-
- Unless required by applicable law or agreed to in writing, software
- distributed under the License is distributed on an "AS IS" BASIS,
- WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- See the License for the specific language governing permissions and
- limitations under the License.
--->
-<status>
-
- <developers>
- <person name="Joe Bloggs" email="joe@joescompany.org" id="JB" />
- <!-- Add more people here -->
- </developers>
-
- <changes>
- <!-- Add new releases here -->
- <release version="0.1" date="unreleased">
- <!-- Some action types have associated images. By default, images are
- defined for 'add', 'fix', 'remove', 'update' and 'hack'. If you add
- src/documentation/resources/images/<foo>.jpg images, these will
- automatically be used for entries of type <foo>. -->
-
- <action dev="JB" type="add" context="admin">
- Initial Import
- </action>
- <!-- Sample action:
- <action dev="JB" type="fix" due-to="Joe Contributor"
- due-to-email="joec@apache.org" fixes-bug="123">
- Fixed a bug in the Foo class.
- </action>
- -->
- </release>
- </changes>
-
- <todo>
- <actions priority="high">
- <action context="docs" dev="JB">
- Customize this template project with your project's details. This
- TODO list is generated from 'status.xml'.
- </action>
- <action context="docs" dev="JB">
- Add lots of content. XML content goes in
- <code>src/documentation/content/xdocs</code>, or wherever the
- <code>${project.xdocs-dir}</code> property (set in
- <code>forrest.properties</code>) points.
- </action>
- <action context="feedback" dev="JB">
- Mail <link
- href="mailto:forrest-dev@xml.apache.org">forrest-dev@xml.apache.org</link>
- with feedback.
- </action>
- </actions>
- <!-- Add todo items. @context is an arbitrary string. Eg:
- <actions priority="high">
- <action context="code" dev="SN">
- </action>
- </actions>
- <actions priority="medium">
- <action context="docs" dev="open">
- </action>
- </actions>
- -->
- </todo>
-
-</status>
http://git-wip-us.apache.org/repos/asf/zookeeper/blob/c1efa954/zookeeper-docs/forrest.properties
----------------------------------------------------------------------
diff --git a/zookeeper-docs/forrest.properties b/zookeeper-docs/forrest.properties
new file mode 100644
index 0000000..70cf81d
--- /dev/null
+++ b/zookeeper-docs/forrest.properties
@@ -0,0 +1,109 @@
+# Copyright 2002-2004 The Apache Software Foundation
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+##############
+# Properties used by forrest.build.xml for building the website
+# These are the defaults, un-comment them if you need to change them.
+##############
+
+# Prints out a summary of Forrest settings for this project
+#forrest.echo=true
+
+# Project name (used to name .war file)
+#project.name=my-project
+
+# Specifies name of Forrest skin to use
+#project.skin=tigris
+#project.skin=pelt
+
+# comma separated list, file:// is supported
+#forrest.skins.descriptors=http://forrest.apache.org/skins/skins.xml,file:///c:/myskins/skins.xml
+
+##############
+# behavioural properties
+#project.menu-scheme=tab_attributes
+#project.menu-scheme=directories
+
+##############
+# layout properties
+
+# Properties that can be set to override the default locations
+#
+# Parent properties must be set. This usually means uncommenting
+# project.content-dir if any other property using it is uncommented
+
+#project.status=status.xml
+#project.content-dir=src/documentation
+project.configfile=${project.home}/src/documentation/conf/cli.xconf
+#project.raw-content-dir=${project.content-dir}/content
+#project.conf-dir=${project.content-dir}/conf
+#project.sitemap-dir=${project.content-dir}
+#project.xdocs-dir=${project.content-dir}/content/xdocs
+#project.resources-dir=${project.content-dir}/resources
+#project.stylesheets-dir=${project.resources-dir}/stylesheets
+#project.images-dir=${project.resources-dir}/images
+#project.schema-dir=${project.resources-dir}/schema
+#project.skins-dir=${project.content-dir}/skins
+#project.skinconf=${project.content-dir}/skinconf.xml
+#project.lib-dir=${project.content-dir}/lib
+#project.classes-dir=${project.content-dir}/classes
+#project.translations-dir=${project.content-dir}/translations
+
+##############
+# validation properties
+
+# This set of properties determine if validation is performed
+# Values are inherited unless overridden.
+# e.g. if forrest.validate=false then all others are false unless set to true.
+forrest.validate=true
+forrest.validate.xdocs=${forrest.validate}
+forrest.validate.skinconf=${forrest.validate}
+forrest.validate.stylesheets=${forrest.validate}
+forrest.validate.skins=${forrest.validate}
+forrest.validate.skins.stylesheets=${forrest.validate.skins}
+
+# Make Forrest work with JDK6
+forrest.validate.sitemap=false
+
+# *.failonerror=(true|false) - stop when an XML file is invalid
+forrest.validate.failonerror=true
+
+# *.excludes=(pattern) - comma-separated list of path patterns to not validate
+# e.g.
+#forrest.validate.xdocs.excludes=samples/subdir/**, samples/faq.xml
+#forrest.validate.xdocs.excludes=
+
+
+##############
+# General Forrest properties
+
+# The URL to start crawling from
+#project.start-uri=linkmap.html
+# Set logging level for messages printed to the console
+# (DEBUG, INFO, WARN, ERROR, FATAL_ERROR)
+#project.debuglevel=ERROR
+# Max memory to allocate to Java
+#forrest.maxmemory=64m
+# Any other arguments to pass to the JVM. For example, to run on an X-less
+# server, set to -Djava.awt.headless=true
+#forrest.jvmargs=
+# The bugtracking URL - the issue number will be appended
+#project.bugtracking-url=http://issues.apache.org/bugzilla/show_bug.cgi?id=
+#project.bugtracking-url=http://issues.apache.org/jira/browse/
+# The issues list as rss
+#project.issues-rss-url=
+#I18n Property only works for the "forrest run" target.
+#project.i18n=true
+
+project.required.plugins=org.apache.forrest.plugin.output.pdf,org.apache.forrest.plugin.input.simplifiedDocbook
http://git-wip-us.apache.org/repos/asf/zookeeper/blob/c1efa954/zookeeper-docs/src/documentation/README.txt
----------------------------------------------------------------------
diff --git a/zookeeper-docs/src/documentation/README.txt b/zookeeper-docs/src/documentation/README.txt
new file mode 100644
index 0000000..9bc261b
--- /dev/null
+++ b/zookeeper-docs/src/documentation/README.txt
@@ -0,0 +1,7 @@
+This is the base documentation directory.
+
+skinconf.xml # This file customizes Forrest for your project. In it, you
+ # tell forrest the project name, logo, copyright info, etc
+
+sitemap.xmap # Optional. This sitemap is consulted before all core sitemaps.
+ # See http://forrest.apache.org/docs/project-sitemap.html
http://git-wip-us.apache.org/repos/asf/zookeeper/blob/c1efa954/zookeeper-docs/src/documentation/TODO.txt
----------------------------------------------------------------------
diff --git a/zookeeper-docs/src/documentation/TODO.txt b/zookeeper-docs/src/documentation/TODO.txt
new file mode 100644
index 0000000..84e7dfa
--- /dev/null
+++ b/zookeeper-docs/src/documentation/TODO.txt
@@ -0,0 +1,227 @@
+This is a running list of todo documentation items. Feel free
+to add to the list or take on an item as you wish (in the form
+of a JIRA patch of course).
+-------------------------------------------------------------
+
+recipes.xml:110:
+[maybe an illustration would be nice for each recipe?]
+
+recipes.xml:167:
+"wait for each watch event". [how do you wait?]
+
+recipes.xml:457:
+<remark>[tbd: myabe helpful to indicate which step this refers to?]</remark>
+
+zookeeperAdmin.xml:77:
+because requires a majority <remark>[tbd: why?]</remark>, it is best to use...
+
+zookeeperAdmin.xml:112:
+ <screen>$yinst -i jdk-1.6.0.00_3 -br test <remark>[y! prop - replace with open equiv]</remark></screen>
+
+zookeeperAdmin.xml:99:
+- use a maximum heap size of 3GB for a 4GB machine. <remark>[tbd: where would they do this? Environment variable, etc?]</remark>
+
+zookeeperAdmin.xml:120
+<screen>$ yinst install -nostart zookeeper_server <remark>[Y! prop - replace with open eq]</remark></screen>
+
+zookeeperAdmin.xml:171:
+In Java, you can run the following command to execute simple operations:<remark> [tbd: also, maybe give some of those simple operations?]
+
+zookeeperAdmin.xml:194:
+Running either program gives you a shell in which to execute simple file-system-like operations. <remark>[tbd: again, sample
+ operations?]
+
+zookeeperAdmin.xml:252:
+If servers use different configuration files,
+care must be taken to ensure that the list of servers in all of the
+standard form, with legal values, etc]</remark>
+
+zookeeperAdmin.xml:408:
+(Note: The system property has no zookeeper
+prefix, and the configuration variable name is different from
+the system property. Yes - it's not consistent, and it's
+annoying.<remark> [tbd: is there any explanation for
+this?]</remark>)
+
+zookeeperAdmin.xml:445: When the election algorithm is
+ "0" a UDP port with the same port number as the port listed in
+ the <emphasis role="bold">server.num</emphasis> option will be
+ used. <remark>[tbd: should that be <emphasis
+ role="bold">server.id</emphasis>? Also, why isn't server.id
+ documented anywhere?]</remark>
+
+zookeeperAdmin.xml:481: The default to this option is yes, which
+ means that a leader will accept client connections.
+ <remark>[tbd: how do you specifiy which server is the
+ leader?]</remark>
+
+zookeeperAdmin.xml:495 When the server
+ starts up, it determines which server it is by looking for the
+ file <filename>myid</filename> in the data directory.<remark>
+ [tdb: should we mention somewhere about creating this file,
+ myid, in the setup procedure?]</remark>
+
+zookeeperAdmin.xml:508: [tbd: is the next sentence explanation an of what the
+ election port or is it a description of a special case?]
+ </remark>If you want to test multiple servers on a single
+ machine, the individual choices of electionPort for each
+ server can be defined in each server's config files using the
+ line electionPort=xxxx to avoid clashes.
+
+zookeeperAdmin.xml:524: If followers fall too far behind a
+ leader, they will be dropped. <remark>[tbd: is this a correct
+ rewording: if followers fall beyond this limit, they are
+ dropped?]</remark>
+
+zookeeperAdmin.xml:551: ZooKeeper will not require updates
+ to be synced to the media. <remark>[tbd: useful because...,
+ dangerous because...]</remark>
+
+zookeeperAdmin.xml:580: Skips ACL checks. <remark>[tbd: when? where?]</remark>
+
+zookeeperAdmin.xml:649: <remark>[tbd: Patrick, Ben, et al: I believe the Message Broker
+ team does perform routine monitoring of Zookeeper. But I might be
+ wrong. To your knowledge, is there any monitoring of a Zookeeper
+ deployment that will a Zookeeper sys admin will want to do, outside of
+ Yahoo?]</remark>
+
+zookeeperAdmin.xml:755: Also,
+ the server lists in each Zookeeper server configuration file
+ should be consistent with one another. <remark>[tbd: I'm assuming
+ this last part is true. Is it?]</remark>
+
+zookeeperAdmin.xml:812: For best results, take note of the following list of good
+ Zookeeper practices. <remark>[tbd: I just threw this section in. Do we
+ have list that is is different from the "things to avoid"? If not, I can
+ easily remove this section.]</remark>
+
+
+zookeeperOver.xml:162: Ephemeral nodes are useful when you
+ want to implement <remark>[tbd]</remark>.
+
+zookeeperOver.xml:174: And if the
+ connection between the client and one of the Zoo Keeper servers is
+ broken, the client will receive a local notification. These can be used
+ to <remark>[tbd]</remark>
+
+zookeeperOver.xml:215: <para>For more information on these (guarantees), and how they can be used, see
+ <remark>[tbd]</remark></para>
+
+zookeeperOver.xml:294: <para><xref linkend="fg_zkComponents" /> shows the high-level components
+ of the ZooKeeper service. With the exception of the request processor,
+ <remark>[tbd: where does the request processor live?]</remark>
+
+zookeeperOver.xml:298: <para><xref linkend="fg_zkComponents" /> shows the high-level components
+ of the ZooKeeper service. With the exception of the request processor,
+ each of
+ the servers that make up the ZooKeeper service replicates its own copy
+ of each of components. <remark>[tbd: I changed the wording in this
+ sentence from the white paper. Can someone please make sure it is still
+ correct?]</remark>
+
+zookeeperOver.xml:342: The programming interface to ZooKeeper is deliberately simple.
+ With it, however, you can implement higher order operations, such as
+ synchronizations primitives, group membership, ownership, etc. Some
+ distributed applications have used it to: <remark>[tbd: add uses from
+ white paper and video presentation.]</remark>
+
+
+zookeeperProgrammers.xml:94: <listitem>
+ <para><xref linkend="ch_programStructureWithExample" />
+ <remark>[tbd]</remark></para>
+ </listitem>
+
+zookeeperProgrammers.xml:115: Also,
+ the <ulink url="#ch_programStructureWithExample">Simple Programmming
+ Example</ulink> <remark>[tbd]</remark> is helpful for understand the basic
+ structure of a ZooKeeper client application.
+
+zookeeperProgrammers.xml:142: The following characters are not
+ allowed because <remark>[tbd:
+ do we need reasons?]</remark>
+
+zookeeperProgrammers.xml:172: If
+ the version it supplies doesn't match the actual version of the data,
+ the update will fail. (This behavior can be overridden. For more
+ information see... )<remark>[tbd... reference here to the section
+ describing the special version number -1]</remark>
+
+zookeeperProgrammers.xml:197: More information about watches can be
+ found in the section
+ <ulink url="recipes.html#sc_recipes_Locks">
+ Zookeeper Watches</ulink>.
+ <remark>[tbd: fix this link] [tbd: Ben there is note from to emphasize
+ that "it is queued". What is "it" and is what we have here
+ sufficient?]</remark></para>
+
+zookeeperProgrammers.xml:335: it will send the session id as a part of the connection handshake.
+ As a security measure, the server creates a password for the session id
+ that any ZooKeeper server can validate. <remark>[tbd: note from Ben:
+ "perhaps capability is a better word." need clarification on that.]
+ </remark>
+
+zookeeperProgrammers.xml:601: <ulink
+ url="recipes.html#sc_recipes_Locks">Locks</ulink>
+ <remark>[tbd:...]</remark> in <ulink
+ url="recipes.html">Zookeeper Recipes</ulink>.
+ <remark>[tbd:..]</remark>).</para>
+
+zookeeperProgrammers.xml:766: <para>See INSTALL for general information about running
+ <emphasis role="bold">configure</emphasis>. <remark>[tbd: what
+ is INSTALL? a directory? a file?]</remark></para>
+
+
+
+zookeeperProgrammers.xml:813: <para>To verify that the node's been created:</para>
+
+ <para>You should see a list of node who are children of the root node
+ "/".</para><remark>[tbd: document all the cli commands (I think this is ben's comment)
+
+zookeeperProgrammers.xml:838: <para>Refer to <xref linkend="ch_programStructureWithExample"/>for examples of usage in Java and C.
+ <remark>[tbd]</remark></para>
+
+zookeeperProgrammers.xml 847: <remark>[tbd: This is a new section. The below
+ is just placeholder. Eventually, a subsection on each of those operations, with a little
+ bit of illustrative code for each op.] </remark>
+
+zookeeperProgrammers.xml:915: Program Structure, with Simple Example</title>
+
+zookeeperProgrammers.xml:999: <term>ZooKeeper Whitepaper <remark>[tbd: find url]</remark></term>
+
+zookeeperProgrammers.xml:1008: <term>API Reference <remark>[tbd: find url]</remark></term>
+
+zookeeperProgrammers.xml:1062: [tbd]</remark></term><listitem>
+ <para>Any other good sources anyone can think of...</para>
+ </listitem>
+
+zookeeperStarted.xml:73: <para>[tbd: should we start w/ a word here about were to get the source,
+ exactly what to download, how to unpack it, and where to put it? Also,
+ does the user need to be in sudo, or can they be under their regular
+ login?]</para>
+
+zookeeperStarted.xml:84: <para>This should generate a JAR file called zookeeper.jar. To start
+ Zookeeper, compile and run zookeeper.jar. <emphasis>[tbd, some more
+ instruction here. Perhaps a command line? Are these two steps or
+ one?]</emphasis></para>
+
+zookeeperStarted.xml:139: <para>ZooKeeper logs messages using log4j -- more detail available in
+ the <ulink url="zookeeperProgrammers.html#Logging">Logging</ulink>
+ section of the Programmer's Guide.<remark revision="include_tbd">[tbd:
+ real reference needed]</remark>
+
+zookeeperStarted.xml:201: The C bindings exist in two variants: single
+ threaded and multi-threaded. These differ only in how the messaging loop
+ is done. <remark>[tbd: what is the messaging loop? Do we talk about it
+ anywyhere? is this too much info for a getting started guide?]</remark>
+
+zookeeperStarted.xml:217: The entry <emphasis
+ role="bold">syncLimit</emphasis> limits how far out of date a server can
+ be from a leader. [TBD: someone please verify that the previous is
+ true.]
+
+zookeeperStarted.xml:232: These are the "electionPort" numbers of the servers (as opposed to
+ clientPorts), that is ports for <remark>[tbd: feedback need: what are
+ these ports, exactly?]
+
+zookeeperStarted.xml:258: <remark>[tbd: what is the other config param?
+ (I believe two are mentioned above.)]</remark>
http://git-wip-us.apache.org/repos/asf/zookeeper/blob/c1efa954/zookeeper-docs/src/documentation/classes/CatalogManager.properties
----------------------------------------------------------------------
diff --git a/zookeeper-docs/src/documentation/classes/CatalogManager.properties b/zookeeper-docs/src/documentation/classes/CatalogManager.properties
new file mode 100644
index 0000000..ac060b9
--- /dev/null
+++ b/zookeeper-docs/src/documentation/classes/CatalogManager.properties
@@ -0,0 +1,37 @@
+# Copyright 2002-2004 The Apache Software Foundation
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+#=======================================================================
+# CatalogManager.properties
+#
+# This is the default properties file for Apache Forrest.
+# This facilitates local configuration of application-specific catalogs.
+#
+# See the Apache Forrest documentation:
+# http://forrest.apache.org/docs/your-project.html
+# http://forrest.apache.org/docs/validation.html
+
+# verbosity ... level of messages for status/debug
+# See forrest/src/core/context/WEB-INF/cocoon.xconf
+
+# catalogs ... list of additional catalogs to load
+# (Note that Apache Forrest will automatically load its own default catalog
+# from src/core/context/resources/schema/catalog.xcat)
+# use full pathnames
+# pathname separator is always semi-colon (;) regardless of operating system
+# directory separator is always slash (/) regardless of operating system
+#
+#catalogs=/home/me/forrest/my-site/src/documentation/resources/schema/catalog.xcat
+catalogs=
+
[03/12] zookeeper git commit: ZOOKEEPER-3022: MAVEN MIGRATION 3.4 -
Iteration 1 - docs, it
Posted by an...@apache.org.
http://git-wip-us.apache.org/repos/asf/zookeeper/blob/c1efa954/zookeeper-docs/src/documentation/content/xdocs/zookeeperInternals.xml
----------------------------------------------------------------------
diff --git a/zookeeper-docs/src/documentation/content/xdocs/zookeeperInternals.xml b/zookeeper-docs/src/documentation/content/xdocs/zookeeperInternals.xml
new file mode 100644
index 0000000..4954123
--- /dev/null
+++ b/zookeeper-docs/src/documentation/content/xdocs/zookeeperInternals.xml
@@ -0,0 +1,487 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!--
+ Copyright 2002-2004 The Apache Software Foundation
+
+ Licensed under the Apache License, Version 2.0 (the "License");
+ you may not use this file except in compliance with the License.
+ You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ See the License for the specific language governing permissions and
+ limitations under the License.
+-->
+
+<!DOCTYPE article PUBLIC "-//OASIS//DTD Simplified DocBook XML V1.0//EN"
+"http://www.oasis-open.org/docbook/xml/simple/1.0/sdocbook.dtd">
+<article id="ar_ZooKeeperInternals">
+ <title>ZooKeeper Internals</title>
+
+ <articleinfo>
+ <legalnotice>
+ <para>Licensed under the Apache License, Version 2.0 (the "License");
+ you may not use this file except in compliance with the License. You may
+ obtain a copy of the License at <ulink
+ url="http://www.apache.org/licenses/LICENSE-2.0">http://www.apache.org/licenses/LICENSE-2.0</ulink>.</para>
+
+ <para>Unless required by applicable law or agreed to in writing,
+ software distributed under the License is distributed on an "AS IS"
+ BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
+ implied. See the License for the specific language governing permissions
+ and limitations under the License.</para>
+ </legalnotice>
+
+ <abstract>
+ <para>This article contains topics which discuss the inner workings of
+ ZooKeeper. So far, that's logging and atomic broadcast. </para>
+
+ </abstract>
+ </articleinfo>
+
+ <section id="ch_Introduction">
+ <title>Introduction</title>
+
+ <para>This document contains information on the inner workings of ZooKeeper.
+ So far, it discusses these topics:
+ </para>
+
+<itemizedlist>
+<listitem><para><xref linkend="sc_atomicBroadcast"/></para></listitem>
+<listitem><para><xref linkend="sc_logging"/></para></listitem>
+</itemizedlist>
+
+</section>
+
+<section id="sc_atomicBroadcast">
+<title>Atomic Broadcast</title>
+
+<para>
+At the heart of ZooKeeper is an atomic messaging system that keeps all of the servers in sync.</para>
+
+<section id="sc_guaranteesPropertiesDefinitions"><title>Guarantees, Properties, and Definitions</title>
+<para>
+The specific guarantees provided by the messaging system used by ZooKeeper are the following:</para>
+
+<variablelist>
+
+<varlistentry><term><emphasis >Reliable delivery</emphasis></term>
+<listitem><para>If a message, m, is delivered
+by one server, it will be eventually delivered by all servers.</para></listitem></varlistentry>
+
+<varlistentry><term><emphasis >Total order</emphasis></term>
+<listitem><para> If a message is
+delivered before message b by one server, a will be delivered before b by all
+servers. If a and b are delivered messages, either a will be delivered before b
+or b will be delivered before a.</para></listitem></varlistentry>
+
+<varlistentry><term><emphasis >Causal order</emphasis> </term>
+
+<listitem><para>
+If a message b is sent after a message a has been delivered by the sender of b,
+a must be ordered before b. If a sender sends c after sending b, c must be ordered after b.
+</para></listitem></varlistentry>
+
+</variablelist>
+
+
+<para>
+The ZooKeeper messaging system also needs to be efficient, reliable, and easy to
+implement and maintain. We make heavy use of messaging, so we need the system to
+be able to handle thousands of requests per second. Although we can require at
+least k+1 correct servers to send new messages, we must be able to recover from
+correlated failures such as power outages. When we implemented the system we had
+little time and few engineering resources, so we needed a protocol that is
+accessible to engineers and is easy to implement. We found that our protocol
+satisfied all of these goals.
+
+</para>
+
+<para>
+Our protocol assumes that we can construct point-to-point FIFO channels between
+the servers. While similar services usually assume message delivery that can
+lose or reorder messages, our assumption of FIFO channels is very practical
+given that we use TCP for communication. Specifically we rely on the following property of TCP:</para>
+
+<variablelist>
+
+<varlistentry>
+<term><emphasis >Ordered delivery</emphasis></term>
+<listitem><para>Data is delivered in the same order it is sent and a message m is
+delivered only after all messages sent before m have been delivered.
+(The corollary to this is that if message m is lost all messages after m will be lost.)</para></listitem></varlistentry>
+
+<varlistentry><term><emphasis >No message after close</emphasis></term>
+<listitem><para>Once a FIFO channel is closed, no messages will be received from it.</para></listitem></varlistentry>
+
+</variablelist>
+
+<para>
+FLP proved that consensus cannot be achieved in asynchronous distributed systems
+if failures are possible. To ensure we achieve consensus in the presence of failures
+we use timeouts. However, we rely on times for liveness not for correctness. So,
+if timeouts stop working (clocks malfunction for example) the messaging system may
+hang, but it will not violate its guarantees.</para>
+
+<para>When describing the ZooKeeper messaging protocol we will talk of packets,
+proposals, and messages:</para>
+<variablelist>
+<varlistentry><term><emphasis >Packet</emphasis></term>
+<listitem><para>a sequence of bytes sent through a FIFO channel</para></listitem></varlistentry><varlistentry>
+
+<term><emphasis >Proposal</emphasis></term>
+<listitem><para>a unit of agreement. Proposals are agreed upon by exchanging packets
+with a quorum of ZooKeeper servers. Most proposals contain messages, however the
+NEW_LEADER proposal is an example of a proposal that does not correspond to a message.</para></listitem>
+</varlistentry><varlistentry>
+
+<term><emphasis >Message</emphasis></term>
+<listitem><para>a sequence of bytes to be atomically broadcast to all ZooKeeper
+servers. A message put into a proposal and agreed upon before it is delivered.</para></listitem>
+</varlistentry>
+
+</variablelist>
+
+<para>
+As stated above, ZooKeeper guarantees a total order of messages, and it also
+guarantees a total order of proposals. ZooKeeper exposes the total ordering using
+a ZooKeeper transaction id (<emphasis>zxid</emphasis>). All proposals will be stamped with a zxid when
+it is proposed and exactly reflects the total ordering. Proposals are sent to all
+ZooKeeper servers and committed when a quorum of them acknowledge the proposal.
+If a proposal contains a message, the message will be delivered when the proposal
+is committed. Acknowledgement means the server has recorded the proposal to persistent storage.
+Our quorums have the requirement that any pair of quorum must have at least one server
+in common. We ensure this by requiring that all quorums have size (<emphasis>n/2+1</emphasis>) where
+n is the number of servers that make up a ZooKeeper service.
+</para>
+
+<para>
+The zxid has two parts: the epoch and a counter. In our implementation the zxid
+is a 64-bit number. We use the high order 32-bits for the epoch and the low order
+32-bits for the counter. Because it has two parts represent the zxid both as a
+number and as a pair of integers, (<emphasis>epoch, count</emphasis>). The epoch number represents a
+change in leadership. Each time a new leader comes into power it will have its
+own epoch number. We have a simple algorithm to assign a unique zxid to a proposal:
+the leader simply increments the zxid to obtain a unique zxid for each proposal.
+<emphasis>Leadership activation will ensure that only one leader uses a given epoch, so our
+simple algorithm guarantees that every proposal will have a unique id.</emphasis>
+</para>
+
+<para>
+ZooKeeper messaging consists of two phases:</para>
+
+<variablelist>
+<varlistentry><term><emphasis >Leader activation</emphasis></term>
+<listitem><para>In this phase a leader establishes the correct state of the system
+and gets ready to start making proposals.</para></listitem>
+</varlistentry>
+
+<varlistentry><term><emphasis >Active messaging</emphasis></term>
+<listitem><para>In this phase a leader accepts messages to propose and coordinates message delivery.</para></listitem>
+</varlistentry>
+</variablelist>
+
+<para>
+ZooKeeper is a holistic protocol. We do not focus on individual proposals, rather
+look at the stream of proposals as a whole. Our strict ordering allows us to do this
+efficiently and greatly simplifies our protocol. Leadership activation embodies
+this holistic concept. A leader becomes active only when a quorum of followers
+(The leader counts as a follower as well. You can always vote for yourself ) has synced
+up with the leader, they have the same state. This state consists of all of the
+proposals that the leader believes have been committed and the proposal to follow
+the leader, the NEW_LEADER proposal. (Hopefully you are thinking to
+yourself, <emphasis>Does the set of proposals that the leader believes has been committed
+included all the proposals that really have been committed?</emphasis> The answer is <emphasis>yes</emphasis>.
+Below, we make clear why.)
+</para>
+
+</section>
+
+<section id="sc_leaderElection">
+
+<title>Leader Activation</title>
+<para>
+Leader activation includes leader election. We currently have two leader election
+algorithms in ZooKeeper: LeaderElection and FastLeaderElection (AuthFastLeaderElection
+is a variant of FastLeaderElection that uses UDP and allows servers to perform a simple
+form of authentication to avoid IP spoofing). ZooKeeper messaging doesn't care about the
+exact method of electing a leader has long as the following holds:
+</para>
+
+<itemizedlist>
+
+<listitem><para>The leader has seen the highest zxid of all the followers.</para></listitem>
+<listitem><para>A quorum of servers have committed to following the leader.</para></listitem>
+
+</itemizedlist>
+
+<para>
+Of these two requirements only the first, the highest zxid amoung the followers
+needs to hold for correct operation. The second requirement, a quorum of followers,
+just needs to hold with high probability. We are going to recheck the second requirement,
+so if a failure happens during or after the leader election and quorum is lost,
+we will recover by abandoning leader activation and running another election.
+</para>
+
+<para>
+After leader election a single server will be designated as a leader and start
+waiting for followers to connect. The rest of the servers will try to connect to
+the leader. The leader will sync up with followers by sending any proposals they
+are missing, or if a follower is missing too many proposals, it will send a full
+snapshot of the state to the follower.
+</para>
+
+<para>
+There is a corner case in which a follower that has proposals, U, not seen
+by a leader arrives. Proposals are seen in order, so the proposals of U will have a zxids
+higher than zxids seen by the leader. The follower must have arrived after the
+leader election, otherwise the follower would have been elected leader given that
+it has seen a higher zxid. Since committed proposals must be seen by a quorum of
+servers, and a quorum of servers that elected the leader did not see U, the proposals
+of you have not been committed, so they can be discarded. When the follower connects
+to the leader, the leader will tell the follower to discard U.
+</para>
+
+<para>
+A new leader establishes a zxid to start using for new proposals by getting the
+epoch, e, of the highest zxid it has seen and setting the next zxid to use to be
+(e+1, 0), fter the leader syncs with a follower, it will propose a NEW_LEADER
+proposal. Once the NEW_LEADER proposal has been committed, the leader will activate
+and start receiving and issuing proposals.
+</para>
+
+<para>
+It all sounds complicated but here are the basic rules of operation during leader
+activation:
+</para>
+
+<itemizedlist>
+<listitem><para>A follower will ACK the NEW_LEADER proposal after it has synced with the leader.</para></listitem>
+<listitem><para>A follower will only ACK a NEW_LEADER proposal with a given zxid from a single server.</para></listitem>
+<listitem><para>A new leader will COMMIT the NEW_LEADER proposal when a quorum of followers have ACKed it.</para></listitem>
+<listitem><para>A follower will commit any state it received from the leader when the NEW_LEADER proposal is COMMIT.</para></listitem>
+<listitem><para>A new leader will not accept new proposals until the NEW_LEADER proposal has been COMMITED.</para></listitem>
+</itemizedlist>
+
+<para>
+If leader election terminates erroneously, we don't have a problem since the
+NEW_LEADER proposal will not be committed since the leader will not have quorum.
+When this happens, the leader and any remaining followers will timeout and go back
+to leader election.
+</para>
+
+</section>
+
+<section id="sc_activeMessaging">
+<title>Active Messaging</title>
+<para>
+Leader Activation does all the heavy lifting. Once the leader is coronated he can
+start blasting out proposals. As long as he remains the leader no other leader can
+emerge since no other leader will be able to get a quorum of followers. If a new
+leader does emerge,
+it means that the leader has lost quorum, and the new leader will clean up any
+mess left over during her leadership activation.
+</para>
+
+<para>ZooKeeper messaging operates similar to a classic two-phase commit.</para>
+
+<mediaobject id="fg_2phaseCommit" >
+ <imageobject>
+ <imagedata fileref="images/2pc.jpg"/>
+ </imageobject>
+</mediaobject>
+
+<para>
+All communication channels are FIFO, so everything is done in order. Specifically
+the following operating constraints are observed:</para>
+
+<itemizedlist>
+
+<listitem><para>The leader sends proposals to all followers using
+the same order. Moreover, this order follows the order in which requests have been
+received. Because we use FIFO channels this means that followers also receive proposals in order.
+</para></listitem>
+
+<listitem><para>Followers process messages in the order they are received. This
+means that messages will be ACKed in order and the leader will receive ACKs from
+followers in order, due to the FIFO channels. It also means that if message $m$
+has been written to non-volatile storage, all messages that were proposed before
+$m$ have been written to non-volatile storage.</para></listitem>
+
+<listitem><para>The leader will issue a COMMIT to all followers as soon as a
+quorum of followers have ACKed a message. Since messages are ACKed in order,
+COMMITs will be sent by the leader as received by the followers in order.</para></listitem>
+
+<listitem><para>COMMITs are processed in order. Followers deliver a proposals
+message when that proposal is committed.</para></listitem>
+
+</itemizedlist>
+
+</section>
+
+<section id="sc_summary">
+<title>Summary</title>
+<para>So there you go. Why does it work? Specifically, why does is set of proposals
+believed by a new leader always contain any proposal that has actually been committed?
+First, all proposals have a unique zxid, so unlike other protocols, we never have
+to worry about two different values being proposed for the same zxid; followers
+(a leader is also a follower) see and record proposals in order; proposals are
+committed in order; there is only one active leader at a time since followers only
+follow a single leader at a time; a new leader has seen all committed proposals
+from the previous epoch since it has seen the highest zxid from a quorum of servers;
+any uncommited proposals from a previous epoch seen by a new leader will be committed
+by that leader before it becomes active.</para></section>
+
+<section id="sc_comparisons"><title>Comparisons</title>
+<para>
+Isn't this just Multi-Paxos? No, Multi-Paxos requires some way of assuring that
+there is only a single coordinator. We do not count on such assurances. Instead
+we use the leader activation to recover from leadership change or old leaders
+believing they are still active.
+</para>
+
+<para>
+Isn't this just Paxos? Your active messaging phase looks just like phase 2 of Paxos?
+Actually, to us active messaging looks just like 2 phase commit without the need to
+handle aborts. Active messaging is different from both in the sense that it has
+cross proposal ordering requirements. If we do not maintain strict FIFO ordering of
+all packets, it all falls apart. Also, our leader activation phase is different from
+both of them. In particular, our use of epochs allows us to skip blocks of uncommitted
+proposals and to not worry about duplicate proposals for a given zxid.
+</para>
+
+</section>
+
+</section>
+
+<section id="sc_quorum">
+<title>Quorums</title>
+
+<para>
+Atomic broadcast and leader election use the notion of quorum to guarantee a consistent
+view of the system. By default, ZooKeeper uses majority quorums, which means that every
+voting that happens in one of these protocols requires a majority to vote on. One example is
+acknowledging a leader proposal: the leader can only commit once it receives an
+acknowledgement from a quorum of servers.
+</para>
+
+<para>
+If we extract the properties that we really need from our use of majorities, we have that we only
+need to guarantee that groups of processes used to validate an operation by voting (e.g., acknowledging
+a leader proposal) pairwise intersect in at least one server. Using majorities guarantees such a property.
+However, there are other ways of constructing quorums different from majorities. For example, we can assign
+weights to the votes of servers, and say that the votes of some servers are more important. To obtain a quorum,
+we get enough votes so that the sum of weights of all votes is larger than half of the total sum of all weights.
+</para>
+
+<para>
+A different construction that uses weights and is useful in wide-area deployments (co-locations) is a hierarchical
+one. With this construction, we split the servers into disjoint groups and assign weights to processes. To form
+a quorum, we have to get a hold of enough servers from a majority of groups G, such that for each group g in G,
+the sum of votes from g is larger than half of the sum of weights in g. Interestingly, this construction enables
+smaller quorums. If we have, for example, 9 servers, we split them into 3 groups, and assign a weight of 1 to each
+server, then we are able to form quorums of size 4. Note that two subsets of processes composed each of a majority
+of servers from each of a majority of groups necessarily have a non-empty intersection. It is reasonable to expect
+that a majority of co-locations will have a majority of servers available with high probability.
+</para>
+
+<para>
+With ZooKeeper, we provide a user with the ability of configuring servers to use majority quorums, weights, or a
+hierarchy of groups.
+</para>
+</section>
+
+<section id="sc_logging">
+
+<title>Logging</title>
+<para>
+Zookeeper uses
+<ulink url="http://www.slf4j.org/index.html">slf4j</ulink> as an abstraction layer for logging.
+<ulink url="http://logging.apache.org/log4j">log4j</ulink> in version 1.2 is chosen as the final logging implementation for now.
+For better embedding support, it is planned in the future to leave the decision of choosing the final logging implementation to the end user.
+Therefore, always use the slf4j api to write log statements in the code, but configure log4j for how to log at runtime.
+Note that slf4j has no FATAL level, former messages at FATAL level have been moved to ERROR level.
+For information on configuring log4j for
+ZooKeeper, see the <ulink url="zookeeperAdmin.html#sc_logging">Logging</ulink> section
+of the <ulink url="zookeeperAdmin.html">ZooKeeper Administrator's Guide.</ulink>
+
+</para>
+
+<section id="sc_developerGuidelines"><title>Developer Guidelines</title>
+
+<para>Please follow the
+<ulink url="http://www.slf4j.org/manual.html">slf4j manual</ulink> when creating log statements within code.
+Also read the
+<ulink url="http://www.slf4j.org/faq.html#logging_performance">FAQ on performance</ulink>
+, when creating log statements. Patch reviewers will look for the following:</para>
+<section id="sc_rightLevel"><title>Logging at the Right Level</title>
+<para>
+There are several levels of logging in slf4j.
+It's important to pick the right one. In order of higher to lower severity:</para>
+<orderedlist>
+ <listitem><para>ERROR level designates error events that might still allow the application to continue running.</para></listitem>
+ <listitem><para>WARN level designates potentially harmful situations.</para></listitem>
+ <listitem><para>INFO level designates informational messages that highlight the progress of the application at coarse-grained level.</para></listitem>
+ <listitem><para>DEBUG Level designates fine-grained informational events that are most useful to debug an application.</para></listitem>
+ <listitem><para>TRACE Level designates finer-grained informational events than the DEBUG.</para></listitem>
+</orderedlist>
+
+<para>
+ZooKeeper is typically run in production such that log messages of INFO level
+severity and higher (more severe) are output to the log.</para>
+
+
+</section>
+
+<section id="sc_slf4jIdioms"><title>Use of Standard slf4j Idioms</title>
+
+<para><emphasis>Static Message Logging</emphasis></para>
+<programlisting>
+LOG.debug("process completed successfully!");
+</programlisting>
+
+<para>
+However when creating parameterized messages are required, use formatting anchors.
+</para>
+
+<programlisting>
+LOG.debug("got {} messages in {} minutes",new Object[]{count,time});
+</programlisting>
+
+
+<para><emphasis>Naming</emphasis></para>
+
+<para>
+Loggers should be named after the class in which they are used.
+</para>
+
+<programlisting>
+public class Foo {
+ private static final Logger LOG = LoggerFactory.getLogger(Foo.class);
+ ....
+ public Foo() {
+ LOG.info("constructing Foo");
+</programlisting>
+
+<para><emphasis>Exception handling</emphasis></para>
+<programlisting>
+try {
+ // code
+} catch (XYZException e) {
+ // do this
+ LOG.error("Something bad happened", e);
+ // don't do this (generally)
+ // LOG.error(e);
+ // why? because "don't do" case hides the stack trace
+
+ // continue process here as you need... recover or (re)throw
+}
+</programlisting>
+</section>
+</section>
+
+</section>
+
+</article>
http://git-wip-us.apache.org/repos/asf/zookeeper/blob/c1efa954/zookeeper-docs/src/documentation/content/xdocs/zookeeperJMX.xml
----------------------------------------------------------------------
diff --git a/zookeeper-docs/src/documentation/content/xdocs/zookeeperJMX.xml b/zookeeper-docs/src/documentation/content/xdocs/zookeeperJMX.xml
new file mode 100644
index 0000000..f0ea636
--- /dev/null
+++ b/zookeeper-docs/src/documentation/content/xdocs/zookeeperJMX.xml
@@ -0,0 +1,236 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!--
+ Copyright 2002-2004 The Apache Software Foundation
+
+ Licensed under the Apache License, Version 2.0 (the "License");
+ you may not use this file except in compliance with the License.
+ You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ See the License for the specific language governing permissions and
+ limitations under the License.
+-->
+
+<!DOCTYPE article PUBLIC "-//OASIS//DTD Simplified DocBook XML V1.0//EN"
+"http://www.oasis-open.org/docbook/xml/simple/1.0/sdocbook.dtd">
+<article id="bk_zookeeperjmx">
+ <title>ZooKeeper JMX</title>
+
+ <articleinfo>
+ <legalnotice>
+ <para>Licensed under the Apache License, Version 2.0 (the "License");
+ you may not use this file except in compliance with the License. You may
+ obtain a copy of the License at <ulink
+ url="http://www.apache.org/licenses/LICENSE-2.0">http://www.apache.org/licenses/LICENSE-2.0</ulink>.</para>
+
+ <para>Unless required by applicable law or agreed to in writing,
+ software distributed under the License is distributed on an "AS IS"
+ BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
+ implied. See the License for the specific language governing permissions
+ and limitations under the License.</para>
+ </legalnotice>
+
+ <abstract>
+ <para>ZooKeeper support for JMX</para>
+ </abstract>
+ </articleinfo>
+
+ <section id="ch_jmx">
+ <title>JMX</title>
+ <para>Apache ZooKeeper has extensive support for JMX, allowing you
+ to view and manage a ZooKeeper serving ensemble.</para>
+
+ <para>This document assumes that you have basic knowledge of
+ JMX. See <ulink
+ url="http://java.sun.com/javase/technologies/core/mntr-mgmt/javamanagement/">
+ Sun JMX Technology</ulink> page to get started with JMX.
+ </para>
+
+ <para>See the <ulink
+ url="http://java.sun.com/javase/6/docs/technotes/guides/management/agent.html">
+ JMX Management Guide</ulink> for details on setting up local and
+ remote management of VM instances. By default the included
+ <emphasis>zkServer.sh</emphasis> supports only local management -
+ review the linked document to enable support for remote management
+ (beyond the scope of this document).
+ </para>
+
+ </section>
+
+ <section id="ch_starting">
+ <title>Starting ZooKeeper with JMX enabled</title>
+
+ <para>The class
+ <emphasis>org.apache.zookeeper.server.quorum.QuorumPeerMain</emphasis>
+ will start a JMX manageable ZooKeeper server. This class
+ registers the proper MBeans during initalization to support JMX
+ monitoring and management of the
+ instance. See <emphasis>bin/zkServer.sh</emphasis> for one
+ example of starting ZooKeeper using QuorumPeerMain.</para>
+ </section>
+
+ <section id="ch_console">
+ <title>Run a JMX console</title>
+
+ <para>There are a number of JMX consoles available which can connect
+ to the running server. For this example we will use Sun's
+ <emphasis>jconsole</emphasis>.</para>
+
+ <para>The Java JDK ships with a simple JMX console
+ named <ulink url="http://java.sun.com/developer/technicalArticles/J2SE/jconsole.html">jconsole</ulink>
+ which can be used to connect to ZooKeeper and inspect a running
+ server. Once you've started ZooKeeper using QuorumPeerMain
+ start <emphasis>jconsole</emphasis>, which typically resides in
+ <emphasis>JDK_HOME/bin/jconsole</emphasis></para>
+
+ <para>When the "new connection" window is displayed either connect
+ to local process (if jconsole started on same host as Server) or
+ use the remote process connection.</para>
+
+ <para>By default the "overview" tab for the VM is displayed (this
+ is a great way to get insight into the VM btw). Select
+ the "MBeans" tab.</para>
+
+ <para>You should now see <emphasis>org.apache.ZooKeeperService</emphasis>
+ on the left hand side. Expand this item and depending on how you've
+ started the server you will be able to monitor and manage various
+ service related features.</para>
+
+ <para>Also note that ZooKeeper will register log4j MBeans as
+ well. In the same section along the left hand side you will see
+ "log4j". Expand that to manage log4j through JMX. Of particular
+ interest is the ability to dynamically change the logging levels
+ used by editing the appender and root thresholds. Log4j MBean
+ registration can be disabled by passing
+ <emphasis>-Dzookeeper.jmx.log4j.disable=true</emphasis> to the JVM
+ when starting ZooKeeper.
+ </para>
+
+ </section>
+
+ <section id="ch_reference">
+ <title>ZooKeeper MBean Reference</title>
+
+ <para>This table details JMX for a server participating in a
+ replicated ZooKeeper ensemble (ie not standalone). This is the
+ typical case for a production environment.</para>
+
+ <table>
+ <title>MBeans, their names and description</title>
+
+ <tgroup cols='4'>
+ <thead>
+ <row>
+ <entry>MBean</entry>
+ <entry>MBean Object Name</entry>
+ <entry>Description</entry>
+ </row>
+ </thead>
+ <tbody>
+ <row>
+ <entry>Quorum</entry>
+ <entry>ReplicatedServer_id<#></entry>
+ <entry>Represents the Quorum, or Ensemble - parent of all
+ cluster members. Note that the object name includes the
+ "myid" of the server (name suffix) that your JMX agent has
+ connected to.</entry>
+ </row>
+ <row>
+ <entry>LocalPeer|RemotePeer</entry>
+ <entry>replica.<#></entry>
+ <entry>Represents a local or remote peer (ie server
+ participating in the ensemble). Note that the object name
+ includes the "myid" of the server (name suffix).</entry>
+ </row>
+ <row>
+ <entry>LeaderElection</entry>
+ <entry>LeaderElection</entry>
+ <entry>Represents a ZooKeeper cluster leader election which is
+ in progress. Provides information about the election, such as
+ when it started.</entry>
+ </row>
+ <row>
+ <entry>Leader</entry>
+ <entry>Leader</entry>
+ <entry>Indicates that the parent replica is the leader and
+ provides attributes/operations for that server. Note that
+ Leader is a subclass of ZooKeeperServer, so it provides
+ all of the information normally associated with a
+ ZooKeeperServer node.</entry>
+ </row>
+ <row>
+ <entry>Follower</entry>
+ <entry>Follower</entry>
+ <entry>Indicates that the parent replica is a follower and
+ provides attributes/operations for that server. Note that
+ Follower is a subclass of ZooKeeperServer, so it provides
+ all of the information normally associated with a
+ ZooKeeperServer node.</entry>
+ </row>
+ <row>
+ <entry>DataTree</entry>
+ <entry>InMemoryDataTree</entry>
+ <entry>Statistics on the in memory znode database, also
+ operations to access finer (and more computationally
+ intensive) statistics on the data (such as ephemeral
+ count). InMemoryDataTrees are children of ZooKeeperServer
+ nodes.</entry>
+ </row>
+ <row>
+ <entry>ServerCnxn</entry>
+ <entry><session_id></entry>
+ <entry>Statistics on each client connection, also
+ operations on those connections (such as
+ termination). Note the object name is the session id of
+ the connection in hex form.</entry>
+ </row>
+ </tbody></tgroup></table>
+
+ <para>This table details JMX for a standalone server. Typically
+ standalone is only used in development situations.</para>
+
+ <table>
+ <title>MBeans, their names and description</title>
+
+ <tgroup cols='4'>
+ <thead>
+ <row>
+ <entry>MBean</entry>
+ <entry>MBean Object Name</entry>
+ <entry>Description</entry>
+ </row>
+ </thead>
+ <tbody>
+ <row>
+ <entry>ZooKeeperServer</entry>
+ <entry>StandaloneServer_port<#></entry>
+ <entry>Statistics on the running server, also operations
+ to reset these attributes. Note that the object name
+ includes the client port of the server (name
+ suffix).</entry>
+ </row>
+ <row>
+ <entry>DataTree</entry>
+ <entry>InMemoryDataTree</entry>
+ <entry>Statistics on the in memory znode database, also
+ operations to access finer (and more computationally
+ intensive) statistics on the data (such as ephemeral
+ count).</entry>
+ </row>
+ <row>
+ <entry>ServerCnxn</entry>
+ <entry><session_id></entry>
+ <entry>Statistics on each client connection, also
+ operations on those connections (such as
+ termination). Note the object name is the session id of
+ the connection in hex form.</entry>
+ </row>
+ </tbody></tgroup></table>
+
+ </section>
+
+</article>
http://git-wip-us.apache.org/repos/asf/zookeeper/blob/c1efa954/zookeeper-docs/src/documentation/content/xdocs/zookeeperObservers.xml
----------------------------------------------------------------------
diff --git a/zookeeper-docs/src/documentation/content/xdocs/zookeeperObservers.xml b/zookeeper-docs/src/documentation/content/xdocs/zookeeperObservers.xml
new file mode 100644
index 0000000..3955f3d
--- /dev/null
+++ b/zookeeper-docs/src/documentation/content/xdocs/zookeeperObservers.xml
@@ -0,0 +1,145 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!--
+ Copyright 2002-2004 The Apache Software Foundation
+
+ Licensed under the Apache License, Version 2.0 (the "License");
+ you may not use this file except in compliance with the License.
+ You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ See the License for the specific language governing permissions and
+ limitations under the License.
+-->
+
+<!DOCTYPE article PUBLIC "-//OASIS//DTD Simplified DocBook XML V1.0//EN"
+"http://www.oasis-open.org/docbook/xml/simple/1.0/sdocbook.dtd">
+<article id="bk_GettStartedGuide">
+ <title>ZooKeeper Observers</title>
+
+ <articleinfo>
+ <legalnotice>
+ <para>Licensed under the Apache License, Version 2.0 (the "License"); you
+ may not use this file except in compliance with the License. You may
+ obtain a copy of the License
+ at <ulink url="http://www.apache.org/licenses/LICENSE-2.0">http://www.apache.org/licenses/LICENSE-2.0</ulink>.</para>
+
+ <para>Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+ WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+ License for the specific language governing permissions and limitations
+ under the License.</para>
+ </legalnotice>
+
+ <abstract>
+ <para>This guide contains information about using non-voting servers, or
+ observers in your ZooKeeper ensembles.</para>
+ </abstract>
+ </articleinfo>
+
+ <section id="ch_Introduction">
+ <title>Observers: Scaling ZooKeeper Without Hurting Write Performance
+ </title>
+ <para>
+ Although ZooKeeper performs very well by having clients connect directly
+ to voting members of the ensemble, this architecture makes it hard to
+ scale out to huge numbers of clients. The problem is that as we add more
+ voting members, the write performance drops. This is due to the fact that
+ a write operation requires the agreement of (in general) at least half the
+ nodes in an ensemble and therefore the cost of a vote can increase
+ significantly as more voters are added.
+ </para>
+ <para>
+ We have introduced a new type of ZooKeeper node called
+ an <emphasis>Observer</emphasis> which helps address this problem and
+ further improves ZooKeeper's scalability. Observers are non-voting members
+ of an ensemble which only hear the results of votes, not the agreement
+ protocol that leads up to them. Other than this simple distinction,
+ Observers function exactly the same as Followers - clients may connect to
+ them and send read and write requests to them. Observers forward these
+ requests to the Leader like Followers do, but they then simply wait to
+ hear the result of the vote. Because of this, we can increase the number
+ of Observers as much as we like without harming the performance of votes.
+ </para>
+ <para>
+ Observers have other advantages. Because they do not vote, they are not a
+ critical part of the ZooKeeper ensemble. Therefore they can fail, or be
+ disconnected from the cluster, without harming the availability of the
+ ZooKeeper service. The benefit to the user is that Observers may connect
+ over less reliable network links than Followers. In fact, Observers may be
+ used to talk to a ZooKeeper server from another data center. Clients of
+ the Observer will see fast reads, as all reads are served locally, and
+ writes result in minimal network traffic as the number of messages
+ required in the absence of the vote protocol is smaller.
+ </para>
+ </section>
+ <section id="sc_UsingObservers">
+ <title>How to use Observers</title>
+ <para>Setting up a ZooKeeper ensemble that uses Observers is very simple,
+ and requires just two changes to your config files. Firstly, in the config
+ file of every node that is to be an Observer, you must place this line:
+ </para>
+ <programlisting>
+ peerType=observer
+ </programlisting>
+
+ <para>
+ This line tells ZooKeeper that the server is to be an Observer. Secondly,
+ in every server config file, you must add :observer to the server
+ definition line of each Observer. For example:
+ </para>
+
+ <programlisting>
+ server.1:localhost:2181:3181:observer
+ </programlisting>
+
+ <para>
+ This tells every other server that server.1 is an Observer, and that they
+ should not expect it to vote. This is all the configuration you need to do
+ to add an Observer to your ZooKeeper cluster. Now you can connect to it as
+ though it were an ordinary Follower. Try it out, by running:</para>
+ <programlisting>
+ $ bin/zkCli.sh -server localhost:2181
+ </programlisting>
+ <para>
+ where localhost:2181 is the hostname and port number of the Observer as
+ specified in every config file. You should see a command line prompt
+ through which you can issue commands like <emphasis>ls</emphasis> to query
+ the ZooKeeper service.
+ </para>
+ </section>
+
+ <section id="ch_UseCases">
+ <title>Example use cases</title>
+ <para>
+ Two example use cases for Observers are listed below. In fact, wherever
+ you wish to scale the numbe of clients of your ZooKeeper ensemble, or
+ where you wish to insulate the critical part of an ensemble from the load
+ of dealing with client requests, Observers are a good architectural
+ choice.
+ </para>
+ <itemizedlist>
+ <listitem>
+ <para> As a datacenter bridge: Forming a ZK ensemble between two
+ datacenters is a problematic endeavour as the high variance in latency
+ between the datacenters could lead to false positive failure detection
+ and partitioning. However if the ensemble runs entirely in one
+ datacenter, and the second datacenter runs only Observers, partitions
+ aren't problematic as the ensemble remains connected. Clients of the
+ Observers may still see and issue proposals.</para>
+ </listitem>
+ <listitem>
+ <para>As a link to a message bus: Some companies have expressed an
+ interest in using ZK as a component of a persistent reliable message
+ bus. Observers would give a natural integration point for this work: a
+ plug-in mechanism could be used to attach the stream of proposals an
+ Observer sees to a publish-subscribe system, again without loading the
+ core ensemble.
+ </para>
+ </listitem>
+ </itemizedlist>
+ </section>
+</article>
http://git-wip-us.apache.org/repos/asf/zookeeper/blob/c1efa954/zookeeper-docs/src/documentation/content/xdocs/zookeeperOtherInfo.xml
----------------------------------------------------------------------
diff --git a/zookeeper-docs/src/documentation/content/xdocs/zookeeperOtherInfo.xml b/zookeeper-docs/src/documentation/content/xdocs/zookeeperOtherInfo.xml
new file mode 100644
index 0000000..a2445b1
--- /dev/null
+++ b/zookeeper-docs/src/documentation/content/xdocs/zookeeperOtherInfo.xml
@@ -0,0 +1,46 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!--
+ Copyright 2002-2004 The Apache Software Foundation
+
+ Licensed under the Apache License, Version 2.0 (the "License");
+ you may not use this file except in compliance with the License.
+ You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ See the License for the specific language governing permissions and
+ limitations under the License.
+-->
+
+<!DOCTYPE article PUBLIC "-//OASIS//DTD Simplified DocBook XML V1.0//EN"
+"http://www.oasis-open.org/docbook/xml/simple/1.0/sdocbook.dtd">
+<article id="bk_OtherInfo">
+ <title>ZooKeeper</title>
+
+ <articleinfo>
+ <legalnotice>
+ <para>Licensed under the Apache License, Version 2.0 (the "License");
+ you may not use this file except in compliance with the License. You may
+ obtain a copy of the License at <ulink
+ url="http://www.apache.org/licenses/LICENSE-2.0">http://www.apache.org/licenses/LICENSE-2.0</ulink>.</para>
+
+ <para>Unless required by applicable law or agreed to in writing,
+ software distributed under the License is distributed on an "AS IS"
+ BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
+ implied. See the License for the specific language governing permissions
+ and limitations under the License.</para>
+ </legalnotice>
+
+ <abstract>
+ <para> currently empty </para>
+ </abstract>
+ </articleinfo>
+
+ <section id="ch_placeholder">
+ <title>Other Info</title>
+ <para> currently empty </para>
+ </section>
+</article>
http://git-wip-us.apache.org/repos/asf/zookeeper/blob/c1efa954/zookeeper-docs/src/documentation/content/xdocs/zookeeperOver.xml
----------------------------------------------------------------------
diff --git a/zookeeper-docs/src/documentation/content/xdocs/zookeeperOver.xml b/zookeeper-docs/src/documentation/content/xdocs/zookeeperOver.xml
new file mode 100644
index 0000000..7a0444c
--- /dev/null
+++ b/zookeeper-docs/src/documentation/content/xdocs/zookeeperOver.xml
@@ -0,0 +1,464 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!--
+ Copyright 2002-2004 The Apache Software Foundation
+
+ Licensed under the Apache License, Version 2.0 (the "License");
+ you may not use this file except in compliance with the License.
+ You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ See the License for the specific language governing permissions and
+ limitations under the License.
+-->
+
+<!DOCTYPE article PUBLIC "-//OASIS//DTD Simplified DocBook XML V1.0//EN"
+"http://www.oasis-open.org/docbook/xml/simple/1.0/sdocbook.dtd">
+<article id="bk_Overview">
+ <title>ZooKeeper</title>
+
+ <articleinfo>
+ <legalnotice>
+ <para>Licensed under the Apache License, Version 2.0 (the "License");
+ you may not use this file except in compliance with the License. You may
+ obtain a copy of the License at <ulink
+ url="http://www.apache.org/licenses/LICENSE-2.0">http://www.apache.org/licenses/LICENSE-2.0</ulink>.</para>
+
+ <para>Unless required by applicable law or agreed to in writing,
+ software distributed under the License is distributed on an "AS IS"
+ BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
+ implied. See the License for the specific language governing permissions
+ and limitations under the License.</para>
+ </legalnotice>
+
+ <abstract>
+ <para>This document contains overview information about ZooKeeper. It
+ discusses design goals, key concepts, implementation, and
+ performance.</para>
+ </abstract>
+ </articleinfo>
+
+ <section id="ch_DesignOverview">
+ <title>ZooKeeper: A Distributed Coordination Service for Distributed
+ Applications</title>
+
+ <para>ZooKeeper is a distributed, open-source coordination service for
+ distributed applications. It exposes a simple set of primitives that
+ distributed applications can build upon to implement higher level services
+ for synchronization, configuration maintenance, and groups and naming. It
+ is designed to be easy to program to, and uses a data model styled after
+ the familiar directory tree structure of file systems. It runs in Java and
+ has bindings for both Java and C.</para>
+
+ <para>Coordination services are notoriously hard to get right. They are
+ especially prone to errors such as race conditions and deadlock. The
+ motivation behind ZooKeeper is to relieve distributed applications the
+ responsibility of implementing coordination services from scratch.</para>
+
+ <section id="sc_designGoals">
+ <title>Design Goals</title>
+
+ <para><emphasis role="bold">ZooKeeper is simple.</emphasis> ZooKeeper
+ allows distributed processes to coordinate with each other through a
+ shared hierarchal namespace which is organized similarly to a standard
+ file system. The name space consists of data registers - called znodes,
+ in ZooKeeper parlance - and these are similar to files and directories.
+ Unlike a typical file system, which is designed for storage, ZooKeeper
+ data is kept in-memory, which means ZooKeeper can acheive high
+ throughput and low latency numbers.</para>
+
+ <para>The ZooKeeper implementation puts a premium on high performance,
+ highly available, strictly ordered access. The performance aspects of
+ ZooKeeper means it can be used in large, distributed systems. The
+ reliability aspects keep it from being a single point of failure. The
+ strict ordering means that sophisticated synchronization primitives can
+ be implemented at the client.</para>
+
+ <para><emphasis role="bold">ZooKeeper is replicated.</emphasis> Like the
+ distributed processes it coordinates, ZooKeeper itself is intended to be
+ replicated over a sets of hosts called an ensemble.</para>
+
+ <figure>
+ <title>ZooKeeper Service</title>
+
+ <mediaobject>
+ <imageobject>
+ <imagedata fileref="images/zkservice.jpg" />
+ </imageobject>
+ </mediaobject>
+ </figure>
+
+ <para>The servers that make up the ZooKeeper service must all know about
+ each other. They maintain an in-memory image of state, along with a
+ transaction logs and snapshots in a persistent store. As long as a
+ majority of the servers are available, the ZooKeeper service will be
+ available.</para>
+
+ <para>Clients connect to a single ZooKeeper server. The client maintains
+ a TCP connection through which it sends requests, gets responses, gets
+ watch events, and sends heart beats. If the TCP connection to the server
+ breaks, the client will connect to a different server.</para>
+
+ <para><emphasis role="bold">ZooKeeper is ordered.</emphasis> ZooKeeper
+ stamps each update with a number that reflects the order of all
+ ZooKeeper transactions. Subsequent operations can use the order to
+ implement higher-level abstractions, such as synchronization
+ primitives.</para>
+
+ <para><emphasis role="bold">ZooKeeper is fast.</emphasis> It is
+ especially fast in "read-dominant" workloads. ZooKeeper applications run
+ on thousands of machines, and it performs best where reads are more
+ common than writes, at ratios of around 10:1.</para>
+ </section>
+
+ <section id="sc_dataModelNameSpace">
+ <title>Data model and the hierarchical namespace</title>
+
+ <para>The name space provided by ZooKeeper is much like that of a
+ standard file system. A name is a sequence of path elements separated by
+ a slash (/). Every node in ZooKeeper's name space is identified by a
+ path.</para>
+
+ <figure>
+ <title>ZooKeeper's Hierarchical Namespace</title>
+
+ <mediaobject>
+ <imageobject>
+ <imagedata fileref="images/zknamespace.jpg" />
+ </imageobject>
+ </mediaobject>
+ </figure>
+ </section>
+
+ <section>
+ <title>Nodes and ephemeral nodes</title>
+
+ <para>Unlike is standard file systems, each node in a ZooKeeper
+ namespace can have data associated with it as well as children. It is
+ like having a file-system that allows a file to also be a directory.
+ (ZooKeeper was designed to store coordination data: status information,
+ configuration, location information, etc., so the data stored at each
+ node is usually small, in the byte to kilobyte range.) We use the term
+ <emphasis>znode</emphasis> to make it clear that we are talking about
+ ZooKeeper data nodes.</para>
+
+ <para>Znodes maintain a stat structure that includes version numbers for
+ data changes, ACL changes, and timestamps, to allow cache validations
+ and coordinated updates. Each time a znode's data changes, the version
+ number increases. For instance, whenever a client retrieves data it also
+ receives the version of the data.</para>
+
+ <para>The data stored at each znode in a namespace is read and written
+ atomically. Reads get all the data bytes associated with a znode and a
+ write replaces all the data. Each node has an Access Control List (ACL)
+ that restricts who can do what.</para>
+
+ <para>ZooKeeper also has the notion of ephemeral nodes. These znodes
+ exists as long as the session that created the znode is active. When the
+ session ends the znode is deleted. Ephemeral nodes are useful when you
+ want to implement <emphasis>[tbd]</emphasis>.</para>
+ </section>
+
+ <section>
+ <title>Conditional updates and watches</title>
+
+ <para>ZooKeeper supports the concept of <emphasis>watches</emphasis>.
+ Clients can set a watch on a znodes. A watch will be triggered and
+ removed when the znode changes. When a watch is triggered the client
+ receives a packet saying that the znode has changed. And if the
+ connection between the client and one of the Zoo Keeper servers is
+ broken, the client will receive a local notification. These can be used
+ to <emphasis>[tbd]</emphasis>.</para>
+ </section>
+
+ <section>
+ <title>Guarantees</title>
+
+ <para>ZooKeeper is very fast and very simple. Since its goal, though, is
+ to be a basis for the construction of more complicated services, such as
+ synchronization, it provides a set of guarantees. These are:</para>
+
+ <itemizedlist>
+ <listitem>
+ <para>Sequential Consistency - Updates from a client will be applied
+ in the order that they were sent.</para>
+ </listitem>
+
+ <listitem>
+ <para>Atomicity - Updates either succeed or fail. No partial
+ results.</para>
+ </listitem>
+
+ <listitem>
+ <para>Single System Image - A client will see the same view of the
+ service regardless of the server that it connects to.</para>
+ </listitem>
+ </itemizedlist>
+
+ <itemizedlist>
+ <listitem>
+ <para>Reliability - Once an update has been applied, it will persist
+ from that time forward until a client overwrites the update.</para>
+ </listitem>
+ </itemizedlist>
+
+ <itemizedlist>
+ <listitem>
+ <para>Timeliness - The clients view of the system is guaranteed to
+ be up-to-date within a certain time bound.</para>
+ </listitem>
+ </itemizedlist>
+
+ <para>For more information on these, and how they can be used, see
+ <emphasis>[tbd]</emphasis></para>
+ </section>
+
+ <section>
+ <title>Simple API</title>
+
+ <para>One of the design goals of ZooKeeper is provide a very simple
+ programming interface. As a result, it supports only these
+ operations:</para>
+
+ <variablelist>
+ <varlistentry>
+ <term>create</term>
+
+ <listitem>
+ <para>creates a node at a location in the tree</para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>delete</term>
+
+ <listitem>
+ <para>deletes a node</para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>exists</term>
+
+ <listitem>
+ <para>tests if a node exists at a location</para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>get data</term>
+
+ <listitem>
+ <para>reads the data from a node</para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>set data</term>
+
+ <listitem>
+ <para>writes data to a node</para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>get children</term>
+
+ <listitem>
+ <para>retrieves a list of children of a node</para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>sync</term>
+
+ <listitem>
+ <para>waits for data to be propagated</para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+
+ <para>For a more in-depth discussion on these, and how they can be used
+ to implement higher level operations, please refer to
+ <emphasis>[tbd]</emphasis></para>
+ </section>
+
+ <section>
+ <title>Implementation</title>
+
+ <para><xref linkend="fg_zkComponents" /> shows the high-level components
+ of the ZooKeeper service. With the exception of the request processor,
+ each of
+ the servers that make up the ZooKeeper service replicates its own copy
+ of each of components.</para>
+
+ <figure id="fg_zkComponents">
+ <title>ZooKeeper Components</title>
+
+ <mediaobject>
+ <imageobject>
+ <imagedata fileref="images/zkcomponents.jpg" />
+ </imageobject>
+ </mediaobject>
+ </figure>
+
+ <para>The replicated database is an in-memory database containing the
+ entire data tree. Updates are logged to disk for recoverability, and
+ writes are serialized to disk before they are applied to the in-memory
+ database.</para>
+
+ <para>Every ZooKeeper server services clients. Clients connect to
+ exactly one server to submit irequests. Read requests are serviced from
+ the local replica of each server database. Requests that change the
+ state of the service, write requests, are processed by an agreement
+ protocol.</para>
+
+ <para>As part of the agreement protocol all write requests from clients
+ are forwarded to a single server, called the
+ <emphasis>leader</emphasis>. The rest of the ZooKeeper servers, called
+ <emphasis>followers</emphasis>, receive message proposals from the
+ leader and agree upon message delivery. The messaging layer takes care
+ of replacing leaders on failures and syncing followers with
+ leaders.</para>
+
+ <para>ZooKeeper uses a custom atomic messaging protocol. Since the
+ messaging layer is atomic, ZooKeeper can guarantee that the local
+ replicas never diverge. When the leader receives a write request, it
+ calculates what the state of the system is when the write is to be
+ applied and transforms this into a transaction that captures this new
+ state.</para>
+ </section>
+
+ <section>
+ <title>Uses</title>
+
+ <para>The programming interface to ZooKeeper is deliberately simple.
+ With it, however, you can implement higher order operations, such as
+ synchronizations primitives, group membership, ownership, etc. Some
+ distributed applications have used it to: <emphasis>[tbd: add uses from
+ white paper and video presentation.]</emphasis> For more information, see
+ <emphasis>[tbd]</emphasis></para>
+ </section>
+
+ <section>
+ <title>Performance</title>
+
+ <para>ZooKeeper is designed to be highly performant. But is it? The
+ results of the ZooKeeper's development team at Yahoo! Research indicate
+ that it is. (See <xref linkend="fg_zkPerfRW" />.) It is especially high
+ performance in applications where reads outnumber writes, since writes
+ involve synchronizing the state of all servers. (Reads outnumbering
+ writes is typically the case for a coordination service.)</para>
+
+ <figure id="fg_zkPerfRW">
+ <title>ZooKeeper Throughput as the Read-Write Ratio Varies</title>
+
+ <mediaobject>
+ <imageobject>
+ <imagedata fileref="images/zkperfRW-3.2.jpg" />
+ </imageobject>
+ </mediaobject>
+ </figure>
+ <para>The figure <xref linkend="fg_zkPerfRW"/> is a throughput
+ graph of ZooKeeper release 3.2 running on servers with dual 2Ghz
+ Xeon and two SATA 15K RPM drives. One drive was used as a
+ dedicated ZooKeeper log device. The snapshots were written to
+ the OS drive. Write requests were 1K writes and the reads were
+ 1K reads. "Servers" indicate the size of the ZooKeeper
+ ensemble, the number of servers that make up the
+ service. Approximately 30 other servers were used to simulate
+ the clients. The ZooKeeper ensemble was configured such that
+ leaders do not allow connections from clients.</para>
+
+ <note><para>In version 3.2 r/w performance improved by ~2x
+ compared to the <ulink
+ url="http://zookeeper.apache.org/docs/r3.1.1/zookeeperOver.html#Performance">previous
+ 3.1 release</ulink>.</para></note>
+
+ <para>Benchmarks also indicate that it is reliable, too. <xref
+ linkend="fg_zkPerfReliability" /> shows how a deployment responds to
+ various failures. The events marked in the figure are the
+ following:</para>
+
+ <orderedlist>
+ <listitem>
+ <para>Failure and recovery of a follower</para>
+ </listitem>
+
+ <listitem>
+ <para>Failure and recovery of a different follower</para>
+ </listitem>
+
+ <listitem>
+ <para>Failure of the leader</para>
+ </listitem>
+
+ <listitem>
+ <para>Failure and recovery of two followers</para>
+ </listitem>
+
+ <listitem>
+ <para>Failure of another leader</para>
+ </listitem>
+ </orderedlist>
+ </section>
+
+ <section>
+ <title>Reliability</title>
+
+ <para>To show the behavior of the system over time as
+ failures are injected we ran a ZooKeeper service made up of
+ 7 machines. We ran the same saturation benchmark as before,
+ but this time we kept the write percentage at a constant
+ 30%, which is a conservative ratio of our expected
+ workloads.
+ </para>
+ <figure id="fg_zkPerfReliability">
+ <title>Reliability in the Presence of Errors</title>
+ <mediaobject>
+ <imageobject>
+ <imagedata fileref="images/zkperfreliability.jpg" />
+ </imageobject>
+ </mediaobject>
+ </figure>
+
+ <para>The are a few important observations from this graph. First, if
+ followers fail and recover quickly, then ZooKeeper is able to sustain a
+ high throughput despite the failure. But maybe more importantly, the
+ leader election algorithm allows for the system to recover fast enough
+ to prevent throughput from dropping substantially. In our observations,
+ ZooKeeper takes less than 200ms to elect a new leader. Third, as
+ followers recover, ZooKeeper is able to raise throughput again once they
+ start processing requests.</para>
+ </section>
+
+ <section>
+ <title>The ZooKeeper Project</title>
+
+ <para>ZooKeeper has been
+ <ulink url="https://cwiki.apache.org/confluence/display/ZOOKEEPER/PoweredBy">
+ successfully used
+ </ulink>
+ in many industrial applications. It is used at Yahoo! as the
+ coordination and failure recovery service for Yahoo! Message
+ Broker, which is a highly scalable publish-subscribe system
+ managing thousands of topics for replication and data
+ delivery. It is used by the Fetching Service for Yahoo!
+ crawler, where it also manages failure recovery. A number of
+ Yahoo! advertising systems also use ZooKeeper to implement
+ reliable services.
+ </para>
+
+ <para>All users and developers are encouraged to join the
+ community and contribute their expertise. See the
+ <ulink url="http://zookeeper.apache.org/">
+ Zookeeper Project on Apache
+ </ulink>
+ for more information.
+ </para>
+ </section>
+ </section>
+</article>
[02/12] zookeeper git commit: ZOOKEEPER-3022: MAVEN MIGRATION 3.4 -
Iteration 1 - docs, it
Posted by an...@apache.org.
http://git-wip-us.apache.org/repos/asf/zookeeper/blob/c1efa954/zookeeper-docs/src/documentation/content/xdocs/zookeeperProgrammers.xml
----------------------------------------------------------------------
diff --git a/zookeeper-docs/src/documentation/content/xdocs/zookeeperProgrammers.xml b/zookeeper-docs/src/documentation/content/xdocs/zookeeperProgrammers.xml
new file mode 100644
index 0000000..8fbd679
--- /dev/null
+++ b/zookeeper-docs/src/documentation/content/xdocs/zookeeperProgrammers.xml
@@ -0,0 +1,1640 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!--
+ Copyright 2002-2004 The Apache Software Foundation
+
+ Licensed under the Apache License, Version 2.0 (the "License");
+ you may not use this file except in compliance with the License.
+ You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ See the License for the specific language governing permissions and
+ limitations under the License.
+-->
+<!DOCTYPE article PUBLIC "-//OASIS//DTD Simplified DocBook XML V1.0//EN"
+"http://www.oasis-open.org/docbook/xml/simple/1.0/sdocbook.dtd">
+<article id="bk_programmersGuide">
+ <title>ZooKeeper Programmer's Guide</title>
+
+ <subtitle>Developing Distributed Applications that use ZooKeeper</subtitle>
+
+ <articleinfo>
+ <legalnotice>
+ <para>Licensed under the Apache License, Version 2.0 (the "License");
+ you may not use this file except in compliance with the License. You may
+ obtain a copy of the License at <ulink
+ url="http://www.apache.org/licenses/LICENSE-2.0">http://www.apache.org/licenses/LICENSE-2.0</ulink>.</para>
+
+ <para>Unless required by applicable law or agreed to in writing,
+ software distributed under the License is distributed on an "AS IS"
+ BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
+ implied. See the License for the specific language governing permissions
+ and limitations under the License.</para>
+ </legalnotice>
+
+ <abstract>
+ <para>This guide contains detailed information about creating
+ distributed applications that use ZooKeeper. It discusses the basic
+ operations ZooKeeper supports, and how these can be used to build
+ higher-level abstractions. It contains solutions to common tasks, a
+ troubleshooting guide, and links to other information.</para>
+
+ <para>$Revision: 1.14 $ $Date: 2008/09/19 05:31:45 $</para>
+ </abstract>
+ </articleinfo>
+
+ <section id="_introduction">
+ <title>Introduction</title>
+
+ <para>This document is a guide for developers wishing to create
+ distributed applications that take advantage of ZooKeeper's coordination
+ services. It contains conceptual and practical information.</para>
+
+ <para>The first four sections of this guide present higher level
+ discussions of various ZooKeeper concepts. These are necessary both for an
+ understanding of how ZooKeeper works as well how to work with it. It does
+ not contain source code, but it does assume a familiarity with the
+ problems associated with distributed computing. The sections in this first
+ group are:</para>
+
+ <itemizedlist>
+ <listitem>
+ <para><xref linkend="ch_zkDataModel" /></para>
+ </listitem>
+
+ <listitem>
+ <para><xref linkend="ch_zkSessions" /></para>
+ </listitem>
+
+ <listitem>
+ <para><xref linkend="ch_zkWatches" /></para>
+ </listitem>
+
+ <listitem>
+ <para><xref linkend="ch_zkGuarantees" /></para>
+ </listitem>
+ </itemizedlist>
+
+ <para>The next four sections provide practical programming
+ information. These are:</para>
+
+ <itemizedlist>
+ <listitem>
+ <para><xref linkend="ch_guideToZkOperations" /></para>
+ </listitem>
+
+ <listitem>
+ <para><xref linkend="ch_bindings" /></para>
+ </listitem>
+
+ <listitem>
+ <para><xref linkend="ch_programStructureWithExample" />
+ <emphasis>[tbd]</emphasis></para>
+ </listitem>
+
+ <listitem>
+ <para><xref linkend="ch_gotchas" /></para>
+ </listitem>
+ </itemizedlist>
+
+ <para>The book concludes with an <ulink
+ url="#apx_linksToOtherInfo">appendix</ulink> containing links to other
+ useful, ZooKeeper-related information.</para>
+
+ <para>Most of information in this document is written to be accessible as
+ stand-alone reference material. However, before starting your first
+ ZooKeeper application, you should probably at least read the chaptes on
+ the <ulink url="#ch_zkDataModel">ZooKeeper Data Model</ulink> and <ulink
+ url="#ch_guideToZkOperations">ZooKeeper Basic Operations</ulink>. Also,
+ the <ulink url="#ch_programStructureWithExample">Simple Programmming
+ Example</ulink> <emphasis>[tbd]</emphasis> is helpful for understanding the basic
+ structure of a ZooKeeper client application.</para>
+ </section>
+
+ <section id="ch_zkDataModel">
+ <title>The ZooKeeper Data Model</title>
+
+ <para>ZooKeeper has a hierarchal name space, much like a distributed file
+ system. The only difference is that each node in the namespace can have
+ data associated with it as well as children. It is like having a file
+ system that allows a file to also be a directory. Paths to nodes are
+ always expressed as canonical, absolute, slash-separated paths; there are
+ no relative reference. Any unicode character can be used in a path subject
+ to the following constraints:</para>
+
+ <itemizedlist>
+ <listitem>
+ <para>The null character (\u0000) cannot be part of a path name. (This
+ causes problems with the C binding.)</para>
+ </listitem>
+
+ <listitem>
+ <para>The following characters can't be used because they don't
+ display well, or render in confusing ways: \u0001 - \u0019 and \u007F
+ - \u009F.</para>
+ </listitem>
+
+ <listitem>
+ <para>The following characters are not allowed: \ud800 -uF8FFF,
+ \uFFF0 - uFFFF.</para>
+ </listitem>
+
+ <listitem>
+ <para>The "." character can be used as part of another name, but "."
+ and ".." cannot alone be used to indicate a node along a path,
+ because ZooKeeper doesn't use relative paths. The following would be
+ invalid: "/a/b/./c" or "/a/b/../c".</para>
+ </listitem>
+
+ <listitem>
+ <para>The token "zookeeper" is reserved.</para>
+ </listitem>
+ </itemizedlist>
+
+ <section id="sc_zkDataModel_znodes">
+ <title>ZNodes</title>
+
+ <para>Every node in a ZooKeeper tree is referred to as a
+ <emphasis>znode</emphasis>. Znodes maintain a stat structure that
+ includes version numbers for data changes, acl changes. The stat
+ structure also has timestamps. The version number, together with the
+ timestamp, allows ZooKeeper to validate the cache and to coordinate
+ updates. Each time a znode's data changes, the version number increases.
+ For instance, whenever a client retrieves data, it also receives the
+ version of the data. And when a client performs an update or a delete,
+ it must supply the version of the data of the znode it is changing. If
+ the version it supplies doesn't match the actual version of the data,
+ the update will fail. (This behavior can be overridden. For more
+ information see... )<emphasis>[tbd...]</emphasis></para>
+
+ <note>
+ <para>In distributed application engineering, the word
+ <emphasis>node</emphasis> can refer to a generic host machine, a
+ server, a member of an ensemble, a client process, etc. In the ZooKeeper
+ documentation, <emphasis>znodes</emphasis> refer to the data nodes.
+ <emphasis>Servers</emphasis> refer to machines that make up the
+ ZooKeeper service; <emphasis>quorum peers</emphasis> refer to the
+ servers that make up an ensemble; client refers to any host or process
+ which uses a ZooKeeper service.</para>
+ </note>
+
+ <para> A znode is the main abstraction a programmer needs to be aware of. Znodes have
+ several characteristics that are worth mentioning here.</para>
+
+ <section id="sc_zkDataMode_watches">
+ <title>Watches</title>
+
+ <para>Clients can set watches on znodes. Changes to that znode trigger
+ the watch and then clear the watch. When a watch triggers, ZooKeeper
+ sends the client a notification. More information about watches can be
+ found in the section
+ <ulink url="#ch_zkWatches">ZooKeeper Watches</ulink>.</para>
+ </section>
+
+ <section>
+ <title>Data Access</title>
+
+ <para>The data stored at each znode in a namespace is read and written
+ atomically. Reads get all the data bytes associated with a znode and a
+ write replaces all the data. Each node has an Access Control List
+ (ACL) that restricts who can do what.</para>
+
+ <para>ZooKeeper was not designed to be a general database or large
+ object store. Instead, it manages coordination data. This data can
+ come in the form of configuration, status information, rendezvous, etc.
+ A common property of the various forms of coordination data is that
+ they are relatively small: measured in kilobytes.
+ The ZooKeeper client and the server implementations have sanity checks
+ to ensure that znodes have less than 1M of data, but the data should
+ be much less than that on average. Operating on relatively large data
+ sizes will cause some operations to take much more time than others and
+ will affect the latencies of some operations because of the extra time
+ needed to move more data over the network and onto storage media. If
+ large data storage is needed, the usually pattern of dealing with such
+ data is to store it on a bulk storage system, such as NFS or HDFS, and
+ store pointers to the storage locations in ZooKeeper.</para>
+ </section>
+
+ <section>
+ <title>Ephemeral Nodes</title>
+
+ <para>ZooKeeper also has the notion of ephemeral nodes. These znodes
+ exists as long as the session that created the znode is active. When
+ the session ends the znode is deleted. Because of this behavior
+ ephemeral znodes are not allowed to have children.</para>
+ </section>
+
+ <section>
+ <title>Sequence Nodes -- Unique Naming</title>
+
+ <para>When creating a znode you can also request that
+ ZooKeeper append a monotonically increasing counter to the end
+ of path. This counter is unique to the parent znode. The
+ counter has a format of %010d -- that is 10 digits with 0
+ (zero) padding (the counter is formatted in this way to
+ simplify sorting), i.e. "<path>0000000001". See
+ <ulink url="recipes.html#sc_recipes_Queues">Queue
+ Recipe</ulink> for an example use of this feature. Note: the
+ counter used to store the next sequence number is a signed int
+ (4bytes) maintained by the parent node, the counter will
+ overflow when incremented beyond 2147483647 (resulting in a
+ name "<path>-2147483648").</para>
+ </section>
+ </section>
+
+ <section id="sc_timeInZk">
+ <title>Time in ZooKeeper</title>
+
+ <para>ZooKeeper tracks time multiple ways:</para>
+
+ <itemizedlist>
+ <listitem>
+ <para><emphasis role="bold">Zxid</emphasis></para>
+
+ <para>Every change to the ZooKeeper state receives a stamp in the
+ form of a <emphasis>zxid</emphasis> (ZooKeeper Transaction Id).
+ This exposes the total ordering of all changes to ZooKeeper. Each
+ change will have a unique zxid and if zxid1 is smaller than zxid2
+ then zxid1 happened before zxid2.</para>
+ </listitem>
+
+ <listitem>
+ <para><emphasis role="bold">Version numbers</emphasis></para>
+
+ <para>Every change to a node will cause an increase to one of the
+ version numbers of that node. The three version numbers are version
+ (number of changes to the data of a znode), cversion (number of
+ changes to the children of a znode), and aversion (number of changes
+ to the ACL of a znode).</para>
+ </listitem>
+
+ <listitem>
+ <para><emphasis role="bold">Ticks</emphasis></para>
+
+ <para>When using multi-server ZooKeeper, servers use ticks to define
+ timing of events such as status uploads, session timeouts,
+ connection timeouts between peers, etc. The tick time is only
+ indirectly exposed through the minimum session timeout (2 times the
+ tick time); if a client requests a session timeout less than the
+ minimum session timeout, the server will tell the client that the
+ session timeout is actually the minimum session timeout.</para>
+ </listitem>
+
+ <listitem>
+ <para><emphasis role="bold">Real time</emphasis></para>
+
+ <para>ZooKeeper doesn't use real time, or clock time, at all except
+ to put timestamps into the stat structure on znode creation and
+ znode modification.</para>
+ </listitem>
+ </itemizedlist>
+ </section>
+
+ <section id="sc_zkStatStructure">
+ <title>ZooKeeper Stat Structure</title>
+
+ <para>The Stat structure for each znode in ZooKeeper is made up of the
+ following fields:</para>
+
+ <itemizedlist>
+ <listitem>
+ <para><emphasis role="bold">czxid</emphasis></para>
+
+ <para>The zxid of the change that caused this znode to be
+ created.</para>
+ </listitem>
+
+ <listitem>
+ <para><emphasis role="bold">mzxid</emphasis></para>
+
+ <para>The zxid of the change that last modified this znode.</para>
+ </listitem>
+
+ <listitem>
+ <para><emphasis role="bold">pzxid</emphasis></para>
+
+ <para>The zxid of the change that last modified children of this znode.</para>
+ </listitem>
+
+ <listitem>
+ <para><emphasis role="bold">ctime</emphasis></para>
+
+ <para>The time in milliseconds from epoch when this znode was
+ created.</para>
+ </listitem>
+
+ <listitem>
+ <para><emphasis role="bold">mtime</emphasis></para>
+
+ <para>The time in milliseconds from epoch when this znode was last
+ modified.</para>
+ </listitem>
+
+ <listitem>
+ <para><emphasis role="bold">version</emphasis></para>
+
+ <para>The number of changes to the data of this znode.</para>
+ </listitem>
+
+ <listitem>
+ <para><emphasis role="bold">cversion</emphasis></para>
+
+ <para>The number of changes to the children of this znode.</para>
+ </listitem>
+
+ <listitem>
+ <para><emphasis role="bold">aversion</emphasis></para>
+
+ <para>The number of changes to the ACL of this znode.</para>
+ </listitem>
+
+ <listitem>
+ <para><emphasis role="bold">ephemeralOwner</emphasis></para>
+
+ <para>The session id of the owner of this znode if the znode is an
+ ephemeral node. If it is not an ephemeral node, it will be
+ zero.</para>
+ </listitem>
+
+ <listitem>
+ <para><emphasis role="bold">dataLength</emphasis></para>
+
+ <para>The length of the data field of this znode.</para>
+ </listitem>
+
+ <listitem>
+ <para><emphasis role="bold">numChildren</emphasis></para>
+
+ <para>The number of children of this znode.</para>
+ </listitem>
+
+ </itemizedlist>
+ </section>
+ </section>
+
+ <section id="ch_zkSessions">
+ <title>ZooKeeper Sessions</title>
+
+ <para>A ZooKeeper client establishes a session with the ZooKeeper
+ service by creating a handle to the service using a language
+ binding. Once created, the handle starts of in the CONNECTING state
+ and the client library tries to connect to one of the servers that
+ make up the ZooKeeper service at which point it switches to the
+ CONNECTED state. During normal operation will be in one of these
+ two states. If an unrecoverable error occurs, such as session
+ expiration or authentication failure, or if the application explicitly
+ closes the handle, the handle will move to the CLOSED state.
+ The following figure shows the possible state transitions of a
+ ZooKeeper client:</para>
+
+ <mediaobject id="fg_states" >
+ <imageobject>
+ <imagedata fileref="images/state_dia.jpg"/>
+ </imageobject>
+ </mediaobject>
+
+ <para>To create a client session the application code must provide
+ a connection string containing a comma separated list of host:port pairs,
+ each corresponding to a ZooKeeper server (e.g. "127.0.0.1:4545" or
+ "127.0.0.1:3000,127.0.0.1:3001,127.0.0.1:3002"). The ZooKeeper
+ client library will pick an arbitrary server and try to connect to
+ it. If this connection fails, or if the client becomes
+ disconnected from the server for any reason, the client will
+ automatically try the next server in the list, until a connection
+ is (re-)established.</para>
+
+ <para> <emphasis role="bold">Added in 3.2.0</emphasis>: An
+ optional "chroot" suffix may also be appended to the connection
+ string. This will run the client commands while interpreting all
+ paths relative to this root (similar to the unix chroot
+ command). If used the example would look like:
+ "127.0.0.1:4545/app/a" or
+ "127.0.0.1:3000,127.0.0.1:3001,127.0.0.1:3002/app/a" where the
+ client would be rooted at "/app/a" and all paths would be relative
+ to this root - ie getting/setting/etc... "/foo/bar" would result
+ in operations being run on "/app/a/foo/bar" (from the server
+ perspective). This feature is particularly useful in multi-tenant
+ environments where each user of a particular ZooKeeper service
+ could be rooted differently. This makes re-use much simpler as
+ each user can code his/her application as if it were rooted at
+ "/", while actual location (say /app/a) could be determined at
+ deployment time.</para>
+
+ <para>When a client gets a handle to the ZooKeeper service,
+ ZooKeeper creates a ZooKeeper session, represented as a 64-bit
+ number, that it assigns to the client. If the client connects to a
+ different ZooKeeper server, it will send the session id as a part
+ of the connection handshake. As a security measure, the server
+ creates a password for the session id that any ZooKeeper server
+ can validate.The password is sent to the client with the session
+ id when the client establishes the session. The client sends this
+ password with the session id whenever it reestablishes the session
+ with a new server.</para>
+
+ <para>One of the parameters to the ZooKeeper client library call
+ to create a ZooKeeper session is the session timeout in
+ milliseconds. The client sends a requested timeout, the server
+ responds with the timeout that it can give the client. The current
+ implementation requires that the timeout be a minimum of 2 times
+ the tickTime (as set in the server configuration) and a maximum of
+ 20 times the tickTime. The ZooKeeper client API allows access to
+ the negotiated timeout.</para>
+
+ <para>When a client (session) becomes partitioned from the ZK
+ serving cluster it will begin searching the list of servers that
+ were specified during session creation. Eventually, when
+ connectivity between the client and at least one of the servers is
+ re-established, the session will either again transition to the
+ "connected" state (if reconnected within the session timeout
+ value) or it will transition to the "expired" state (if
+ reconnected after the session timeout). It is not advisable to
+ create a new session object (a new ZooKeeper.class or zookeeper
+ handle in the c binding) for disconnection. The ZK client library
+ will handle reconnect for you. In particular we have heuristics
+ built into the client library to handle things like "herd effect",
+ etc... Only create a new session when you are notified of session
+ expiration (mandatory).</para>
+
+ <para>Session expiration is managed by the ZooKeeper cluster
+ itself, not by the client. When the ZK client establishes a
+ session with the cluster it provides a "timeout" value detailed
+ above. This value is used by the cluster to determine when the
+ client's session expires. Expirations happens when the cluster
+ does not hear from the client within the specified session timeout
+ period (i.e. no heartbeat). At session expiration the cluster will
+ delete any/all ephemeral nodes owned by that session and
+ immediately notify any/all connected clients of the change (anyone
+ watching those znodes). At this point the client of the expired
+ session is still disconnected from the cluster, it will not be
+ notified of the session expiration until/unless it is able to
+ re-establish a connection to the cluster. The client will stay in
+ disconnected state until the TCP connection is re-established with
+ the cluster, at which point the watcher of the expired session
+ will receive the "session expired" notification.</para>
+
+ <para>Example state transitions for an expired session as seen by
+ the expired session's watcher:</para>
+
+ <orderedlist>
+ <listitem><para>'connected' : session is established and client
+ is communicating with cluster (client/server communication is
+ operating properly)</para></listitem>
+ <listitem><para>.... client is partitioned from the
+ cluster</para></listitem>
+ <listitem><para>'disconnected' : client has lost connectivity
+ with the cluster</para></listitem>
+ <listitem><para>.... time elapses, after 'timeout' period the
+ cluster expires the session, nothing is seen by client as it is
+ disconnected from cluster</para></listitem>
+ <listitem><para>.... time elapses, the client regains network
+ level connectivity with the cluster</para></listitem>
+ <listitem><para>'expired' : eventually the client reconnects to
+ the cluster, it is then notified of the
+ expiration</para></listitem>
+ </orderedlist>
+
+ <para>Another parameter to the ZooKeeper session establishment
+ call is the default watcher. Watchers are notified when any state
+ change occurs in the client. For example if the client loses
+ connectivity to the server the client will be notified, or if the
+ client's session expires, etc... This watcher should consider the
+ initial state to be disconnected (i.e. before any state changes
+ events are sent to the watcher by the client lib). In the case of
+ a new connection, the first event sent to the watcher is typically
+ the session connection event.</para>
+
+ <para>The session is kept alive by requests sent by the client. If
+ the session is idle for a period of time that would timeout the
+ session, the client will send a PING request to keep the session
+ alive. This PING request not only allows the ZooKeeper server to
+ know that the client is still active, but it also allows the
+ client to verify that its connection to the ZooKeeper server is
+ still active. The timing of the PING is conservative enough to
+ ensure reasonable time to detect a dead connection and reconnect
+ to a new server.</para>
+
+ <para>
+ Once a connection to the server is successfully established
+ (connected) there are basically two cases where the client lib generates
+ connectionloss (the result code in c binding, exception in Java -- see
+ the API documentation for binding specific details) when either a synchronous or
+ asynchronous operation is performed and one of the following holds:
+ </para>
+
+ <orderedlist>
+ <listitem><para>The application calls an operation on a session that is no
+ longer alive/valid</para></listitem>
+ <listitem><para>The ZooKeeper client disconnects from a server when there
+ are pending operations to that server, i.e., there is a pending asynchronous call.
+ </para></listitem>
+ </orderedlist>
+
+ <para> <emphasis role="bold">Added in 3.2.0 -- SessionMovedException</emphasis>. There is an internal
+ exception that is generally not seen by clients called the SessionMovedException.
+ This exception occurs because a request was received on a connection for a session
+ which has been reestablished on a different server. The normal cause of this error is
+ a client that sends a request to a server, but the network packet gets delayed, so
+ the client times out and connects to a new server. When the delayed packet arrives at
+ the first server, the old server detects that the session has moved, and closes the
+ client connection. Clients normally do not see this error since they do not read
+ from those old connections. (Old connections are usually closed.) One situation in which this
+ condition can be seen is when two clients try to reestablish the same connection using
+ a saved session id and password. One of the clients will reestablish the connection
+ and the second client will be disconnected (causing the pair to attempt to re-establish
+ its connection/session indefinitely).</para>
+
+ </section>
+
+ <section id="ch_zkWatches">
+ <title>ZooKeeper Watches</title>
+
+ <para>All of the read operations in ZooKeeper - <emphasis
+ role="bold">getData()</emphasis>, <emphasis
+ role="bold">getChildren()</emphasis>, and <emphasis
+ role="bold">exists()</emphasis> - have the option of setting a watch as a
+ side effect. Here is ZooKeeper's definition of a watch: a watch event is
+ one-time trigger, sent to the client that set the watch, which occurs when
+ the data for which the watch was set changes. There are three key points
+ to consider in this definition of a watch:</para>
+
+ <itemizedlist>
+ <listitem>
+ <para><emphasis role="bold">One-time trigger</emphasis></para>
+
+ <para>One watch event will be sent to the client when the data has changed.
+ For example, if a client does a getData("/znode1", true) and later the
+ data for /znode1 is changed or deleted, the client will get a watch
+ event for /znode1. If /znode1 changes again, no watch event will be
+ sent unless the client has done another read that sets a new
+ watch.</para>
+ </listitem>
+
+ <listitem>
+ <para><emphasis role="bold">Sent to the client</emphasis></para>
+
+ <para>This implies that an event is on the way to the client, but may
+ not reach the client before the successful return code to the change
+ operation reaches the client that initiated the change. Watches are
+ sent asynchronously to watchers. ZooKeeper provides an ordering
+ guarantee: a client will never see a change for which it has set a
+ watch until it first sees the watch event. Network delays or other
+ factors may cause different clients to see watches and return codes
+ from updates at different times. The key point is that everything seen
+ by the different clients will have a consistent order.</para>
+ </listitem>
+
+ <listitem>
+ <para><emphasis role="bold">The data for which the watch was
+ set</emphasis></para>
+
+ <para>This refers to the different ways a node can change. It
+ helps to think of ZooKeeper as maintaining two lists of
+ watches: data watches and child watches. getData() and
+ exists() set data watches. getChildren() sets child
+ watches. Alternatively, it may help to think of watches being
+ set according to the kind of data returned. getData() and
+ exists() return information about the data of the node,
+ whereas getChildren() returns a list of children. Thus,
+ setData() will trigger data watches for the znode being set
+ (assuming the set is successful). A successful create() will
+ trigger a data watch for the znode being created and a child
+ watch for the parent znode. A successful delete() will trigger
+ both a data watch and a child watch (since there can be no
+ more children) for a znode being deleted as well as a child
+ watch for the parent znode.</para>
+ </listitem>
+ </itemizedlist>
+
+ <para>Watches are maintained locally at the ZooKeeper server to which the
+ client is connected. This allows watches to be lightweight to set,
+ maintain, and dispatch. When a client connects to a new server, the watch
+ will be triggered for any session events. Watches will not be received
+ while disconnected from a server. When a client reconnects, any previously
+ registered watches will be reregistered and triggered if needed. In
+ general this all occurs transparently. There is one case where a watch
+ may be missed: a watch for the existence of a znode not yet created will
+ be missed if the znode is created and deleted while disconnected.</para>
+
+ <section id="sc_WatchSemantics">
+ <title>Semantics of Watches</title>
+
+ <para> We can set watches with the three calls that read the state of
+ ZooKeeper: exists, getData, and getChildren. The following list details
+ the events that a watch can trigger and the calls that enable them:
+ </para>
+
+ <itemizedlist>
+ <listitem>
+ <para><emphasis role="bold">Created event:</emphasis></para>
+ <para>Enabled with a call to exists.</para>
+ </listitem>
+
+ <listitem>
+ <para><emphasis role="bold">Deleted event:</emphasis></para>
+ <para>Enabled with a call to exists, getData, and getChildren.</para>
+ </listitem>
+
+ <listitem>
+ <para><emphasis role="bold">Changed event:</emphasis></para>
+ <para>Enabled with a call to exists and getData.</para>
+ </listitem>
+
+ <listitem>
+ <para><emphasis role="bold">Child event:</emphasis></para>
+ <para>Enabled with a call to getChildren.</para>
+ </listitem>
+ </itemizedlist>
+ </section>
+
+ <section id="sc_WatchGuarantees">
+ <title>What ZooKeeper Guarantees about Watches</title>
+
+ <para>With regard to watches, ZooKeeper maintains these
+ guarantees:</para>
+
+ <itemizedlist>
+ <listitem>
+ <para>Watches are ordered with respect to other events, other
+ watches, and asynchronous replies. The ZooKeeper client libraries
+ ensures that everything is dispatched in order.</para>
+ </listitem>
+ </itemizedlist>
+
+ <itemizedlist>
+ <listitem>
+ <para>A client will see a watch event for a znode it is watching
+ before seeing the new data that corresponds to that znode.</para>
+ </listitem>
+ </itemizedlist>
+
+ <itemizedlist>
+ <listitem>
+ <para>The order of watch events from ZooKeeper corresponds to the
+ order of the updates as seen by the ZooKeeper service.</para>
+ </listitem>
+ </itemizedlist>
+ </section>
+
+ <section id="sc_WatchRememberThese">
+ <title>Things to Remember about Watches</title>
+
+ <itemizedlist>
+ <listitem>
+ <para>Watches are one time triggers; if you get a watch event and
+ you want to get notified of future changes, you must set another
+ watch.</para>
+ </listitem>
+ </itemizedlist>
+
+ <itemizedlist>
+ <listitem>
+ <para>Because watches are one time triggers and there is latency
+ between getting the event and sending a new request to get a watch
+ you cannot reliably see every change that happens to a node in
+ ZooKeeper. Be prepared to handle the case where the znode changes
+ multiple times between getting the event and setting the watch
+ again. (You may not care, but at least realize it may
+ happen.)</para>
+ </listitem>
+ </itemizedlist>
+
+ <itemizedlist>
+ <listitem>
+ <para>A watch object, or function/context pair, will only be
+ triggered once for a given notification. For example, if the same
+ watch object is registered for an exists and a getData call for the
+ same file and that file is then deleted, the watch object would
+ only be invoked once with the deletion notification for the file.
+ </para>
+ </listitem>
+ </itemizedlist>
+
+ <itemizedlist>
+ <listitem>
+ <para>When you disconnect from a server (for example, when the
+ server fails), you will not get any watches until the connection
+ is reestablished. For this reason session events are sent to all
+ outstanding watch handlers. Use session events to go into a safe
+ mode: you will not be receiving events while disconnected, so your
+ process should act conservatively in that mode.</para>
+ </listitem>
+ </itemizedlist>
+ </section>
+ </section>
+
+ <section id="sc_ZooKeeperAccessControl">
+ <title>ZooKeeper access control using ACLs</title>
+
+ <para>ZooKeeper uses ACLs to control access to its znodes (the
+ data nodes of a ZooKeeper data tree). The ACL implementation is
+ quite similar to UNIX file access permissions: it employs
+ permission bits to allow/disallow various operations against a
+ node and the scope to which the bits apply. Unlike standard UNIX
+ permissions, a ZooKeeper node is not limited by the three standard
+ scopes for user (owner of the file), group, and world
+ (other). ZooKeeper does not have a notion of an owner of a
+ znode. Instead, an ACL specifies sets of ids and permissions that
+ are associated with those ids.</para>
+
+ <para>Note also that an ACL pertains only to a specific znode. In
+ particular it does not apply to children. For example, if
+ <emphasis>/app</emphasis> is only readable by ip:172.16.16.1 and
+ <emphasis>/app/status</emphasis> is world readable, anyone will
+ be able to read <emphasis>/app/status</emphasis>; ACLs are not
+ recursive.</para>
+
+ <para>ZooKeeper supports pluggable authentication schemes. Ids are
+ specified using the form <emphasis>scheme:id</emphasis>,
+ where <emphasis>scheme</emphasis> is a the authentication scheme
+ that the id corresponds to. For
+ example, <emphasis>ip:172.16.16.1</emphasis> is an id for a
+ host with the address <emphasis>172.16.16.1</emphasis>.</para>
+
+ <para>When a client connects to ZooKeeper and authenticates
+ itself, ZooKeeper associates all the ids that correspond to a
+ client with the clients connection. These ids are checked against
+ the ACLs of znodes when a clients tries to access a node. ACLs are
+ made up of pairs of <emphasis>(scheme:expression,
+ perms)</emphasis>. The format of
+ the <emphasis>expression</emphasis> is specific to the scheme. For
+ example, the pair <emphasis>(ip:19.22.0.0/16, READ)</emphasis>
+ gives the <emphasis>READ</emphasis> permission to any clients with
+ an IP address that starts with 19.22.</para>
+
+ <section id="sc_ACLPermissions">
+ <title>ACL Permissions</title>
+
+ <para>ZooKeeper supports the following permissions:</para>
+
+ <itemizedlist>
+ <listitem><para><emphasis role="bold">CREATE</emphasis>: you can create a child node</para></listitem>
+ <listitem><para><emphasis role="bold">READ</emphasis>: you can get data from a node and list its children.</para></listitem>
+ <listitem><para><emphasis role="bold">WRITE</emphasis>: you can set data for a node</para></listitem>
+ <listitem><para><emphasis role="bold">DELETE</emphasis>: you can delete a child node</para></listitem>
+ <listitem><para><emphasis role="bold">ADMIN</emphasis>: you can set permissions</para></listitem>
+ </itemizedlist>
+
+ <para>The <emphasis>CREATE</emphasis>
+ and <emphasis>DELETE</emphasis> permissions have been broken out
+ of the <emphasis>WRITE</emphasis> permission for finer grained
+ access controls. The cases for <emphasis>CREATE</emphasis>
+ and <emphasis>DELETE</emphasis> are the following:</para>
+
+ <para>You want A to be able to do a set on a ZooKeeper node, but
+ not be able to <emphasis>CREATE</emphasis>
+ or <emphasis>DELETE</emphasis> children.</para>
+
+ <para><emphasis>CREATE</emphasis>
+ without <emphasis>DELETE</emphasis>: clients create requests by
+ creating ZooKeeper nodes in a parent directory. You want all
+ clients to be able to add, but only request processor can
+ delete. (This is kind of like the APPEND permission for
+ files.)</para>
+
+ <para>Also, the <emphasis>ADMIN</emphasis> permission is there
+ since ZooKeeper doesn’t have a notion of file owner. In some
+ sense the <emphasis>ADMIN</emphasis> permission designates the
+ entity as the owner. ZooKeeper doesn’t support the LOOKUP
+ permission (execute permission bit on directories to allow you
+ to LOOKUP even though you can't list the directory). Everyone
+ implicitly has LOOKUP permission. This allows you to stat a
+ node, but nothing more. (The problem is, if you want to call
+ zoo_exists() on a node that doesn't exist, there is no
+ permission to check.)</para>
+
+ <section id="sc_BuiltinACLSchemes">
+ <title>Builtin ACL Schemes</title>
+
+ <para>ZooKeeeper has the following built in schemes:</para>
+
+ <itemizedlist>
+ <listitem><para><emphasis role="bold">world</emphasis> has a
+ single id, <emphasis>anyone</emphasis>, that represents
+ anyone.</para></listitem>
+
+ <listitem><para><emphasis role="bold">auth</emphasis> doesn't
+ use any id, represents any authenticated
+ user.</para></listitem>
+
+ <listitem><para><emphasis role="bold">digest</emphasis> uses
+ a <emphasis>username:password</emphasis> string to generate
+ MD5 hash which is then used as an ACL ID
+ identity. Authentication is done by sending
+ the <emphasis>username:password</emphasis> in clear text. When
+ used in the ACL the expression will be
+ the <emphasis>username:base64</emphasis>
+ encoded <emphasis>SHA1</emphasis>
+ password <emphasis>digest</emphasis>.</para>
+ </listitem>
+
+ <listitem><para><emphasis role="bold">ip</emphasis> uses the
+ client host IP as an ACL ID identity. The ACL expression is of
+ the form <emphasis>addr/bits</emphasis> where the most
+ significant <emphasis>bits</emphasis>
+ of <emphasis>addr</emphasis> are matched against the most
+ significant <emphasis>bits</emphasis> of the client host
+ IP.</para></listitem>
+
+ </itemizedlist>
+ </section>
+
+ <section>
+ <title>ZooKeeper C client API</title>
+
+ <para>The following constants are provided by the ZooKeeper C
+ library:</para>
+
+ <itemizedlist>
+ <listitem><para><emphasis>const</emphasis> <emphasis>int</emphasis> ZOO_PERM_READ; //can read node’s value and list its children</para></listitem>
+ <listitem><para><emphasis>const</emphasis> <emphasis>int</emphasis> ZOO_PERM_WRITE;// can set the node’s value</para></listitem>
+ <listitem><para><emphasis>const</emphasis> <emphasis>int</emphasis> ZOO_PERM_CREATE; //can create children</para></listitem>
+ <listitem><para><emphasis>const</emphasis> <emphasis>int</emphasis> ZOO_PERM_DELETE;// can delete children</para></listitem>
+ <listitem><para><emphasis>const</emphasis> <emphasis>int</emphasis> ZOO_PERM_ADMIN; //can execute set_acl()</para></listitem>
+ <listitem><para><emphasis>const</emphasis> <emphasis>int</emphasis> ZOO_PERM_ALL;// all of the above flags OR’d together</para></listitem>
+ </itemizedlist>
+
+ <para>The following are the standard ACL IDs:</para>
+
+ <itemizedlist>
+ <listitem><para><emphasis>struct</emphasis> Id ZOO_ANYONE_ID_UNSAFE; //(‘world’,’anyone’)</para></listitem>
+ <listitem><para><emphasis>struct</emphasis> Id ZOO_AUTH_IDS;// (‘auth’,’’)</para></listitem>
+ </itemizedlist>
+
+ <para>ZOO_AUTH_IDS empty identity string should be interpreted as “the identity of the creator”.</para>
+
+ <para>ZooKeeper client comes with three standard ACLs:</para>
+
+ <itemizedlist>
+ <listitem><para><emphasis>struct</emphasis> ACL_vector ZOO_OPEN_ACL_UNSAFE; //(ZOO_PERM_ALL,ZOO_ANYONE_ID_UNSAFE)</para></listitem>
+ <listitem><para><emphasis>struct</emphasis> ACL_vector ZOO_READ_ACL_UNSAFE;// (ZOO_PERM_READ, ZOO_ANYONE_ID_UNSAFE)</para></listitem>
+ <listitem><para><emphasis>struct</emphasis> ACL_vector ZOO_CREATOR_ALL_ACL; //(ZOO_PERM_ALL,ZOO_AUTH_IDS)</para></listitem>
+ </itemizedlist>
+
+ <para>The ZOO_OPEN_ACL_UNSAFE is completely open free for all
+ ACL: any application can execute any operation on the node and
+ can create, list and delete its children. The
+ ZOO_READ_ACL_UNSAFE is read-only access for any
+ application. CREATE_ALL_ACL grants all permissions to the
+ creator of the node. The creator must have been authenticated by
+ the server (for example, using “<emphasis>digest</emphasis>”
+ scheme) before it can create nodes with this ACL.</para>
+
+ <para>The following ZooKeeper operations deal with ACLs:</para>
+
+ <itemizedlist><listitem>
+ <para><emphasis>int</emphasis> <emphasis>zoo_add_auth</emphasis>
+ (zhandle_t *zh,<emphasis>const</emphasis> <emphasis>char</emphasis>*
+ scheme,<emphasis>const</emphasis> <emphasis>char</emphasis>*
+ cert, <emphasis>int</emphasis> certLen, void_completion_t
+ completion, <emphasis>const</emphasis> <emphasis>void</emphasis>
+ *data);</para>
+ </listitem></itemizedlist>
+
+ <para>The application uses the zoo_add_auth function to
+ authenticate itself to the server. The function can be called
+ multiple times if the application wants to authenticate using
+ different schemes and/or identities.</para>
+
+ <itemizedlist><listitem>
+ <para><emphasis>int</emphasis> <emphasis>zoo_create</emphasis>
+ (zhandle_t *zh, <emphasis>const</emphasis> <emphasis>char</emphasis>
+ *path, <emphasis>const</emphasis> <emphasis>char</emphasis>
+ *value,<emphasis>int</emphasis>
+ valuelen, <emphasis>const</emphasis> <emphasis>struct</emphasis>
+ ACL_vector *acl, <emphasis>int</emphasis>
+ flags,<emphasis>char</emphasis>
+ *realpath, <emphasis>int</emphasis>
+ max_realpath_len);</para>
+ </listitem></itemizedlist>
+
+ <para>zoo_create(...) operation creates a new node. The acl
+ parameter is a list of ACLs associated with the node. The parent
+ node must have the CREATE permission bit set.</para>
+
+ <itemizedlist><listitem>
+ <para><emphasis>int</emphasis> <emphasis>zoo_get_acl</emphasis>
+ (zhandle_t *zh, <emphasis>const</emphasis> <emphasis>char</emphasis>
+ *path,<emphasis>struct</emphasis> ACL_vector
+ *acl, <emphasis>struct</emphasis> Stat *stat);</para>
+ </listitem></itemizedlist>
+
+ <para>This operation returns a node’s ACL info.</para>
+
+ <itemizedlist><listitem>
+ <para><emphasis>int</emphasis> <emphasis>zoo_set_acl</emphasis>
+ (zhandle_t *zh, <emphasis>const</emphasis> <emphasis>char</emphasis>
+ *path, <emphasis>int</emphasis>
+ version,<emphasis>const</emphasis> <emphasis>struct</emphasis>
+ ACL_vector *acl);</para>
+ </listitem></itemizedlist>
+
+ <para>This function replaces node’s ACL list with a new one. The
+ node must have the ADMIN permission set.</para>
+
+ <para>Here is a sample code that makes use of the above APIs to
+ authenticate itself using the “<emphasis>foo</emphasis>” scheme
+ and create an ephemeral node “/xyz” with create-only
+ permissions.</para>
+
+ <note><para>This is a very simple example which is intended to show
+ how to interact with ZooKeeper ACLs
+ specifically. See <filename>.../trunk/src/c/src/cli.c</filename>
+ for an example of a C client implementation</para>
+ </note>
+
+ <programlisting>
+#include <string.h>
+#include <errno.h>
+
+#include "zookeeper.h"
+
+static zhandle_t *zh;
+
+/**
+ * In this example this method gets the cert for your
+ * environment -- you must provide
+ */
+char *foo_get_cert_once(char* id) { return 0; }
+
+/** Watcher function -- empty for this example, not something you should
+ * do in real code */
+void watcher(zhandle_t *zzh, int type, int state, const char *path,
+ void *watcherCtx) {}
+
+int main(int argc, char argv) {
+ char buffer[512];
+ char p[2048];
+ char *cert=0;
+ char appId[64];
+
+ strcpy(appId, "example.foo_test");
+ cert = foo_get_cert_once(appId);
+ if(cert!=0) {
+ fprintf(stderr,
+ "Certificate for appid [%s] is [%s]\n",appId,cert);
+ strncpy(p,cert, sizeof(p)-1);
+ free(cert);
+ } else {
+ fprintf(stderr, "Certificate for appid [%s] not found\n",appId);
+ strcpy(p, "dummy");
+ }
+
+ zoo_set_debug_level(ZOO_LOG_LEVEL_DEBUG);
+
+ zh = zookeeper_init("localhost:3181", watcher, 10000, 0, 0, 0);
+ if (!zh) {
+ return errno;
+ }
+ if(zoo_add_auth(zh,"foo",p,strlen(p),0,0)!=ZOK)
+ return 2;
+
+ struct ACL CREATE_ONLY_ACL[] = {{ZOO_PERM_CREATE, ZOO_AUTH_IDS}};
+ struct ACL_vector CREATE_ONLY = {1, CREATE_ONLY_ACL};
+ int rc = zoo_create(zh,"/xyz","value", 5, &CREATE_ONLY, ZOO_EPHEMERAL,
+ buffer, sizeof(buffer)-1);
+
+ /** this operation will fail with a ZNOAUTH error */
+ int buflen= sizeof(buffer);
+ struct Stat stat;
+ rc = zoo_get(zh, "/xyz", 0, buffer, &buflen, &stat);
+ if (rc) {
+ fprintf(stderr, "Error %d for %s\n", rc, __LINE__);
+ }
+
+ zookeeper_close(zh);
+ return 0;
+}
+ </programlisting>
+ </section>
+ </section>
+ </section>
+
+ <section id="sc_ZooKeeperPluggableAuthentication">
+ <title>Pluggable ZooKeeper authentication</title>
+
+ <para>ZooKeeper runs in a variety of different environments with
+ various different authentication schemes, so it has a completely
+ pluggable authentication framework. Even the builtin authentication
+ schemes use the pluggable authentication framework.</para>
+
+ <para>To understand how the authentication framework works, first you must
+ understand the two main authentication operations. The framework
+ first must authenticate the client. This is usually done as soon as
+ the client connects to a server and consists of validating information
+ sent from or gathered about a client and associating it with the connection.
+ The second operation handled by the framework is finding the entries in an
+ ACL that correspond to client. ACL entries are <<emphasis>idspec,
+ permissions</emphasis>> pairs. The <emphasis>idspec</emphasis> may be
+ a simple string match against the authentication information associated
+ with the connection or it may be a expression that is evaluated against that
+ information. It is up to the implementation of the authentication plugin
+ to do the match. Here is the interface that an authentication plugin must
+ implement:</para>
+
+ <programlisting>
+public interface AuthenticationProvider {
+ String getScheme();
+ KeeperException.Code handleAuthentication(ServerCnxn cnxn, byte authData[]);
+ boolean isValid(String id);
+ boolean matches(String id, String aclExpr);
+ boolean isAuthenticated();
+}
+ </programlisting>
+
+ <para>The first method <emphasis>getScheme</emphasis> returns the string
+ that identifies the plugin. Because we support multiple methods of authentication,
+ an authentication credential or an <emphasis>idspec</emphasis> will always be
+ prefixed with <emphasis>scheme:</emphasis>. The ZooKeeper server uses the scheme
+ returned by the authentication plugin to determine which ids the scheme
+ applies to.</para>
+
+ <para><emphasis>handleAuthentication</emphasis> is called when a client
+ sends authentication information to be associated with a connection. The
+ client specifies the scheme to which the information corresponds. The
+ ZooKeeper server passes the information to the authentication plugin whose
+ <emphasis>getScheme</emphasis> matches the scheme passed by the client. The
+ implementor of <emphasis>handleAuthentication</emphasis> will usually return
+ an error if it determines that the information is bad, or it will associate information
+ with the connection using <emphasis>cnxn.getAuthInfo().add(new Id(getScheme(), data))</emphasis>.
+ </para>
+
+ <para>The authentication plugin is involved in both setting and using ACLs. When an
+ ACL is set for a znode, the ZooKeeper server will pass the id part of the entry to
+ the <emphasis>isValid(String id)</emphasis> method. It is up to the plugin to verify
+ that the id has a correct form. For example, <emphasis>ip:172.16.0.0/16</emphasis>
+ is a valid id, but <emphasis>ip:host.com</emphasis> is not. If the new ACL includes
+ an "auth" entry, <emphasis>isAuthenticated</emphasis> is used to see if the
+ authentication information for this scheme that is assocatied with the connection
+ should be added to the ACL. Some schemes
+ should not be included in auth. For example, the IP address of the client is not
+ considered as an id that should be added to the ACL if auth is specified.</para>
+
+ <para>ZooKeeper invokes
+ <emphasis>matches(String id, String aclExpr)</emphasis> when checking an ACL. It
+ needs to match authentication information of the client against the relevant ACL
+ entries. To find the entries which apply to the client, the ZooKeeper server will
+ find the scheme of each entry and if there is authentication information
+ from that client for that scheme, <emphasis>matches(String id, String aclExpr)</emphasis>
+ will be called with <emphasis>id</emphasis> set to the authentication information
+ that was previously added to the connection by <emphasis>handleAuthentication</emphasis> and
+ <emphasis>aclExpr</emphasis> set to the id of the ACL entry. The authentication plugin
+ uses its own logic and matching scheme to determine if <emphasis>id</emphasis> is included
+ in <emphasis>aclExpr</emphasis>.
+ </para>
+
+ <para>There are two built in authentication plugins: <emphasis>ip</emphasis> and
+ <emphasis>digest</emphasis>. Additional plugins can adding using system properties. At
+ startup the ZooKeeper server will look for system properties that start with
+ "zookeeper.authProvider." and interpret the value of those properties as the class name
+ of an authentication plugin. These properties can be set using the
+ <emphasis>-Dzookeeeper.authProvider.X=com.f.MyAuth</emphasis> or adding entries such as
+ the following in the server configuration file:</para>
+
+ <programlisting>
+authProvider.1=com.f.MyAuth
+authProvider.2=com.f.MyAuth2
+ </programlisting>
+
+ <para>Care should be taking to ensure that the suffix on the property is unique. If there are
+ duplicates such as <emphasis>-Dzookeeeper.authProvider.X=com.f.MyAuth -Dzookeeper.authProvider.X=com.f.MyAuth2</emphasis>,
+ only one will be used. Also all servers must have the same plugins defined, otherwise clients using
+ the authentication schemes provided by the plugins will have problems connecting to some servers.
+ </para>
+ </section>
+
+ <section id="ch_zkGuarantees">
+ <title>Consistency Guarantees</title>
+
+ <para>ZooKeeper is a high performance, scalable service. Both reads and
+ write operations are designed to be fast, though reads are faster than
+ writes. The reason for this is that in the case of reads, ZooKeeper can
+ serve older data, which in turn is due to ZooKeeper's consistency
+ guarantees:</para>
+
+ <variablelist>
+ <varlistentry>
+ <term>Sequential Consistency</term>
+
+ <listitem>
+ <para>Updates from a client will be applied in the order that they
+ were sent.</para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>Atomicity</term>
+
+ <listitem>
+ <para>Updates either succeed or fail -- there are no partial
+ results.</para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>Single System Image</term>
+
+ <listitem>
+ <para>A client will see the same view of the service regardless of
+ the server that it connects to.</para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>Reliability</term>
+
+ <listitem>
+ <para>Once an update has been applied, it will persist from that
+ time forward until a client overwrites the update. This guarantee
+ has two corollaries:</para>
+
+ <orderedlist>
+ <listitem>
+ <para>If a client gets a successful return code, the update will
+ have been applied. On some failures (communication errors,
+ timeouts, etc) the client will not know if the update has
+ applied or not. We take steps to minimize the failures, but the
+ guarantee is only present with successful return codes.
+ (This is called the <emphasis>monotonicity condition</emphasis> in Paxos.)</para>
+ </listitem>
+
+ <listitem>
+ <para>Any updates that are seen by the client, through a read
+ request or successful update, will never be rolled back when
+ recovering from server failures.</para>
+ </listitem>
+ </orderedlist>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>Timeliness</term>
+
+ <listitem>
+ <para>The clients view of the system is guaranteed to be up-to-date
+ within a certain time bound (on the order of tens of seconds).
+ Either system changes will be seen by a client within this bound, or
+ the client will detect a service outage.</para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+
+ <para>Using these consistency guarantees it is easy to build higher level
+ functions such as leader election, barriers, queues, and read/write
+ revocable locks solely at the ZooKeeper client (no additions needed to
+ ZooKeeper). See <ulink url="recipes.html">Recipes and Solutions</ulink>
+ for more details.</para>
+
+ <note>
+ <para>Sometimes developers mistakenly assume one other guarantee that
+ ZooKeeper does <emphasis>not</emphasis> in fact make. This is:</para>
+
+ <variablelist>
+ <varlistentry>
+ <term>Simultaneously Consistent Cross-Client Views</term>
+
+ <listitem>
+ <para>ZooKeeper does not guarantee that at every instance in
+ time, two different clients will have identical views of
+ ZooKeeper data. Due to factors like network delays, one client
+ may perform an update before another client gets notified of the
+ change. Consider the scenario of two clients, A and B. If client
+ A sets the value of a znode /a from 0 to 1, then tells client B
+ to read /a, client B may read the old value of 0, depending on
+ which server it is connected to. If it
+ is important that Client A and Client B read the same value,
+ Client B should should call the <emphasis
+ role="bold">sync()</emphasis> method from the ZooKeeper API
+ method before it performs its read.</para>
+
+ <para>So, ZooKeeper by itself doesn't guarantee that changes occur
+ synchronously across all servers, but ZooKeeper
+ primitives can be used to construct higher level functions that
+ provide useful client synchronization. (For more information,
+ see the <ulink
+ url="recipes.html">ZooKeeper Recipes</ulink>.
+ <emphasis>[tbd:..]</emphasis>).</para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ </note>
+ </section>
+
+ <section id="ch_bindings">
+ <title>Bindings</title>
+
+ <para>The ZooKeeper client libraries come in two languages: Java and C.
+ The following sections describe these.</para>
+
+ <section>
+ <title>Java Binding</title>
+
+ <para>There are two packages that make up the ZooKeeper Java binding:
+ <emphasis role="bold">org.apache.zookeeper</emphasis> and <emphasis
+ role="bold">org.apache.zookeeper.data</emphasis>. The rest of the
+ packages that make up ZooKeeper are used internally or are part of the
+ server implementation. The <emphasis
+ role="bold">org.apache.zookeeper.data</emphasis> package is made up of
+ generated classes that are used simply as containers.</para>
+
+ <para>The main class used by a ZooKeeper Java client is the <emphasis
+ role="bold">ZooKeeper</emphasis> class. Its two constructors differ only
+ by an optional session id and password. ZooKeeper supports session
+ recovery accross instances of a process. A Java program may save its
+ session id and password to stable storage, restart, and recover the
+ session that was used by the earlier instance of the program.</para>
+
+ <para>When a ZooKeeper object is created, two threads are created as
+ well: an IO thread and an event thread. All IO happens on the IO thread
+ (using Java NIO). All event callbacks happen on the event thread.
+ Session maintenance such as reconnecting to ZooKeeper servers and
+ maintaining heartbeat is done on the IO thread. Responses for
+ synchronous methods are also processed in the IO thread. All responses
+ to asynchronous methods and watch events are processed on the event
+ thread. There are a few things to notice that result from this
+ design:</para>
+
+ <itemizedlist>
+ <listitem>
+ <para>All completions for asynchronous calls and watcher callbacks
+ will be made in order, one at a time. The caller can do any
+ processing they wish, but no other callbacks will be processed
+ during that time.</para>
+ </listitem>
+
+ <listitem>
+ <para>Callbacks do not block the processing of the IO thread or the
+ processing of the synchronous calls.</para>
+ </listitem>
+
+ <listitem>
+ <para>Synchronous calls may not return in the correct order. For
+ example, assume a client does the following processing: issues an
+ asynchronous read of node <emphasis role="bold">/a</emphasis> with
+ <emphasis>watch</emphasis> set to true, and then in the completion
+ callback of the read it does a synchronous read of <emphasis
+ role="bold">/a</emphasis>. (Maybe not good practice, but not illegal
+ either, and it makes for a simple example.)</para>
+
+ <para>Note that if there is a change to <emphasis
+ role="bold">/a</emphasis> between the asynchronous read and the
+ synchronous read, the client library will receive the watch event
+ saying <emphasis role="bold">/a</emphasis> changed before the
+ response for the synchronous read, but because the completion
+ callback is blocking the event queue, the synchronous read will
+ return with the new value of <emphasis role="bold">/a</emphasis>
+ before the watch event is processed.</para>
+ </listitem>
+ </itemizedlist>
+
+ <para>Finally, the rules associated with shutdown are straightforward:
+ once a ZooKeeper object is closed or receives a fatal event
+ (SESSION_EXPIRED and AUTH_FAILED), the ZooKeeper object becomes invalid.
+ On a close, the two threads shut down and any further access on zookeeper
+ handle is undefined behavior and should be avoided. </para>
+ </section>
+
+ <section>
+ <title>C Binding</title>
+
+ <para>The C binding has a single-threaded and multi-threaded library.
+ The multi-threaded library is easiest to use and is most similar to the
+ Java API. This library will create an IO thread and an event dispatch
+ thread for handling connection maintenance and callbacks. The
+ single-threaded library allows ZooKeeper to be used in event driven
+ applications by exposing the event loop used in the multi-threaded
+ library.</para>
+
+ <para>The package includes two shared libraries: zookeeper_st and
+ zookeeper_mt. The former only provides the asynchronous APIs and
+ callbacks for integrating into the application's event loop. The only
+ reason this library exists is to support the platforms were a
+ <emphasis>pthread</emphasis> library is not available or is unstable
+ (i.e. FreeBSD 4.x). In all other cases, application developers should
+ link with zookeeper_mt, as it includes support for both Sync and Async
+ API.</para>
+
+ <section>
+ <title>Installation</title>
+
+ <para>If you're building the client from a check-out from the Apache
+ repository, follow the steps outlined below. If you're building from a
+ project source package downloaded from apache, skip to step <emphasis
+ role="bold">3</emphasis>.</para>
+
+ <orderedlist>
+ <listitem>
+ <para>Run <command>ant compile_jute</command> from the ZooKeeper
+ top level directory (<filename>.../trunk</filename>).
+ This will create a directory named "generated" under
+ <filename>.../trunk/src/c</filename>.</para>
+ </listitem>
+
+ <listitem>
+ <para>Change directory to the<filename>.../trunk/src/c</filename>
+ and run <command>autoreconf -if</command> to bootstrap <emphasis
+ role="bold">autoconf</emphasis>, <emphasis
+ role="bold">automake</emphasis> and <emphasis
+ role="bold">libtool</emphasis>. Make sure you have <emphasis
+ role="bold">autoconf version 2.59</emphasis> or greater installed.
+ Skip to step<emphasis role="bold"> 4</emphasis>.</para>
+ </listitem>
+
+ <listitem>
+ <para>If you are building from a project source package,
+ unzip/untar the source tarball and cd to the<filename>
+ zookeeper-x.x.x/src/c</filename> directory.</para>
+ </listitem>
+
+ <listitem>
+ <para>Run <command>./configure <your-options></command> to
+ generate the makefile. Here are some of options the <emphasis
+ role="bold">configure</emphasis> utility supports that can be
+ useful in this step:</para>
+
+ <itemizedlist>
+ <listitem>
+ <para><command>--enable-debug</command></para>
+
+ <para>Enables optimization and enables debug info compiler
+ options. (Disabled by default.)</para>
+ </listitem>
+
+ <listitem>
+ <para><command>--without-syncapi </command></para>
+
+ <para>Disables Sync API support; zookeeper_mt library won't be
+ built. (Enabled by default.)</para>
+ </listitem>
+
+ <listitem>
+ <para><command>--disable-static </command></para>
+
+ <para>Do not build static libraries. (Enabled by
+ default.)</para>
+ </listitem>
+
+ <listitem>
+ <para><command>--disable-shared</command></para>
+
+ <para>Do not build shared libraries. (Enabled by
+ default.)</para>
+ </listitem>
+ </itemizedlist>
+
+ <note>
+ <para>See INSTALL for general information about running
+ <emphasis role="bold">configure</emphasis>.</para>
+ </note>
+ </listitem>
+
+ <listitem>
+ <para>Run <command>make</command> or <command>make
+ install</command> to build the libraries and install them.</para>
+ </listitem>
+
+ <listitem>
+ <para>To generate doxygen documentation for the ZooKeeper API, run
+ <command>make doxygen-doc</command>. All documentation will be
+ placed in a new subfolder named docs. By default, this command
+ only generates HTML. For information on other document formats,
+ run <command>./configure --help</command></para>
+ </listitem>
+ </orderedlist>
+ </section>
+
+ <section>
+ <title>Building Your Own C Client</title>
+
+ <para>In order to be able to use the ZooKeeper API in your application
+ you have to remember to</para>
+
+ <orderedlist>
+ <listitem>
+ <para>Include ZooKeeper header: #include
+ <zookeeper/zookeeper.h></para>
+ </listitem>
+
+ <listitem>
+ <para>If you are building a multithreaded client, compile with
+ -DTHREADED compiler flag to enable the multi-threaded version of
+ the library, and then link against against the
+ <emphasis>zookeeper_mt</emphasis> library. If you are building a
+ single-threaded client, do not compile with -DTHREADED, and be
+ sure to link against the<emphasis> zookeeper_st
+ </emphasis>library.</para>
+ </listitem>
+ </orderedlist>
+
+ <note><para>
+ See <filename>.../trunk/src/c/src/cli.c</filename>
+ for an example of a C client implementation</para>
+ </note>
+ </section>
+ </section>
+ </section>
+
+ <section id="ch_guideToZkOperations">
+ <title>Building Blocks: A Guide to ZooKeeper Operations</title>
+
+ <para>This section surveys all the operations a developer can perform
+ against a ZooKeeper server. It is lower level information than the earlier
+ concepts chapters in this manual, but higher level than the ZooKeeper API
+ Reference. It covers these topics:</para>
+
+ <itemizedlist>
+ <listitem>
+ <para><xref linkend="sc_connectingToZk" /></para>
+ </listitem>
+ </itemizedlist>
+
+ <section id="sc_errorsZk">
+ <title>Handling Errors</title>
+
+ <para>Both the Java and C client bindings may report errors. The Java client binding does so by throwing KeeperException, calling code() on the exception will return the specific error code. The C client binding returns an error code as defined in the enum ZOO_ERRORS. API callbacks indicate result code for both language bindings. See the API documentation (javadoc for Java, doxygen for C) for full details on the possible errors and their meaning.</para>
+ </section>
+
+ <section id="sc_connectingToZk">
+ <title>Connecting to ZooKeeper</title>
+
+ <para></para>
+ </section>
+
+ <section id="sc_readOps">
+ <title>Read Operations</title>
+
+ <para></para>
+ </section>
+
+ <section id="sc_writeOps">
+ <title>Write Operations</title>
+
+ <para></para>
+ </section>
+
+ <section id="sc_handlingWatches">
+ <title>Handling Watches</title>
+
+ <para></para>
+ </section>
+
+ <section id="sc_miscOps">
+ <title>Miscelleaneous ZooKeeper Operations</title>
+ <para></para>
+ </section>
+
+
+ </section>
+
+ <section id="ch_programStructureWithExample">
+ <title>Program Structure, with Simple Example</title>
+
+ <para><emphasis>[tbd]</emphasis></para>
+ </section>
+
+ <section id="ch_gotchas">
+ <title>Gotchas: Common Problems and Troubleshooting</title>
+
+ <para>So now you know ZooKeeper. It's fast, simple, your application
+ works, but wait ... something's wrong. Here are some pitfalls that
+ ZooKeeper users fall into:</para>
+
+ <orderedlist>
+ <listitem>
+ <para>If you are using watches, you must look for the connected watch
+ event. When a ZooKeeper client disconnects from a server, you will
+ not receive notification of changes until reconnected. If you are
+ watching for a znode to come into existence, you will miss the event
+ if the znode is created and deleted while you are disconnected.</para>
+ </listitem>
+
+ <listitem>
+ <para>You must test ZooKeeper server failures. The ZooKeeper service
+ can survive failures as long as a majority of servers are active. The
+ question to ask is: can your application handle it? In the real world
+ a client's connection to ZooKeeper can break. (ZooKeeper server
+ failures and network partitions are common reasons for connection
+ loss.) The ZooKeeper client library takes care of recovering your
+ connection and letting you know what happened, but you must make sure
+ that you recover your state and any outstanding requests that failed.
+ Find out if you got it right in the test lab, not in production - test
+ with a ZooKeeper service made up of a several of servers and subject
+ them to reboots.</para>
+ </listitem>
+
+ <listitem>
+ <para>The list of ZooKeeper servers used by the client must match the
+ list of ZooKeeper servers that each ZooKeeper server has. Things can
+ work, although not optimally, if the client list is a subset of the
+ real list of ZooKeeper servers, but not if the client lists ZooKeeper
+ servers not in the ZooKeeper cluster.</para>
+ </listitem>
+
+ <listitem>
+ <para>Be careful where you put that transaction log. The most
+ performance-critical part of ZooKeeper is the transaction log.
+ ZooKeeper must sync transactions to media before it returns a
+ response. A dedicated transaction log device is key to consistent good
+ performance. Putting the log on a busy device will adversely effect
+ performance. If you only have one storage device, put trace files on
+ NFS and increase the snapshotCount; it doesn't eliminate the problem,
+ but it can mitigate it.</para>
+ </listitem>
+
+ <listitem>
+ <para>Set your Java max heap size correctly. It is very important to
+ <emphasis>avoid swapping.</emphasis> Going to disk unnecessarily will
+ almost certainly degrade your performance unacceptably. Remember, in
+ ZooKeeper, everything is ordered, so if one request hits the disk, all
+ other queued requests hit the disk.</para>
+
+ <para>To avoid swapping, try to set the heapsize to the amount of
+ physical memory you have, minus the amount needed by the OS and cache.
+ The best way to determine an optimal heap size for your configurations
+ is to <emphasis>run load tests</emphasis>. If for some reason you
+ can't, be conservative in your estimates and choose a number well
+ below the limit that would cause your machine to swap. For example, on
+ a 4G machine, a 3G heap is a conservative estimate to start
+ with.</para>
+ </listitem>
+ </orderedlist>
+ </section>
+
+ <appendix id="apx_linksToOtherInfo">
+ <title>Links to Other Information</title>
+
+ <para>Outside the formal documentation, there're several other sources of
+ information for ZooKeeper developers.</para>
+
+ <variablelist>
+ <varlistentry>
+ <term>ZooKeeper Whitepaper <emphasis>[tbd: find url]</emphasis></term>
+
+ <listitem>
+ <para>The definitive discussion of ZooKeeper design and performance,
+ by Yahoo! Research</para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>API Reference <emphasis>[tbd: find url]</emphasis></term>
+
+ <listitem>
+ <para>The complete reference to the ZooKeeper API</para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><ulink
+ url="http://us.dl1.yimg.com/download.yahoo.com/dl/ydn/zookeeper.m4v">ZooKeeper
+ Talk at the Hadoup Summit 2008</ulink></term>
+
+ <listitem>
+ <para>A video introduction to ZooKeeper, by Benjamin Reed of Yahoo!
+ Research</para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><ulink
+ url="https://cwiki.apache.org/confluence/display/ZOOKEEPER/Tutorial">Barrier and
+ Queue Tutorial</ulink></term>
+
+ <listitem>
+ <para>The excellent Java tutorial by Flavio Junqueira, implementing
+ simple barriers and producer-consumer queues using ZooKeeper.</para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><ulink
+ url="https://cwiki.apache.org/confluence/display/ZOOKEEPER/ZooKeeperArticles">ZooKeeper
+ - A Reliable, Scalable Distributed Coordination System</ulink></term>
+
+ <listitem>
+ <para>An article by Todd Hoff (07/15/2008)</para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><ulink url="recipes.html">ZooKeeper Recipes</ulink></term>
+
+ <listitem>
+ <para>Pseudo-level discussion of the implementation of various
+ synchronization solutions with ZooKeeper: Event Handles, Queues,
+ Locks, and Two-phase Commits.</para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><emphasis>[tbd]</emphasis></term>
+
+ <listitem>
+ <para>Any other good sources anyone can think of...</para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ </appendix>
+</article>
http://git-wip-us.apache.org/repos/asf/zookeeper/blob/c1efa954/zookeeper-docs/src/documentation/content/xdocs/zookeeperQuotas.xml
----------------------------------------------------------------------
diff --git a/zookeeper-docs/src/documentation/content/xdocs/zookeeperQuotas.xml b/zookeeper-docs/src/documentation/content/xdocs/zookeeperQuotas.xml
new file mode 100644
index 0000000..7668e6a
--- /dev/null
+++ b/zookeeper-docs/src/documentation/content/xdocs/zookeeperQuotas.xml
@@ -0,0 +1,71 @@
+<?xml version="1.0" encoding="UTF-8"?>
+ <!--
+ Copyright 2002-2004 The Apache Software Foundation Licensed under the
+ Apache License, Version 2.0 (the "License"); you may not use this file
+ except in compliance with the License. You may obtain a copy of the
+ License at http://www.apache.org/licenses/LICENSE-2.0 Unless required
+ by applicable law or agreed to in writing, software distributed under
+ the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR
+ CONDITIONS OF ANY KIND, either express or implied. See the License for
+ the specific language governing permissions and limitations under the
+ License.
+ -->
+ <!DOCTYPE article PUBLIC "-//OASIS//DTD Simplified DocBook XML V1.0//EN"
+ "http://www.oasis-open.org/docbook/xml/simple/1.0/sdocbook.dtd">
+<article id="bk_Quota">
+ <title>ZooKeeper Quota's Guide</title>
+ <subtitle>A Guide to Deployment and Administration</subtitle>
+ <articleinfo>
+ <legalnotice>
+ <para>
+ Licensed under the Apache License, Version 2.0 (the "License"); you
+ may not use this file except in compliance with the License. You may
+ obtain a copy of the License at
+ <ulink url="http://www.apache.org/licenses/LICENSE-2.0">http://www.apache.org/licenses/LICENSE-2.0
+ </ulink>
+ .
+ </para>
+ <para>Unless required by applicable law or agreed to in
+ writing, software distributed under the License is distributed on an
+ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either
+ express or implied. See the License for the specific language
+ governing permissions and limitations under the License.</para>
+ </legalnotice>
+ <abstract>
+ <para>This document contains information about deploying,
+ administering and mantaining ZooKeeper. It also discusses best
+ practices and common problems.</para>
+ </abstract>
+ </articleinfo>
+ <section id="zookeeper_quotas">
+ <title>Quotas</title>
+ <para> ZooKeeper has both namespace and bytes quotas. You can use the ZooKeeperMain class to setup quotas.
+ ZooKeeper prints <emphasis>WARN</emphasis> messages if users exceed the quota assigned to them. The messages
+ are printed in the log of the ZooKeeper.
+ </para>
+ <para><computeroutput>$ bin/zkCli.sh -server host:port</computeroutput></para>
+ <para> The above command gives you a command line option of using quotas.</para>
+ <section>
+ <title>Setting Quotas</title>
+ <para>You can use
+ <emphasis>setquota</emphasis> to set a quota on a ZooKeeper node. It has an option of setting quota with
+ -n (for namespace)
+ and -b (for bytes). </para>
+ <para> The ZooKeeper quota are stored in ZooKeeper itself in /zookeeper/quota. To disable other people from
+ changing the quota's set the ACL for /zookeeper/quota such that only admins are able to read and write to it.
+ </para>
+ </section>
+ <section>
+ <title>Listing Quotas</title>
+ <para> You can use
+ <emphasis>listquota</emphasis> to list a quota on a ZooKeeper node.
+ </para>
+ </section>
+ <section>
+ <title> Deleting Quotas</title>
+ <para> You can use
+ <emphasis>delquota</emphasis> to delete quota on a ZooKeeper node.
+ </para>
+ </section>
+ </section>
+ </article>