You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@zookeeper.apache.org by ma...@apache.org on 2009/06/25 03:48:18 UTC

svn commit: r788238 - in /hadoop/zookeeper/trunk: CHANGES.txt docs/zookeeperProgrammers.html docs/zookeeperProgrammers.pdf src/docs/src/documentation/content/xdocs/zookeeperProgrammers.xml

Author: mahadev
Date: Thu Jun 25 01:48:18 2009
New Revision: 788238

URL: http://svn.apache.org/viewvc?rev=788238&view=rev
Log:
document effects (latency) of storing large amounts of data in znodes. (breed via mahadev)

Modified:
    hadoop/zookeeper/trunk/CHANGES.txt
    hadoop/zookeeper/trunk/docs/zookeeperProgrammers.html
    hadoop/zookeeper/trunk/docs/zookeeperProgrammers.pdf
    hadoop/zookeeper/trunk/src/docs/src/documentation/content/xdocs/zookeeperProgrammers.xml

Modified: hadoop/zookeeper/trunk/CHANGES.txt
URL: http://svn.apache.org/viewvc/hadoop/zookeeper/trunk/CHANGES.txt?rev=788238&r1=788237&r2=788238&view=diff
==============================================================================
--- hadoop/zookeeper/trunk/CHANGES.txt (original)
+++ hadoop/zookeeper/trunk/CHANGES.txt Thu Jun 25 01:48:18 2009
@@ -234,6 +234,9 @@
 
   ZOOKEEPER-356. Masking bookie failure during writes to a ledger (flavio via breed)
 
+  ZOOKEEPER-327. document effects (latency) of storing large amounts of data
+in znodes. (breed via mahadev)
+
 NEW FEATURES:
 
   ZOOKEEPER-371. jdiff documentation included in build/release (giri via phunt)

Modified: hadoop/zookeeper/trunk/docs/zookeeperProgrammers.html
URL: http://svn.apache.org/viewvc/hadoop/zookeeper/trunk/docs/zookeeperProgrammers.html?rev=788238&r1=788237&r2=788238&view=diff
==============================================================================
--- hadoop/zookeeper/trunk/docs/zookeeperProgrammers.html (original)
+++ hadoop/zookeeper/trunk/docs/zookeeperProgrammers.html Thu Jun 25 01:48:18 2009
@@ -530,13 +530,27 @@
         atomically. Reads get all the data bytes associated with a znode and a
         write replaces all the data. Each node has an Access Control List
         (ACL) that restricts who can do what.</p>
-<a name="N100E2"></a><a name="Ephemeral+Nodes"></a>
+<p>ZooKeeper was not designed to be a general database or large
+        object store. Instead, it manages coordination data. This data can
+        come in the form of configuration, status information, rendezvous, etc.
+        A common property of the various forms of coordination data is that
+        they are relatively small: measured in kilobytes.
+        The ZooKeeper client and the server implementations have sanity checks
+        to ensure that znodes have less than 1M of data, but the data should
+        be much less than that on average. Operating on relatively large data
+        sizes will cause some operations to take much more time than others and
+        will affect the latencies of some operations because of the extra time
+        needed to move more data over the network and onto storage media. If
+        large data storage is needed, the usually pattern of dealing with such
+        data is to store it on a bulk storage system, such as NFS or HDFS, and
+        store pointers to the storage locations in ZooKeeper.</p>
+<a name="N100E5"></a><a name="Ephemeral+Nodes"></a>
 <h4>Ephemeral Nodes</h4>
 <p>ZooKeeper also has the notion of ephemeral nodes. These znodes
         exists as long as the session that created the znode is active. When
         the session ends the znode is deleted. Because of this behavior
         ephemeral znodes are not allowed to have children.</p>
-<a name="N100EC"></a><a name="Sequence+Nodes+--+Unique+Naming"></a>
+<a name="N100EF"></a><a name="Sequence+Nodes+--+Unique+Naming"></a>
 <h4>Sequence Nodes -- Unique Naming</h4>
 <p>When creating a znode you can also request that
         ZooKeeper append a monotonicly increasing counter to the end
@@ -550,7 +564,7 @@
         (4bytes) maintained by the parent node, the counter will
         overflow when incremented beyond 2147483647 (resulting in a
         name "&lt;path&gt;-2147483647").</p>
-<a name="N100FB"></a><a name="sc_timeInZk"></a>
+<a name="N100FE"></a><a name="sc_timeInZk"></a>
 <h3 class="h4">Time in ZooKeeper</h3>
 <p>ZooKeeper tracks time multiple ways:</p>
 <ul>
@@ -619,7 +633,7 @@
 </li>
       
 </ul>
-<a name="N10133"></a><a name="sc_zkStatStructure"></a>
+<a name="N10136"></a><a name="sc_zkStatStructure"></a>
 <h3 class="h4">ZooKeeper Stat Structure</h3>
 <p>The Stat structure for each znode in ZooKeeper is made up of the
       following fields:</p>
@@ -754,7 +768,7 @@
 </div>
 
   
-<a name="N101A5"></a><a name="ch_zkSessions"></a>
+<a name="N101A8"></a><a name="ch_zkSessions"></a>
 <h2 class="h3">ZooKeeper Sessions</h2>
 <div class="section">
 <p>To create a client session the application code must provide
@@ -842,7 +856,7 @@
 </div>
 
   
-<a name="N101CF"></a><a name="ch_zkWatches"></a>
+<a name="N101D2"></a><a name="ch_zkWatches"></a>
 <h2 class="h3">ZooKeeper Watches</h2>
 <div class="section">
 <p>All of the read operations in ZooKeeper - <strong>getData()</strong>, <strong>getChildren()</strong>, and <strong>exists()</strong> - have the option of setting a watch as a
@@ -925,7 +939,7 @@
     general this all occurs transparently. There is one case where a watch
     may be missed: a watch for the existance of a znode not yet created will
     be missed if the znode is created and deleted while disconnected.</p>
-<a name="N10205"></a><a name="sc_WatchGuarantees"></a>
+<a name="N10208"></a><a name="sc_WatchGuarantees"></a>
 <h3 class="h4">What ZooKeeper Guarantees about Watches</h3>
 <p>With regard to watches, ZooKeeper maintains these
       guarantees:</p>
@@ -960,7 +974,7 @@
 </li>
       
 </ul>
-<a name="N1022A"></a><a name="sc_WatchRememberThese"></a>
+<a name="N1022D"></a><a name="sc_WatchRememberThese"></a>
 <h3 class="h4">Things to Remember about Watches</h3>
 <ul>
         
@@ -1019,7 +1033,7 @@
 </div>
 
   
-<a name="N10256"></a><a name="sc_ZooKeeperAccessControl"></a>
+<a name="N10259"></a><a name="sc_ZooKeeperAccessControl"></a>
 <h2 class="h3">ZooKeeper access control using ACLs</h2>
 <div class="section">
 <p>ZooKeeper uses ACLs to control access to its znodes (the
@@ -1054,7 +1068,7 @@
     example, the pair <em>(ip:19.22.0.0/16, READ)</em>
     gives the <em>READ</em> permission to any clients with
     an IP address that starts with 19.22.</p>
-<a name="N10289"></a><a name="sc_ACLPermissions"></a>
+<a name="N1028C"></a><a name="sc_ACLPermissions"></a>
 <h3 class="h4">ACL Permissions</h3>
 <p>ZooKeeper supports the following permissions:</p>
 <ul>
@@ -1110,7 +1124,7 @@
       node, but nothing more. (The problem is, if you want to call
       zoo_exists() on a node that doesn't exist, there is no
       permission to check.)</p>
-<a name="N102DF"></a><a name="sc_BuiltinACLSchemes"></a>
+<a name="N102E2"></a><a name="sc_BuiltinACLSchemes"></a>
 <h4>Builtin ACL Schemes</h4>
 <p>ZooKeeeper has the following built in schemes:</p>
 <ul>
@@ -1159,7 +1173,7 @@
 
       
 </ul>
-<a name="N10323"></a><a name="ZooKeeper+C+client+API"></a>
+<a name="N10326"></a><a name="ZooKeeper+C+client+API"></a>
 <h4>ZooKeeper C client API</h4>
 <p>The following constants are provided by the ZooKeeper C
       library:</p>
@@ -1381,7 +1395,7 @@
 </div>
 
   
-<a name="N1043A"></a><a name="sc_ZooKeeperPluggableAuthentication"></a>
+<a name="N1043D"></a><a name="sc_ZooKeeperPluggableAuthentication"></a>
 <h2 class="h3">Pluggable ZooKeeper authentication</h2>
 <div class="section">
 <p>ZooKeeper runs in a variety of different environments with
@@ -1467,7 +1481,7 @@
 </div>
       
   
-<a name="N104A6"></a><a name="ch_zkGuarantees"></a>
+<a name="N104A9"></a><a name="ch_zkGuarantees"></a>
 <h2 class="h3">Consistency Guarantees</h2>
 <div class="section">
 <p>ZooKeeper is a high performance, scalable service. Both reads and
@@ -1593,12 +1607,12 @@
 </div>
 
   
-<a name="N1050D"></a><a name="ch_bindings"></a>
+<a name="N10510"></a><a name="ch_bindings"></a>
 <h2 class="h3">Bindings</h2>
 <div class="section">
 <p>The ZooKeeper client libraries come in two languages: Java and C.
     The following sections describe these.</p>
-<a name="N10516"></a><a name="Java+Binding"></a>
+<a name="N10519"></a><a name="Java+Binding"></a>
 <h3 class="h4">Java Binding</h3>
 <p>There are two packages that make up the ZooKeeper Java binding:
       <strong>org.apache.zookeeper</strong> and <strong>org.apache.zookeeper.data</strong>. The rest of the
@@ -1665,7 +1679,7 @@
       (SESSION_EXPIRED and AUTH_FAILED), the ZooKeeper object becomes invalid.
       On a close, the two threads shut down and any further access on zookeeper
       handle is undefined behavior and should be avoided. </p>
-<a name="N1055F"></a><a name="C+Binding"></a>
+<a name="N10562"></a><a name="C+Binding"></a>
 <h3 class="h4">C Binding</h3>
 <p>The C binding has a single-threaded and multi-threaded library.
       The multi-threaded library is easiest to use and is most similar to the
@@ -1682,7 +1696,7 @@
       (i.e. FreeBSD 4.x). In all other cases, application developers should
       link with zookeeper_mt, as it includes support for both Sync and Async
       API.</p>
-<a name="N1056E"></a><a name="Installation"></a>
+<a name="N10571"></a><a name="Installation"></a>
 <h4>Installation</h4>
 <p>If you're building the client from a check-out from the Apache
         repository, follow the steps outlined below. If you're building from a
@@ -1813,7 +1827,7 @@
 </li>
         
 </ol>
-<a name="N10617"></a><a name="Using+the+C+Client"></a>
+<a name="N1061A"></a><a name="Using+the+C+Client"></a>
 <h4>Using the C Client</h4>
 <p>You can test your client by running a ZooKeeper server (see
         instructions on the project wiki page on how to run it) and connecting
@@ -1871,7 +1885,7 @@
 </div>
 
    
-<a name="N1065D"></a><a name="ch_guideToZkOperations"></a>
+<a name="N10660"></a><a name="ch_guideToZkOperations"></a>
 <h2 class="h3">Building Blocks: A Guide to ZooKeeper Operations</h2>
 <div class="section">
 <p>This section surveys all the operations a developer can perform
@@ -1889,28 +1903,28 @@
 </li>
     
 </ul>
-<a name="N10671"></a><a name="sc_errorsZk"></a>
+<a name="N10674"></a><a name="sc_errorsZk"></a>
 <h3 class="h4">Handling Errors</h3>
 <p>Both the Java and C client bindings may report errors. The Java client binding does so by throwing KeeperException, calling code() on the exception will return the specific error code. The C client binding returns an error code as defined in the enum ZOO_ERRORS. API callbacks indicate result code for both language bindings. See the API documentation (javadoc for Java, doxygen for C) for full details on the possible errors and their meaning.</p>
-<a name="N1067B"></a><a name="sc_connectingToZk"></a>
+<a name="N1067E"></a><a name="sc_connectingToZk"></a>
 <h3 class="h4">Connecting to ZooKeeper</h3>
 <p></p>
-<a name="N10684"></a><a name="sc_readOps"></a>
+<a name="N10687"></a><a name="sc_readOps"></a>
 <h3 class="h4">Read Operations</h3>
 <p></p>
-<a name="N1068D"></a><a name="sc_writeOps"></a>
+<a name="N10690"></a><a name="sc_writeOps"></a>
 <h3 class="h4">Write Operations</h3>
 <p></p>
-<a name="N10696"></a><a name="sc_handlingWatches"></a>
+<a name="N10699"></a><a name="sc_handlingWatches"></a>
 <h3 class="h4">Handling Watches</h3>
 <p></p>
-<a name="N1069F"></a><a name="sc_miscOps"></a>
+<a name="N106A2"></a><a name="sc_miscOps"></a>
 <h3 class="h4">Miscelleaneous ZooKeeper Operations</h3>
 <p></p>
 </div>
 
   
-<a name="N106A9"></a><a name="ch_programStructureWithExample"></a>
+<a name="N106AC"></a><a name="ch_programStructureWithExample"></a>
 <h2 class="h3">Program Structure, with Simple Example</h2>
 <div class="section">
 <p>
@@ -1919,7 +1933,7 @@
 </div>
 
   
-<a name="N106B4"></a><a name="ch_gotchas"></a>
+<a name="N106B7"></a><a name="ch_gotchas"></a>
 <h2 class="h3">Gotchas: Common Problems and Troubleshooting</h2>
 <div class="section">
 <p>So now you know ZooKeeper. It's fast, simple, your application

Modified: hadoop/zookeeper/trunk/docs/zookeeperProgrammers.pdf
URL: http://svn.apache.org/viewvc/hadoop/zookeeper/trunk/docs/zookeeperProgrammers.pdf?rev=788238&r1=788237&r2=788238&view=diff
==============================================================================
Binary files - no diff available.

Modified: hadoop/zookeeper/trunk/src/docs/src/documentation/content/xdocs/zookeeperProgrammers.xml
URL: http://svn.apache.org/viewvc/hadoop/zookeeper/trunk/src/docs/src/documentation/content/xdocs/zookeeperProgrammers.xml?rev=788238&r1=788237&r2=788238&view=diff
==============================================================================
--- hadoop/zookeeper/trunk/src/docs/src/documentation/content/xdocs/zookeeperProgrammers.xml (original)
+++ hadoop/zookeeper/trunk/src/docs/src/documentation/content/xdocs/zookeeperProgrammers.xml Thu Jun 25 01:48:18 2009
@@ -202,6 +202,21 @@
         atomically. Reads get all the data bytes associated with a znode and a
         write replaces all the data. Each node has an Access Control List
         (ACL) that restricts who can do what.</para>
+        
+        <para>ZooKeeper was not designed to be a general database or large
+        object store. Instead, it manages coordination data. This data can
+        come in the form of configuration, status information, rendezvous, etc.
+        A common property of the various forms of coordination data is that
+        they are relatively small: measured in kilobytes.
+        The ZooKeeper client and the server implementations have sanity checks
+        to ensure that znodes have less than 1M of data, but the data should
+        be much less than that on average. Operating on relatively large data
+        sizes will cause some operations to take much more time than others and
+        will affect the latencies of some operations because of the extra time
+        needed to move more data over the network and onto storage media. If
+        large data storage is needed, the usually pattern of dealing with such
+        data is to store it on a bulk storage system, such as NFS or HDFS, and
+        store pointers to the storage locations in ZooKeeper.</para> 
       </section>
 
       <section>