You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@directory.apache.org by bu...@apache.org on 2013/09/21 16:55:06 UTC

svn commit: r879335 - in /websites/staging/directory/trunk/content: ./ mavibot/user-guide/ mavibot/user-guide/images/

Author: buildbot
Date: Sat Sep 21 14:55:06 2013
New Revision: 879335

Log:
Staging update by buildbot for directory

Added:
    websites/staging/directory/trunk/content/mavibot/user-guide/images/nodeLeaf2.graphml   (with props)
    websites/staging/directory/trunk/content/mavibot/user-guide/images/nodeLeaf2.png   (with props)
Modified:
    websites/staging/directory/trunk/content/   (props changed)
    websites/staging/directory/trunk/content/mavibot/user-guide/2.1-file-format.html
    websites/staging/directory/trunk/content/mavibot/user-guide/images/nodeLeaf.graphml
    websites/staging/directory/trunk/content/mavibot/user-guide/images/nodeLeaf.png

Propchange: websites/staging/directory/trunk/content/
------------------------------------------------------------------------------
--- cms:source-revision (original)
+++ cms:source-revision Sat Sep 21 14:55:06 2013
@@ -1 +1 @@
-1525161
+1525247

Modified: websites/staging/directory/trunk/content/mavibot/user-guide/2.1-file-format.html
==============================================================================
--- websites/staging/directory/trunk/content/mavibot/user-guide/2.1-file-format.html (original)
+++ websites/staging/directory/trunk/content/mavibot/user-guide/2.1-file-format.html Sat Sep 21 14:55:06 2013
@@ -214,10 +214,18 @@ it's something we might want to change l
 <p><img alt="BTrees" src="images/BTree.png" /></p>
 <p>Note that each <em>BTreeHeader</em> has at least one root page, even if it contains no data. In this schema, we show the root page just after the <em>BTree</em> it is associated to, but after a few updates, the root page may perfectly well be stored elswhere on the disk.</p>
 <h4 id="the-nodes-and-leaves">The Nodes and Leaves</h4>
-<p>Nodes and Leaves are logical <em>BTree</em> pages which are serialized on disk into one to many <em>PageIO</em>s. They have slightly different data structures, as <em>Node</em>s contains pointers to <em>Leaves</em>, and no data, while <em>Leaves</em> contains data. In any case, both contain the keys.</p>
+<p>Nodes and Leaves are logical <em>BTree</em> pages which are serialized on disk into one to many <em>PageIO</em>s. They have slightly different data structures, as <em>Node</em>s contains pointers to <em>Leaves</em>, and no data, while <em>Leaves</em> contains data. In any case, both contain the keys. The <em>Node</em> has one ore value than the <em>Leaf</em>, too.</p>
 <p>On disk, each <em>Node</em> and <em>Leaf</em> are stored in <em>PageIO</em>s, as we said. A <em>Node</em> will have pointers to some other logical pages, and on disk, those pointers will be offset of the first <em>PageIO</em> used to store the logical page it points to.</p>
 <p>Here is the <em>Node</em> and <em>Leaf</em> data structures once serialized :</p>
 <p><img alt="Node and Leaf" src="images/nodeLeaf.png" /></p>
+<p>Note that we store the size of the serialized data : this is necessary as we have to know how many <em>PageIO</em>s will be needed to store the logical page.</p>
+<p>The <em>rootPage</em> is just a <em>Node</em> or a <em>Leaf</em>.</p>
+<h4 id="potential-improvement">Potential improvement</h4>
+<p>We can get better performance by serializing the data differently. Instead of storing keys and values as byte arrays prefixed by their length, we could store an array of keys and values' offsets before the associated byte[]. Here is the resulting data structure, once serialized :</p>
+<p><img alt="Node and Leaf, improved" src="images/nodeLeaf2.png" /></p>
+<p>(The <em>Node</em> is not described, as it's basically the same data structure, but with one extra value).</p>
+<p>It does not need more space to serialize the data this way, as the offsets are ints, and in the previous version, those ints are used to store the length of the keys and values anyway.</p>
+<p>The gain is that we can have access to a given key and value without having to read all the previous keys and values. Also we can now read a leaf or a node without having to deserialize all the keys and values they contain.</p>
 
 
     <div class="nav">

Modified: websites/staging/directory/trunk/content/mavibot/user-guide/images/nodeLeaf.graphml
==============================================================================
Binary files - no diff available.

Modified: websites/staging/directory/trunk/content/mavibot/user-guide/images/nodeLeaf.png
==============================================================================
Binary files - no diff available.

Added: websites/staging/directory/trunk/content/mavibot/user-guide/images/nodeLeaf2.graphml
==============================================================================
Binary file - no diff available.

Propchange: websites/staging/directory/trunk/content/mavibot/user-guide/images/nodeLeaf2.graphml
------------------------------------------------------------------------------
    svn:mime-type = application/xml

Added: websites/staging/directory/trunk/content/mavibot/user-guide/images/nodeLeaf2.png
==============================================================================
Binary file - no diff available.

Propchange: websites/staging/directory/trunk/content/mavibot/user-guide/images/nodeLeaf2.png
------------------------------------------------------------------------------
    svn:mime-type = image/png