You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@directory.apache.org by bu...@apache.org on 2014/02/13 15:33:31 UTC

svn commit: r897765 - in /websites/staging/directory/trunk/content: ./ mavibot/user-guide/7.2-physical-storage.html mavibot/user-guide/images/btreeHeader.png

Author: buildbot
Date: Thu Feb 13 14:33:30 2014
New Revision: 897765

Log:
Staging update by buildbot for directory

Modified:
    websites/staging/directory/trunk/content/   (props changed)
    websites/staging/directory/trunk/content/mavibot/user-guide/7.2-physical-storage.html
    websites/staging/directory/trunk/content/mavibot/user-guide/images/btreeHeader.png

Propchange: websites/staging/directory/trunk/content/
------------------------------------------------------------------------------
--- cms:source-revision (original)
+++ cms:source-revision Thu Feb 13 14:33:30 2014
@@ -1 +1 @@
-1567195
+1567940

Modified: websites/staging/directory/trunk/content/mavibot/user-guide/7.2-physical-storage.html
==============================================================================
--- websites/staging/directory/trunk/content/mavibot/user-guide/7.2-physical-storage.html (original)
+++ websites/staging/directory/trunk/content/mavibot/user-guide/7.2-physical-storage.html Thu Feb 13 14:33:30 2014
@@ -180,28 +180,24 @@ it's something we might want to change l
 <p>We keep a few bytes at the beginning of the file to store some critical information about the RecordManager. Here is the list of stored informations :</p>
 <ul>
 <li>The <em>PageIO</em> size (in bytes)</li>
-<li>The number of managed BTrees</li>
+<li>The number of managed B-Trees</li>
 <li>The offset of the first free page</li>
-<li>The offset of the last free page</li>
+<li>The offset of the current B-tree of B-trees</li>
+<li>The offset of the previous B-tree of B-trees, if we an update operation is being done</li>
+<li>The offset of the current CopiedPages B-tree</li>
+<li>The offset of the previous CopiedPages B-tree, if we an update operation is being done</li>
 </ul>
-<p>Here is a picture that shows the header content :</p>
+<p>Here is a picture that shows the header content in two different case (when an update operation is not completed, and when it's completed) :</p>
 <p><img alt="RecordManager header" src="images/RMHeader.png" /></p>
 <p>We keep a track of the free pages (a free page is a PageIO that is not anymore used, for instance because the data have been deleted.) This is done by keeping a link between each PageIO and by pointing to the first feee PageIO and to the last free PageIO of this list.</p>
-<blockquote>
-<p><strong>Note</strong> We might get rid of the last free page offset.</p>
-</blockquote>
 <p>At startup, of course, we have no free pages, and those pointers contain the -1 offset.</p>
 <p>This header is stored in a <em>PageIO</em>, at the very beginning of the file.</p>
 <h3 id="the-recordmanager-structure">The RecordManager structure</h3>
 <p>The <em>RecordManager</em> manages <em>BTree</em>s, and we have to store them into <em>PageIO</em>s. How do we do that ?</p>
 <p>All the <em>BTree</em>s have a header that contains many informations about them, and point to a <em>rootPage</em> which is the current root (so the root for the latest revision). As a <em>RecordManager</em> can manage more than one <em>BTree</em>, we have to find a way to retreive all the <em>BTree</em>s at startup : we use an internal link, so that a <em>BTree</em> points to the next btree. At startup, we read the first <em>BTree</em> which is stored in the second <em>PageIO</em> in the file (so just after the RecordManager header), then we read the next <em>BTree</em> pointed by the first <em>BTree</em>, and so on.</p>
-<h4 id="the-btree-header">The BTree header</h4>
-<p>Each <em>BTree</em> has to keep many informations so that it can be used. Those informations are :</p>
+<h4 id="the-b-tree-info">The B-tree info</h4>
+<p>Each <em>B-tree</em> has some informations that never change over the time. Here is the list of those informations :</p>
 <ul>
-<li>revision (8 bytes) : the current revision for this <em>BTree</em>. This value is updated after each modification in the <em>BTree</em>.</li>
-<li>nbElems (8 bytes) : the total number of elements we have in the <em>BTree</em>. This is updated after each modification either.</li>
-<li>rootPage offset (8 bytes) : the position in the file where the <em>rootPage</em> is stored</li>
-<li>nextBtree offset (8 bytes) : the position of the next <em>BTree</em> header in the file (or -1 if we don't have any other <em>BTree</em>)</li>
 <li>pageSize (4 bytes) : the number of elements we cans store in a <em>Node</em> or a <em>Leaf</em>. It's not related in any possible way with the <em>PageIO</em> size.</li>
 <li>nameSize (4 bytes) : The <em>BTree</em> name size</li>
 <li>name (nameSize bytes) : the <em>BTree</em> name</li>
@@ -211,13 +207,23 @@ it's something we might want to change l
 <li>valueSerializer (valueSerializerSize bytes): The java <em>FQCN</em> for the value serializer</li>
 <li>dupsAllowed (1 byte): tells if the <em>BTree</em> can have duplicated values.</li>
 </ul>
-<p>As we can see, thi sheader can have various length, and if one one the names is long, we may need more than one PageIOs to store it.</p>
-<p>Here is a diagram which present this data structure on disk :</p>
-<p><img alt="BTreeHeader header" src="images/btreeHeader.png" /></p>
-<p>Note that a <em>BTree</em> header can be stored on one or many <em>IOPage</em>s, depending on its size.</p>
+<p>As we can see, this header can have various length, and if one one the names is long, we may need more than one PageIOs to store it.</p>
+<p>Note that a <em>B-tree</em> info can be stored on one or many <em>IOPage</em>s, depending on its size.</p>
+<h4 id="the-btree-header">The BTree header</h4>
+<p>We may have many versions of the <em>B-tree</em> header, one per revision (unless we overwrite it when we don't need to keep many revisions on disk).</p>
+<p>Each <em>BTree</em> has to keep many informations, which are :</p>
+<ul>
+<li>revision (8 bytes) : the current revision for this <em>BTree</em>. This value is updated after each modification in the <em>BTree</em>.</li>
+<li>nbElems (8 bytes) : the total number of elements we have in the <em>BTree</em>. This is updated after each modification either.</li>
+<li>rootPage offset (8 bytes) : the position in the file where the <em>rootPage</em> is stored</li>
+<li>btreeInfo offset (8 bytes) : the position of the <em>B-tree</em> info page in the file</li>
+</ul>
+<p>Here is a diagram which present the <em>B-tree header</em> and the <em>B-tree info</em> data structures on disk :</p>
+<p><img alt="BTreeHeader" src="images/btreeHeader.png" /></p>
+<p>Note that they can be stored on one or many <em>IOPage</em>s, depending on its size.</p>
 <p>All in all, when we have more than one <em>BTree</em> stored in the file, the content of the file which stores the <em>BTree</em> headers will look like this one :</p>
 <p><img alt="BTrees" src="images/BTree.png" /></p>
-<p>Note that each <em>BTreeHeader</em> has at least one root page, even if it contains no data. In this schema, we show the root page just after the <em>BTree</em> it is associated to, but after a few updates, the root page may perfectly well be stored elswhere on the disk.</p>
+<p>Note that each <em>B-tree Header</em> has one root page, even if it contains no data. In this schema, we show the root page just after the <em>BTree</em> it is associated to, but after a few updates, the root page may perfectly well be stored elsewhere on the disk.</p>
 <h4 id="the-nodes-and-leaves">The Nodes and Leaves</h4>
 <p>Nodes and Leaves are logical <em>BTree</em> pages which are serialized on disk into one to many <em>PageIO</em>s. They have slightly different data structures, as <em>Node</em>s contains pointers to <em>Leaves</em>, and no data, while <em>Leaves</em> contains data. In any case, both contain the keys. The <em>Node</em> has one ore value than the <em>Leaf</em>, too.</p>
 <p>On disk, each <em>Node</em> and <em>Leaf</em> are stored in <em>PageIO</em>s, as we said. A <em>Node</em> will have pointers to some other logical pages, and on disk, those pointers will be offset of the first <em>PageIO</em> used to store the logical page it points to.</p>

Modified: websites/staging/directory/trunk/content/mavibot/user-guide/images/btreeHeader.png
==============================================================================
Binary files - no diff available.