You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@directory.apache.org by bu...@apache.org on 2013/09/20 19:59:30 UTC

svn commit: r879190 - in /websites/staging/directory/trunk/content: ./ mavibot/user-guide/2.1-file-format.html mavibot/user-guide/images/BTree.png mavibot/user-guide/images/RMHeader.png mavibot/user-guide/images/btreeHeader.png

Author: buildbot
Date: Fri Sep 20 17:59:30 2013
New Revision: 879190

Log:
Staging update by buildbot for directory

Added:
    websites/staging/directory/trunk/content/mavibot/user-guide/images/BTree.png   (with props)
    websites/staging/directory/trunk/content/mavibot/user-guide/images/RMHeader.png   (with props)
    websites/staging/directory/trunk/content/mavibot/user-guide/images/btreeHeader.png   (with props)
Modified:
    websites/staging/directory/trunk/content/   (props changed)
    websites/staging/directory/trunk/content/mavibot/user-guide/2.1-file-format.html

Propchange: websites/staging/directory/trunk/content/
------------------------------------------------------------------------------
--- cms:source-revision (original)
+++ cms:source-revision Fri Sep 20 17:59:30 2013
@@ -1 +1 @@
-1525066
+1525070

Modified: websites/staging/directory/trunk/content/mavibot/user-guide/2.1-file-format.html
==============================================================================
--- websites/staging/directory/trunk/content/mavibot/user-guide/2.1-file-format.html (original)
+++ websites/staging/directory/trunk/content/mavibot/user-guide/2.1-file-format.html Fri Sep 20 17:59:30 2013
@@ -149,7 +149,7 @@
 <h1 id="21-file-format">2.1 - File format</h1>
 <p>When associated with a RecordManager, Mavibot stores all the Btrees in one single file, which is split in many physical pages, all having the same size. </p>
 <blockquote>
-<p><strong>Note</strong> page size
+<p><strong>Note</strong>
 Currently, the choice was to use one single size for all the pages, regardless the data we store into them. The rationnal is to
 get close to the OS page size (frequently 512 bytes or 4096 bytes). This is not necessarily the best choice though, let's say 
 it's something we might want to change later.</p>
@@ -157,11 +157,61 @@ it's something we might want to change l
 <h2 id="general-file-structure">General file structure</h2>
 <p>The file we use to store the data is a plain binary file, used to store all the BTrees. We can store many btrees in one single file.</p>
 <p>This file is considered as a fileSystem, with fixed size 'pages' (a page is an array of bytes). The page size is arbitrary fixed when teh RecordManager is created, and we will store every logical data n those physical pages, which will require to spread the logical data in many pages in most of the cases.</p>
-<h2 id="pageio">PageIO</h2>
+<h3 id="pageio">PageIO</h3>
 <p>Let's first introduce the <em>PageIO</em>, which is used to store the data on disk.</p>
 <p>A <em>PageIO</em> contains some raw data. As we have to map some logical data that may be wider than a physical fixed size <em>PageIO</em>, we use potentially more than one <em>PageIO</em> to store the data, and we link the <em>PageIO</em>s alltogether.</p>
 <p>Each <em>PageIO</em> has a height bytes pointer at the beginning, pointing to the next PageIO (or to nothing, if there is no more <em>PageIO</em> in the chain), plus an extra 4 bytes on the first <em>PageIO</em> to define the number of bytes stored in the chain of PageIO. Here is the mapping between a logical page and some PageIOs :</p>
 <p><img alt="PageIO mapping" src="images/PageIOLogical.png" /></p>
+<p>Every <em>PageIO</em>s are contiguous on disk, but the <em>PageIO</em>s used to store a logical page may be located anywhere on the disk, they don't have to be continuous.</p>
+<p>Here is the structure of a <em>PageIO</em> on disk :</p>
+<ul>
+<li>next page offset (8 bytes) : the offset of the next <em>PageIO</em>, or -1L if no more <em>PageIO</em> is needed</li>
+<li>data size (4 bytes) : for the first <em>PageIO</em>, the size of the stored data across all the <em>PageIO</em>s used to store a page.</li>
+<li>data (N bytes) : a block of data, which size will be min( PageSize - offset - data size, data size) for the first <em>PageIO</em> or min( PageSize - offset, data size) for any other <em>PageIO</em>s</li>
+</ul>
+<h2 id="logical-structure-mapping-on-disk">Logical structure mapping on disk</h2>
+<p>We will now describe how each logical structure is serialized on disk.</p>
+<h3 id="recordmanager-header">RecordManager header</h3>
+<p>We keep a few bytes at the beginning of the file to store some critical information about the RecordManager. Here is the list of stored informations :</p>
+<ul>
+<li>The <em>PageIO</em> size (in bytes)</li>
+<li>The number of managed BTrees</li>
+<li>The offset of the first free page</li>
+<li>The offset of the last free page</li>
+</ul>
+<p>Here is a picture that shows the header content :</p>
+<p><img alt="RecordManager header" src="images/RMHEader.png" /></p>
+<p>We keep a track of the free pages (a free page is a PageIO that is not anymore used, for instance because the data have been deleted.) This is done by keeping a link between each PageIO and by pointing to the first feee PageIO and to the last free PageIO of this list.</p>
+<blockquote>
+<p><strong>Note</strong> We might get rid of the last free page offset.</p>
+</blockquote>
+<p>At startup, of course, we have no free pages, and those pointers contain the -1 offset.</p>
+<p>This header is stored in a <em>PageIO</em>, at the very beginning of the file.</p>
+<h3 id="the-recordmanager-structure">The RecordManager structure</h3>
+<p>The <em>RecordManager</em> manages <em>BTree</em>s, and we have to store them into <em>PageIO</em>s. How do we do that ?</p>
+<p>All the <em>BTree</em>s have a header that contains many informations about them, and point to a <em>rootPage</em> which is the current root (so the root for the latest revision). As a <em>RecordManager</em> can manage more than one <em>BTree</em>, we have to find a way to retreive all the <em>BTree</em>s at startup : we use an internal link, so that a <em>BTree</em> points to the next btree. At startup, we read the first <em>BTree</em> which is stored in the second <em>PageIO</em> in the file (so just after the RecordManager header), then we read the next <em>BTree</em> pointed by the first <em>BTree</em>, and so on.</p>
+<h4 id="the-btree-header">The BTree header</h4>
+<p>Each <em>BTree</em> has to keep many informations so that it can be used. Those informations are :</p>
+<ul>
+<li>revision (8 bytes) : the current revision for this <em>BTree</em>. This value is updated after each modification in the <em>BTree</em>.</li>
+<li>nbElems (8 bytes) : the total number of elements we have in the <em>BTree</em>. This is updated after each modification either.</li>
+<li>rootPage offset (8 bytes) : the position in the file where the <em>rootPage</em> is stored</li>
+<li>nextBtree offset (8 bytes) : the position of the next <em>BTree</em> header in the file (or -1 if we don't have any other <em>BTree</em>)</li>
+<li>pageSize (4 bytes) : the number of elements we cans store in a <em>Node</em> or a <em>Leaf</em>. It's not related in any possible way with the <em>PageIO</em> size.</li>
+<li>nameSize (4 bytes) : The <em>BTree</em> name size</li>
+<li>name (nameSize bytes) : the <em>BTree</em> name</li>
+<li>keySerializerSize (4 bytes) : The size of the java <em>FQCN</em> for the key serializer</li>
+<li>keySerializer (keySerializerSize bytes) : The java <em>FQCN</em> for the key serializer</li>
+<li>valueSerializerSize (4 bytes) : The size of the java <em>FQCN</em> for the value serializer</li>
+<li>valueSerializer (valueSerializerSize bytes): The java <em>FQCN</em> for the value serializer</li>
+<li>dupsAllowed (1 byte): tells if the <em>BTree</em> can have duplicated values.</li>
+</ul>
+<p>As we can see, thi sheader can have various length, and if one one the names is long, we may need more than one PageIOs to store it.</p>
+<p>Here is a diagram which present this data structure on disk :</p>
+<p><img alt="BTreeHeader header" src="images/btreeHeader.png" /></p>
+<p>Note that a <em>BTree</em> header can be stored on one or many <em>IOPage</em>s, depending on its size.</p>
+<p>All in all, when we have more than one <em>BTree</em> stored in the file, the content of the file which stores the <em>BTree</em> headers will look like this one :</p>
+<p><img alt="BTrees" src="images/BTree.png" /></p>
 
 
     <div class="nav">

Added: websites/staging/directory/trunk/content/mavibot/user-guide/images/BTree.png
==============================================================================
Binary file - no diff available.

Propchange: websites/staging/directory/trunk/content/mavibot/user-guide/images/BTree.png
------------------------------------------------------------------------------
    svn:mime-type = image/png

Added: websites/staging/directory/trunk/content/mavibot/user-guide/images/RMHeader.png
==============================================================================
Binary file - no diff available.

Propchange: websites/staging/directory/trunk/content/mavibot/user-guide/images/RMHeader.png
------------------------------------------------------------------------------
    svn:mime-type = image/png

Added: websites/staging/directory/trunk/content/mavibot/user-guide/images/btreeHeader.png
==============================================================================
Binary file - no diff available.

Propchange: websites/staging/directory/trunk/content/mavibot/user-guide/images/btreeHeader.png
------------------------------------------------------------------------------
    svn:mime-type = image/png