You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@jackrabbit.apache.org by th...@apache.org on 2018/08/21 10:31:38 UTC

svn commit: r1838538 [5/35] - in /jackrabbit/site/live/oak/docs: ./ architecture/ coldstandby/ features/ nodestore/ nodestore/document/ nodestore/segment/ oak-mongo-js/ oak-mongo-js/fonts/ oak-mongo-js/scripts/ oak-mongo-js/scripts/prettify/ oak-mongo-...

Modified: jackrabbit/site/live/oak/docs/nodestore/segmentmk.html
URL: http://svn.apache.org/viewvc/jackrabbit/site/live/oak/docs/nodestore/segmentmk.html?rev=1838538&r1=1838537&r2=1838538&view=diff
==============================================================================
--- jackrabbit/site/live/oak/docs/nodestore/segmentmk.html (original)
+++ jackrabbit/site/live/oak/docs/nodestore/segmentmk.html Tue Aug 21 10:31:37 2018
@@ -1,13 +1,13 @@
 <!DOCTYPE html>
 <!--
- | Generated by Apache Maven Doxia Site Renderer 1.8.1 at 2018-08-10 
+ | Generated by Apache Maven Doxia Site Renderer 1.7.4 at 2018-02-21 
  | Rendered using Apache Maven Fluido Skin 1.6
 -->
 <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
   <head>
     <meta charset="UTF-8" />
     <meta name="viewport" content="width=device-width, initial-scale=1.0" />
-    <meta name="Date-Revision-yyyymmdd" content="20180810" />
+    <meta name="Date-Revision-yyyymmdd" content="20180221" />
     <meta http-equiv="Content-Language" content="en" />
     <title>Jackrabbit Oak &#x2013; Segment Storage Design Overview</title>
     <link rel="stylesheet" href="../css/apache-maven-fluido-1.6.min.css" />
@@ -52,7 +52,6 @@
         <a href="#" class="dropdown-toggle" data-toggle="dropdown">Main APIs <b class="caret"></b></a>
         <ul class="dropdown-menu">
             <li><a href="http://www.day.com/specs/jcr/2.0/index.html" title="JCR API">JCR API</a></li>
-            <li><a href="https://jackrabbit.apache.org/jcr/jcr-api.html" title="Jackrabbit API">Jackrabbit API</a></li>
             <li><a href="../oak_api/overview.html" title="Oak API">Oak API</a></li>
         </ul>
       </li>
@@ -137,7 +136,7 @@
 
       <div id="breadcrumbs">
         <ul class="breadcrumb">
-        <li id="publishDate">Last Published: 2018-08-10<span class="divider">|</span>
+        <li id="publishDate">Last Published: 2018-02-21<span class="divider">|</span>
 </li>
           <li id="projectVersion">Version: 1.10-SNAPSHOT</li>
         </ul>
@@ -156,14 +155,12 @@
     <li><a href="../architecture/nodestate.html" title="The Node State Model"><span class="none"></span>The Node State Model</a>  </li>
           <li class="nav-header">Main APIs</li>
     <li><a href="http://www.day.com/specs/jcr/2.0/index.html" class="externalLink" title="JCR API"><span class="none"></span>JCR API</a>  </li>
-    <li><a href="https://jackrabbit.apache.org/jcr/jcr-api.html" class="externalLink" title="Jackrabbit API"><span class="none"></span>Jackrabbit API</a>  </li>
     <li><a href="../oak_api/overview.html" title="Oak API"><span class="none"></span>Oak API</a>  </li>
           <li class="nav-header">Features and Plugins</li>
     <li><a href="../nodestore/overview.html" title="Node Storage"><span class="icon-chevron-down"></span>Node Storage</a>
       <ul class="nav nav-list">
     <li><a href="../nodestore/documentmk.html" title="Document NodeStore"><span class="icon-chevron-down"></span>Document NodeStore</a>
       <ul class="nav nav-list">
-    <li><a href="../nodestore/document/mongo-document-store.html" title="MongoDB DocumentStore"><span class="none"></span>MongoDB DocumentStore</a>  </li>
     <li><a href="../nodestore/document/node-bundling.html" title="Node Bundling"><span class="none"></span>Node Bundling</a>  </li>
     <li><a href="../nodestore/document/secondary-store.html" title="Secondary Store"><span class="none"></span>Secondary Store</a>  </li>
     <li><a href="../nodestore/persistent-cache.html" title="Persistent Cache"><span class="none"></span>Persistent Cache</a>  </li>
@@ -242,33 +239,31 @@
    WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    See the License for the specific language governing permissions and
    limitations under the License.
-  -->
-<h1>Segment Storage Design Overview</h1>
+  --><h1>Segment Storage Design Overview</h1>
 <p><i>NOTE:</i> The information on this page applies to an older version of the TarMK and is mainly of historical interest. For the documentation of the current versions see <a href="segment/overview.html">Oak Segment Tar</a>.</p>
 <p>The SegmentNodeStore is an Oak storage backend that stores content as various types of <i>records</i> within larger <i>segments</i>. One or more <i>journals</i> are used to track the latest state of the repository. In the Tar implementation only one &#x201c;root&#x201d; journal is used.</p>
 <p>The SegmentNodeStore was designed from the ground up based on the following key principles:</p>
-<ul>
 
+<ul>
+  
 <li>
-
-<p>Immutability. Segments are immutable, which makes is easy to cache frequently accessed segments. This also makes it less likely for programming or system errors to cause repository inconsistencies, and simplifies features like backups or master-slave clustering.</p>
-</li>
+<p>Immutability. Segments are immutable, which makes is easy to cache frequently accessed segments. This also makes it less likely for programming or system errors to cause repository inconsistencies, and simplifies features like backups or master-slave clustering.</p></li>
+  
 <li>
-
-<p>Compactness. The formatting of records is optimized for size to reduce IO costs and to fit as much content in caches as possible. A node stored in SegmentNodeStore typically consumes only a fraction of the size it would as a bundle in Jackrabbit Classic.</p>
-</li>
+<p>Compactness. The formatting of records is optimized for size to reduce IO costs and to fit as much content in caches as possible. A node stored in SegmentNodeStore typically consumes only a fraction of the size it would as a bundle in Jackrabbit Classic.</p></li>
+  
 <li>
-
-<p>Locality. Segments are written so that related records, like a node and its immediate children, usually end up stored in the same segment. This makes tree traversals very fast and avoids most cache misses for typical clients that access more than one related node per session.</p>
-</li>
+<p>Locality. Segments are written so that related records, like a node and its immediate children, usually end up stored in the same segment. This makes tree traversals very fast and avoids most cache misses for typical clients that access more than one related node per session.</p></li>
 </ul>
 <p>This document describes the overall design of the SegmentNodeStore. See the source code and javadocs in <tt>org.apache.jackrabbit.oak.plugins.segment</tt> for full details.</p>
 <h1>Segments</h1>
 <p>The content tree and all its revisions are stored in a collection of immutable segments. Each segment is identified by a UUID and typically contains a continuous subset of the content tree, for example a node with its properties and closest child nodes. Some segments might also be used to store commonly occurring property values or other shared data. Segments can be to up to 256KiB in size.</p>
 <p>Segments come in two types: data and bulk segments. The type of a segment is encoded in its UUID and can thus be determined already before reading the segment. The following bit patterns are used (each <tt>x</tt> represents four random bits):</p>
-<ul>
 
+<ul>
+  
 <li><tt>xxxxxxxx-xxxx-4xxx-axxx-xxxxxxxxxxxx</tt> data segment UUID</li>
+  
 <li><tt>xxxxxxxx-xxxx-4xxx-bxxx-xxxxxxxxxxxx</tt> bulk segment UUID</li>
 </ul>
 <p>(This encoding makes segment UUIDs appear as syntactically valid version 4 random UUIDs specified in RFC 4122.)</p>
@@ -276,27 +271,22 @@
 <h2><a name="Bulk_segments"></a>Bulk segments</h2>
 <p>Bulk segments contain raw binary data, interpreted simply as a sequence of block records with no headers or other extra metadata:</p>
 
-<div>
-<div>
-<pre class="source">[block 1] [block 2] ... [block N]
+<div class="source">
+<div class="source"><pre class="prettyprint">[block 1] [block 2] ... [block N]
 </pre></div></div>
-
 <p>A bulk segment whose length is <tt>n</tt> bytes consists of <tt>n div 4096</tt> block records of 4KiB each followed possibly a block record of <tt>n mod 4096</tt> bytes, if there still are remaining bytes in the segment. The structure of a bulk segment can thus be determined based only on the segment length.</p></div>
 <div class="section">
 <h2><a name="Data_segments"></a>Data segments</h2>
 <p>A data segment can contain any types of records, may refer to content in other segments, and comes with a segment header that guides the parsing of the segment. The overall structure of a data segment is:</p>
 
-<div>
-<div>
-<pre class="source">[segment header] [record 1] [record 2] ... [record N]
+<div class="source">
+<div class="source"><pre class="prettyprint">[segment header] [record 1] [record 2] ... [record N]
 </pre></div></div>
-
 <p>The header and each record is zero-padded to make their size a multiple of four bytes and to align the next record at a four-byte boundary.</p>
 <p>The segment header consists of the following fields:</p>
 
-<div>
-<div>
-<pre class="source">+--------+--------+--------+--------+--------+--------+--------+--------+
+<div class="source">
+<div class="source"><pre class="prettyprint">+--------+--------+--------+--------+--------+--------+--------+--------+
 | magic bytes: &quot;0aK&quot; ASCII |version |reserved|idcount |rootcount        |
 +--------+--------+--------+--------+--------+--------+--------+--------+
 | blobrefcount    | reserved (set to 0)                                 |
@@ -317,7 +307,6 @@
 |                                            | padding (set to 0)       |
 +--------+--------+--------+--------+--------+--------+--------+--------+
 </pre></div></div>
-
 <p>The first three bytes of a segment always contain the ASCII string &#x201c;0aK&#x201d;, which is intended to make the binary segment data format easily detectable. The next byte indicates the version of segment format, and is set to 10 for all segments that follow the format described here.</p>
 <p>The <tt>idcount</tt> byte indicates how many other segments are referenced by records within this segment. The identifiers of those segments are listed starting at offset 16 of the segment header. This lookup table of up to 255 segment identifiers is used to optimize garbage collection and to avoid having to repeat the 16-byte UUIDs whenever references to records in other segments are made.</p>
 <p>The 16-bit <tt>rootcount</tt> field indicates the number of root record references that follow after the segment identifier lookup table. The root record references are a debugging and recovery aid, that are not needed during normal operation. They identify the types and locations of those records within this segment that are not accessible by following references in other records within this segment. <s>These root references give enough context for parsing all records within a segment without any external information.</s> See <a class="externalLink" href="https://issues.apache.org/jira/browse/OAK-2498">OAK-2498</a>.</p>
@@ -334,13 +323,11 @@
 <p>The content inside a segment is divided in records of different types: blocks, lists, maps, values, templates and nodes. These record types and their internal structures are described in subsections below.</p>
 <p>Each record is uniquely addressable by its location within the segment and the UUID of that segment. A single segment can contain up to 256KiB of data and and references to up to 256 segments (including itself). Since all records are aligned at four-byte boundaries, 16 bits are needed to address all possible record locations within a segment. Thus only three bytes are needed to store a reference to any record in any segment (1 byte to identify the segment, 2 bytes for the record offset):</p>
 
-<div>
-<div>
-<pre class="source">+--------+--------+--------+
+<div class="source">
+<div class="source"><pre class="prettyprint">+--------+--------+--------+
 | refid  | offset          |
 +--------+--------+--------+
 </pre></div></div>
-
 <p>The <tt>refid</tt> filed is the number of the referenced segment identifier, with refid zero interpreted as a reference to the current segment and refids 1-255 the segment identifiers stored in the lookup table in the segment header.</p></div>
 <div class="section">
 <h2><a name="Block_records"></a>Block records</h2>
@@ -350,9 +337,8 @@
 <p>List records are used as components of more complex record types. Lists are used for storing arrays of values for multi-valued properties and sequences of blocks for large binary values.</p>
 <p>The list of references is split into pieces of up to 255 references each and those pieces are stored as records. If there are more than 255 pieces like that, then a higher-level list is created of references to those pieces. This process is continued until the resulting list has less than 255 entries.</p>
 
-<div>
-<div>
-<pre class="source">+--------+--------+--------+-----+
+<div class="source">
+<div class="source"><pre class="prettyprint">+--------+--------+--------+-----+
 | sub-list ID 1            | ... |
 +--------+--------+--------+-----+
   |
@@ -361,7 +347,6 @@
 | record ID 1              | ... | record ID 255            |
 +--------+--------+--------+-----+--------+--------+--------+
 </pre></div></div>
-
 <p>The result is a hierarchically stored immutable list where each element can be accessed in O(log N) time and the size overhead of updating or appending list elements (and thus creating a new immutable list) is also O(log N).</p></div>
 <div class="section">
 <h2><a name="Map_records"></a>Map records</h2>
@@ -374,32 +359,32 @@
 <p>Value records are byte arrays used for storing all names and values of the content tree. Since item names can be thought of as name values and since all JCR and Oak values can be expressed in binary form (strings encoded in UTF-8), it is easiest to simply use that form for storing all values. The size overhead of such a form for small value types like booleans or dates is amortized by the facts that those types are used only for a minority of values in typical content trees and that repeating copies of a value can be stored just once.</p>
 <p>There are four types of value records: small, medium, long and external. The small- and medium-sized values are stored in inline form, prepended by one or two bytes that indicate the length of the value. Long values of up to two exabytes (2^61) are stored as a list of block records. Finally an external value record contains the length of the value and a string reference (up to 4kB in length) to some external storage location.</p>
 <p>The type of a value record is encoded in the high-order bits of the first byte of the record. These bit patterns are:</p>
-<ul>
 
+<ul>
+  
 <li><tt>0xxxxxxx</tt>: small value, length (0 - 127 bytes) encoded in 7 bits</li>
+  
 <li><tt>10xxxxxx</tt>: medium value length (128 - 16511 bytes) encoded in 6 + 8 bits</li>
+  
 <li><tt>110xxxxx</tt>: long value, length (up to 2^61 bytes) encoded in 5 + 7*8 bits</li>
+  
 <li><tt>1110xxxx</tt>: external value, reference string length encoded in 4 + 8 bits</li>
 </ul></div>
 <div class="section">
 <h2><a name="Template_records"></a>Template records</h2>
 <p>A template record describes the common structure of a family of related nodes. Since the structures of most nodes in a typical content tree fall into a small set of common templates, it makes sense to store such templates separately instead of repeating that information separately for each node. For example, the property names and types as well as child node names of all nt:file nodes are typically the same. The presence of mixins and different subtypes increases the number of different templates, but they&#x2019;re typically still far fewer than nodes in the repository.</p>
 <p>A template record consists of a set of up to N (exact size TBD, N ~ 256) property name and type pairs. Additionally, since nodes that are empty or contain just a single child node are most common, a template record also contains information whether the node has zero, one or many child nodes. In case of a single child node, the template also contains the name of that node. For example, the template for typical mix:versionable nt:file nodes would be (using CND-like notation):</p>
-<ul>
-
-<li>jcr:primaryType (NAME)
-<ul>
 
-<li>jcr:mixinTypes (NAME) multiple</li>
-<li>jcr:created (DATE)</li>
-<li>jcr:uuid (STRING)</li>
-<li>jcr:versionHistory (REFERENCE)</li>
-<li>jcr:predecessors (REFERENCE) multiple</li>
-<li>jcr:baseVersion (REFERENCE)</li>
-<li>jcr:content</li>
-</ul>
-</li>
-</ul>
+<div class="source">
+<div class="source"><pre class="prettyprint">- jcr:primaryType (NAME)
+- jcr:mixinTypes (NAME) multiple
+- jcr:created (DATE)
+- jcr:uuid (STRING)
+- jcr:versionHistory (REFERENCE)
+- jcr:predecessors (REFERENCE) multiple
+- jcr:baseVersion (REFERENCE)
++ jcr:content
+</pre></div></div>
 <p>The names used in a template are stored as separate value records and included by reference. This way multiple templates that for example all contain the &#x201c;jcr:primaryType&#x201d; property name don&#x2019;t need to repeatedly store it.</p></div>
 <div class="section">
 <h2><a name="Node_records"></a>Node records</h2>
@@ -408,16 +393,25 @@
 <p>A node that contains more than N properties or M child nodes (exact size TBD, M ~ 1k) is stored differently, using map records for the properties and child nodes. This way a node can become arbitrarily large and still remain reasonably efficient to access and modify. The main downside of this alternative storage layout is that the ordering of child nodes is lost.</p>
 <h1>Tar</h1>
 <p>TODO:</p>
-<ul>
 
+<ul>
+  
 <li>tar entry checksums</li>
+  
 <li>graph and index entries</li>
+  
 <li>recovery mechanism</li>
+  
 <li>tar generations / cleanup</li>
+  
 <li>journal.log</li>
+  
 <li>compaction</li>
+  
 <li>cleanup</li>
+  
 <li>backup</li>
+  
 <li>slow startup / journal.log</li>
 </ul></div>
         </div>

Added: jackrabbit/site/live/oak/docs/oak-mongo-js/fonts/OpenSans-Bold-webfont.eot
URL: http://svn.apache.org/viewvc/jackrabbit/site/live/oak/docs/oak-mongo-js/fonts/OpenSans-Bold-webfont.eot?rev=1838538&view=auto
==============================================================================
Binary file - no diff available.

Propchange: jackrabbit/site/live/oak/docs/oak-mongo-js/fonts/OpenSans-Bold-webfont.eot
------------------------------------------------------------------------------
    svn:mime-type = application/octet-stream