You are viewing a plain text version of this content. The canonical link for it is here.
Posted to oak-commits@jackrabbit.apache.org by ju...@apache.org on 2014/02/24 22:11:26 UTC

svn commit: r1571441 - /jackrabbit/oak/trunk/oak-doc/src/site/markdown/segmentmk.md

Author: jukka
Date: Mon Feb 24 21:11:26 2014
New Revision: 1571441

URL: http://svn.apache.org/r1571441
Log:
OAK-593: Segment-based MK

More detailed documentation of the overall segment format, WIP

Modified:
    jackrabbit/oak/trunk/oak-doc/src/site/markdown/segmentmk.md

Modified: jackrabbit/oak/trunk/oak-doc/src/site/markdown/segmentmk.md
URL: http://svn.apache.org/viewvc/jackrabbit/oak/trunk/oak-doc/src/site/markdown/segmentmk.md?rev=1571441&r1=1571440&r2=1571441&view=diff
==============================================================================
--- jackrabbit/oak/trunk/oak-doc/src/site/markdown/segmentmk.md (original)
+++ jackrabbit/oak/trunk/oak-doc/src/site/markdown/segmentmk.md Mon Feb 24 21:11:26 2014
@@ -54,35 +54,81 @@ its properties and closest child nodes. 
 to store commonly occurring property values or other shared data. Segments
 can be to up to 256KiB in size.
 
-Segments come in two types: data and bulk segments. A data segment can
-contain any types of records, may refer to content in other segments and
-comes with a segment header that guides the parsing of the segment.
-Bulk segments on the other hand only contain raw binary data, interpreted
-as a sequence of block records. Bulk segments are only used for storing
-large binary values, and are handled separately from data segments to
-prevent large binaries from disrupting the compactness of the rest of
-the stored tree data.
-
-The type of a segment is encoded in its UUID and can thus be determined
-already before reading the segment. The following bit patterns are used
-(each `x` represents 4 random bits):
-
-  * `xxxxxxxx-xxxx-4xxx-Axxx-xxxxxxxxxxxx`: data segment UUID
-  * `xxxxxxxx-xxxx-4xxx-Bxxx-xxxxxxxxxxxx`: bulk segment UUID
-
-This encoding makes segment UUIDs appear as syntactically valid version 4
-random UUIDs specified in RFC 4122.
-
-Content within a data segment can contain references to content within
-other segments. Each segment keeps a list of the UUIDs of all other segments
-it references. This list of segment references can be used to optimize
-both internal storage (as seen below) and garbage collection. Segments
-that are no longer referenced can be efficiently identified by
-traversing the graph of segment-level references without having to
-parse or even fetch the contents of each segment.
+Segments come in two types: data and bulk segments. The type of a segment
+is encoded in its UUID and can thus be determined already before reading
+the segment. The following bit patterns are used (each `x` represents four
+random bits):
 
-The internal record structure of nodes is described in a moment once
-we first cover journal documents.
+  * `xxxxxxxx-xxxx-4xxx-Axxx-xxxxxxxxxxxx` data segment UUID
+  * `xxxxxxxx-xxxx-4xxx-Bxxx-xxxxxxxxxxxx` bulk segment UUID
+
+(This encoding makes segment UUIDs appear as syntactically valid version 4
+random UUIDs specified in RFC 4122.)
+
+Bulk segments
+-------------
+
+Bulk segments contain raw binary data, interpreted simply as a sequence
+of block records with no headers or other extra metadata:
+
+    [block 1] [block 2] ... [block N]
+
+A bulk segment whose length is `n` bytes consists of `n div 4096` block
+records of 4KiB each followed possibly a block record of `n mod 4096` bytes,
+if there still are remaining bytes in the segment. The structure of a
+bulk segment can thus be parsed based only on the segment length.
+
+Data segments
+-------------
+
+A data segment can contain any types of records, may refer to content in
+other segments, and comes with a segment header that guides the parsing
+of the segment. The overall structure of a data segment is:
+
+    [segment header] [record 1] [record 2] ... [record N] [checksum]
+
+The header and each record is zero-padded to make their size a multiple of
+four bytes and to align the next record at a four-byte boundary. The last
+four bytes of a segment contain the Adler-32 checksum of all the preceding
+bytes.
+
+The segment header consists of the following fields:
+
+    +--------+--------+--------+--------+--------+--------+--------+--------+
+    | magic bytes: "0aK\n" in ASCII     |version |idcount |rootcount        |
+    +--------+--------+--------+--------+--------+--------+--------+--------+
+    | nanosecond timestamp/counter (8 bytes)                                |
+    +--------+--------+--------+--------+--------+--------+--------+--------+
+    | Referenced segment identifiers  (idcount x 16 bytes)                  |
+    |                                                                       |
+    |                            ......                                     |
+    |                                                                       |
+    +--------+--------+--------+--------+--------+--------+--------+--------+
+    | Root record references  (rootcount x 3 bytes)                         |
+    |                                                                       |
+    |                            ......          +--------+--------+--------+
+    |                                            |
+    +--------+--------+--------+--------+--------+
+
+The first four bytes of a segment always contain the ASCII string "0aK\n",
+which is intended to make the binary segment data format easily detectable.
+The next byte indicates the version of segment format, and is set to zero
+for all segments that follow the format described here.
+
+The `idcount` byte indicates how many other segments are referenced by
+records within this segment. The identifiers of those segments are listed
+starting at offset 16 of the segment header. This lookup table of up to
+255 segment identifiers is used to optimize garbage collection and to avoid
+having to repeat the 16-byte UUIDs whenever references to records in other
+segments are made.
+
+The 16-bit `rootcount` field indicates the number of root record references
+that follow after the segment identifier lookup table. The root record
+references are a debugging and recovery aid, that are not needed during
+normal operation. They identify the types and locations of those records
+within this segment that are not accessible by following references in
+other records within this segment. These root references give enough context
+for parsing all records within a segment without any external information.
 
 Journals
 ========
@@ -257,4 +303,4 @@ TBD, M ~ 1k) is stored differently, usin
 and child nodes. This way a node can become arbitrarily large and still
 remain reasonably efficient to access and modify. The main downside of
 this alternative storage layout is that the ordering of child nodes is
-lost.
\ No newline at end of file
+lost.