You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@iotdb.apache.org by hx...@apache.org on 2019/07/22 12:35:35 UTC
[incubator-iotdb] branch master updated: TsFile Docs Update for the
hierarchy of TsFile (#288)
This is an automated email from the ASF dual-hosted git repository.
hxd pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-iotdb.git
The following commit(s) were added to refs/heads/master by this push:
new 4b1c927 TsFile Docs Update for the hierarchy of TsFile (#288)
4b1c927 is described below
commit 4b1c927c84686ef3112378528e53c1b34df3201b
Author: Zihan Meng <40...@users.noreply.github.com>
AuthorDate: Mon Jul 22 20:34:04 2019 +0800
TsFile Docs Update for the hierarchy of TsFile (#288)
* TsFile hierarchy with graph, updated content directory
---
.../UserGuide/7-TsFile/3-Hierarchy.md | 241 +++++++++++++++++++++
docs/Documentation/UserGuideV0.7.0/0-Content.md | 1 +
.../UserGuideV0.7.0/7-TsFile/1-Installation.md | 2 +-
.../UserGuideV0.7.0/7-TsFile/2-Usage.md | 2 +-
4 files changed, 244 insertions(+), 2 deletions(-)
diff --git a/docs/Documentation/UserGuide/7-TsFile/3-Hierarchy.md b/docs/Documentation/UserGuide/7-TsFile/3-Hierarchy.md
new file mode 100644
index 0000000..643417f
--- /dev/null
+++ b/docs/Documentation/UserGuide/7-TsFile/3-Hierarchy.md
@@ -0,0 +1,241 @@
+<!--
+
+ Licensed to the Apache Software Foundation (ASF) under one
+ or more contributor license agreements. See the NOTICE file
+ distributed with this work for additional information
+ regarding copyright ownership. The ASF licenses this file
+ to you under the Apache License, Version 2.0 (the
+ "License"); you may not use this file except in compliance
+ with the License. You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing,
+ software distributed under the License is distributed on an
+ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ KIND, either express or implied. See the License for the
+ specific language governing permissions and limitations
+ under the License.
+
+-->
+# Chapter 7: TsFile
+## TsFile Hierarchy
+ Here is a brief introduction of the structure of a TsFile file.
+
+## Variable Storage
+ * **Big Endian**
+
+ * For Example, the `int` `0x8` will be stored as `00 00 00 08`, not `08 00 00 00`
+
+ * **String with Variable Length**
+
+ * The format is `int size` plus `String literal`. Size can be zero.
+
+ * Size equals the number of bytes this string will take, and it may not equal to the length of the string.
+
+ * For example "sensor_1" will be stored as `00 00 00 08` plus the encoding(ASCII) of "sensor_1".
+
+ * Note that for the "Magic String"(file signature) "TsFilev0.8.0", the size(12) and encoding(ASCII)
+ is fixed so there is no need to put the size before this string literal.
+
+ * **Data Type Hardcode**
+ * 0: BOOLEAN
+ * 1: INT32 (`int`)
+ * 2: INT64 (`long`)
+ * 3: FLOAT
+ * 4: DOUBLE
+ * 5: TEXT (`String`)
+ * **Encoding Type Hardcode**
+ * 0: PLAIN
+ * 1: PLAIN_DICTIONARY
+ * 2: RLE
+ * 3: DIFF
+ * 4: TS_2DIFF
+ * 5: BITMAP
+ * 6: GORILLA
+ * 7: REGULAR
+ * **Compressing Type Hardcode**
+ * 0: UNCOMPRESSED
+ * 1: SNAPPY
+
+
+## TsFile Overview
+Here is a graph about the TsFile structure.
+
+![TsFile Breakdown](https://user-images.githubusercontent.com/40447846/61616997-6fad1300-ac9c-11e9-9c17-46785ebfbc88.png)
+
+## Magic String
+There is a 12 bytes magic string:
+
+`TsFilev0.8.0`
+
+It is in both the beginning and end of a TsFile file as signature.
+
+## Data
+
+The content of a TsFile file can be divided as two parts: data and metadata. There is a byte `0x02` as the marker between
+data and metadata.
+
+The data section is an array of `ChunkGroup`, each ChuckGroup represents a *device*.
+
+#### ChuckGroup
+
+The `ChunkGroup` has an array of `Chunk`, a following byte `0x00` as the marker, and a `ChunkFooter`.
+
+##### Chunk
+
+A `Chunk` represents a *sensor*. There is a byte `0x01` as the marker, following a `ChunkHeader` and an array of `Page`.
+
+###### ChunkHeader
+<center>
+ <table style="text-align:center">
+ <tr><th>Member Description</th><th>Member Type</td></tr>
+ <tr><td>The name of this sensor(measurementID)</td><td>String</td>
+ <tr><td>Size of this chunk</td><td>int</td>
+ <tr><td>Data type of this chuck</td><td>short</td>
+ <tr><td>Number of pages</td><td>int</td>
+ <tr><td>Compression Type</td><td>short</td>
+ <tr><td>Encoding Type</td><td>short</td>
+ <tr><td>Max Tombstone Time</td><td>long</td>
+ </table>
+</center>
+
+###### Page
+
+A `Page` represents some data in a `Chunk`. It contains a `PageHeader` and the actual data (The encoded time-value pair).
+
+PageHeader Structure
+
+<center>
+ <table style="text-align:center">
+ <tr><th>Member Description</th><th>Member Type</td></tr>
+ <tr><td>Data size before compressing</td><td>int</td>
+ <tr><td>Data size after compressing(if use SNAPPY)</td><td>int</td>
+ <tr><td>Number of values</td><td>int</td>
+ <tr><td>Minimum time stamp</td><td>long</td>
+ <tr><td>Maximum time stamp</td><td>long</td>
+ <tr><td>Minimum value of the page</td><td>Type of the page</td>
+ <tr><td>Maximum value of the page</td><td>Type of the page</td>
+ <tr><td>First value of the page</td><td>Type of the page</td>
+ <tr><td>Last value of the page</td><td>Type of the page</td>
+ <tr><td>Sum of the Page</td><td>double</td>
+ </table>
+</center>
+
+###### ChunkGroupFooter
+
+<center>
+ <table style="text-align:center">
+ <tr><th>Member Description</th><th>Member Type</td></tr>
+ <tr><td>Deviceid</td><td>String</td>
+ <tr><td>Data size of the ChunkGroup</td><td>long</td>
+ <tr><td>Number of chunks</td><td>int</td>
+ </table>
+</center>
+
+## Metadata
+
+### TsDeviceMetaData
+The first part of metadata is `TsDeviceMetaData`
+
+<center>
+ <table style="text-align:center">
+ <tr><th>Member Description</th><th>Member Type</td></tr>
+ <tr><td>Start time</td><td>long</td>
+ <tr><td>End time</td><td>long</td>
+ <tr><td>Number of chunk groups</td><td>int</td>
+ </table>
+</center>
+
+Then there is an array of `ChunkGroupMetaData` after `TsDeviceMetaData`
+### ChunkGroupMetaData
+
+<center>
+ <table style="text-align:center">
+ <tr><th>Member Description</th><th>Member Type</td></tr>
+ <tr><td>Deviceid</td><td>String</td>
+ <tr><td>Start offset of the ChunkGroup</td><td>long</td>
+ <tr><td>End offset of the ChunkGroup</td><td>long</td>
+ <tr><td>Version</td><td>long</td>
+ <tr><td>Number of ChunkMetaData</td><td>int</td>
+ </table>
+</center>
+
+Then there is an array of `ChunkMetadata` for each `ChunkGroupMetadata`
+
+##### ChunkMetaData
+
+<center>
+ <table style="text-align:center">
+ <tr><th>Member Description</th><th>Member Type</td></tr>
+ <tr><td>Measurementid</td><td>String</td>
+ <tr><td>Start offset of ChunkHeader</td><td>long</td>
+ <tr><td>Number of data points</td><td>long</td>
+ <tr><td>Start time</td><td>long</td>
+ <tr><td>End time</td><td>long</td>
+ <tr><td>Data type</td><td>short</td>
+ <tr><td>Number of statistics</td><td>int</td>
+ <tr><td>The statistics of this chunk</td><td>TsDigest</td>
+ </table>
+</center>
+
+###### TsDigest
+
+There are five statistics: `min, last, sum, first, max`
+
+The storage format is a name-value pair. The name is a string (remember the length is before the literal).
+
+But for the value, there is also a size integer before the data even if it is not string. For example, if the `min` is 3, then it will be
+stored as 3 "min" 4 3 in the TsFile.
+
+#### File Metadata
+
+After the array of `ChunkGroupMetadata`, here is the last part of the metadata.
+
+<center>
+ <table style="text-align:center">
+ <tr><th>Member Description</th><th>Member Type</td></tr>
+ <tr><td>Number of Devices</td><td>int</td>
+ <tr><td>Array of DeviceIndexMetadata</td><td>DeviceIndexMetadata</td>
+ <tr><td>Number of Measurements</td><td>int</td>
+ <tr><td>Array of Measurement name and schema</td><td>String, MeasurementSchema pair</td>
+ <tr><td>Current Version(3 for now)</td><td>int</td>
+ <tr><td>Author byte</td><td>byte</td>
+ <tr><td>Author(if author byte is 0x01)</td><td>String</td>
+ <tr><td>File Metadata size(not including itself)</td><td>int</td>
+ </table>
+</center>
+
+##### DeviceIndexMetadata
+<center>
+ <table style="text-align:center">
+ <tr><th>Member Description</th><th>Member Type</td></tr>
+ <tr><td>Deviceid</td><td>String</td>
+ <tr><td>Start offset of ChunkGroupMetaData(Or TsDeviceMetaData if it's the first one)</td><td>long</td>
+ <tr><td>length</td><td>int</td>
+ <tr><td>Start time</td><td>long</td>
+ <tr><td>End time</td><td>long</td>
+ </table>
+</center>
+
+##### MeasurementSchema
+<center>
+ <table style="text-align:center">
+ <tr><th>Member Description</th><th>Member Type</td></tr>
+ <tr><td>Measurementid</td><td>String</td></tr>
+ <tr><td>Data type</td><td>short</td>
+ <tr><td>Encoding</td><td>short</td>
+ <tr><td>Compressor</td><td>short</td>
+ <tr><td>Size of props</td><td>int</td>
+ </table>
+</center>
+
+If size of props is greater than 0, there is an array of <String, String> pair as properties of this measurement.
+
+Such as "max_point_number""2".
+
+## Done
+
+After the `FileMetaData`, there will be another Magic String and you have finished the journey of discovering TsFile!
+
+You can also use /tsfile/example/TsFileSequenceRead to read and validate a TsFile.
\ No newline at end of file
diff --git a/docs/Documentation/UserGuideV0.7.0/0-Content.md b/docs/Documentation/UserGuideV0.7.0/0-Content.md
index d483d97..1b245d2 100644
--- a/docs/Documentation/UserGuideV0.7.0/0-Content.md
+++ b/docs/Documentation/UserGuideV0.7.0/0-Content.md
@@ -52,6 +52,7 @@
# Chapter 7: TsFile
* 1-Installation
* 2-Usage
+* 3-Hierarchy
# Chapter 8: System Tools
* 1-Sync
* 2-Memory Estimation Tool
diff --git a/docs/Documentation/UserGuideV0.7.0/7-TsFile/1-Installation.md b/docs/Documentation/UserGuideV0.7.0/7-TsFile/1-Installation.md
index f5e1a67..c856e47 100644
--- a/docs/Documentation/UserGuideV0.7.0/7-TsFile/1-Installation.md
+++ b/docs/Documentation/UserGuideV0.7.0/7-TsFile/1-Installation.md
@@ -19,7 +19,7 @@
-->
-# Chaper7: TsFile
+# Chapter 7: TsFile
## Installation
diff --git a/docs/Documentation/UserGuideV0.7.0/7-TsFile/2-Usage.md b/docs/Documentation/UserGuideV0.7.0/7-TsFile/2-Usage.md
index 77a24cd..822e72d 100644
--- a/docs/Documentation/UserGuideV0.7.0/7-TsFile/2-Usage.md
+++ b/docs/Documentation/UserGuideV0.7.0/7-TsFile/2-Usage.md
@@ -19,7 +19,7 @@
-->
-# Chaper7: TsFile
+# Chapter 7: TsFile
## Usage