You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@iotdb.apache.org by xu...@apache.org on 2021/06/27 15:50:52 UTC

[iotdb] branch master updated: [IOTDB-1455] Update documents for new IoTDB concepts and TsFile structures (#3456)

This is an automated email from the ASF dual-hosted git repository.

xuekaifeng pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/iotdb.git


The following commit(s) were added to refs/heads/master by this push:
     new 30678c1  [IOTDB-1455] Update documents for new IoTDB concepts and TsFile structures (#3456)
30678c1 is described below

commit 30678c1e9edad793bcd6b8f177d5aa28559908d2
Author: Zesong Sun <sz...@mails.tsinghua.edu.cn>
AuthorDate: Sun Jun 27 23:50:27 2021 +0800

    [IOTDB-1455] Update documents for new IoTDB concepts and TsFile structures (#3456)
    
    * [IOTDB-1455] Update documents for new IoTDB concepts and TsFile structures
    
    * English version of User Guide 3.1 Data Concept
    
    * English version of System Design 2.2 TsFile Format
---
 docs/SystemDesign/TsFile/Format.md                 | 190 +++++++++++----------
 .../Data-Concept/Data-Model-and-Terminology.md     |  96 +++++------
 docs/zh/SystemDesign/TsFile/Format.md              | 166 +++++++++---------
 .../Data-Concept/Data-Model-and-Terminology.md     |  99 +++++------
 4 files changed, 288 insertions(+), 263 deletions(-)

diff --git a/docs/SystemDesign/TsFile/Format.md b/docs/SystemDesign/TsFile/Format.md
index 16b759f..d1b2d3a 100644
--- a/docs/SystemDesign/TsFile/Format.md
+++ b/docs/SystemDesign/TsFile/Format.md
@@ -66,49 +66,60 @@
 
 ### 1.2 TsFile Overview
 
+<!-- TODO
+
 Here is the structure diagram of TsFile.
 
 <img style="width:100%; max-width:800px; max-height:600px; margin-left:auto; margin-right:auto; display:block;" src="https://user-images.githubusercontent.com/33376433/123052025-f47aab80-d434-11eb-94c2-9b75429e5c54.png">
 
-This TsFile contains two devices: d1, d2. Each device contains two measurements: s1, s2. 4 timeseries in total. Each timeseries contains 2 Chunks.
+This TsFile contains two entities: d1, d2. Each entity contains two measurements: s1, s2. 4 timeseries in total. Each timeseries contains 2 Chunks.
 
-Here is another representation of the TsFile structure:
+-->
 
-<img style="width:100%; max-width:800px; max-height:600px; margin-left:auto; margin-right:auto; display:block;" src="https://user-images.githubusercontent.com/19167280/98808354-ed2f0080-2456-11eb-8e7f-b11a4759d560.png">
+There are two parts in TsFile: **Data Area** and **Index Area**.
 
-This TsFile contains two devices: d1, d2. Each device contains three measurements: s1, s2, s3. 6 timeseries in total. Each timeseries contains 2 Chunks.
+There are three concepts, from small to large, in **Data Area:**
 
-There are three parts of metadata
+* **Page**: A page is a sequence of timeseries. It is the smallest unit in which a data block is deserialized.
 
-* A list of ChunkMetadata organized by timeseries.
-* A list of TimeseriesMetadata organized by timeseries.
-* TsFileMetadata
+* **Chunk**: A chunk contains several pages in one timeseries. It is the smallest unit in which a data block is read by IO.
 
-Query Process:e.g., read d1.s1
+* **ChunkGroup**: A chunk group contains several chunks in one entity.
 
-* deserialize TsFileMetadata,get the position of TimeseriesMetadata of d1.s1
-* deserialize and get the TimeseriesMetadata of d1.s1
-* according to TimeseriesMetadata of d1.s1,deserialize all ChunkMetadata of d1.s1 
-* according to each ChunkMetadata of d1.s1,read its Chunk
+There are three parts in **Index Area**:
 
-#### 1.2.1 Magic String and Version Number
+* **TimeseriesIndex** organized by timeseries, containing a header and list of ChunkIndex. The header records data type and statistics (maximum and minimum timestamps, etc.) of a time series in the file. The data block index list records the offsets of the chunks in the file, and the related statistics (maximum and minimum timestamps, etc.).
+* **IndexOfTimeseriesIndex** for index the offsets of TimeseriesIndex in the file.
+* **BloomFilter** for entities.
 
-A TsFile begins with a 6-byte magic string (`TsFile`) and a 6-byte version number (`000002`).
 
-#### 1.2.2 Data
 
-The content of a TsFile file can be divided as two parts: data (Chunk) and metadata (XXMetadata). There is a byte `0x02` as the marker between
-data and metadata.
+Here is the structure diagram of TsFile:
 
-The data section is an array of `ChunkGroup`, each ChunkGroup represents a *device*.
+<img style="width:100%; max-width:800px; max-height:600px; margin-left:auto; margin-right:auto; display:block;" src="https://user-images.githubusercontent.com/19167280/123542462-6710c180-d77c-11eb-9afb-a1b495c82ea9.png">
+
+This TsFile contains two entities: d1, d2. Each entity contains three measurements: s1, s2, s3. 6 timeseries in total. Each timeseries contains 2 Chunks.
+
+Query Process of reading d1.s1:
+
+* Deserialize IndexOfTimeseriesIndex, get the position of TimeseriesIndex of d1.s1
+* Deserialize and get the TimeseriesIndex of d1.s1
+* According to TimeseriesIndex of d1.s1, deserialize all ChunkIndex of d1.s1 
+* According to each ChunkIndex of d1.s1, read its Chunk
+
+#### 1.2.1 Magic String and Version Number
+
+A TsFile begins with a 6-byte magic string (`TsFile`) and a 6-byte version number (`000002`).
+
+#### 1.2.2 Data Area
 
 ##### ChunkGroup
 
-The `ChunkGroup` consists of several `Chunk`, a byte delimiter`0x00` and a `ChunkFooter`.
+A `ChunkGroup` stores the data of an entity for a period of time. It consists of several `Chunk`, a byte delimiter`0x00` and a `ChunkFooter`.
 
 ##### Chunk
 
-A `Chunk` represents the data of a *measurement* in a time range, data points in Chunks are in time ascending order. There is a byte `0x01` as the marker, following a `ChunkHeader` and an array of `Page`.
+A `Chunk` stores the data of a measurement for a period of time. The data in a chunk is stored in time increment order. It consists of a byte `0x01` as the marker, following a `ChunkHeader` and an array of `Page`.
 
 ##### ChunkHeader
 |             Member             |  Type  | Description |
@@ -122,9 +133,9 @@ A `Chunk` represents the data of a *measurement* in a time range, data points in
 
 ##### Page
 
-A `Page` represents some data in a `Chunk`. It contains a `PageHeader` and the actual data (The encoded time-value pair).
+A `Page` stores a sequence of timeseries. It is the smallest unit in which a data block is deserialized. It contains a `PageHeader` and the actual data (encoded time-value pairs).
 
-PageHeader Structure
+PageHeader Structure:
 
 |             Member             |  Type  | Description |
 | :----------------------------------: | :--------------: | :----: |
@@ -134,31 +145,31 @@ PageHeader Structure
 
 Here is the detailed information for `statistics`:
 
- |             Member               | Description | DoubleStatistics | FloatStatistics | IntegerStatistics | LongStatistics | BinaryStatistics | BooleanStatistics |
- | :----------------------------------: | :--------------: | :----: | :----: | :----: | :----: | :----: | :----: |
- | count  | number of time-value points | long | long | long | long | long | long | 
- | startTime | start time | long | long | long | long | long | long | 
- | endTime | end time | long | long | long | long | long | long | 
- | minValue | min value | double | float | int | long | - | - |
- | maxValue | max value | double | float | int | long | - | - |
- | firstValue | first value | double | float | int | long | Binary | boolean|
- | lastValue | last value | double | float | int | long | Binary | boolean|
- | sumValue | sum value | double | double | double | double | - | - |
- | extreme | extreme value | double | float | int | long | - | - |
- 
+|             Member               | Description | DoubleStatistics | FloatStatistics | IntegerStatistics | LongStatistics | BinaryStatistics | BooleanStatistics |
+| :----------------------------------: | :--------------: | :----: | :----: | :----: | :----: | :----: | :----: |
+| count  | number of time-value points | long | long | long | long | long | long |
+| startTime | start time | long | long | long | long | long | long |
+| endTime | end time | long | long | long | long | long | long |
+| minValue | min value | double | float | int | long | - | - |
+| maxValue | max value | double | float | int | long | - | - |
+| firstValue | first value | double | float | int | long | Binary | boolean|
+| lastValue | last value | double | float | int | long | Binary | boolean|
+| sumValue | sum value | double | double | double | double | - | - |
+| extreme | extreme value | double | float | int | long | - | - |
+
 ##### ChunkGroupFooter
 
 |             Member             |  Type  | Description |
 | :--------------------------------: | :----: | :----: |
-|         deviceID          | String | Name of device |
+|         entityID  | String | Name of entity |
 |      dataSize      |  long  | Data size of the ChunkGroup |
 | numberOfChunks |  int   | Number of chunks |
 
-#### 1.2.3  Metadata
+#### 1.2.3  Index Area
 
-##### 1.2.3.1 ChunkMetadata
+##### 1.2.3.1 ChunkIndex
 
-The first part of metadata is `ChunkMetadata` 
+The first part of index is `ChunkIndex` :
 
 |             Member             |  Type  | Description |
 | :------------------------------------------------: | :------: | :----: |
@@ -167,73 +178,74 @@ The first part of metadata is `ChunkMetadata`
 |                tsDataType                |  TSDataType   | Data type |
 |   statistics    |       Statistics        | Statistic values |
 
-##### 1.2.3.2 TimeseriesMetadata
+##### 1.2.3.2 TimeseriesIndex
 
-The second part of metadata is `TimeseriesMetadata`.
+The second part of index is `TimeseriesIndex`:
 
 |             Member             |  Type  | Description |
 | :------------------------------------------------: | :------: | :------: |
 |             measurementUid            |  String  | Name of measurement |
 |               tsDataType                |  short   |  Data type |
-| startOffsetOfChunkMetadataList |  long  | Start offset of ChunkMetadata list |
-|  chunkMetaDataListDataSize  |  int  | ChunkMetadata list size |
+| startOffsetOfChunkIndexList |  long  | Start offset of ChunkIndex list |
+|  ChunkIndexListDataSize  |  int  | ChunkIndex list size |
 |   statistics    |       Statistics        | Statistic values |
 
-##### 1.2.3.3 TsFileMetaData
+##### 1.2.3.3 IndexOfTimeseriesIndex (Secondary Index)
 
-The third part of metadata is `TsFileMetaData`.
+The third part of index is `IndexOfTimeseriesIndex`:
 
 |             Member             |  Type  | Description |
 | :-------------------------------------------------: | :---------------------: | :---: |
-|       MetadataIndex              |   MetadataIndexNode      | MetadataIndex node |
-|           totalChunkNum            |                int                 | total chunk num |
-|          invalidChunkNum           |                int                 | invalid chunk num |
-|                versionInfo         |             List<Pair<Long, Long>>       | version information |
-|        metaOffset   |                long                 | offset of MetaMarker.SEPARATOR |
+|       IndexTree       |   IndexNode      | Root index node of IndexTree |
+| offsetOfIndexArea   |                long                 | offset of index area |
 |                bloomFilter                 |                BloomFilter      | bloom filter |
 
-MetadataIndexNode has members as below:
+IndexNode has members as below:
 
 |             Member             |  Type  | Description |
 | :------------------------------------: | :----: | :---: |
-|      children    | List<MetadataIndexEntry> | MetadataIndexEntry list |
-|       endOffset      | long |    EndOffset of this MetadataIndexNode |
-|   nodeType    | MetadataIndexNodeType | MetadataIndexNode type |
+|      children    | List<IndexEntry> | IndexEntry list |
+|       endOffset      | long |    EndOffset of this IndexNode |
+|   nodeType    | IndexNodeType | IndexNode type |
 
-MetadataIndexEntry has members as below:
+IndexEntry has members as below:
 
 |             Member             |  Type  | Description |
 | :------------------------------------: | :----: | :---: |
-|  name    | String | Name of related device or measurement |
+|  name    | String | Name of related entity or measurement |
 |     offset     | long   | offset |
 
-All MetadataIndexNode forms a **metadata index tree**, which consists of no more than two levels: device index level and measurement index level. In different situation, the tree could have different components. The MetadataIndexNodeType has four enums: `INTERNAL_DEVICE`, `LEAF_DEVICE`, `INTERNAL_MEASUREMENT`, `LEAF_MEASUREMENT`, which indicates the internal or leaf node of device index level and measurement index level respectively. Only the `LEAF_MEASUREMENT` nodes point to `Timeseries [...]
+All IndexNode forms an **index tree (secondary index)** like a B+ tree, which consists of two levels: entity index level and measurement index level. The IndexNodeType has four enums: `INTERNAL_ENTITY`, `LEAF_ENTITY`, `INTERNAL_MEASUREMENT`, `LEAF_MEASUREMENT`, which indicates the internal or leaf node of entity index level and measurement index level respectively. Only the `LEAF_MEASUREMENT` nodes point to `TimeseriesIndex`.
+
+Here are four detailed examples.
+
+The degree of the index tree (that is, the max number of each node's children) could be configured by users, and is 256 by default. In the examples below, we assume `max_degree_of_index_node = 10`.
 
-To describe the structure of metadata index tree more clearly, we will give four examples here in details.
+* Example 1: 5 entities with 5 measurements each
 
-The max degree of the metadata index tree (that is, the max number of each node's children) could be configured by users, and is 256 by default. In the examples below, we assume `max_degree_of_index_node = 10` in the following examples.
+<img style="width:100%; max-width:800px; max-height:600px; margin-left:auto; margin-right:auto; display:block;" src="https://user-images.githubusercontent.com/19167280/122677230-134e2780-d214-11eb-9603-ac7b95bc0668.png">
 
-<img style="width:100%; max-width:800px; max-height:600px; margin-left:auto; margin-right:auto; display:block;" src="https://user-images.githubusercontent.com/19167280/81935219-de3fd080-9622-11ea-9aa1-a59bef1c0001.png">
+In the case of 5 entities with 5 measurements each: Since the numbers of entities and measurements are both no more than `max_degree_of_index_node`, the tree has only measurement index level by default. In this level, each IndexNode is composed of no more than 10 index entries. The root nonde is `INTERNAL_MEASUREMENT` type, and the 5 index entries point to index nodes of related entities. These nodes point to  `TimeseriesIndex` directly, as they are `LEAF_MEASUREMENT` type.
 
-In the case of 5 devices with 5 measurements each: Since the numbers of devices and measurements are both no more than `max_degree_of_index_node`, the tree has only measurement index level by default. In this level, each MetadataIndexNode is composed of no more than 10 MetadataIndex entries. The root nonde is `INTERNAL_MEASUREMENT` type, and the 5 MetadataIndex entries point to MetadataIndex nodes of related devices. These nodes point to  `TimeseriesMetadata` directly, as they are `LEAF_ [...]
+* Example 2: 1 entity with 150 measurements
 
-<img style="width:100%; max-width:800px; max-height:600px; margin-left:auto; margin-right:auto; display:block;" src="https://user-images.githubusercontent.com/19167280/81935210-d97b1c80-9622-11ea-8a69-2c2c5f05a876.png">
+<img style="width:100%; max-width:800px; max-height:600px; margin-left:auto; margin-right:auto; display:block;" src="https://user-images.githubusercontent.com/19167280/122677233-15b08180-d214-11eb-8d09-c741cca59262.png">
 
-In the case of 1 device with 150 measurements: The number of measurements exceeds `max_degree_of_index_node`, so the tree has only measurement index level by default. In this level, each MetadataIndexNode is composed of no more than 10 MetadataIndex entries. The nodes that point to `TimeseriesMetadata` directly are `LEAF_MEASUREMENT` type. Other nodes and root node of index tree are not leaf nodes of measurement index level, so they are `INTERNAL_MEASUREMENT` type.
+In the case of 1 entity with 150 measurements: The number of measurements exceeds `max_degree_of_index_node`, so the tree has only measurement index level by default. In this level, each IndexNode is composed of no more than 10 index entries. The nodes that point to `TimeseriesIndex` directly are `LEAF_MEASUREMENT` type. Other nodes and root node of index tree are not leaf nodes of measurement index level, so they are `INTERNAL_MEASUREMENT` type.
 
-<img style="width:100%; max-width:800px; max-height:600px; margin-left:auto; margin-right:auto; display:block;" src="https://user-images.githubusercontent.com/19167280/95592841-c0fd1a00-0a7b-11eb-9b46-dfe8b2f73bfb.png">
+* Example 3: 150 entities with 1 measurement each
 
-In the case of 150 device with 1 measurement each: The number of devices exceeds `max_degree_of_index_node`, so the device index level and measurement index level of the tree are both formed. In these two levels, each MetadataIndexNode is composed of no more than 10 MetadataIndex entries. The nodes that point to `TimeseriesMetadata` directly are `LEAF_MEASUREMENT` type. The root nodes of measurement index level are also the leaf nodes of device index level, which are `LEAF_DEVICE` type.  [...]
+<img style="width:100%; max-width:800px; max-height:600px; margin-left:auto; margin-right:auto; display:block;" src="https://user-images.githubusercontent.com/19167280/122771008-9a64d380-d2d8-11eb-9044-5ac794dd38f7.png">
 
-<img style="width:100%; max-width:800px; max-height:600px; margin-left:auto; margin-right:auto; display:block;" src="https://user-images.githubusercontent.com/19167280/81935138-b6e90380-9622-11ea-94f9-c97bd2b5d050.png">
+In the case of 150 entities with 1 measurement each: The number of entities exceeds `max_degree_of_index_node`, so the entity index level and measurement index level of the tree are both formed. In these two levels, each IndexNode is composed of no more than 10 index entries. The nodes that point to `TimeseriesIndex` directly are `LEAF_MEASUREMENT` type. The root nodes of measurement index level are also the leaf nodes of entity index level, which are `LEAF_ENTITY` type. Other nodes and  [...]
 
-In the case of 150 device with 150 measurements each: The numbers of devices and measurements both exceed `max_degree_of_index_node`, so the device index level and measurement index level are both formed. In these two levels, each MetadataIndexNode is composed of no more than 10 MetadataIndex entries. As is described before, from the root node to the leaf nodes of device index level, their types are `INTERNAL_DEVICE` and `LEAF_DEVICE`; each leaf node of device index level can be seen as  [...]
+* Example 4: 150 entities with 150 measurements each
 
-The MetadataIndex is designed as tree structure so that not all the `TimeseriesMetadata` need to be read when the number of devices or measurements is too large. Only reading specific MetadataIndex nodes according to requirement and reducing I/O could speed up the query. More reading process of TsFile in details will be described in the last section of this chapter.
+<img style="width:100%; max-width:800px; max-height:600px; margin-left:auto; margin-right:auto; display:block;" src="https://user-images.githubusercontent.com/19167280/122677241-1a753580-d214-11eb-817f-17bcf797251f.png">
 
-##### 1.2.3.4 TsFileMetadataSize
+In the case of 150 entities with 150 measurements each: The numbers of entities and measurements both exceed `max_degree_of_index_node`, so the entity index level and measurement index level are both formed. In these two levels, each IndexNode is composed of no more than 10 index entries. As is described before, from the root node to the leaf nodes of entity index level, their types are `INTERNAL_ENTITY` and `LEAF_ENTITY`; each leaf node of entity index level can be seen as the root node [...]
 
-After the TsFileMetaData, there is an int indicating the size of the TsFileMetaData.
+The IndexTree is designed as tree structure so that not all the `TimeseriesIndex` need to be read when the number of entities or measurements is too large. Only reading specific IndexTree nodes according to requirement and reducing I/O could speed up the query. More reading process of TsFile in details will be described in the last section of this chapter.
 
 
 #### 1.2.4 Magic String
@@ -512,39 +524,39 @@ file length: 33436
                     |   [marker] 3
                     |   [version] 102
                30009| [marker] 2
-               30010| [ChunkMetadataList] of root.group_12.d0.s_BOOLEANe_PLAIN, tsDataType:BOOLEAN
+               30010| [ChunkIndexList] of root.group_12.d0.s_BOOLEANe_PLAIN, tsDataType:BOOLEAN
                     | [startTime: 1 endTime: 10000 count: 10000 [firstValue:true,lastValue:true]] 
-               30066| [ChunkMetadataList] of root.group_12.d0.s_BOOLEANe_RLE, tsDataType:BOOLEAN
+               30066| [ChunkIndexList] of root.group_12.d0.s_BOOLEANe_RLE, tsDataType:BOOLEAN
                     | [startTime: 1 endTime: 10000 count: 10000 [firstValue:true,lastValue:true]] 
-               30120| [ChunkMetadataList] of root.group_12.d1.s_INT32e_PLAIN, tsDataType:INT32
+               30120| [ChunkIndexList] of root.group_12.d1.s_INT32e_PLAIN, tsDataType:INT32
                     | [startTime: 1 endTime: 10000 count: 10000 [minValue:1,maxValue:1,firstValue:1,lastValue:1,sumValue:10000.0]] 
-               30196| [ChunkMetadataList] of root.group_12.d1.s_INT32e_RLE, tsDataType:INT32
+               30196| [ChunkIndexList] of root.group_12.d1.s_INT32e_RLE, tsDataType:INT32
                     | [startTime: 1 endTime: 10000 count: 10000 [minValue:1,maxValue:1,firstValue:1,lastValue:1,sumValue:10000.0]] 
-               30270| [ChunkMetadataList] of root.group_12.d1.s_INT32e_TS_2DIFF, tsDataType:INT32
+               30270| [ChunkIndexList] of root.group_12.d1.s_INT32e_TS_2DIFF, tsDataType:INT32
                     | [startTime: 1 endTime: 10000 count: 10000 [minValue:1,maxValue:1,firstValue:1,lastValue:1,sumValue:10000.0]] 
-               30349| [ChunkMetadataList] of root.group_12.d2.s_INT64e_PLAIN, tsDataType:INT64
+               30349| [ChunkIndexList] of root.group_12.d2.s_INT64e_PLAIN, tsDataType:INT64
                     | [startTime: 1 endTime: 10000 count: 10000 [minValue:1,maxValue:1,firstValue:1,lastValue:1,sumValue:10000.0]] 
-               30441| [ChunkMetadataList] of root.group_12.d2.s_INT64e_RLE, tsDataType:INT64
+               30441| [ChunkIndexList] of root.group_12.d2.s_INT64e_RLE, tsDataType:INT64
                     | [startTime: 1 endTime: 10000 count: 10000 [minValue:1,maxValue:1,firstValue:1,lastValue:1,sumValue:10000.0]] 
-               30531| [ChunkMetadataList] of root.group_12.d2.s_INT64e_TS_2DIFF, tsDataType:INT64
+               30531| [ChunkIndexList] of root.group_12.d2.s_INT64e_TS_2DIFF, tsDataType:INT64
                     | [startTime: 1 endTime: 10000 count: 10000 [minValue:1,maxValue:1,firstValue:1,lastValue:1,sumValue:10000.0]] 
-               30626| [ChunkMetadataList] of root.group_12.d3.s_FLOATe_GORILLA, tsDataType:FLOAT
+               30626| [ChunkIndexList] of root.group_12.d3.s_FLOATe_GORILLA, tsDataType:FLOAT
                     | [startTime: 1 endTime: 10000 count: 10000 [minValue:1.1,maxValue:1.1,firstValue:1.1,lastValue:1.1,sumValue:11000.00023841858]] 
-               30704| [ChunkMetadataList] of root.group_12.d3.s_FLOATe_PLAIN, tsDataType:FLOAT
+               30704| [ChunkIndexList] of root.group_12.d3.s_FLOATe_PLAIN, tsDataType:FLOAT
                     | [startTime: 1 endTime: 10000 count: 10000 [minValue:1.1,maxValue:1.1,firstValue:1.1,lastValue:1.1,sumValue:11000.00023841858]] 
-               30780| [ChunkMetadataList] of root.group_12.d3.s_FLOATe_RLE, tsDataType:FLOAT
+               30780| [ChunkIndexList] of root.group_12.d3.s_FLOATe_RLE, tsDataType:FLOAT
                     | [startTime: 1 endTime: 10000 count: 10000 [minValue:1.1,maxValue:1.1,firstValue:1.1,lastValue:1.1,sumValue:11000.00023841858]] 
-               30854| [ChunkMetadataList] of root.group_12.d3.s_FLOATe_TS_2DIFF, tsDataType:FLOAT
+               30854| [ChunkIndexList] of root.group_12.d3.s_FLOATe_TS_2DIFF, tsDataType:FLOAT
                     | [startTime: 1 endTime: 10000 count: 10000 [minValue:1.1,maxValue:1.1,firstValue:1.1,lastValue:1.1,sumValue:11000.00023841858]] 
-               30933| [ChunkMetadataList] of root.group_12.d4.s_DOUBLEe_GORILLA, tsDataType:DOUBLE
+               30933| [ChunkIndexList] of root.group_12.d4.s_DOUBLEe_GORILLA, tsDataType:DOUBLE
                     | [startTime: 1 endTime: 10000 count: 10000 [minValue:1.1,maxValue:1.1,firstValue:1.1,lastValue:1.1,sumValue:11000.000000002045]] 
-               31028| [ChunkMetadataList] of root.group_12.d4.s_DOUBLEe_PLAIN, tsDataType:DOUBLE
+               31028| [ChunkIndexList] of root.group_12.d4.s_DOUBLEe_PLAIN, tsDataType:DOUBLE
                     | [startTime: 1 endTime: 10000 count: 10000 [minValue:1.1,maxValue:1.1,firstValue:1.1,lastValue:1.1,sumValue:11000.00000000123]] 
-               31121| [ChunkMetadataList] of root.group_12.d4.s_DOUBLEe_RLE, tsDataType:DOUBLE
+               31121| [ChunkIndexList] of root.group_12.d4.s_DOUBLEe_RLE, tsDataType:DOUBLE
                     | [startTime: 1 endTime: 10000 count: 10000 [minValue:1.1,maxValue:1.1,firstValue:1.1,lastValue:1.1,sumValue:11000.000000001224]] 
-               31212| [ChunkMetadataList] of root.group_12.d4.s_DOUBLEe_TS_2DIFF, tsDataType:DOUBLE
+               31212| [ChunkIndexList] of root.group_12.d4.s_DOUBLEe_TS_2DIFF, tsDataType:DOUBLE
                     | [startTime: 1 endTime: 10000 count: 10000 [minValue:1.1,maxValue:1.1,firstValue:1.1,lastValue:1.1,sumValue:11000.000000002045]] 
-               31308| [ChunkMetadataList] of root.group_12.d5.s_TEXTe_PLAIN, tsDataType:TEXT
+               31308| [ChunkIndexList] of root.group_12.d5.s_TEXTe_PLAIN, tsDataType:TEXT
                     | [startTime: 1 endTime: 10000 count: 10000 [firstValue:version_test,lastValue:version_test]] 
                32840| [MetadataIndex] of root.group_12.d0
                32881| [MetadataIndex] of root.group_12.d1
@@ -552,7 +564,7 @@ file length: 33436
                32959| [MetadataIndex] of root.group_12.d3
                33000| [MetadataIndex] of root.group_12.d4
                33042| [MetadataIndex] of root.group_12.d5
-               33080| [TsFileMetadata]
+               33080| [IndexOfTimeseriesIndex]
                     |   [num of devices] 6
                     |   6 key&TsMetadataIndex
                     |   [totalChunkNum] 17
@@ -561,7 +573,7 @@ file length: 33436
                     |   [bloom filter bit vector byte array] 
                     |   [bloom filter number of bits] 256
                     |   [bloom filter number of hash functions] 5
-               33426| [TsFileMetadataSize] 346
+               33426| [IndexOfTimeseriesIndexSize] 346
                33430| [magic tail] TsFile
                33436| END of TsFile
 
diff --git a/docs/UserGuide/Data-Concept/Data-Model-and-Terminology.md b/docs/UserGuide/Data-Concept/Data-Model-and-Terminology.md
index 4f0a5e8..147f56f 100644
--- a/docs/UserGuide/Data-Concept/Data-Model-and-Terminology.md
+++ b/docs/UserGuide/Data-Concept/Data-Model-and-Terminology.md
@@ -21,47 +21,73 @@
 # Data Concept
 ## Data Model
 
-In this section, a power scenario is taken as an example to illustrate how to creat a correct data model in IoTDB. For convenience, a sample data file is attached for you to practise IoTDB.
+A wind power IoT scenario is taken as an example to illustrate how to creat a correct data model in IoTDB.
 
-Download the attachment: [IoTDB-SampleData.txt](https://github.com/thulab/iotdb/files/4438687/OtherMaterial-Sample.Data.txt).
+According to the enterprise organization structure and equipment entity hierarchy, it is expressed as an attribute hierarchy structure, as shown below. The hierarchical from top to bottom is: power group layer - power plant layer - entity layer - measurement layer. ROOT is the root node, and each node of measurement layer is a leaf node. In the process of using IoTDB, the attributes on the path from ROOT node is directly connected to each leaf node with ".", thus forming the name of a ti [...]
 
-According to the data attribute layers, it is expressed as an attribute hierarchy structure based on the coverage of attributes and the subordinate relationship between them, as shown below. The hierarchical from top to bottom is: power group layer - power plant layer - device layer - sensor layer. ROOT is the root node, and each node of sensor layer is a leaf node. In the process of using IoTDB, the attributes on the path from ROOT node is directly connected to each leaf node with ".",  [...]
+<center><img style="width:100%; max-width:800px; max-height:600px; margin-left:auto; margin-right:auto; display:block;" src="https://user-images.githubusercontent.com/19167280/122668849-b1c69280-d1ec-11eb-83cb-3b73c40bdf72.png"></center>
 
-<center><img style="width:100%; max-width:800px; max-height:600px; margin-left:auto; margin-right:auto; display:block;" src="https://user-images.githubusercontent.com/13203019/51577327-7aa50780-1ef4-11e9-9d75-cadabb62444e.jpg"></center>
+Here are the basic concepts of the model involved in IoTDB:
 
-**Attribute hierarchy structure**
+* Measurement (Also called field)
 
-After getting the name of the timeseries, we need to set up the storage group according to the actual scenario and scale of the data. Because in the scenario of this chapter data is usually arrived in the unit of groups (i.e., data may be across electric fields and devices), in order to avoid frequent switch of IO when writing data, and meet the user's requirement of physical isolation of data in the unit of groups, storage group is set at the group layer.
+**Univariable or multi-variable measurement**. It is information measured by a detection equipment in an actual scene, and can transform the sensed information into an electrical signal or other desired form of information output and send it to IoTDB.  In IoTDB, all data and paths stored are organized in units of measuements.
 
-Here are the basic concepts of the model involved in IoTDB:
+* Sub-measurement
 
-* Device
+In multi-variable measurements, there are many sub-measurement. For example, GPS is a multi-variable measurements, including three sub-measurement: longitude, dimension and altitude. Multi-variable measurements are usually collected at the same time and share time series.
 
-A device is an installation equipped with sensors in real scenarios. In IoTDB, all sensors should have their corresponding devices.
+The univariable measurement overlaps the sub-measurement name with the measurement name. For example, temperature is a univariable measurement.
 
-* Sensor
+* Entity (Also called device)
 
-A sensor is a detection equipment in an actual scene, which can sense the information to be measured, and can transform the sensed information into an electrical signal or other desired form of information output and send it to IoTDB. In IoTDB, all data and paths stored are organized in units of sensors.
+**An entity** is an equipped with measurements in real scenarios. In IoTDB, all measurements should have their corresponding entities.
 
 * Storage Group
 
-Storage groups are used to let users define how to organize and isolate different time series data on disk. Time series belonging to the same storage group is continuously written to the same file in the corresponding folder. The file may be closed due to user commands or system policies, and hence the data coming next from these sensors will be stored in a new file in the same folder. Time series belonging to different storage groups are stored in different folders.
+**A group of entities.** Users can set any prefix path as a storage group. Provided that there are four timeseries `root.ln.wf01.wt01.status`, `root.ln.wf01.wt01.temperature`, `root.ln.wf02.wt02.hardware`, `root.ln.wf02.wt02.status`, two devices `wt01`, `wt02` under the path `root.ln` may belong to the same owner or the same manufacturer, so d1 and d2 are closely related. At this point, the prefix path root.vehicle can be designated as a storage group, which will enable IoTDB to store al [...]
 
-Users can set any prefix path as a storage group. Provided that there are four time series `root.vehicle.d1.s1`, `root.vehicle.d1.s2`, `root.vehicle.d2.s1`, `root.vehicle.d2.s2`, two devices `d1` and `d2` under the path `root.vehicle` may belong to the same owner or the same manufacturer, so d1 and d2 are closely related. At this point, the prefix path root.vehicle can be designated as a storage group, which will enable IoTDB to store all devices under it in the same folder. Newly added  [...]
+> Note1: A full path (`root.ln.wf01.wt01.status` as in the above example) is not allowed to be set as a storage group.
+>
+> Note2: The prefix of a timeseries must belong to a storage group. Before creating a timeseries, users must set which storage group the series belongs to. Only timeseries whose storage group is set can be persisted to disk.
 
-> Note: A full path (`root.vehicle.d1.s1` as in the above example) is not allowed to be set as a storage group.
+Once a prefix path is set as a storage group, the storage group settings cannot be changed.
 
-Setting a reasonable number of storage groups can lead to performance gains: there is neither the slowdown of the system due to frequent switching of IO (which will also take up a lot of memory and result in frequent memory-file switching) caused by too many storage files (or folders), nor the block of write commands caused by too few storage files (or folders) (which reduces concurrency).
+After a storage group is set, the ancestral layers, children and descendant layers of the corresponding prefix path are not allowed to be set up again (for example, after `root.ln` is set as the storage group, the root layer and `root.ln.wf01` are not allowed to be set as storage groups).
 
-Users should balance the storage group settings of storage files according to their own data size and usage scenarios to achieve better system performance. (There will be officially provided storage group scale and performance test reports in the future).
+The Layer Name of storage group can only consist of characters, numbers, underscores and hyphen, like `root.storagegroup_1-sg1`.
 
-> Note: The prefix of a time series must belong to a storage group. Before creating a time series, the user must set which storage group the series belongs to. Only the time series whose storage group is set can be persisted to disk.
+* Data point
 
-Once a prefix path is set as a storage group, the storage group settings cannot be changed.
+**A "time-value" pair**.
 
-After a storage group is set, all parent and child layers of the corresponding prefix path are not allowed to be set up again (for example, after `root.ln` is set as the storage group, the root layer and `root.ln.wf01` are not allowed to be set as storage groups).
+* Timeseries (A measurement of an entity corresponds to a timeseries. Also called meter, timeline, and tag, parameter in real time database)
 
-The Layer Name of storage group can only consist of characters, numbers, underscores and hyphen, like `root.storagegroup_1-sg1`.
+**The record of a measurement of an entity on the time axis.** Timeseries is a series of data points.
+
+For example, if entity wt01 in power plant wf01 of power group ln has a measurement named status, its timeseries  can be expressed as: `root.ln.wf01.wt01.status`.
+
+
+* Multi-variable timeseries (Also called aligned timeseries, from v0.13)
+
+A multi-variable measurements of an entity corresponds to a multi-variable timeseries. These timeseries are called **multi-variable timeseries**, also called **aligned timeseries**.
+
+Multi-variable timeseries need to be created, inserted and deleted at the same time. However, when querying, you can query each sub-measurement separately.
+
+By using multi-variable timeseries, the timestamp columns of a group of multi-variable timeseries need to be stored only once in memory and disk when inserting data, instead of once per timeseries:
+
+<img style="width:100%; max-width:800px; max-height:600px; margin-left:auto; margin-right:auto; display:block;" src="https://user-images.githubusercontent.com/19167280/114125919-f4850800-9929-11eb-8211-81d4c04af1ec.png">
+
+In the following chapters of data definition language, data operation language and Java Native Interface, various operations related to multi-variable timeseries will be introduced one by one.
+
+
+* Measurement template (From v0.13)
+
+In the actual scenario, many entities collect the same measurements, that is, they have the same measurements name and type. A **measurement template** can be declared to define the collectable measurements set. Measurement template is hung on any node of the tree data pattern, which means that all entities under the node have the same measurements set.
+
+Currently you can only set one **measurement template** on a specific path. An entity will use it's own measurement template or nearest ancestor's measurement template.
+
+In the following chapters of data definition language, data operation language and Java Native Interface, various operations related to measurement template will be introduced one by one.
 
 * Path
 
@@ -93,36 +119,6 @@ The characters supported in LayerName without double quotes are as below:
 > 
 > Besides, if deploy on Windows system, the LayerName is case-insensitive, which means it's not allowed to set storage groups `root.ln` and `root.LN` at the same time.
 
-* Timeseries Path
-
-The timeseries path is the core concept in IoTDB. A timeseries path can be thought of as the complete path of a sensor that produces the time series data. All timeseries paths in IoTDB must start with root and end with the sensor. A timeseries path can also be called a full path.
-
-For example, if device1 of the vehicle type has a sensor named sensor1, its timeseries path can be expressed as: `root.vehicle.device1.sensor1`. Double quotes can be nested with escape characters, e.g. `root.sg.d1."s.\"t\"1"`.
-
-> Note: The layer of timeseries paths supported by the current IoTDB must be greater than or equal to four (it will be changed to two in the future).
-
-
-* Aligned timeseries (From v0.13)
-
-When a group of sensors detects data at the same time, multiple timeseries with the same timestamp will be produced, which are called **aligned timeseries** in IoTDB (and are also called **multivariate timeseries** academically. It contains multiple unary timeseries as components, and the sampling time of each unary timeseries is the same.)
-
-Aligned timeseries can be created, inserted values, and deleted at the same time. However, when querying, each sensor can be queried separately.
-
-By using aligned timeseries, the timestamp column could be stored only once in memory and disk when inserting data, instead of stored as many times as the number of timeseries:
-
-<img style="width:100%; max-width:800px; max-height:600px; margin-left:auto; margin-right:auto; display:block;" src="https://user-images.githubusercontent.com/19167280/114125919-f4850800-9929-11eb-8211-81d4c04af1ec.png">
-
-In the following chapters of data definition language, data operation language and Java Native Interface, various operations related to aligned timeseries will be introduced one by one.
-
-
-* Device template (From v0.13)
-
-In the actual scenario, there are many devices with the same model, that is, they have the same working condition name and type. To save system resources, you can declare a **device template** for the same type of device, mount it to any node in the path.
-
-Currently you can only set one **device template** on a specific path. Device will use it's own device template or nearest ancestor's device template.
-
-In the following chapters of data definition language, data operation language and Java Native Interface, various operations related to device template will be introduced one by one.
-
 
 * Prefix Path
 
diff --git a/docs/zh/SystemDesign/TsFile/Format.md b/docs/zh/SystemDesign/TsFile/Format.md
index 3aa07c9..ec2517a 100644
--- a/docs/zh/SystemDesign/TsFile/Format.md
+++ b/docs/zh/SystemDesign/TsFile/Format.md
@@ -65,50 +65,64 @@
 
 ### 1.2 TsFile 概述
 
+<!-- TODO
+
 下图是关于TsFile的结构图。
 
 <img style="width:100%; max-width:800px; max-height:600px; margin-left:auto; margin-right:auto; display:block;" src="https://user-images.githubusercontent.com/33376433/123052025-f47aab80-d434-11eb-94c2-9b75429e5c54.png">
 
 此文件包括两个设备 d1、d2,每个设备包含两个测点 s1、s2,共 4 个时间序列。每个时间序列包含两个 Chunk。
 
-下图是另一种关于TsFile的结构表示:
+-->
 
-<img style="width:100%; max-width:800px; max-height:600px; margin-left:auto; margin-right:auto; display:block;" src="https://user-images.githubusercontent.com/19167280/98808354-ed2f0080-2456-11eb-8e7f-b11a4759d560.png">
+TsFile 整体分为两大部分:**数据区**和**索引区**。
 
-此文件包括两个设备 d1、d2,每个设备包含三个测点 s1、s2、s3,共 6 个时间序列。每个时间序列包含两个 Chunk。
+**数据区**所包含的概念由小到大有如下三个:
 
-元数据分为三部分
+* **Page数据页**:一段时间序列,是数据块被反序列化的最小单元;
 
-* 按时间序列组织的 ChunkMetadata 列表
-* 按时间序列组织的 TimeseriesMetadata
-* TsFileMetadata
+* **Chunk数据块**:包含一条时间序列的多个 Page ,是数据块被IO读取的最小单元;
 
-查询流程:以查 d1.s1 为例
+* **ChunkGroup数据块组**:包含一个实体的多个 Chunk。
 
-* 反序列化 TsFileMetadata,得到 d1.s1 的 TimeseriesMetadata 的位置
-* 反序列化得到 d1.s1 的 TimeseriesMetadata
-* 根据 d1.s1 的 TimeseriesMetadata,反序列化其所有 ChunkMetadata
-* 根据 d1.s1 的每一个 ChunkMetadata,读取其 Chunk 数据
+**索引区**分为三部分:
 
-#### 1.2.1 文件签名和版本号
+* 按时间序列组织的 **TimeseriesIndex**,包含1个头信息和数据块索引(ChunkIndex)列表。头信息记录文件内某条时间序列的数据类型、统计信息(最大最小时间戳等);数据块索引列表记录该序列各Chunk在文件中的 offset,并记录相关统计信息(最大最小时间戳等);
 
-TsFile文件头由 6 个字节的 "Magic String" (`TsFile`) 和 6 个字节的版本号 (`000002`)组成。
+* **IndexOfTimeseriesIndex**,用于索引各TimeseriesIndex在文件中的 offset;
+
+* **BloomFilter**,针对实体(Entity)的布隆过滤器。
+
+> 注:ChunkIndex 旧称 ChunkMetadata;TimeseriesIndex 旧称 TimeseriesMetadata;IndexOfTimeseriesIndex 旧称 TsFileMetadata。v0.13版本起,根据其索引区的特性和实际所索引的内容重新命名。
+
+下图是关于 TsFile 的结构图。
+
+<img style="width:100%; max-width:800px; max-height:600px; margin-left:auto; margin-right:auto; display:block;" src="https://user-images.githubusercontent.com/19167280/123542462-6710c180-d77c-11eb-9afb-a1b495c82ea9.png">
 
-#### 1.2.2 数据文件
+此文件包括两个实体 d1、d2,每个实体分别包含三个物理量 s1、s2、s3,共 6 个时间序列。每个时间序列包含两个 Chunk。
 
-TsFile文件的内容可以划分为两个部分: 数据(Chunk)和元数据(XXMetadata)。数据和元数据之间是由一个字节的 `0x02` 做为分隔符。
+TsFile 的查询流程,以查 d1.s1 为例:
 
-`ChunkGroup` 存储了一个 *设备(device)* 一段时间的数据。
+* 反序列化 IndexOfTimeseriesIndex,得到 d1.s1 的 TimeseriesIndex 的位置
+* 反序列化得到 d1.s1 的 TimeseriesIndex
+* 根据 d1.s1 的 TimeseriesIndex,反序列化其所有 ChunkIndex
+* 根据 d1.s1 的每一个 ChunkIndex,读取其 Chunk 数据
 
-##### ChunkGroup
+#### 1.2.1 文件签名和版本号
+
+TsFile文件头由 6 个字节的 "Magic String" (`TsFile`) 和 6 个字节的版本号 (`000002`)组成。
+
+#### 1.2.2 数据区
 
-`ChunkGroup` 由若干个 `Chunk`, 一个字节的分隔符 `0x00` 和 一个`ChunkFooter`组成。
+##### ChunkGroup 数据块组
 
-##### Chunk
+`ChunkGroup` 存储了一个实体(Entity) 一段时间的数据。由若干个 `Chunk`, 一个字节的分隔符 `0x00` 和 一个`ChunkFooter`组成。
 
-一个 `Chunk` 存储了一个 *测点(measurement)* 一段时间的数据,Chunk 内数据是按时间递增序存储的。`Chunk` 是由一个字节的分隔符 `0x01`, 一个 `ChunkHeader` 和若干个 `Page` 构成。
+##### Chunk 数据块
 
-##### ChunkHeader
+一个 `Chunk` 存储了一个物理量(Measurement) 一段时间的数据,Chunk 内数据是按时间递增序存储的。`Chunk` 是由一个字节的分隔符 `0x01`, 一个 `ChunkHeader` 和若干个 `Page` 构成。
+
+##### ChunkHeader 数据块头
 
 |             成员             |  类型  | 解释 |
 | :--------------------------: | :----: | :----: |
@@ -119,11 +133,11 @@ TsFile文件的内容可以划分为两个部分: 数据(Chunk)和元数据
 |    encodingType    | TSEncoding  | 编码类型 |
 |  numOfPages  |  int   | 包含的page数量 |
 
-##### Page
+##### Page 数据页
 
-一个 `Page` 页存储了 `Chunk` 的一些数据。 它包含一个 `PageHeader` 和实际的数据(time-value 编码的键值对)。
+一个 `Page` 页存储了一段时间序列,是数据块被反序列化的最小单元。 它包含一个 `PageHeader` 和实际的数据(time-value 编码的键值对)。
 
-PageHeader 结构
+PageHeader 结构:
 
 |                 成员                 |       类型       | 解释 |
 | :----------------------------------: | :--------------: | :----: |
@@ -133,31 +147,31 @@ PageHeader 结构
 
 这里是`statistics`的详细信息:
 
- |             成员               | 描述 | DoubleStatistics | FloatStatistics | IntegerStatistics | LongStatistics | BinaryStatistics | BooleanStatistics |
- | :----------------------------------: | :--------------: | :----: | :----: | :----: | :----: | :----: | :----: |
- | count  | 数据点个数 | long | long | long | long | long | long | 
- | startTime | 开始时间 | long | long | long | long | long | long | 
- | endTime | 结束时间 | long | long | long | long | long | long | 
- | minValue | 最小值 | double | float | int | long | - | - |
- | maxValue | 最大值 | double | float | int | long | - | - |
- | firstValue | 第一个值 | double | float | int | long | Binary | boolean|
- | lastValue | 最后一个值 | double | float | int | long | Binary | boolean|
- | sumValue | 和 | double | double | double | double | - | - |
- | extreme | 极值 | double | float | int | long | - | - |
- 
-##### ChunkGroupFooter
+|             成员               | 描述 | DoubleStatistics | FloatStatistics | IntegerStatistics | LongStatistics | BinaryStatistics | BooleanStatistics |
+| :----------------------------------: | :--------------: | :----: | :----: | :----: | :----: | :----: | :----: |
+| count  | 数据点个数 | long | long | long | long | long | long |
+| startTime | 开始时间 | long | long | long | long | long | long |
+| endTime | 结束时间 | long | long | long | long | long | long |
+| minValue | 最小值 | double | float | int | long | - | - |
+| maxValue | 最大值 | double | float | int | long | - | - |
+| firstValue | 第一个值 | double | float | int | long | Binary | boolean|
+| lastValue | 最后一个值 | double | float | int | long | Binary | boolean|
+| sumValue | 和 | double | double | double | double | - | - |
+| extreme | 极值 | double | float | int | long | - | - |
+
+##### ChunkGroupFooter 数据块组结尾
 
 |                成员                |  类型  | 解释 |
 | :--------------------------------: | :----: | :----: |
-|         deviceID          | String | 设备名称 |
+| entityID  | String | 实体名称 |
 |      dataSize      |  long  | ChunkGroup 大小 |
 | numberOfChunks |  int   | 包含的 chunks 的数量 |
 
-#### 1.2.3  元数据
+#### 1.2.3  索引区
 
-##### 1.2.3.1 ChunkMetadata
+##### 1.2.3.1 ChunkIndex 数据块索引
 
-第一部分的元数据是 `ChunkMetadata` 
+第一部分的索引是 `ChunkIndex` :
 
 |                        成员                        |   类型   | 解释 |
 | :------------------------------------------------: | :------: | :----: |
@@ -166,79 +180,79 @@ PageHeader 结构
 |                tsDataType                |  TSDataType   | 数据类型 |
 |   statistics    |       Statistics        | 统计量 |
 
-##### 1.2.3.2 TimeseriesMetadata
+##### 1.2.3.2 TimeseriesIndex 时间序列索引
 
-第二部分的元数据是 `TimeseriesMetadata`。
+第二部分的索引是 `TimeseriesIndex`:
 
 |                        成员                        |   类型   | 解释 |
 | :------------------------------------------------: | :------: | :------: |
-|             measurementUid            |  String  | 传感器名称 |
+|             measurementUid            |  String  | 物理量名称 |
 |               tsDataType                |  TSDataType   |  数据类型 |
-| startOffsetOfChunkMetadataList |  long  | 文件中 ChunkMetadata 列表开始的偏移量 |
-|  chunkMetaDataListDataSize  |  int  | ChunkMetadata 列表的大小 |
+| startOffsetOfChunkIndexList |  long  | 文件中 ChunkIndex 列表开始的偏移量 |
+|  ChunkIndexListDataSize  |  int  | ChunkIndex 列表的大小 |
 |   statistics    |       Statistics        | 统计量 |
 
-##### 1.2.3.3 TsFileMetaData
+##### 1.2.3.3 IndexOfTimeseriesIndex 时间序列索引的索引(二级索引)
 
-第三部分的元数据是 `TsFileMetaData`。
+第三部分的索引是 `IndexOfTimeseriesIndex`:
 
 |                        成员                        |   类型   | 解释 |
 | :-------------------------------------------------: | :---------------------: | :---:|
-|       MetadataIndex              |   MetadataIndexNode      |元数据索引节点 |
-|           totalChunkNum            |                int                 | 包含的 Chunk 总数 |
-|          invalidChunkNum           |                int                 | 失效的 Chunk 总数 |
-|                versionInfo         |             List<Pair<Long, Long>>       | 版本信息映射 |
-|        metaOffset   |                long                 | MetaMarker.SEPARATOR偏移量 |
+| IndexTree     |   IndexNode      |索引节点 |
+| offsetOfIndexArea   |                long                 | 索引区的偏移量 |
 |                bloomFilter                 |                BloomFilter      | 布隆过滤器 |
 
-元数据索引节点 (MetadataIndexNode) 的成员和类型具体如下:
+索引节点 (IndexNode) 的成员和类型具体如下:
 
 |                  成员                  |  类型  | 解释 |
 | :------------------------------------: | :----: | :---: |
-|      children    | List<MetadataIndexEntry> | 节点元数据索引项列表 |
-|       endOffset      | long |    此元数据索引节点的结束偏移量 |
-|   nodeType    | MetadataIndexNodeType | 节点类型 |
+|      children    | List<IndexEntry> | 节点索引项列表 |
+|       endOffset      | long |    此索引节点的结束偏移量 |
+|   nodeType    | IndexNodeType | 节点类型 |
 
-元数据索引项 (MetadataIndexEntry) 的成员和类型具体如下:
+索引项 (MetadataIndexEntry) 的成员和类型具体如下:
 
 |                  成员                  |  类型  | 解释 |
 | :------------------------------------: | :----: | :---: |
-|  name    | String | 对应设备或传感器的名字 |
+|  name    | String | 对应实体或物理量的名字 |
 |     offset     | long   | 偏移量 |
 
-所有的元数据索引节点构成一棵**元数据索引树**,这棵树最多由两个层级组成:设备索引层级和传感器索引层级,在不同的情况下会有不同的组成方式。元数据索引节点类型有四种,分别是`INTERNAL_DEVICE`、`LEAF_DEVICE`、`INTERNAL_MEASUREMENT`、`LEAF_MEASUREMENT`,分别对应设备索引层级的中间节点和叶子节点,和传感器索引层级的中间节点和叶子节点。
-只有传感器索引层级的叶子节点(`LEAF_MEASUREMENT`) 指向 `TimeseriesMetadata`。
+所有的索引节点构成一棵类B+树结构的**索引树(二级索引)**,这棵树由两部分组成:实体索引部分和物理量索引部分。索引节点类型有四种,分别是`INTERNAL_ENTITY`、`LEAF_ENTITY`、`INTERNAL_MEASUREMENT`、`LEAF_MEASUREMENT`,分别对应实体索引部分的中间节点和叶子节点,和物理量索引部分的中间节点和叶子节点。 只有物理量索引部分的叶子节点(`LEAF_MEASUREMENT`) 指向 `TimeseriesIndex`。
+
+下面,我们使用四个例子来加以详细说明。
+
+索引树节点的度(即每个节点的最大子节点个数)可以由用户进行配置,配置项为`max_degree_of_index_node`,其默认值为256。在以下例子中,我们假定 `max_degree_of_index_node = 10`。
 
-为了更清楚的说明元数据索引树的结构,这里我们使用四个例子来加以详细说明。
+* 例1:5个实体,每个实体有5个物理量
 
-元数据索引树的最大度(即每个节点的最大子节点个数)是可以由用户进行配置的,配置项为`max_degree_of_index_node`,其默认值为256。在以下例子中,为了简化,我们假定 `max_degree_of_index_node = 10`。
+<img style="width:100%; max-width:800px; max-height:600px; margin-left:auto; margin-right:auto; display:block;" src="https://user-images.githubusercontent.com/19167280/122677230-134e2780-d214-11eb-9603-ac7b95bc0668.png">
 
-<img style="width:100%; max-width:800px; max-height:600px; margin-left:auto; margin-right:auto; display:block;" src="https://user-images.githubusercontent.com/19167280/81935219-de3fd080-9622-11ea-9aa1-a59bef1c0001.png">
+在5个实体,每个实体有5个物理量的情况下,由于实体数和物理量数均不超过 `max_degree_of_index_node`,因此索引树只有默认的物理量部分。在这部分中,每个 IndexNode 最多由10个 IndexEntry 组成。根节点的 IndexNode 是 `INTERNAL_MEASUREMENT` 类型,其中的5个 IndexEntry 指向对应的实体的 IndexNode,这些节点直接指向 `TimeseriesIndex`,是 `LEAF_MEASUREMENT`。
 
-在5个设备,每个设备有5个传感器的情况下,由于设备数和传感器树均不超过 `max_degree_of_index_node`,因此元数据索引树只有默认的传感器层级。在这个层级里,每个 MetadataIndexNode 最多由10个 MetadataIndexEntry 组成。根节点的 MetadataIndexNode 是 `INTERNAL_MEASUREMENT` 类型,其中的5个 MetadataIndexEntry 指向对应的设备的 MetadataIndexNode,这些节点直接指向 `TimeseriesMetadata`,是 `LEAF_MEASUREMENT`。
+* 例2:1个实体,150个物理量
 
-<img style="width:100%; max-width:800px; max-height:600px; margin-left:auto; margin-right:auto; display:block;" src="https://user-images.githubusercontent.com/19167280/81935210-d97b1c80-9622-11ea-8a69-2c2c5f05a876.png">
+<img style="width:100%; max-width:800px; max-height:600px; margin-left:auto; margin-right:auto; display:block;" src="https://user-images.githubusercontent.com/19167280/122677233-15b08180-d214-11eb-8d09-c741cca59262.png">
 
-在1个设备,设备中有150个传感器的情况下,传感器个数超过了 `max_degree_of_index_node`,元数据索引树有默认的传感器层级。在这个层级里,每个 MetadataIndexNode 最多由10个 MetadataIndexEntry 组成。直接指向 `TimeseriesMetadata`的节点类型均为 `LEAF_MEASUREMENT`;而后续产生的中间节点和根节点不是传感器索引层级的叶子节点,这些节点是 `INTERNAL_MEASUREMENT`。
+在1个实体,实体中有150个物理量的情况下,物理量个数超过了 `max_degree_of_index_node`,索引树有默认的物理量层级。在这个层级里,每个 IndexNode 最多由10个 IndexEntry 组成。直接指向 `TimeseriesIndex`的节点类型均为 `LEAF_MEASUREMENT`;而后续产生的中间节点和根节点不是物理量索引层级的叶子节点,这些节点是 `INTERNAL_MEASUREMENT`。
 
-<img style="width:100%; max-width:800px; max-height:600px; margin-left:auto; margin-right:auto; display:block;" src="https://user-images.githubusercontent.com/19167280/95592841-c0fd1a00-0a7b-11eb-9b46-dfe8b2f73bfb.png">
+* 例3:150个实体,每个实体有1个物理量
 
-在150个设备,每个设备中有1个传感器的情况下,设备个数超过了 `max_degree_of_index_node`,形成元数据索引树的传感器层级和设备索引层级。在这两个层级里,每个 MetadataIndexNode 最多由10个 MetadataIndexEntry 组成。直接指向 `TimeseriesMetadata` 的节点类型为 `LEAF_MEASUREMENT`,传感器索引层级的根节点同时作为设备索引层级的叶子节点,其节点类型为 `LEAF_DEVICE`;而后续产生的中间节点和根节点不是设备索引层级的叶子节点,因此节点类型为 `INTERNAL_DEVICE`。
+<img style="width:100%; max-width:800px; max-height:600px; margin-left:auto; margin-right:auto; display:block;" src="https://user-images.githubusercontent.com/19167280/122771008-9a64d380-d2d8-11eb-9044-5ac794dd38f7.png">
 
-<img style="width:100%; max-width:800px; max-height:600px; margin-left:auto; margin-right:auto; display:block;" src="https://user-images.githubusercontent.com/19167280/81935138-b6e90380-9622-11ea-94f9-c97bd2b5d050.png">
+在150个实体,每个实体中有1个物理量的情况下,实体个数超过了 `max_degree_of_index_node`,形成索引树的物理量层级和实体索引层级。在这两个层级里,每个 IndexNode 最多由10个 IndexEntry 组成。直接指向 `TimeseriesIndex` 的节点类型为 `LEAF_MEASUREMENT`,物理量索引层级的根节点同时作为实体索引层级的叶子节点,其节点类型为 `LEAF_ENTITY`;而后续产生的中间节点和根节点不是实体索引层级的叶子节点,因此节点类型为 `INTERNAL_ENTITY`。
 
-在150个设备,每个设备中有150个传感器的情况下,传感器和设备个数均超过了 `max_degree_of_index_node`,形成元数据索引树的传感器层级和设备索引层级。在这两个层级里,每个 MetadataIndexNode 均最多由10个 MetadataIndexEntry 组成。如前所述,从根节点到设备索引层级的叶子节点,类型分别为`INTERNAL_DEVICE` 和 `LEAF_DEVICE`,而每个设备索引层级的叶子节点都是传感器索引层级的根节点,从这里到传感器索引层级的叶子节点,类型分别为`INTERNAL_MEASUREMENT` 和 `LEAF_MEASUREMENT`。
+* 例4:150个实体,每个实体有150个物理量
 
-元数据索引采用树形结构进行设计的目的是在设备数或者传感器数量过大时,可以不用一次读取所有的 `TimeseriesMetadata`,只需要根据所读取的传感器定位对应的节点,从而减少 I/O,加快查询速度。有关 TsFile 的读流程将在本章最后一节加以详细说明。
+<img style="width:100%; max-width:800px; max-height:600px; margin-left:auto; margin-right:auto; display:block;" src="https://user-images.githubusercontent.com/19167280/122677241-1a753580-d214-11eb-817f-17bcf797251f.png">
 
-##### 1.2.3.4 TsFileMetadataSize
+在150个实体,每个实体中有150个物理量的情况下,物理量和实体个数均超过了 `max_degree_of_index_node`,形成索引树的物理量层级和实体索引层级。在这两个层级里,每个 IndexNode 均最多由10个 IndexEntry 组成。如前所述,从根节点到实体索引层级的叶子节点,类型分别为`INTERNAL_ENTITY` 和 `LEAF_ENTITY`,而每个实体索引层级的叶子节点都是物理量索引层级的根节点,从这里到物理量索引层级的叶子节点,类型分别为`INTERNAL_MEASUREMENT` 和 `LEAF_MEASUREMENT`。
 
-在TsFileMetaData之后,有一个int值用来表示TsFileMetaData的大小。
+索引采用树形结构进行设计的目的是在实体数或者物理量数量过大时,可以不用一次读取所有的 `TimeseriesIndex`,只需要根据所读取的物理量定位对应的节点,从而减少 I/O,加快查询速度。有关 TsFile 的读流程将在本章最后一节加以详细说明。
 
 
 #### 1.2.4 Magic String
 
-TsFile 是以6个字节的magic string (`TsFile`) 作为结束.
+TsFile 是以6个字节的magic string (`TsFile`) 作为结束。
 
 
 恭喜您, 至此您已经完成了 TsFile 的探秘之旅,祝您玩儿的开心!
diff --git a/docs/zh/UserGuide/Data-Concept/Data-Model-and-Terminology.md b/docs/zh/UserGuide/Data-Concept/Data-Model-and-Terminology.md
index 469442f..5690847 100644
--- a/docs/zh/UserGuide/Data-Concept/Data-Model-and-Terminology.md
+++ b/docs/zh/UserGuide/Data-Concept/Data-Model-and-Terminology.md
@@ -23,44 +23,76 @@
 
 ## 数据模型
 
-本节,我们以电力场景为例,说明如何在IoTDB中创建一个正确的数据模型。
+我们以风电场物联网场景为例,说明如何在 IoTDB 中创建一个正确的数据模型。
 
-根据属性层级,属性涵盖范围以及数据之间的从属关系,我们可将其数据模型表示为如下图所示的属性层级组织结构,即电力集团层-电厂层-设备层-传感器层。其中ROOT为根节点,传感器层的每一个节点为叶子节点。IoTDB的语法规定,ROOT节点到叶子节点的路径以“.”连接,以此完整路径命名IoTDB中的一个时间序列。例如,下图最左侧路径对应的时间序列名称为`ROOT.ln.wf01.wt01.status`。
+根据企业组织结构和设备实体层次结构,我们将其物联网数据模型表示为如下图所示的属性层级组织结构,即电力集团层-风电场层-实体层-物理量层。其中 ROOT 为根节点,物理量层的每一个节点为叶子节点。IoTDB 采用树形结构定义数据模式,以从ROOT 节点到叶子节点的路径来命名一个时间序列,层次间以“.”连接。例如,下图最左侧路径对应的时间序列名称为`ROOT.ln.wf01.wt01.status`。
 
-<img style="width:100%; max-width:800px; max-height:600px; margin-left:auto; margin-right:auto; display:block;" src="https://user-images.githubusercontent.com/13203019/51577327-7aa50780-1ef4-11e9-9d75-cadabb62444e.jpg">
+<img style="width:100%; max-width:800px; max-height:600px; margin-left:auto; margin-right:auto; display:block;" src="https://user-images.githubusercontent.com/19167280/123542457-5f511d00-d77c-11eb-8006-562d83069baa.png">
 
-**属性层级组织结构**
+IoTDB模型结构涉及如下基本概念:
 
-得到时间序列的名称之后,我们需要根据数据的实际场景和规模设置存储组。由于在本文所述场景中,每次到达的数据通常以集团为单位(即数据可能为跨电场、跨设备的),为了写入数据时避免频繁切换IO降低系统速度,且满足用户以集团为单位进行物理隔离数据的要求,我们将存储组设置在集团层。
+* 物理量(Measurement,也称工况、字段 field)
 
-根据模型结构,IoTDB中涉及如下基本概念:
+**一元或多元物理量**,是在实际场景中检测装置所记录的测量信息,且可以按一定规律变换成为电信号或其他所需形式的信息输出并发送给 IoTDB。在 IoTDB 当中,存储的所有数据及路径,都是以物理量为单位进行组织。
 
-* 设备
+* 物理分量(SubMeasurement、分量)
 
-设备指的是在实际场景中拥有传感器的装置。在IoTDB当中,所有的传感器都应有其对应的归属的设备。
+在多元物理量中,包括多个分量。如 GPS 是一个多元物理量,包含3个分量:经度、维度、海拔。多元物理量通常被同时采集,共享时间列。
 
-* 传感器
+一元物理量则将分量名和物理量名字重合。如温度是一个一元物理量。
 
-传感器是指在实际场景中的一种检测装置,它能感受到被测量的信息,并能将感受到的信息按一定规律变换成为电信号或其他所需形式的信息输出并发送给IoTDB。在IoTDB当中,存储的所有的数据及路径,都是以传感器为单位进行组织。
+* 实体(Entity,也称设备,device)
 
-* 存储组
+**一个物理实体**,是在实际场景中拥有物理量的设备或装置。在IoTDB当中,所有的物理量都有其对应的归属实体。
 
-用户可以将任意前缀路径设置成存储组。如有4条时间序列`root.vehicle.d1.s1`, `root.vehicle.d1.s2`, `root.vehicle.d2.s1`, `root.vehicle.d2.s2`,路径`root.vehicle`下的两个设备d1,d2可能属于同一个业主,或者同一个厂商,因此关系紧密。这时候就可以将前缀路径`root.vehicle`指定为一个存储组,这将使得IoTDB将其下的所有设备的数据存储在同一个文件夹下。未来`root.vehicle`下增加了新的设备,也将属于该存储组。
+* 存储组(Storage group)
 
-> 注意:不允许将一个完整路径(如上例的`root.vehicle.d1.s1`)设置成存储组。
+**一组物理实体**,用户可以将任意前缀路径设置成存储组。如有4条时间序列`root.ln.wf01.wt01.status`, `root.ln.wf01.wt01.temperature`, `root.ln.wf02.wt02.hardware`, `root.ln.wf02.wt02.status`,路径`root.ln`下的两个实体 `wt01`, `wt02`可能属于同一个业主,或者同一个制造商,这时候就可以将前缀路径`root.ln`指定为一个存储组。未来`root.ln`下增加了新的实体,也将属于该存储组。
 
-设置合理数量的存储组可以带来性能的提升:既不会因为产生过多的存储文件(夹)导致频繁切换IO降低系统速度(并且会占用大量内存且出现频繁的内存-文件切换),也不会因为过少的存储文件夹(降低了并发度从而)导致写入命令阻塞。
+一个存储组中的所有实体的数据会存储在同一个文件夹下,不同存储组的实体数据会存储在磁盘的不同文件夹下,从而实现物理隔离。
 
-用户应根据自己的数据规模和使用场景,平衡存储文件的存储组设置,以达到更好的系统性能。(未来会有官方提供的存储组规模与性能测试报告)
+> 注意1:不允许将一个完整路径(如上例的`root.ln.wf01.wt01.status`)设置成存储组。
+>
+> 注意2:一个时间序列其前缀必须属于某个存储组。在创建时间序列之前,用户必须设定该序列属于哪个存储组(Storage Group)。只有设置了存储组的时间序列才可以被持久化在磁盘上。
 
-> 注意:一个时间序列其前缀必须属于某个存储组。在创建时间序列之前,用户必须设定该序列属于哪个存储组(Storage Group)。只有设置了存储组的时间序列才可以被持久化在磁盘上。
+一个前缀路径一旦被设定成存储组后就不可以再更改这个存储组的设定。
 
-一个前缀路径一旦被设定成存储组后就不可以再更改这个存储组的设置。
-
-一个存储组设定后,其对应的前缀路径的所有父层级与子层级也不允许再设置存储组(如,`root.ln`设置存储组后,root层级与`root.ln.wf01`不允许被设置为存储组)。
+一个存储组设定后,其对应的前缀路径的祖先层级与孩子及后裔层级也不允许再设置存储组(如,`root.ln`设置存储组后,root 层级与`root.ln.wf01`不允许被设置为存储组)。
 
 存储组节点名只支持中英文字符、数字、下划线和中划线的组合。例如`root.存储组_1-组1` 。
 
+* 数据点(Data point)
+
+**一个“时间-值”对**。
+
+* 时间序列(一个实体的某个物理量对应一个时间序列,Timeseries,也称测点meter、时间线timeline,实时数据库中常被称作标签tag、参数parameter)
+
+**一个物理实体的某个物理量在时间轴上的记录**,是数据点的序列。
+
+例如,ln电力集团、wf01风电场的实体 wt01有名为 status的物理量,则它的时间序列可以表示为:`root.ln.wf01.wt01.status`。 
+
+* 一元时间序列(single-variable timeseries 或timeseries,v0.1 起支持)
+
+一个实体的一个一元物理量对应一个一元时间序列。实体+物理量=时间序列
+
+* 多元时间序列(Multi-variable timeseries 或 Aligned timeseries,v0.13 起支持)
+
+一个实体的一个多元物理量对应一个多元时间序列。这些时间序列称为**多元时间序列**,也叫**对齐时间序列**。
+
+多元时间序列需要被同时创建、同时插入值,删除时也必须同时删除。不过在查询的时候,可以对于每一个分量单独查询。
+
+通过使用对齐的时间序列,在插入数据时,一组对齐序列的时间戳列在内存和磁盘中仅需存储一次,而不是每个时间序列存储一次:
+
+<img style="width:100%; max-width:800px; max-height:600px; margin-left:auto; margin-right:auto; display:block;" src="https://user-images.githubusercontent.com/19167280/123542458-62e4a400-d77c-11eb-8c45-ca516f1b7eba.png">
+
+在后续数据定义语言、数据操作语言和 Java 原生接口章节,将对涉及到对齐时间序列的各种操作进行逐一介绍。
+
+* 物理量模板(Measurement template,v0.13 起支持)
+
+实际应用中有许多实体所采集的物理量相同,即具有相同的工况名称和类型,可以声明一个**物理量模板**来定义可采集的物理量集合。将物理量模版挂在树形数据模式的任意节点上,表示该节点下的所有实体具有相同的物理量集合。
+
+目前每一个路径节点仅允许挂载一个物理量模板,实体将使用其自身或最近祖先的物理量模板作为有效模板。
+
 
 * 路径
 
@@ -89,35 +121,6 @@ LayerName: Identifier | STAR
 
 > 注意: storage group中的LayerName只支持数字,字母,汉字,下划线和中划线。另外,如果在Windows系统上部署,存储组层级名称是大小写不敏感的。例如同时创建`root.ln` 和 `root.LN` 是不被允许的。
 
-* 时间序列
-
-时间序列是IoTDB中的核心概念。时间序列可以被看作产生时序数据的传感器的所在完整路径,在IoTDB中所有的时间序列必须以root开始、以传感器作为结尾。一个时间序列也可称为一个全路径。
-
-例如,vehicle种类的device1有名为sensor1的传感器,则它的时间序列可以表示为:`root.vehicle.device1.sensor1`。 
-
-> 注意:当前IoTDB支持的时间序列必须大于等于四层(之后会更改为两层)。
-
-* 对齐时间序列(v0.13 起支持)
-
-在同一个时间戳有多个传感器同时采样,会形成具有相同时间戳的多条时间序列,在 IoTDB 中,这些时间序列成为**对齐时间序列**(在学术上也称为**多元时间序列**,即包含多个一元时间序列作为分量, 各个一元时间序列的采样时间点相同)。
-
-对齐时间序列可以被同时创建,同时插入值,删除时也必须同时删除。不过在查询的时候,可以对于每一个传感器单独查询。
-
-通过使用对齐的时间序列,在插入数据时,一组对齐序列的时间戳列在内存和磁盘中仅需存储一次,而不是时间序列的条数次:
-
-<img style="width:100%; max-width:800px; max-height:600px; margin-left:auto; margin-right:auto; display:block;" src="https://user-images.githubusercontent.com/19167280/114125919-f4850800-9929-11eb-8211-81d4c04af1ec.png">
-
-在后续数据定义语言、数据操作语言和 Java 原生接口章节,将对涉及到对齐时间序列的各种操作进行逐一介绍。
-
-
-* 设备模板(v0.13 起支持)
-
-实际场景中有许多设备型号相同,即具有相同的工况名称和类型,为了节省系统资源,可以声明一个**设备模板**表征同一类型的设备,设备模版可以挂在到路径的任意节点上。
-
-目前每一个路径节点仅允许挂载一个设备模板,具体的设备将使用其自身或最近祖先的设备模板作为有效模板。
-
-在后续数据定义语言、数据操作语言和 Java 原生接口章节,将对涉及到设备模板的各种操作进行逐一介绍。
-
 * 前缀路径
 
 前缀路径是指一个时间序列的前缀所在的路径,一个前缀路径包含以该路径为前缀的所有时间序列。例如当前我们有`root.vehicle.device1.sensor1`, `root.vehicle.device1.sensor2`, `root.vehicle.device2.sensor1`三个传感器,则`root.vehicle.device1`前缀路径包含`root.vehicle.device1.sensor1`、`root.vehicle.device1.sensor2`两个时间序列,而不包含`root.vehicle.device2.sensor1`。