You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@iotdb.apache.org by GitBox <gi...@apache.org> on 2019/07/17 08:19:54 UTC

[GitHub] [incubator-iotdb] qiaojialin commented on a change in pull request #247: TsFile Docs

qiaojialin commented on a change in pull request #247: TsFile Docs
URL: https://github.com/apache/incubator-iotdb/pull/247#discussion_r304272184
 
 

 ##########
 File path: docs/Documentation/UserGuideV0.7.0/7-TsFile/2-Usage.md
 ##########
 @@ -0,0 +1,765 @@
+<!--
+
+    Licensed to the Apache Software Foundation (ASF) under one
+    or more contributor license agreements.  See the NOTICE file
+    distributed with this work for additional information
+    regarding copyright ownership.  The ASF licenses this file
+    to you under the Apache License, Version 2.0 (the
+    "License"); you may not use this file except in compliance
+    with the License.  You may obtain a copy of the License at
+
+        http://www.apache.org/licenses/LICENSE-2.0
+
+    Unless required by applicable law or agreed to in writing,
+    software distributed under the License is distributed on an
+    "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+    KIND, either express or implied.  See the License for the
+    specific language governing permissions and limitations
+    under the License.
+
+-->
+Now, you’re ready to start doing some awesome things with TsFile. This section demonstrates the detailed usage of TsFile.
+
+### Time-series Data
+A time-series is considered as a set of quadruples. A quadruple is defined as (deltaObject, measurement, time, value).
+
+* **deltaObject**: In many situations, a device which contains many sensors can be considered as a deltaObject.
+* **measurement**: A sensor can be considered as a measurement
+
+
+Table 1 illustrates a set of time-series data. The set showed in the following table contains one deltaObject named "device\_1" with three measurements named "sensor\_1", "sensor\_2" and "sensor\_3". 
+
+<center>
+<table style="text-align:center">
+	<tr><th colspan="6">device_1</th></tr>
+	<tr><th colspan="2">sensor_1</th><th colspan="2">sensor_2</th><th colspan="2">sensor_3</th></tr>
+	<tr><th>time</th><th>value</td><th>time</th><th>value</td><th>time</th><th>value</td>
+	<tr><td>1</td><td>1.2</td><td>1</td><td>20</td><td>2</td><td>50</td></tr>
+	<tr><td>3</td><td>1.4</td><td>2</td><td>20</td><td>4</td><td>51</td></tr>
+	<tr><td>5</td><td>1.1</td><td>3</td><td>21</td><td>6</td><td>52</td></tr>
+	<tr><td>7</td><td>1.8</td><td>4</td><td>20</td><td>8</td><td>53</td></tr>
+</table>
+<span>A set of time-series data</span>
+</center>
+
+**One Line of Data**: In many industrial applications, a device normally contains more than one sensor and these sensors may have values at a same timestamp, which is called one line of data. 
+
+Formally, one line of data consists of a `deltaObject_id`, a timestamp which indicates the milliseconds since January 1, 1970, 00:00:00, and several data pairs composed of `measurement_id` and corresponding `value`. All data pairs in one line belong to this `deltaObject_id` and have the same timestamp. If one of the `measurements` doesn't have a `value` in the `timestamp`, use a space instead(Actually, TsFile does not store null values). Its format is shown as follow:
+
+```
+deltaObject_id, timestamp, <measurement_id, value>...
+```
+
+An example is illustrated as follow. In this example, the data type of three measurements are  `INT32`, `FLOAT` and  `ENUMS` respectively.
+
+```
+device_1, 1490860659000, m1, 10, m2, 12.12, m3, MAN
+```
+
+
+### Writing TsFile
+
+#### Generate a TsFile File.
+A TsFile can be generated by following three steps and the complete code will be given in the section "Example for writing TsFile".
+
+* First, use the interface to construct a TsFile instance.
+	```
+	public TsFileWriter(File file) throws WriteProcessException, IOException
+	```
+	
+	**Parameters:**
+	
+	* file : The TsFile to write
+
+* Second, add measurements
+
+	```
+	public void addMeasurement(MeasurementDescriptor measurementDescriptor) throws WriteProcessException
+	```
+	
+	**Parameters:**
+	
+	* measurementDescriptor : The measurement information including name, data type and encoding
+	
+	Or use a json object
+	```
+	public void addMeasurementByJson(JSONObject measurement) throws WriteProcessException
+	```
+	**Parameters:**
+    	
+    * measurement : The Json object including name, data type, encoding, and compression type. See schema Json section 
+    below.
+    
+        > **Notice:** Although one measurement name can be used in multiple deltaObjects, the properties cannot be changed. I.e. 
+    it's not allowed to add one measurement name for multiple times with different type or encoding.
+    Here is a bad example:
+
+        ```
+        // The measurement "sensor_1" is float type
+        addMeasurement(new MeasurementSchema("sensor_1", TSDataType.FLOAT, TSEncoding.RLE));
+        // This call will throw a WriteProcessException exception
+        addMeasurement(new MeasurementSchema("sensor_1", TSDataType.INT32, TSEncoding.RLE));
+        ```
+* Third, write data continually.
+	
+	```
+	public void write(TSRecord record) throws IOException, WriteProcessException
+	```
+	
+	Use this interface to create a new TSRecord(a timestamp and device pair).
+	
+	```
+	public TSRecord(long timestamp, String deviceId)
+	```
+	Then create DataPoint(a measurement and value pair), and use the addTuple method to add the DataPoint to the correct
+	TsRecord.
+	
+* Finally, call `close` to finish this writing process. 
+	
+	```
+	public void close() throws IOException
+	```
+
+#### Format of Schema JSON
+`SchemaJSON` is a schema array specifying a list of allowable time series. The schema describes each measurement's `measurement_id`, `data_type`, `encoding`, `compression type` , and
+properties according to the specific data type.
+
+An example is shown as follow:
+
+``` json
+{
+    "schema": [
+        {
+            "measurement_id": "m1",
+            "data_type": "INT32",
+            "encoding": "RLE"
+        },
+        {
+            "measurement_id": "m2",
+            "data_type": "FLOAT",
+            "encoding": "TS_2DIFF",
+            "max_point_number": 2
+        },
+        {
+            "measurement_id": "m3",
+            "data_type": "ENUMS",
+            "encoding": "BITMAP",
+            "enum_values":["MAN","WOMAN"]
+        },
+        {
+            "measurement_id": "m4",
+            "data_type": "INT64",
+            "encoding": "RLE",            
+            "compressor": "SNAPPY"
+        }
+    ],
+}
+```
+`SchemaJSON` consists of a `JSONArray` of schema objects . For each schema object, which corresponds to a time series, its field description is shown as follow:
+
+| key      | is required|     description | allowed values|
+| :-------- | --------:| :------:| :------:|
+| measurement_id    |**required**	|name of the time series |any combination of letters, numbers and other symbols like `_` `.`  |
+| data_type    		|**required**	|data type|`BOOLEAN`, `INT32`, `INT64`, `FLOAT`, `DOUBLE`, `ENUM` and `TEXT`(namely `String`)|
+| encoding    		|**required**	| encoding approach for time domain. |`PLAIN`(for all data types), {`TS_2DIFF`, `RLE`}(for `INT32`, `INT64`, `FLOAT`, `DOUBLE`, `ENUM`), `BITMAP`(`ENUM`)|
+| compressor    		|**required**	| the type of compression.| `SNAPPY` and `UNCOMPRESSED`|
+| enum_values 		|required if `data_type` is `ENUM`	| the fields of `ENUM`  	|  in format of `["MAN","WOMAN"]`|
+| max\_point\_number    		|optional	| the number of reserved decimal digits. It's useful if the data type is `FLOAT`, `DOUBLE` or `BigDecimal`| natural number, defaults to 2|
+|max\_string\_length	|optional	| maximal length of string. It's useful if the data type is `TEXT`.  | positive integer, defaults to 128|
+
+
+
+#### Example for writing TsFile
+
+You should install TsFile to your local maven repository.
+
+See reference: [Installation](./1-Installation.md)
+
+
+
+##### Writing TsFile by using json schema
+
+```java
+package org.apache.iotdb.tsfile;
+
+import java.io.File;
+import java.io.IOException;
+import com.alibaba.fastjson.JSONArray;
+import com.alibaba.fastjson.JSONObject;
+import org.apache.iotdb.tsfile.exception.write.WriteProcessException;
+import org.apache.iotdb.tsfile.file.metadata.enums.TSDataType;
+import org.apache.iotdb.tsfile.file.metadata.enums.TSEncoding;
+import org.apache.iotdb.tsfile.write.TsFileWriter;
+import org.apache.iotdb.tsfile.write.record.TSRecord;
+import org.apache.iotdb.tsfile.write.record.datapoint.DataPoint;
+import org.apache.iotdb.tsfile.write.record.datapoint.FloatDataPoint;
+import org.apache.iotdb.tsfile.write.record.datapoint.IntDataPoint;
+import org.apache.iotdb.tsfile.write.schema.MeasurementSchema;
+/**
+ * An example of writing data to TsFile
+ */
+public class TsFileWrite {
+  /**
+   * There are two ways to construct a TsFile instance,they generate the identical TsFile file.
+   * This method uses the first interface:
+   * public void addMeasurementByJson(JSONObject measurement) throws WriteProcessException
+   * The corresponding json string is provided below.
+   * {
+   *     "schema": [
+   *         {
+   *             "measurement_id": "sensor_1",
+   *             "data_type": "FLOAT",
+   *             "encoding": "RLE",
+   * 	         "compressor" : "UNCOMPRESSED"
+   *         },
+   *         {
+   *             "measurement_id": "sensor_2",
+   *             "data_type": "INT32",
+   *             "encoding": "TS_2DIFF",
+   * 	         "compressor" : "UNCOMPRESSED"
+   *         },
+   *         {
+   *             "measurement_id": "sensor_3",
+   *             "data_type": "INT32",
+   *             "encoding": "TS_2DIFF",
+   * 	         "compressor" : "UNCOMPRESSED"
+   *        }
+   *     ]
+   * }
+   */
+  public static void main(String args[]) {
+    try {
+         String path = "testWithJson.tsfile";
+         String jsonText = "{\n" +
+                 "    \"schema\": [\n" +
+                 "        {\n" +
+                 "            \"measurement_id\": \"sensor_1\",\n" +
+                 "            \"data_type\": \"FLOAT\",\n" +
+                 "            \"encoding\": \"RLE\",\n" +
+                 "            \"compressor\" : \"UNCOMPRESSED\"\n" +
+                 "        },\n" +
+                 "        {\n" +
+                 "            \"measurement_id\": \"sensor_2\",\n" +
+                 "            \"data_type\": \"INT32\",\n" +
+                 "            \"encoding\": \"TS_2DIFF\",\n" +
+                 "            \"compressor\" : \"UNCOMPRESSED\"\n" +
+                 "\n" +
+                 "        },\n" +
+                 "        {\n" +
+                 "            \"measurement_id\": \"sensor_3\",\n" +
+                 "            \"data_type\": \"INT32\",\n" +
+                 "            \"encoding\": \"TS_2DIFF\",        \n" +
+                 "            \"compressor\" : \"UNCOMPRESSED\"\n" +
+                 "\n" +
+                 "  }\n" +
+                 "    ]\n" +
+                 "}";
+         File f = new File(path);
+         if (f.exists()) {
+           f.delete();
+         }
+         TsFileWriter tsFileWriter = new TsFileWriter(f);
+         JSONObject j = JSONObject.parseObject(jsonText);
+         JSONArray schemas = j.getJSONArray("schema");
+         // add measurements into file schema
+         for (int i = 0; i < schemas.size(); ++i) {
+           tsFileWriter.addMeasurementByJson(schemas.getJSONObject(i));
+         }
+        // construct TSRecord
+        TSRecord tsRecord = new TSRecord(1, "device_1");
+        DataPoint dPoint1 = new FloatDataPoint("sensor_1", 1.2f);
+        DataPoint dPoint2 = new IntDataPoint("sensor_2", 20);
+        DataPoint dPoint3;
+        tsRecord.addTuple(dPoint1);
+        tsRecord.addTuple(dPoint2);
+    
+        // write a TSRecord to TsFile
+        tsFileWriter.write(tsRecord);
+    
+        tsRecord = new TSRecord(2, "device_1");
+        dPoint2 = new IntDataPoint("sensor_2", 20);
+        dPoint3 = new IntDataPoint("sensor_3", 50);
+        tsRecord.addTuple(dPoint2);
+        tsRecord.addTuple(dPoint3);
+        tsFileWriter.write(tsRecord);
+    
+        tsRecord = new TSRecord(3, "device_1");
+        dPoint1 = new FloatDataPoint("sensor_1", 1.4f);
+        dPoint2 = new IntDataPoint("sensor_2", 21);
+        tsRecord.addTuple(dPoint1);
+        tsRecord.addTuple(dPoint2);
+        tsFileWriter.write(tsRecord);
+    
+        tsRecord = new TSRecord(4, "device_1");
+        dPoint1 = new FloatDataPoint("sensor_1", 1.2f);
+        dPoint2 = new IntDataPoint("sensor_2", 20);
+        dPoint3 = new IntDataPoint("sensor_3", 51);
+        tsRecord.addTuple(dPoint1);
+        tsRecord.addTuple(dPoint2);
+        tsRecord.addTuple(dPoint3);
+        tsFileWriter.write(tsRecord);
+    
+        tsRecord = new TSRecord(6, "device_1");
+        dPoint1 = new FloatDataPoint("sensor_1", 7.2f);
+        dPoint2 = new IntDataPoint("sensor_2", 10);
+        dPoint3 = new IntDataPoint("sensor_3", 11);
+        tsRecord.addTuple(dPoint1);
+        tsRecord.addTuple(dPoint2);
+        tsRecord.addTuple(dPoint3);
+        tsFileWriter.write(tsRecord);
+    
+        tsRecord = new TSRecord(7, "device_1");
+        dPoint1 = new FloatDataPoint("sensor_1", 6.2f);
+        dPoint2 = new IntDataPoint("sensor_2", 20);
+        dPoint3 = new IntDataPoint("sensor_3", 21);
+        tsRecord.addTuple(dPoint1);
+        tsRecord.addTuple(dPoint2);
+        tsRecord.addTuple(dPoint3);
+        tsFileWriter.write(tsRecord);
+    
+        tsRecord = new TSRecord(8, "device_1");
+        dPoint1 = new FloatDataPoint("sensor_1", 9.2f);
+        dPoint2 = new IntDataPoint("sensor_2", 30);
+        dPoint3 = new IntDataPoint("sensor_3", 31);
+        tsRecord.addTuple(dPoint1);
+        tsRecord.addTuple(dPoint2);
+        tsRecord.addTuple(dPoint3);
+        tsFileWriter.write(tsRecord);
+    
+        // close TsFile
+        tsFileWriter.close();        
+    } catch (Throwable e) {
+      e.printStackTrace();
+      System.out.println(e.getMessage());
+    }
+  }
+}
+
+```
+
+##### Writing TsFile directly without defining the schema by json
 
 Review comment:
   ```suggestion
   ##### Writing TsFile with defining the schema by API
   ```

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services