You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@iotdb.apache.org by GitBox <gi...@apache.org> on 2019/07/18 02:23:23 UTC

[GitHub] [incubator-iotdb] jt2594838 commented on a change in pull request #247: TsFile Docs

jt2594838 commented on a change in pull request #247: TsFile Docs
URL: https://github.com/apache/incubator-iotdb/pull/247#discussion_r304709106
 
 

 ##########
 File path: docs/Documentation/UserGuideV0.7.0/7-TsFile/2-Usage.md
 ##########
 @@ -0,0 +1,457 @@
+<!--
+
+    Licensed to the Apache Software Foundation (ASF) under one
+    or more contributor license agreements.  See the NOTICE file
+    distributed with this work for additional information
+    regarding copyright ownership.  The ASF licenses this file
+    to you under the Apache License, Version 2.0 (the
+    "License"); you may not use this file except in compliance
+    with the License.  You may obtain a copy of the License at
+
+        http://www.apache.org/licenses/LICENSE-2.0
+
+    Unless required by applicable law or agreed to in writing,
+    software distributed under the License is distributed on an
+    "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+    KIND, either express or implied.  See the License for the
+    specific language governing permissions and limitations
+    under the License.
+
+-->
+Now, you’re ready to start doing some awesome things with TsFile. This section demonstrates the detailed usage of TsFile.
+
+### Time-series Data
+A time-series is considered as a set of quadruples. A quadruple is defined as (deltaObject, measurement, time, value).
+
+* **deltaObject**: In many situations, a device which contains many sensors can be considered as a deltaObject.
+* **measurement**: A sensor can be considered as a measurement
+
+
+Table 1 illustrates a set of time-series data. The set showed in the following table contains one deltaObject named "device\_1" with three measurements named "sensor\_1", "sensor\_2" and "sensor\_3". 
+
+<center>
+<table style="text-align:center">
+	<tr><th colspan="6">device_1</th></tr>
+	<tr><th colspan="2">sensor_1</th><th colspan="2">sensor_2</th><th colspan="2">sensor_3</th></tr>
+	<tr><th>time</th><th>value</td><th>time</th><th>value</td><th>time</th><th>value</td>
+	<tr><td>1</td><td>1.2</td><td>1</td><td>20</td><td>2</td><td>50</td></tr>
+	<tr><td>3</td><td>1.4</td><td>2</td><td>20</td><td>4</td><td>51</td></tr>
+	<tr><td>5</td><td>1.1</td><td>3</td><td>21</td><td>6</td><td>52</td></tr>
+	<tr><td>7</td><td>1.8</td><td>4</td><td>20</td><td>8</td><td>53</td></tr>
+</table>
+<span>A set of time-series data</span>
+</center>
+
+**One Line of Data**: In many industrial applications, a device normally contains more than one sensor and these sensors may have values at a same timestamp, which is called one line of data. 
+
+Formally, one line of data consists of a `deltaObject_id`, a timestamp which indicates the milliseconds since January 1, 1970, 00:00:00, and several data pairs composed of `measurement_id` and corresponding `value`. All data pairs in one line belong to this `deltaObject_id` and have the same timestamp. If one of the `measurements` doesn't have a `value` in the `timestamp`, use a space instead(Actually, TsFile does not store null values). Its format is shown as follow:
+
+```
+deltaObject_id, timestamp, <measurement_id, value>...
+```
+
+An example is illustrated as follow. In this example, the data type of three measurements are  `INT32`, `FLOAT` and  `ENUMS` respectively.
+
+```
+device_1, 1490860659000, m1, 10, m2, 12.12, m3, MAN
+```
+
+
+### Writing TsFile
+
+#### Generate a TsFile File.
+A TsFile can be generated by following three steps and the complete code will be given in the section "Example for writing TsFile".
+
+* First, use the interface to construct a TsFile instance.
+	```
+	public TsFileWriter(File file) throws WriteProcessException, IOException
+	```
+	
+	**Parameters:**
+	
+	* file : The TsFile to write
+
+* Second, add measurements
+
+	```
+	public void addMeasurement(MeasurementDescriptor measurementDescriptor) throws WriteProcessException
+	```
+	
+	**Parameters:**
+	
+	* measurementDescriptor : The measurement information including name, data type and encoding
+	
+        > **Notice:** Although one measurement name can be used in multiple deltaObjects, the properties cannot be changed. I.e. 
+    it's not allowed to add one measurement name for multiple times with different type or encoding.
+    Here is a bad example:
+
+        ```
+        // The measurement "sensor_1" is float type
+        addMeasurement(new MeasurementSchema("sensor_1", TSDataType.FLOAT, TSEncoding.RLE));
+        // This call will throw a WriteProcessException exception
+        addMeasurement(new MeasurementSchema("sensor_1", TSDataType.INT32, TSEncoding.RLE));
+        ```
+* Third, write data continually.
+	
+	```
+	public void write(TSRecord record) throws IOException, WriteProcessException
+	```
+	
+	Use this interface to create a new TSRecord(a timestamp and device pair).
+	
+	```
+	public TSRecord(long timestamp, String deviceId)
+	```
+	Then create DataPoint(a measurement and value pair), and use the addTuple method to add the DataPoint to the correct
+	TsRecord.
+	
+* Finally, call `close` to finish this writing process. 
+	
+	```
+	public void close() throws IOException
+	```
+
+#### Example for writing TsFile
+
+You should install TsFile to your local maven repository.
+
+See reference: [Installation](./1-Installation.md)
+
+A more thorough example can be found at `/tsfile/example/src/main/java/org/apache/iotdb/tsfile/TsFileWrite.java`
+
+```java
+package org.apache.iotdb.tsfile;
+
+import java.io.File;
+import org.apache.iotdb.tsfile.file.metadata.enums.TSDataType;
+import org.apache.iotdb.tsfile.file.metadata.enums.TSEncoding;
+import org.apache.iotdb.tsfile.write.TsFileWriter;
+import org.apache.iotdb.tsfile.write.record.TSRecord;
+import org.apache.iotdb.tsfile.write.record.datapoint.DataPoint;
+import org.apache.iotdb.tsfile.write.record.datapoint.FloatDataPoint;
+import org.apache.iotdb.tsfile.write.record.datapoint.IntDataPoint;
+import org.apache.iotdb.tsfile.write.schema.MeasurementSchema;
+/**
+ * An example of writing data to TsFile
+ * It uses the interface:
+ * public void addMeasurement(MeasurementSchema MeasurementSchema) throws WriteProcessException
+ */
+public class TsFileWrite {
+
+  public static void main(String args[]) {
+    try {
+      String path = "test.tsfile";
+      File f = new File(path);
+      if (f.exists()) {
+        f.delete();
+      }
+      TsFileWriter tsFileWriter = new TsFileWriter(f);
+
+      // add measurements into file schema
+      tsFileWriter
+              .addMeasurement(new MeasurementSchema("sensor_1", TSDataType.FLOAT, TSEncoding.RLE));
+      tsFileWriter
+              .addMeasurement(new MeasurementSchema("sensor_2", TSDataType.INT32, TSEncoding.TS_2DIFF));
+      tsFileWriter
+              .addMeasurement(new MeasurementSchema("sensor_3", TSDataType.INT32, TSEncoding.TS_2DIFF));
+      // construct TSRecord
+      TSRecord tsRecord = new TSRecord(1, "device_1");
+      DataPoint dPoint1 = new FloatDataPoint("sensor_1", 1.2f);
+      DataPoint dPoint2 = new IntDataPoint("sensor_2", 20);
+
+     //For time 1 in device_1, the data will be 1.2, 20, null
+      tsRecord.addTuple(dPoint1);
+      tsRecord.addTuple(dPoint2);
+
+      // write a TSRecord to TsFile
+      tsFileWriter.write(tsRecord);
+      // close TsFile
+      tsFileWriter.close();
+    } catch (Throwable e) {
+      e.printStackTrace();
+      System.out.println(e.getMessage());
+    }
+  }
+}
+
+```
+
+### Interface for Reading TsFile
+
+#### Before the Start
+
+The set of time-series data in section "Time-series Data" is used here for a concrete introduction in this section. The set showed in the following table contains one deltaObject named "device\_1" with three measurements named "sensor\_1", "sensor\_2" and "sensor\_3". And the measurements has been simplified to do a simple illustration, which contains only 4 time-value pairs each.
+
+<center>
+<table style="text-align:center">
+	<tr><th colspan="6">device_1</th></tr>
+	<tr><th colspan="2">sensor_1</th><th colspan="2">sensor_2</th><th colspan="2">sensor_3</th></tr>
+	<tr><th>time</th><th>value</td><th>time</th><th>value</td><th>time</th><th>value</td>
+	<tr><td>1</td><td>1.2</td><td>1</td><td>20</td><td>2</td><td>50</td></tr>
+	<tr><td>3</td><td>1.4</td><td>2</td><td>20</td><td>4</td><td>51</td></tr>
+	<tr><td>5</td><td>1.1</td><td>3</td><td>21</td><td>6</td><td>52</td></tr>
+	<tr><td>7</td><td>1.8</td><td>4</td><td>20</td><td>8</td><td>53</td></tr>
+</table>
+<span>A set of time-series data</span>
+</center>
+
+#### Definition of Path
+
+A path represents a series instance in TsFile. In the example given above, "device\_1.sensor\_1" is a path.
 
 Review comment:
   I would suggest a more specific description like:
   A path is a dot-separated string which uniquely identifies a time-series in TsFile, e.g., "root.area\_1.device\_1.sensor\_1". The last section "sensor\_1" is called "measurementId" while the remaining parts "root.area\_1.device\_1" is called deviceId. As mentioned above, the same measurement in different devices has the same data type and ending, and devices are also unique.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services