You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@iotdb.apache.org by qi...@apache.org on 2020/06/19 07:21:10 UTC
[incubator-iotdb] branch master updated: Fix spark tsfile master (#1389)

This is an automated email from the ASF dual-hosted git repository.

qiaojialin pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-iotdb.git


The following commit(s) were added to refs/heads/master by this push:
     new e42bbfe  Fix spark tsfile master (#1389)
e42bbfe is described below

commit e42bbfe3cf02a6a3b61b22073dfbf2f2d301b047
Author: SilverNarcissus <15...@smail.nju.edu.cn>
AuthorDate: Fri Jun 19 15:21:00 2020 +0800

    Fix spark tsfile master (#1389)
    
    * fix spark-tsfile query all measurement bug
---
 docs/SystemDesign/Connector/Spark-TsFile.md        | 22 +++++++++++++++++++++-
 docs/zh/SystemDesign/Connector/Spark-TsFile.md     | 16 ++++++++++++++++
 .../apache/iotdb/spark/tsfile/DefaultSource.scala  | 13 ++++++++++++-
 3 files changed, 49 insertions(+), 2 deletions(-)

diff --git a/docs/SystemDesign/Connector/Spark-TsFile.md b/docs/SystemDesign/Connector/Spark-TsFile.md
index 8b1aea3..eea8d15 100644
--- a/docs/SystemDesign/Connector/Spark-TsFile.md
+++ b/docs/SystemDesign/Connector/Spark-TsFile.md
@@ -68,7 +68,27 @@ The main logic is the buildReader function in src / main / scala / org / apache
 The main logic of the SQL analysis of the wide table structure is in src / main / scala / org / apache / iotdb / spark / tsfile / WideConverter.scala. This structure is basically the same as the Tsfile native query structure. No special processing is required, and the SQL statement is directly converted into  Corresponding query expression
 
 #### 4. Narrow table structure
-The main logic of the SQL analysis of the wide table structure is src / main / scala / org / apache / iotdb / spark / tsfile / NarrowConverter.scala. After the SQL is converted to an expression, the narrow table structure is different from the Tsfile native query structure.  The expression is converted into a disjunction expression related to the device before it can be converted into a query of Tsfile. The conversion code is in src / main / java / org / apache / iotdb / spark / tsfile / qp
+The main logic of the SQL analysis of the wide table structure is src / main / scala / org / apache / iotdb / spark / tsfile / NarrowConverter.scala. 
+
+Firstly we use required schema to decide which timeseries we should get from time file
+```
+requiredSchema.foreach((field: StructField) => {
+  if (field.name != QueryConstant.RESERVED_TIME
+    && field.name != NarrowConverter.DEVICE_NAME) {
+    measurementNames += field.name
+  }
+})
+```
+
+After the SQL is converted to an expression, the narrow table structure is different from the Tsfile native query structure.  The expression is converted into a disjunction expression related to the device before it can be converted into a query of Tsfile. The conversion code is in src / main / java / org / apache / iotdb / spark / tsfile / qp
+
+example：
+```
+select time, device_name, s1 from tsfile_table where time > 1588953600000 and time < 1589040000000 and device_name = 'root.group1.d1'
+```
+Obviously we only need timeseries 'root.group1.d1.s1' and our expression is [time > 1588953600000] and [time < 1589040000000]
+
+
 
 #### 5. Query execution
 The actual data query execution is performed by the Tsfile native component, see:
diff --git a/docs/zh/SystemDesign/Connector/Spark-TsFile.md b/docs/zh/SystemDesign/Connector/Spark-TsFile.md
index bb10bc7..ccf947a 100644
--- a/docs/zh/SystemDesign/Connector/Spark-TsFile.md
+++ b/docs/zh/SystemDesign/Connector/Spark-TsFile.md
@@ -78,9 +78,25 @@ SQL解析分宽表结构与窄表结构
 
 宽表结构的SQL解析主要逻辑在 src/main/scala/org/apache/iotdb/spark/tsfile/NarrowConverter.scala中
 
+首先我们根据查询的schema确定要查询的时间序列，仅在tsfile中查询那些sql中存在的时间序列
+```
+requiredSchema.foreach((field: StructField) => {
+  if (field.name != QueryConstant.RESERVED_TIME
+    && field.name != NarrowConverter.DEVICE_NAME) {
+    measurementNames += field.name
+  }
+})
+```
+
 SQL转化为表达式后，由于窄表结构与 TsFile 原生查询结构不同，需要先将表达式转化为与 device 有关的析取表达式
 ，才可以转化为对 TsFile 的查询，转化代码在src/main/java/org/apache/iotdb/spark/tsfile/qp中
 
+例子：
+```
+select time, device_name, s1 from tsfile_table where time > 1588953600000 and time < 1589040000000 and device_name = 'root.group1.d1'
+```
+此时仅查询时间序列root.group1.d1.s1，条件表达式为[time > 1588953600000] and [time < 1589040000000]
+
 #### 5. 查询实际执行
 
 实际数据查询执行由 TsFile 原生组件完成，参见：
diff --git a/spark-tsfile/src/main/scala/org/apache/iotdb/spark/tsfile/DefaultSource.scala b/spark-tsfile/src/main/scala/org/apache/iotdb/spark/tsfile/DefaultSource.scala
index 257a0f2..ed613fa 100755
--- a/spark-tsfile/src/main/scala/org/apache/iotdb/spark/tsfile/DefaultSource.scala
+++ b/spark-tsfile/src/main/scala/org/apache/iotdb/spark/tsfile/DefaultSource.scala
@@ -41,6 +41,9 @@ import org.apache.spark.sql.execution.datasources.{FileFormat, OutputWriterFacto
 import org.apache.spark.sql.sources.{DataSourceRegister, Filter}
 import org.apache.spark.sql.types._
 import org.slf4j.LoggerFactory
+import scala.collection.JavaConversions._
+import scala.collection.mutable
+import scala.collection.mutable.ListBuffer
 
 private[tsfile] class DefaultSource extends FileFormat with DataSourceRegister {
 
@@ -116,7 +119,15 @@ private[tsfile] class DefaultSource extends FileFormat with DataSourceRegister {
 
       if (options.getOrElse(DefaultSource.isNarrowForm, "").equals("narrow_form")) {
         val deviceNames = reader.getAllDevices()
-        val measurementNames = reader.getAllMeasurements.keySet()
+        
+        val measurementNames = new java.util.HashSet[String]()
+
+        requiredSchema.foreach((field: StructField) => {
+          if (field.name != QueryConstant.RESERVED_TIME
+            && field.name != NarrowConverter.DEVICE_NAME) {
+            measurementNames += field.name
+          }
+        })
 
         // construct queryExpression based on queriedSchema and filters
         val queryExpressions = NarrowConverter.toQueryExpression(dataSchema, deviceNames,