You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@iotdb.apache.org by GitBox <gi...@apache.org> on 2022/09/20 13:59:41 UTC

[GitHub] [iotdb] Cpaulyz commented on a diff in pull request #7359: [IOTDB-4391] Apply PatternTreeMap in parsing .mods file to optimize query

Cpaulyz commented on code in PR #7359:
URL: https://github.com/apache/iotdb/pull/7359#discussion_r975404054


##########
tsfile/src/main/java/org/apache/iotdb/tsfile/file/metadata/ChunkMetadata.java:
##########
@@ -197,32 +198,30 @@ public void setVersion(long version) {
   }
 
   public List<TimeRange> getDeleteIntervalList() {
-    return deleteIntervalList;
-  }
-
-  public void setDeleteIntervalList(List<TimeRange> list) {
-    this.deleteIntervalList = list;
+    return deleteIntervalSet == null ? null : new ArrayList<>(deleteIntervalSet);
   }
 
   public void insertIntoSortedDeletions(long startTime, long endTime) {
-    List<TimeRange> resultInterval = new ArrayList<>();
-    if (deleteIntervalList != null) {
-      for (TimeRange interval : deleteIntervalList) {
-        if (interval.getMax() < startTime) {
-          resultInterval.add(interval);
-        } else if (interval.getMin() > endTime) {
-          resultInterval.add(new TimeRange(startTime, endTime));
-          startTime = interval.getMin();
-          endTime = interval.getMax();
-        } else if (interval.getMax() >= startTime || interval.getMin() <= endTime) {
-          startTime = Math.min(interval.getMin(), startTime);
-          endTime = Math.max(interval.getMax(), endTime);
-        }
+    TimeRange timeRange = new TimeRange(startTime, endTime);
+
+    if (deleteIntervalSet != null) {
+      TimeRange interval = deleteIntervalSet.floor(timeRange);
+      while (interval != null && timeRange.intersects(interval)) {
+        timeRange.merge(interval);
+        deleteIntervalSet.remove(interval);
+        interval = deleteIntervalSet.floor(timeRange);
       }
+      interval = deleteIntervalSet.ceiling(timeRange);
+      while (interval != null && timeRange.intersects(interval)) {
+        timeRange.merge(interval);
+        deleteIntervalSet.remove(interval);
+        interval = deleteIntervalSet.ceiling(timeRange);
+      }
+    } else {
+      deleteIntervalSet = new TreeSet<>();
     }
 
-    resultInterval.add(new TimeRange(startTime, endTime));
-    deleteIntervalList = resultInterval;
+    deleteIntervalSet.add(timeRange);

Review Comment:
   > BTW, I think we can move this merge phase into the generation of PatternTreeMap(make the `BiConsumer<V, Set<V>> appendFunction` in `getModsPatternTreeMap` more complicated) for Deletion and of course we need to take more thing into account like `version` and `offset`
   
   Yes.. In fact, my original implementation was to do the merge in appendFunction ([code](https://github.com/apache/iotdb/pull/7359/commits/ea497089cecc0902baaee0b154dd0c775bc97b07#diff-b24f923a2084b6c2e8abab3aa2be961842fef5b22d51466daddbdeb989a309cb)). But I found that it is not a good idea for some cases because it will merge all modification in modFile. For example, there are 10000 modifications about `root.sg1.d1.s1` and 1 modification about `root.sg1.d1.s2`. If query is about `root.sg1.d1.s2`, the merge operation about `root.sg1.d1.s1` is a waste of time. 
   
   I think maybe I can apply range merge algorithm before `return allModifications.getOverlapped(path);`. It ensures the return list of `QueryContext#getPathModifications` is in order. And upper layer no need to sort time range again (seems save 1 time).



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@iotdb.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org