You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@iotdb.apache.org by GitBox <gi...@apache.org> on 2020/05/27 03:57:24 UTC

[GitHub] [incubator-iotdb] fanhualta commented on a change in pull request #1269: [IOTDB-707] Optimize TsFileResource memory usage

fanhualta commented on a change in pull request #1269:
URL: https://github.com/apache/incubator-iotdb/pull/1269#discussion_r430815501



##########
File path: server/src/main/java/org/apache/iotdb/db/engine/storagegroup/TsFileResource.java
##########
@@ -339,12 +363,91 @@ public long getFileSize() {
     return file.length();
   }
 
-  public Map<String, Long> getStartTimeMap() {
-    return startTimeMap;
+  public long getStartTime(String deviceId) {
+    if (!deviceToIndex.containsKey(deviceId)) {
+      return -1;
+    }
+    return startTimes[deviceToIndex.get(deviceId)];
   }
 
-  public Map<String, Long> getEndTimeMap() {
-    return endTimeMap;
+  public long getEndTime(String deviceId) {
+    if (!deviceToIndex.containsKey(deviceId)) {
+      return -1;
+    }
+    return endTimes[deviceToIndex.get(deviceId)];
+  }
+
+  public long getOrDefaultStartTime(String deviceId, long defaultTime) {
+    return getStartTime(deviceId) >= 0 ? startTimes[deviceToIndex.get(deviceId)] : defaultTime;
+  }
+
+  public long getOrDefaultEndTime(String deviceId, long defaultTime) {
+    return getEndTime(deviceId) >= 0 ? endTimes[deviceToIndex.get(deviceId)] : defaultTime;

Review comment:
       You will read the time twice if it's valid. It's better to recording the result of getStartTime(deviceId) with temporary variables.

##########
File path: server/src/main/java/org/apache/iotdb/db/engine/storagegroup/TsFileResource.java
##########
@@ -63,16 +63,22 @@
   public static final String RESOURCE_SUFFIX = ".resource";
   static final String TEMP_SUFFIX = ".temp";
   private static final String CLOSING_SUFFIX = ".closing";
+  private static final int INIT_ARRAY_SIZE = 64;

Review comment:
       In the worst case, it may waste nearly half memory of `startTimes` and `endTimes`. From this point of view, I think it may be not a better choice than `Map<String, long[]> deviceToStartEndTime`.

##########
File path: server/src/main/java/org/apache/iotdb/db/engine/storagegroup/TsFileResource.java
##########
@@ -339,12 +363,91 @@ public long getFileSize() {
     return file.length();
   }
 
-  public Map<String, Long> getStartTimeMap() {
-    return startTimeMap;
+  public long getStartTime(String deviceId) {
+    if (!deviceToIndex.containsKey(deviceId)) {
+      return -1;
+    }
+    return startTimes[deviceToIndex.get(deviceId)];
   }
 
-  public Map<String, Long> getEndTimeMap() {
-    return endTimeMap;
+  public long getEndTime(String deviceId) {
+    if (!deviceToIndex.containsKey(deviceId)) {
+      return -1;
+    }
+    return endTimes[deviceToIndex.get(deviceId)];
+  }
+
+  public long getOrDefaultStartTime(String deviceId, long defaultTime) {
+    return getStartTime(deviceId) >= 0 ? startTimes[deviceToIndex.get(deviceId)] : defaultTime;

Review comment:
       You will read the time twice if it's valid. It's better to recording the result of `getStartTime(deviceId)` with temporary variables.

##########
File path: server/src/main/java/org/apache/iotdb/db/engine/merge/selector/MaxFileMergeFileSelector.java
##########
@@ -213,18 +213,18 @@ private boolean checkClosed(TsFileResource unseqFile) {
   private void selectOverlappedSeqFiles(TsFileResource unseqFile) {
 
     int tmpSelectedNum = 0;
-    for (Entry<String, Long> deviceStartTimeEntry : unseqFile.getStartTimeMap().entrySet()) {
+    for (Entry<String, Integer> deviceStartTimeEntry : unseqFile.getDeviceToIndexMap().entrySet()) {
       String deviceId = deviceStartTimeEntry.getKey();
-      Long unseqStartTime = deviceStartTimeEntry.getValue();
-      Long unseqEndTime = unseqFile.getEndTimeMap().get(deviceId);
+      long unseqStartTime = unseqFile.getStartTime(deviceId);
+      long unseqEndTime = unseqFile.getEndTime(deviceId);

Review comment:
       If we already have the index info, that is `entry.getValue()`. It will be more efficient to get from endTimes by index directly. Otherwise, it will hash twice and find in bucket in `ConcurrentHashMap.` Other iterations have similar problems, please have a look too.

##########
File path: server/src/main/java/org/apache/iotdb/db/engine/storagegroup/StorageGroupProcessor.java
##########
@@ -318,12 +318,18 @@ private void recover() throws StorageGroupProcessorException {
 
     for (TsFileResource resource : sequenceFileTreeSet) {
       long timePartitionId = resource.getTimePartition();
+      Map<String, Long> endTimeMap = new HashMap<>();
+      for (Entry<String, Integer> entry : resource.getDeviceToIndexMap().entrySet()) {
+        String deviceId = entry.getKey();
+        long endTime = resource.getEndTime(deviceId);

Review comment:
       Same issue as above.

##########
File path: server/src/main/java/org/apache/iotdb/db/engine/storagegroup/StorageGroupProcessor.java
##########
@@ -344,16 +350,16 @@ private void updateLastestFlushedTime() throws IOException {
     VersionController versionController = new SimpleFileVersionController(storageGroupSysDir.getPath());
     long currentVersion = versionController.currVersion();
     for (TsFileResource resource : upgradeSeqFileList) {
-      for (Entry<String, Long> entry : resource.getEndTimeMap().entrySet()) {
+      for (Entry<String, Integer> entry : resource.getDeviceToIndexMap().entrySet()) {
         String deviceId = entry.getKey();
-        long endTime = entry.getValue();
+        long endTime = resource.getEndTime(deviceId);

Review comment:
       Same issue as above.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org