You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@druid.apache.org by GitBox <gi...@apache.org> on 2020/10/15 18:09:30 UTC

[GitHub] [druid] jihoonson commented on a change in pull request #10335: Configurable Index Type

jihoonson commented on a change in pull request #10335:
URL: https://github.com/apache/druid/pull/10335#discussion_r505740755



##########
File path: server/src/main/java/org/apache/druid/segment/indexing/TuningConfig.java
##########
@@ -32,11 +35,43 @@
 public interface TuningConfig
 {
   boolean DEFAULT_LOG_PARSE_EXCEPTIONS = false;
+  AppendableIndexSpec DEFAULT_APPENDABLE_INDEX = new OnheapIncrementalIndex.Spec();
   int DEFAULT_MAX_PARSE_EXCEPTIONS = Integer.MAX_VALUE;
   int DEFAULT_MAX_SAVED_PARSE_EXCEPTIONS = 0;
   int DEFAULT_MAX_ROWS_IN_MEMORY = 1_000_000;
-  // We initially estimated this to be 1/3(max jvm memory), but bytesCurrentlyInMemory only
-  // tracks active index and not the index being flushed to disk, to account for that
-  // we halved default to 1/6(max jvm memory)
-  long DEFAULT_MAX_BYTES_IN_MEMORY = JvmUtils.getRuntimeInfo().getMaxHeapSizeBytes() / 6;
+
+  /**
+   * The inceremental index implementation to use
+   */
+  AppendableIndexSpec getAppendableIndexSpec();
+
+  /**
+   * Maximum number of bytes (estimated) to store in memory before persisting to local storage
+   */
+  long getMaxBytesInMemory();
+
+  /**
+   * Maximum number of bytes (estimated) to store in memory before persisting to local storage.
+   * If getMaxBytesInMemory() returns 0, the appendable index default will be used.
+   */
+  default long getMaxBytesInMemoryOrDefault()
+  {
+    // In the main tuningConfig class constructor, we set the maxBytes to 0 if null to avoid setting
+    // maxBytes to max jvm memory of the process that starts first. Instead we set the default based on
+    // the actual task node's jvm memory.
+    final long maxBytesInMemory = getMaxBytesInMemory();
+    if (maxBytesInMemory > 0) {
+      return maxBytesInMemory;
+    } else if (maxBytesInMemory == 0) {
+      return getAppendableIndexSpec().getDefaultMaxBytesInMemory();
+    } else {
+      return Long.MAX_VALUE;
+    }
+  }
+
+  PartitionsSpec getPartitionsSpec();
+
+  IndexSpec getIndexSpec();
+
+  IndexSpec getIndexSpecForIntermediatePersists();

Review comment:
       Oops, sorry. I missed the previous ping. 
   
   > I was wondering why they are in `AppenderatorConfig` in the first place.
   
   `indexSpecForIntermediatePersists` was added in #7919. I'm not aware of the exact reason, but it seems not bad to have it only in `AppenderatorConfig` because that parameter is coupled with how `Appenderator` spills intermediate segments. Hadoop task is sort of special. I think it was implemented before we added `Appenderator` (or at least before `AppenderatorDriver`), and we could implement `Appenderator` in a more structured way based on the lessons we learned from Hadoop task. So, it works similar to other tasks using `Appenderator`, but is not exactly same. 
   
   Regarding the interface change, even though we currently have only the tasks which have the similar spilling mechanism, but it might not be true in the future. So, I would say it would be better to not modify the interface unless you have to.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org