You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@druid.apache.org by GitBox <gi...@apache.org> on 2020/10/05 22:28:11 UTC

[GitHub] [druid] jihoonson commented on a change in pull request #10479: Fix compaction task slot computation in auto compaction

jihoonson commented on a change in pull request #10479:
URL: https://github.com/apache/druid/pull/10479#discussion_r499906319



##########
File path: server/src/main/java/org/apache/druid/server/coordinator/duty/CompactSegments.java
##########
@@ -277,7 +300,8 @@ private CoordinatorStats doRun(
         );
         LOG.infoSegments(segmentsToCompact, "Compacting segments");
         // Count the compaction task itself + its sub tasks
-        numSubmittedTasks += findNumMaxConcurrentSubTasks(config.getTuningConfig()) + 1;
+        numSubmittedTasks++;
+        numCompactionTasksAndSubtasks += findMaxNumTaskSlotsUsedByOneCompactionTask(config.getTuningConfig()) + 1;

Review comment:
       Oops, good catch. It was always counting an extra slot regardless of the mode of the compaction task. Added tests.

##########
File path: server/src/main/java/org/apache/druid/server/coordinator/duty/CompactSegments.java
##########
@@ -277,7 +300,8 @@ private CoordinatorStats doRun(
         );
         LOG.infoSegments(segmentsToCompact, "Compacting segments");
         // Count the compaction task itself + its sub tasks
-        numSubmittedTasks += findNumMaxConcurrentSubTasks(config.getTuningConfig()) + 1;
+        numSubmittedTasks++;

Review comment:
       I think the previous one was wrong. It should count only the number of compaction tasks, not including its subtasks. I think this change would be fine since it's not documented yet. But yes, we should document it and all missing metrics.

##########
File path: server/src/main/java/org/apache/druid/server/coordinator/duty/CompactSegments.java
##########
@@ -180,24 +183,43 @@ public DruidCoordinatorRuntimeParams run(DruidCoordinatorRuntimeParams params)
   }
 
   /**
-   * Each compaction task can run a parallel indexing task. When we count the number of current running
-   * compaction tasks, we should count the sub tasks of the parallel indexing task as well. However, we currently
-   * don't have a good way to get the number of current running sub tasks except poking each supervisor task,
-   * which is complex to handle all kinds of failures. Here, we simply return {@code maxNumConcurrentSubTasks} instead
-   * to estimate the number of sub tasks conservatively. This should be ok since it won't affect to the performance of
-   * other ingestion types.
+   * Returns the maximum number of task slots used by one compaction task at any time when the task is issued with
+   * the given tuningConfig.
    */
-  private int findNumMaxConcurrentSubTasks(@Nullable ClientCompactionTaskQueryTuningConfig tuningConfig)
+  @VisibleForTesting
+  static int findMaxNumTaskSlotsUsedByOneCompactionTask(@Nullable ClientCompactionTaskQueryTuningConfig tuningConfig)
   {
-    if (tuningConfig != null && tuningConfig.getMaxNumConcurrentSubTasks() != null) {
-      // The actual number of subtasks might be smaller than the configured max.
-      // However, we use the max to simplify the estimation here.
-      return tuningConfig.getMaxNumConcurrentSubTasks();
+    if (isParallelMode(tuningConfig)) {
+      @Nullable Integer maxNumConcurrentSubTasks = tuningConfig.getMaxNumConcurrentSubTasks();
+      // Max number of task slots used in parallel mode = maxNumConcurrentSubTasks + 1 (supervisor task)
+      return (maxNumConcurrentSubTasks == null ? 1 : maxNumConcurrentSubTasks) + 1;
     } else {
-      return 0;
+      return 1;
     }
   }
 
+  /**
+   * Returns true if the compaction task can run in the parallel mode with the given tuningConfig.
+   * This method should be synchronized with ParallelIndexSupervisorTask.isParallelMode(InputSource, ParallelIndexTuningConfig).
+   */
+  @VisibleForTesting
+  static boolean isParallelMode(@Nullable ClientCompactionTaskQueryTuningConfig tuningConfig)

Review comment:
       They can't be merged for now since they are in different modules. There will be a cyclic dependency if I add a dependency for `indexing-service` in `server` since `indexing-service` already has one for `server`.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org