You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@druid.apache.org by GitBox <gi...@apache.org> on 2020/09/28 16:34:25 UTC

[GitHub] [druid] abhishekagarwal87 opened a new pull request #10445: Fix the task id creation in CompactionTask

abhishekagarwal87 opened a new pull request #10445:
URL: https://github.com/apache/druid/pull/10445


   Fixes a corner scenario in compaction when `partitionsSpec` is of type `SingleDimensionPartitionsSpec` and `maxNumConcurrentSubTasks` is set to 1. Compaction fails since subtasks are created with a `supervisorTaskId` that is not the task id of the compaction task and they fail to find the supervisor task. 
   
   
   This PR has:
   - [x] been self-reviewed.
   - [x] added unit tests or modified existing tests to cover new code paths, ensuring the threshold for [code coverage](https://github.com/apache/druid/blob/master/dev/code-review/code-coverage.md) is met.
   
   <hr>
   
   ##### Key changed/added classes in this PR
    * `ParallelIndexSupervisorTask`
    * `CompactionTask`
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] abhishekagarwal87 commented on a change in pull request #10445: Fix the task id creation in CompactionTask

Posted by GitBox <gi...@apache.org>.
abhishekagarwal87 commented on a change in pull request #10445:
URL: https://github.com/apache/druid/pull/10445#discussion_r497285788



##########
File path: indexing-service/src/main/java/org/apache/druid/indexing/common/task/CompactionTask.java
##########
@@ -361,9 +362,13 @@ public TaskStatus runTask(TaskToolbox toolbox) throws Exception
           // a new Appenderator on its own instead. As a result, they should use different sequence names to allocate
           // new segmentIds properly. See IndexerSQLMetadataStorageCoordinator.allocatePendingSegments() for details.
           // In this case, we use different fake IDs for each created index task.
-          final String subtaskId = tuningConfig == null || tuningConfig.getMaxNumConcurrentSubTasks() == 1
-                                   ? createIndexTaskSpecId(i)
-                                   : getId();
+          ParallelIndexIngestionSpec ingestionSpec = ingestionSpecs.get(i);
+          InputSource inputSource = ingestionSpec.getIOConfig().getNonNullInputSource(
+              ingestionSpec.getDataSchema().getParser()
+          );
+          final String subtaskId = ParallelIndexSupervisorTask.isParallelMode(inputSource, tuningConfig)
+                                   ? getId()
+                                   : createIndexTaskSpecId(i);
           return newTask(subtaskId, ingestionSpecs.get(i));

Review comment:
       Done




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] abhishekagarwal87 commented on pull request #10445: Fix the task id creation in CompactionTask

Posted by GitBox <gi...@apache.org>.
abhishekagarwal87 commented on pull request #10445:
URL: https://github.com/apache/druid/pull/10445#issuecomment-702308950


   > I've re-triggered LGTM and Travis since it looks like intermittent failures. Should we add an integration test for this edge case in a follow up PR?
   
   I added `testRunParallelWithRangePartitioningWithSingleTask` for the same. do you think we still need an integration test? 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] maytasm commented on a change in pull request #10445: Fix the task id creation in CompactionTask

Posted by GitBox <gi...@apache.org>.
maytasm commented on a change in pull request #10445:
URL: https://github.com/apache/druid/pull/10445#discussion_r497100121



##########
File path: indexing-service/src/main/java/org/apache/druid/indexing/common/task/CompactionTask.java
##########
@@ -361,9 +362,13 @@ public TaskStatus runTask(TaskToolbox toolbox) throws Exception
           // a new Appenderator on its own instead. As a result, they should use different sequence names to allocate
           // new segmentIds properly. See IndexerSQLMetadataStorageCoordinator.allocatePendingSegments() for details.
           // In this case, we use different fake IDs for each created index task.
-          final String subtaskId = tuningConfig == null || tuningConfig.getMaxNumConcurrentSubTasks() == 1
-                                   ? createIndexTaskSpecId(i)
-                                   : getId();
+          ParallelIndexIngestionSpec ingestionSpec = ingestionSpecs.get(i);
+          InputSource inputSource = ingestionSpec.getIOConfig().getNonNullInputSource(
+              ingestionSpec.getDataSchema().getParser()
+          );
+          final String subtaskId = ParallelIndexSupervisorTask.isParallelMode(inputSource, tuningConfig)
+                                   ? getId()
+                                   : createIndexTaskSpecId(i);
           return newTask(subtaskId, ingestionSpecs.get(i));

Review comment:
       nit: You can use `ingestionSpec` here (instead of repeating `ingestionSpecs.get(i)`)

##########
File path: indexing-service/src/main/java/org/apache/druid/indexing/common/task/batch/parallel/ParallelIndexSupervisorTask.java
##########
@@ -466,13 +466,20 @@ private void initializeSubTaskCleaner()
     registerResourceCloserOnAbnormalExit(currentSubTaskHolder);
   }
 
-  private boolean isParallelMode()
+  public static boolean isParallelMode(InputSource inputSource, @Nullable ParallelIndexTuningConfig tuningConfig)
   {
+    if (null == tuningConfig) {
+      return false;
+    }
+    boolean useRangePartitions = tuningConfig.getGivenOrDefaultPartitionsSpec() instanceof SingleDimensionPartitionsSpec;

Review comment:
       nit: Can you create a static method useRangePartitions that takes in tuningConfig as argument to avoid repeating code/logic




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] maytasm merged pull request #10445: Fix the task id creation in CompactionTask

Posted by GitBox <gi...@apache.org>.
maytasm merged pull request #10445:
URL: https://github.com/apache/druid/pull/10445


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] suneet-s commented on pull request #10445: Fix the task id creation in CompactionTask

Posted by GitBox <gi...@apache.org>.
suneet-s commented on pull request #10445:
URL: https://github.com/apache/druid/pull/10445#issuecomment-702258772


   I've re-triggered LGTM and Travis since it looks like intermittent failures. Should we add an integration test for this edge case in a follow up PR? 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] ccaominh commented on a change in pull request #10445: Fix the task id creation in CompactionTask

Posted by GitBox <gi...@apache.org>.
ccaominh commented on a change in pull request #10445:
URL: https://github.com/apache/druid/pull/10445#discussion_r497840982



##########
File path: indexing-service/src/test/java/org/apache/druid/indexing/common/task/CompactionTaskParallelRunTest.java
##########
@@ -219,6 +219,37 @@ public void testRunParallelWithRangePartitioning()
     }
   }
 
+  @Test
+  public void testRunParallelWithRangePartitioningWithSingleTask()
+  {
+    // Range partitioning is not supported with segment lock yet
+    if (lockGranularity == LockGranularity.SEGMENT) {
+      return;
+    }
+    runIndexTask(null, true);
+
+    final Builder builder = new Builder(
+        DATA_SOURCE,
+        getSegmentLoaderFactory(),
+        RETRY_POLICY_FACTORY
+    );
+    final CompactionTask compactionTask = builder
+        .inputSpec(new CompactionIntervalSpec(INTERVAL_TO_INDEX, null))
+        .tuningConfig(newTuningConfig(new SingleDimensionPartitionsSpec(7, null, "dim", false), 1, true))
+        .build();
+
+    final Set<DataSegment> compactedSegments = runTask(compactionTask);
+    final CompactionState expectedState = new CompactionState(
+        new SingleDimensionPartitionsSpec(7, null, "dim", false),
+        compactionTask.getTuningConfig().getIndexSpec().asMap(getObjectMapper())
+    );
+    for (DataSegment segment : compactedSegments) {
+      // Expecte compaction state to exist as store compaction state by default

Review comment:
       There's at least one typo in this comment

##########
File path: indexing-service/src/test/java/org/apache/druid/indexing/common/task/CompactionTaskParallelRunTest.java
##########
@@ -219,6 +219,37 @@ public void testRunParallelWithRangePartitioning()
     }
   }
 
+  @Test
+  public void testRunParallelWithRangePartitioningWithSingleTask()
+  {
+    // Range partitioning is not supported with segment lock yet
+    if (lockGranularity == LockGranularity.SEGMENT) {
+      return;
+    }

Review comment:
       [JUnit has an `Assume` API](https://junit.org/junit4/javadoc/4.12/org/junit/Assume.html), which could be a good fit here.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org