You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@druid.apache.org by GitBox <gi...@apache.org> on 2021/11/03 18:17:19 UTC

[GitHub] [druid] maytasm opened a new pull request #11874: Support changing dimension schema in Auto Compaction

maytasm opened a new pull request #11874:
URL: https://github.com/apache/druid/pull/11874


   Support changing dimension schema in Auto Compaction 
   
   ### Description
   
   Changing dimensionsSpec is already supported in manual compaction task. This PR adds it to the auto compaction.
   By changing dimensionsSpec, the user will be able to change the dimension ordering, remove dimension, and change dimension data type. 
   
   This PR has:
   - [ ] been self-reviewed.
      - [ ] using the [concurrency checklist](https://github.com/apache/druid/blob/master/dev/code-review/concurrency.md) (Remove this item if the PR doesn't have any relation to concurrency.)
   - [ ] added documentation for new or modified features or behaviors.
   - [ ] added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
   - [ ] added or updated version, license, or notice information in [licenses.yaml](https://github.com/apache/druid/blob/master/dev/license.md)
   - [ ] added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
   - [ ] added unit tests or modified existing tests to cover new code paths, ensuring the threshold for [code coverage](https://github.com/apache/druid/blob/master/dev/code-review/code-coverage.md) is met.
   - [ ] added integration tests.
   - [ ] been tested in a test Druid cluster.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] maytasm commented on a change in pull request #11874: Support changing dimension schema in Auto Compaction

Posted by GitBox <gi...@apache.org>.
maytasm commented on a change in pull request #11874:
URL: https://github.com/apache/druid/pull/11874#discussion_r744318026



##########
File path: server/src/main/java/org/apache/druid/server/coordinator/duty/NewestSegmentFirstIterator.java
##########
@@ -440,6 +442,19 @@ private boolean needsCompaction(DataSourceCompactionConfig config, SegmentsToCom
       }
     }
 
+    if (config.getDimensionsSpec() != null) {
+      final DimensionsSpec existingDimensionsSpec = lastCompactionState.getDimensionsSpec();
+      // Checks for list of dimensions
+      if (config.getDimensionsSpec().getDimensions() != null) {
+        final List<DimensionSchema> existingDimensions = existingDimensionsSpec != null ?
+                                                         existingDimensionsSpec.getDimensions() :
+                                                         null;
+        if (!config.getDimensionsSpec().getDimensions().equals(existingDimensions)) {
+          return true;

Review comment:
       Done




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] maytasm commented on a change in pull request #11874: Support changing dimension schema in Auto Compaction

Posted by GitBox <gi...@apache.org>.
maytasm commented on a change in pull request #11874:
URL: https://github.com/apache/druid/pull/11874#discussion_r744317560



##########
File path: docs/configuration/index.md
##########
@@ -988,7 +989,7 @@ The below is a list of the supported configurations for auto compaction.
 |`chatHandlerNumRetries`|Retries for reporting the pushed segments in worker tasks.|no (default = 5)|
 
 ###### Automatic compaction granularitySpec
-You can optionally use the `granularitySpec` object to configure the segment granularity of the compacted segments.
+You can optionally use the `granularitySpec` object to configure the granularity and rollup of the compacted segments.

Review comment:
       Done




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] suneet-s commented on a change in pull request #11874: Support changing dimension schema in Auto Compaction

Posted by GitBox <gi...@apache.org>.
suneet-s commented on a change in pull request #11874:
URL: https://github.com/apache/druid/pull/11874#discussion_r743839200



##########
File path: docs/configuration/index.md
##########
@@ -988,7 +989,7 @@ The below is a list of the supported configurations for auto compaction.
 |`chatHandlerNumRetries`|Retries for reporting the pushed segments in worker tasks.|no (default = 5)|
 
 ###### Automatic compaction granularitySpec
-You can optionally use the `granularitySpec` object to configure the segment granularity of the compacted segments.
+You can optionally use the `granularitySpec` object to configure the granularity and rollup of the compacted segments.

Review comment:
       nit: These fields are defined below, so it seems redundant to mention them again here. 

##########
File path: core/src/test/java/org/apache/druid/timeline/DataSegmentTest.java
##########
@@ -120,6 +121,7 @@ public void testV1Serialization() throws Exception
         new NumberedShardSpec(3, 0),
         new CompactionState(
             new HashedPartitionsSpec(100000, null, ImmutableList.of("dim1")),
+            new DimensionsSpec(DimensionsSpec.getDefaultSchemas(ImmutableList.of("dim1", "bar", "foo")), null, null),

Review comment:
       Can we add a test for de-serialization of a CompactionState object that does not have the `dimensionsSpec` field

##########
File path: server/src/main/java/org/apache/druid/server/coordinator/duty/NewestSegmentFirstIterator.java
##########
@@ -440,6 +442,19 @@ private boolean needsCompaction(DataSourceCompactionConfig config, SegmentsToCom
       }
     }
 
+    if (config.getDimensionsSpec() != null) {
+      final DimensionsSpec existingDimensionsSpec = lastCompactionState.getDimensionsSpec();
+      // Checks for list of dimensions
+      if (config.getDimensionsSpec().getDimensions() != null) {
+        final List<DimensionSchema> existingDimensions = existingDimensionsSpec != null ?
+                                                         existingDimensionsSpec.getDimensions() :
+                                                         null;
+        if (!config.getDimensionsSpec().getDimensions().equals(existingDimensions)) {
+          return true;

Review comment:
       nit: Please add a log line here for operators to know why the segment was selected for compaction similar to segmentGranularity above.
   
   Similar comments for rollup + queryGranularity

##########
File path: server/src/main/java/org/apache/druid/server/coordinator/duty/NewestSegmentFirstIterator.java
##########
@@ -440,6 +442,19 @@ private boolean needsCompaction(DataSourceCompactionConfig config, SegmentsToCom
       }
     }
 
+    if (config.getDimensionsSpec() != null) {
+      final DimensionsSpec existingDimensionsSpec = lastCompactionState.getDimensionsSpec();
+      // Checks for list of dimensions
+      if (config.getDimensionsSpec().getDimensions() != null) {

Review comment:
       If `config.getDimensionsSpec().getDimensions()` == null and lastCompactionState.getDimensionsSpec() is non-null then we want needsCompaction to return false. Is that understanding correct?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] maytasm commented on a change in pull request #11874: Support changing dimension schema in Auto Compaction

Posted by GitBox <gi...@apache.org>.
maytasm commented on a change in pull request #11874:
URL: https://github.com/apache/druid/pull/11874#discussion_r744318448



##########
File path: server/src/main/java/org/apache/druid/server/coordinator/duty/NewestSegmentFirstIterator.java
##########
@@ -440,6 +442,19 @@ private boolean needsCompaction(DataSourceCompactionConfig config, SegmentsToCom
       }
     }
 
+    if (config.getDimensionsSpec() != null) {
+      final DimensionsSpec existingDimensionsSpec = lastCompactionState.getDimensionsSpec();
+      // Checks for list of dimensions
+      if (config.getDimensionsSpec().getDimensions() != null) {

Review comment:
       Yes. If `config.getDimensionsSpec().getDimensions() == null` then needsCompaction will return false regardless of lastCompactionState.getDimensionsSpec() value




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] maytasm commented on a change in pull request #11874: Support changing dimension schema in Auto Compaction

Posted by GitBox <gi...@apache.org>.
maytasm commented on a change in pull request #11874:
URL: https://github.com/apache/druid/pull/11874#discussion_r744317612



##########
File path: core/src/test/java/org/apache/druid/timeline/DataSegmentTest.java
##########
@@ -120,6 +121,7 @@ public void testV1Serialization() throws Exception
         new NumberedShardSpec(3, 0),
         new CompactionState(
             new HashedPartitionsSpec(100000, null, ImmutableList.of("dim1")),
+            new DimensionsSpec(DimensionsSpec.getDefaultSchemas(ImmutableList.of("dim1", "bar", "foo")), null, null),

Review comment:
       Done. See DataSegmentTest#testDeserializationDataSegmentLastCompactionStateWithoutDimensionsSpec




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] maytasm merged pull request #11874: Support changing dimension schema in Auto Compaction

Posted by GitBox <gi...@apache.org>.
maytasm merged pull request #11874:
URL: https://github.com/apache/druid/pull/11874


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org