You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@druid.apache.org by GitBox <gi...@apache.org> on 2021/03/16 00:53:10 UTC

[GitHub] [druid] jihoonson opened a new issue #10999: Auto compaction can fail to find segments for compaction when segment versions are mixed in the same time chunk based on new segment granularity

jihoonson opened a new issue #10999:
URL: https://github.com/apache/druid/issues/10999


   ### Affected Version
   
   The master branch
   
   ### Description
   
   #10843 added support for segment granularity for auto compaction. This change can make auto compaction to fail in finding candidate segments for compaction when those segments have mixed versions, especially when you change segment granularity from something small to something large. When you change segment granularity, auto compaction internally creates another timeline which is populated based on the new segment granularity. Here is a code snippet of [how we populate the new timeline](https://github.com/apache/druid/blob/master/server/src/main/java/org/apache/druid/server/coordinator/duty/NewestSegmentFirstIterator.java#L135-L148).
   
   ```java
                 DataSegment segmentsForCompact = segment.withShardSpec(new NumberedShardSpec(partitionNum, partitions));
                 // PartitionHolder can only holds chunks of one partition space
                 // However, partition in the new timeline (timelineWithConfiguredSegmentGranularity) can be hold multiple
                 // partitions of the original timeline (when the new segmentGranularity is larger than the original
                 // segmentGranularity). Hence, we group all the segments of the original timeline into intervals bucket
                 // by the new configuredSegmentGranularity. We then convert each segment into a new partition space so that
                 // there is no duplicate partitionNum across all segments of each new Interval. We will have to save the
                 // original ShardSpec to convert the segment back when returning from the iterator.
                 originalShardSpecs.put(new Pair<>(interval, segmentsForCompact.getId()), segment.getShardSpec());
                 timelineWithConfiguredSegmentGranularity.add(
                     interval,
                     segmentsForCompact.getVersion(),
                     NumberedPartitionChunk.make(partitionNum, partitions, segmentsForCompact)
                 );
   ```
   
   As shown in the snippet, we use the segment version directly when populating the new timeline. Since the `interval` in the snippet is a time chunk based on new segment granularity, those segments of mixed versions can be added into the same time chunk in the new timeline. Finally, we replace the shardSpec of those segments with a new one that has `partitions` of the number of segments in the new time chunk. As a result, all those segments of mixed versions will not be visible since there will be always less number of non-overshadowed segments than `partitions` in that time chunk.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] maytasm closed issue #10999: Auto compaction can fail to find segments for compaction when segment versions are mixed in the same time chunk based on new segment granularity

Posted by GitBox <gi...@apache.org>.
maytasm closed issue #10999:
URL: https://github.com/apache/druid/issues/10999


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org