You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@druid.apache.org by GitBox <gi...@apache.org> on 2019/09/07 05:22:02 UTC

[GitHub] [incubator-druid] jihoonson opened a new issue #8489: Stateful auto compaction

jihoonson opened a new issue #8489: Stateful auto compaction
URL: https://github.com/apache/incubator-druid/issues/8489
 
 
   ### Motivation
   
   In auto compaction, the coordinator searches for segments to compact based on their byte size. This algorithm is currently a stateless algorithm. At each coordinator run, it traverses all segments of all datasources from the latest one to the oldest one, compares their size against `targetCompactionSizeBytes` (this is currently missing which causes #8481), and issues a compaction task if it finds some segments smaller than `targetCompactionSizeBytes`.
   
   However, only comparing the segment size against `targetCompactionSizeBytes` is not enough to tell that a given segment requires a further compaction or not because the segment could be created with various types of `partitionsSpec` and later compacted with one of them. (As of now, auto compaction supports only `maxRowsPerSegment`, `maxTotalRows`, and `targetCompactionSizeBytes`, but it should support all partitionsSpec types in the future.)
   
   As of now, we have 3 `partitionsSpec`, i.e., `DynamicPartitionsSpec`, `HashedPartitionsSpec`, and `SingleDimensionPartitionsSpec`.
   
   - `DynamicPartitionsSpec` has `maxRowsPerSegment`, and `maxTotalRows`. 
   - `HashedPartitionsSpec` has `targetPartitionSize` (target number of rows per segment), `numShards`, and `partitionDimensions`.
   - `SingleDimensionPartitionsSpec` has `targetPartitionSize` (target number of rows per segment), `maxPartitionSize` (max number of rows per segment), `partitionDimensions`.
   
   In the coordinator, most of these configurations are not easy to use to search for segments which need compaction because the number of rows in each segment is not available in the coordinator. And even if we had that in the coordinator, hash or range partitioned segments could have totally different number of rows from `targetPartitionSize`, which makes hard to tell a given segment needs compaction or not. Note that a segment doesn't need further compaction if it's already compacted with the same `partitionsSpec`. Auto compaction based on parallel indexing can also make things complicated. In parallel indexing with `DynamicPartitionsSpec`, the last segment created by each task can have rows less than `maxRowsPerSegment`.
   
   ### Proposed changes
   
   To address this issue, I propose to store the state of auto compaction in the metadata store, so that auto compaction can search for compaction candidates from the segments which are not compacted yet. 
   
   #### Metadata store change
   
   A new column `meta_payload` will be added to the `segments` table. This will store the below JSON blob per segment for now, but can be extended to store more information in the future if needed.
   
   ```json
   {
     "compactionPartitionsSpec" : {
       "type" : "dynamic",
       "maxRowsPerSegment" : 1000000,
       "maxTotalRows" : null
     }
   }
   ```
   
   This will be deserialized into the following java class.
   
   ```java
   public class SegmentMetadata
   {
     private final PartitionsSpec compactionPartitionsSpec;
   }
   ```
   
   #### Changes in publishing segments
   
   When a _compaction_ task publishes segments, it sends its `partitionsSpec` along with the segments to the overlord (`SegmentTransactionalInsertAction`). The overlord constructs the `SegmentMetadata` properly and stores it in the `segments` table (`IndexerSQLMetadataStorageCoordinator`).
   
   #### Changes in the coordinator
   
   `SQLMetadataSegmentManager` loads `SegmentMetadata` as well as `DataSegment` and keeps them in memory. Since `PartitionsSpec` is not very diverse in general, interning would be useful to reduce memory usage.
   
   `DruidCoordinatorSegmentCompactor` checks that the `SegmentMetadata` is available for a given segment. If it's missing, that segment should be a new segment created by a non-compaction task, which means it will be a compaction candidate. If it exists, the coordinator compares the `compactionPartitionsSpec` in `SegmentMetadata` with that in auto compaction configuration. The segment will be a compaction candidate if it has a different `partitionsSpec`.
   
   ### Rationale
   
   #### Dropped alternative
   
   The coordinator can get the number of rows of each segment from the system schema to find the compaction candidates. However, as mentioned in the motivation section, number of rows is not enough to determine that a given segment needs a compaction or not.
   
   ### Operational impact
   
   The metadata store change requires to alter the `segments` table before the upgrade. To minimize the operational impact and avoid unwanted alter operation, the overlord and the coordinator can check the new column exists in the metadata store on their startup and fall back to the current implementation if it's missing.
   
   For operators who want to use the new stateful auto compaction, we should update the rolling update document to say that table alter is required before upgrade.
   
   ### Test plan
   
   - Will add unit tests
   - Will test in our internal cluster

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org