You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@flink.apache.org by GitBox <gi...@apache.org> on 2022/06/06 09:24:52 UTC

[GitHub] [flink-table-store] LadyForest commented on pull request #119: POC for ALTER TABLE ... COMPACT

LadyForest commented on PR #119:
URL: https://github.com/apache/flink-table-store/pull/119#issuecomment-1147245509

As discussed offline, the new implementation has been modified.

* For non-rescale bucket compaction, we don't perform a scan at the planning phase. Instead, we put a flag along with part spec to indicate it is ordinary manual trigger compaction.

* Introduce a new compaction strategy, which "deep cleans" the data layout. The current `UniversalCompaction` is performed on `LevelSortedRun`, which focuses on solving the write amplification issue. However, for manually triggered compaction, we want to eliminate all intersected key ranges, such that after this compaction, the scan can simply perform a concatenation (not merge) read. Meanwhile, we want to compact small files with the best effort. Thus the proposed strategy works as follows.
1. Use `IntervalPartition` algorithm to partition all scanned manifests into different sections (`List<List<SortedRun>>`). The key range between different sections does not overlap. And sorted runs which fall into one section share the key range.
2. As a result, filter the sections which have sorted run size exceeding one will find out all overlapped files. Meanwhile, for a single section(just containing one sorted run), if there are more than two small data files, this section can be picked.
3. **IMPORTANT** compaction should be performed for each section, not across different sections. Thus this strategy may pick a list of `CompactUnit` for a single bucket, which differs from `UniversalCompaction`.

* Introduce a new `PrecommittingSinkWriter` impl to perform dedicated compaction tasks. This writer is responsible for scanning and selecting partition and bucket according to the current sub-task id, and then creating a per-bucket compact writer to submit compaction. Since there's no data shuffled between source and sink, so all the compaction is performed when `SinkWriterOperator#endInput` is invoked.

--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org