You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by GitBox <gi...@apache.org> on 2022/04/21 04:12:20 UTC

[GitHub] [flink-table-store] JingsongLi opened a new pull request, #97: [FLINK-27335] Optimize async compaction in MergeTreeWriter

JingsongLi opened a new pull request, #97:
URL: https://github.com/apache/flink-table-store/pull/97

   Currently Full Compaction may cause the writer to be blocked, which has an impact on LogStore latency.
   
   We need to decouple compact and write, compact completely asynchronous.
   But too many files will lead to unstable reads, when too many files, Compaction processing speed can not keep up with Writer, need to back press Writer.
   
   Stop parameter: num-sorted-run.stop-trigger, default 10


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [flink-table-store] tsreaper commented on a diff in pull request #97: [FLINK-27335] Optimize async compaction in MergeTreeWriter

Posted by GitBox <gi...@apache.org>.
tsreaper commented on code in PR #97:
URL: https://github.com/apache/flink-table-store/pull/97#discussion_r858505932


##########
flink-table-store-core/src/main/java/org/apache/flink/table/store/file/mergetree/MergeTreeWriter.java:
##########
@@ -111,7 +114,10 @@ public void write(ValueKind valueKind, RowData key, RowData value) throws Except
 
     private void flush() throws Exception {
         if (memTable.size() > 0) {
-            finishCompaction();
+            if (levels.numberOfSortedRuns() > numSortedRunStopTrigger) {

Review Comment:
   Currently number of sorted runs is limited by `compaction due to file num`. If we add async compaction it is very likely that number of sorted runs exceed the limit. What to do under this scenario?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [flink-table-store] JingsongLi merged pull request #97: [FLINK-27335] Optimize async compaction in MergeTreeWriter

Posted by GitBox <gi...@apache.org>.
JingsongLi merged PR #97:
URL: https://github.com/apache/flink-table-store/pull/97


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [flink-table-store] JingsongLi commented on a diff in pull request #97: [FLINK-27335] Optimize async compaction in MergeTreeWriter

Posted by GitBox <gi...@apache.org>.
JingsongLi commented on code in PR #97:
URL: https://github.com/apache/flink-table-store/pull/97#discussion_r858605411


##########
flink-table-store-core/src/main/java/org/apache/flink/table/store/file/mergetree/MergeTreeWriter.java:
##########
@@ -111,7 +114,10 @@ public void write(ValueKind valueKind, RowData key, RowData value) throws Except
 
     private void flush() throws Exception {
         if (memTable.size() > 0) {
-            finishCompaction();
+            if (levels.numberOfSortedRuns() > numSortedRunStopTrigger) {

Review Comment:
   In fact the biggest benefit is the decoupling of the compactionTrigger parameter and the stopTrigger parameter, giving the user the flexibility to define the behavior.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [flink-table-store] JingsongLi commented on a diff in pull request #97: [FLINK-27335] Optimize async compaction in MergeTreeWriter

Posted by GitBox <gi...@apache.org>.
JingsongLi commented on code in PR #97:
URL: https://github.com/apache/flink-table-store/pull/97#discussion_r854830708


##########
flink-table-store-core/src/main/java/org/apache/flink/table/store/file/mergetree/MergeTreeOptions.java:
##########
@@ -115,6 +125,7 @@ public MergeTreeOptions(
         this.pageSize = pageSize;
         this.targetFileSize = targetFileSize;
         this.numSortedRunMax = numSortedRunMax;
+        this.numSortedRunStopTrigger = numSortedRunStopTrigger;

Review Comment:
   And add document



##########
flink-table-store-core/src/main/java/org/apache/flink/table/store/file/mergetree/MergeTreeOptions.java:
##########
@@ -115,6 +125,7 @@ public MergeTreeOptions(
         this.pageSize = pageSize;
         this.targetFileSize = targetFileSize;
         this.numSortedRunMax = numSortedRunMax;
+        this.numSortedRunStopTrigger = numSortedRunStopTrigger;

Review Comment:
   And add document?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [flink-table-store] tsreaper commented on a diff in pull request #97: [FLINK-27335] Optimize async compaction in MergeTreeWriter

Posted by GitBox <gi...@apache.org>.
tsreaper commented on code in PR #97:
URL: https://github.com/apache/flink-table-store/pull/97#discussion_r858346622


##########
flink-table-store-core/src/main/java/org/apache/flink/table/store/file/mergetree/MergeTreeOptions.java:
##########
@@ -115,6 +125,7 @@ public MergeTreeOptions(
         this.pageSize = pageSize;
         this.targetFileSize = targetFileSize;
         this.numSortedRunMax = numSortedRunMax;
+        this.numSortedRunStopTrigger = Math.max(numSortedRunMax, numSortedRunStopTrigger);

Review Comment:
   Make sure that `numSortedRunMax >= numSortedRunStopTrigger`? Otherwise the number of sorted runs may exceed `numSortedRunMax` which contradicts the description of that config option.



##########
flink-table-store-core/src/test/java/org/apache/flink/table/store/file/mergetree/compact/CompactManagerTest.java:
##########
@@ -197,7 +197,7 @@ private void innerTest(
                 new CompactManager(
                         service, strategy, comparator, 2, testRewriter(expectedDropDelete));
         manager.submitCompaction(levels);
-        manager.finishCompaction(levels);

Review Comment:
   Add tests so that compaction thread will randomly hang to simulate heavy compaction tasks.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [flink-table-store] JingsongLi commented on a diff in pull request #97: [FLINK-27335] Optimize async compaction in MergeTreeWriter

Posted by GitBox <gi...@apache.org>.
JingsongLi commented on code in PR #97:
URL: https://github.com/apache/flink-table-store/pull/97#discussion_r858604620


##########
flink-table-store-core/src/test/java/org/apache/flink/table/store/file/mergetree/compact/CompactManagerTest.java:
##########
@@ -197,7 +197,7 @@ private void innerTest(
                 new CompactManager(
                         service, strategy, comparator, 2, testRewriter(expectedDropDelete));
         manager.submitCompaction(levels);
-        manager.finishCompaction(levels);

Review Comment:
   `MergeTreeTest` can cover this.
   Its compaction is very slow.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [flink-table-store] JingsongLi commented on a diff in pull request #97: [FLINK-27335] Optimize async compaction in MergeTreeWriter

Posted by GitBox <gi...@apache.org>.
JingsongLi commented on code in PR #97:
URL: https://github.com/apache/flink-table-store/pull/97#discussion_r858599817


##########
flink-table-store-core/src/main/java/org/apache/flink/table/store/file/mergetree/MergeTreeOptions.java:
##########
@@ -115,6 +125,7 @@ public MergeTreeOptions(
         this.pageSize = pageSize;
         this.targetFileSize = targetFileSize;
         this.numSortedRunMax = numSortedRunMax;
+        this.numSortedRunStopTrigger = Math.max(numSortedRunMax, numSortedRunStopTrigger);

Review Comment:
   I think we can change the run max name to `numSortedRunCompactionTrigger`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [flink-table-store] JingsongLi commented on a diff in pull request #97: [FLINK-27335] Optimize async compaction in MergeTreeWriter

Posted by GitBox <gi...@apache.org>.
JingsongLi commented on code in PR #97:
URL: https://github.com/apache/flink-table-store/pull/97#discussion_r854826594


##########
flink-table-store-core/src/main/java/org/apache/flink/table/store/file/mergetree/MergeTreeOptions.java:
##########
@@ -115,6 +125,7 @@ public MergeTreeOptions(
         this.pageSize = pageSize;
         this.targetFileSize = targetFileSize;
         this.numSortedRunMax = numSortedRunMax;
+        this.numSortedRunStopTrigger = numSortedRunStopTrigger;

Review Comment:
   check `numSortedRunStopTrigger` should large than `numSortedRunMax`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org