You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2023/01/09 04:41:22 UTC

[GitHub] [hudi] trushev opened a new pull request, #7626: [HUDI-5516] Reduce memory footprint on workload with thousand active partitions

trushev opened a new pull request, #7626:
URL: https://github.com/apache/hudi/pull/7626

   This PR significantly reduces memory footprint on workload with thousand active partitions between checkpoints. That workload is relevant with wide checkpoint interval. More specifically, active partition here is a special case of active fileId.
   Write client holds map with write handles to create ReplaceHandle between checkpoints. It leads to `OutOfMemoryError` on the workload because write handle is a huge object.
   Essentially, it is enough to hold only write path instead of the whole handle.
   
   ### Change Logs
   
   1. Released writer in closed create handle. The same approach is used in append and merge handles.
   1. Introduced `FlinkClosedHandle` with the lowest memory footprint. It is needed because create handle is a huge object even with released writer.
   1. Removed append handle from handle map because it is not used anyway. It reduces memory footprint and fixes potential NPE issue
   1. Replaced `HoodieWriteHandle<?, ?, ?, ?>` with `MiniBatchHandle` in flink modules. It is needed because lightweight `FlinkClosedHandle` implements `MiniBatchHandle` and does not extend `HoodieWriteHandle`. Refactoring is correct because all flink's handlers implement `MiniBatchHandle`.
   
   ### Impact
   
   Workload test with `-Xmx2048m`
   ```SQL
   create table source (
       `id` int,
       `data` string
   ) with (
       'connector' = 'datagen',
       'rows-per-second' = '100',
       'fields.id.kind' = 'sequence',
       'fields.id.start' = '0',
       'fields.id.end' = '3000'
   );
   create table sink (
       `id` int primary key,
       `data` string,
       `part` string
   ) partitioned by (`part`) with (
       'connector' = 'hudi',
       'path' = '/tmp/sink',
       'write.batch.size' = '0.001',  -- 1024 bytes
       'write.task.max.size' = '101.001',  -- 101.001MB
       'write.merge.max_memory' = '1'  -- 1024 bytes
   );
   
   insert into sink select `id`, `data`, concat('part', cast(`id` as string)) as `part` from source;
   ```
   
   #### Before
   
   Job failure with `OutOfMemoryError` on partition 550. Heap dump says that handle map holds 1.4 GB
   
   <img width="1300" alt="Снимок экрана 2023-01-09 в 11 13 13" src="https://user-images.githubusercontent.com/42293632/211240084-c1c136d9-b8bf-46c2-9dde-48404fc5f38a.png">
   
   ![fix-oom-before](https://user-images.githubusercontent.com/42293632/211239403-d0609d83-e56d-4326-9d5d-64db3c80bce9.png)
   
   #### After 
   3000 rows written
   ![fix-oom-after](https://user-images.githubusercontent.com/42293632/211239453-4898444f-1b76-4132-8bfc-891c716ca92a.png)
   
   ### Risk level (write none, low medium or high below)
   
   Low
   
   ### Documentation Update
   
   _Describe any necessary documentation update if there is any new feature, config, or user-facing change_
   
   - _The config description must be updated if new configs are added or the default value of the configs are changed_
   - _Any new feature or user-facing change requires updating the Hudi website. Please create a Jira ticket, attach the
     ticket number here and follow the [instruction](https://hudi.apache.org/contribute/developer-setup#website) to make
     changes to the website._
   
   ### Contributor's checklist
   
   - [ ] Read through [contributor's guide](https://hudi.apache.org/contribute/how-to-contribute)
   - [ ] Change Logs and Impact were stated clearly
   - [ ] Adequate tests were added if applicable
   - [ ] CI passed
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] trushev commented on a diff in pull request #7626: [HUDI-5516] Reduce memory footprint on workload with thousand active partitions

Posted by GitBox <gi...@apache.org>.
trushev commented on code in PR #7626:
URL: https://github.com/apache/hudi/pull/7626#discussion_r1067712881


##########
hudi-client/hudi-flink-client/src/main/java/org/apache/hudi/client/HoodieFlinkWriteClient.java:
##########
@@ -94,7 +94,7 @@
    * FileID to write handle mapping in order to record the write handles for each file group,
    * so that we can append the mini-batch data buffer incrementally.
    */
-  private final Map<String, HoodieWriteHandle<?, ?, ?, ?>> bucketToHandles;
+  private final Map<String, MiniBatchHandle> bucketToHandles;
 

Review Comment:
   Thanks, that is a good point, I'll check it



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #7626: [HUDI-5516] Reduce memory footprint on workload with thousand active partitions

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #7626:
URL: https://github.com/apache/hudi/pull/7626#issuecomment-1386565755

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "5a1b3a0d551128fd1f54a8c1b43c021330b7d056",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14184",
       "triggerID" : "5a1b3a0d551128fd1f54a8c1b43c021330b7d056",
       "triggerType" : "PUSH"
     }, {
       "hash" : "9cdfeaad2dde6819cf04e9db4395115cf46c2f35",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14221",
       "triggerID" : "9cdfeaad2dde6819cf04e9db4395115cf46c2f35",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fad68a69ecf7e7eba8d43307e1b0fa9da6244857",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14236",
       "triggerID" : "fad68a69ecf7e7eba8d43307e1b0fa9da6244857",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ae0b2c787c8e3afd7f9a3f6cc04676f910373657",
       "status" : "CANCELED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14381",
       "triggerID" : "ae0b2c787c8e3afd7f9a3f6cc04676f910373657",
       "triggerType" : "PUSH"
     }, {
       "hash" : "784dd7c8e7f8d6b7013071df04bfb57121b1d6c9",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14382",
       "triggerID" : "784dd7c8e7f8d6b7013071df04bfb57121b1d6c9",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * ae0b2c787c8e3afd7f9a3f6cc04676f910373657 Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14381) 
   * 784dd7c8e7f8d6b7013071df04bfb57121b1d6c9 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14382) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] trushev commented on pull request #7626: [HUDI-5516] Reduce memory footprint on workload with thousand active partitions

Posted by GitBox <gi...@apache.org>.
trushev commented on PR #7626:
URL: https://github.com/apache/hudi/pull/7626#issuecomment-1396449286

   > Thanks for great effort @trushev, can we revert the changes/refactoring for `#performWriteOperation`, it is not related with this issue, and we can address it in another PR.
   > 
   > We can still abstract out some utilities for the `try-finally-close` handling.
   
   Thank you, I replaced the refactoring with simple `try-finally-close` and rebased branch on the latest master


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] trushev commented on pull request #7626: [HUDI-5516] Reduce memory footprint on workload with thousand active partitions

Posted by GitBox <gi...@apache.org>.
trushev commented on PR #7626:
URL: https://github.com/apache/hudi/pull/7626#issuecomment-1386515641

   > The thing which I want to share is that caching write handles could take a lot of memory, because each handle obtains an instance of `HoodieTable`, and there is a `viewManager` in every `HoodieTable`, which will load all pending compaction plans from Hudi timeline when the `FileSystemViewManager#getFileSystemView` is called.
   
   @TengHuo  I've reworked this PR, we don't cache handles at all now. If it's possible could you pls apply this commit and verify that your scenario is fixed?
    


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #7626: [HUDI-5516] Reduce memory footprint on workload with thousand active partitions

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #7626:
URL: https://github.com/apache/hudi/pull/7626#issuecomment-1386860644

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "5a1b3a0d551128fd1f54a8c1b43c021330b7d056",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14184",
       "triggerID" : "5a1b3a0d551128fd1f54a8c1b43c021330b7d056",
       "triggerType" : "PUSH"
     }, {
       "hash" : "9cdfeaad2dde6819cf04e9db4395115cf46c2f35",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14221",
       "triggerID" : "9cdfeaad2dde6819cf04e9db4395115cf46c2f35",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fad68a69ecf7e7eba8d43307e1b0fa9da6244857",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14236",
       "triggerID" : "fad68a69ecf7e7eba8d43307e1b0fa9da6244857",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ae0b2c787c8e3afd7f9a3f6cc04676f910373657",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14381",
       "triggerID" : "ae0b2c787c8e3afd7f9a3f6cc04676f910373657",
       "triggerType" : "PUSH"
     }, {
       "hash" : "784dd7c8e7f8d6b7013071df04bfb57121b1d6c9",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14382",
       "triggerID" : "784dd7c8e7f8d6b7013071df04bfb57121b1d6c9",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 784dd7c8e7f8d6b7013071df04bfb57121b1d6c9 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14382) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] danny0405 commented on pull request #7626: [HUDI-5516] Reduce memory footprint on workload with thousand active partitions

Posted by GitBox <gi...@apache.org>.
danny0405 commented on PR #7626:
URL: https://github.com/apache/hudi/pull/7626#issuecomment-1397828189

   All the Flink related tests are passed so would merge the PR soon ~


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] trushev commented on a diff in pull request #7626: [HUDI-5516] Reduce memory footprint on workload with thousand active partitions

Posted by GitBox <gi...@apache.org>.
trushev commented on code in PR #7626:
URL: https://github.com/apache/hudi/pull/7626#discussion_r1071735088


##########
hudi-client/hudi-flink-client/src/main/java/org/apache/hudi/client/HoodieFlinkWriteClient.java:
##########
@@ -94,7 +94,7 @@
    * FileID to write handle mapping in order to record the write handles for each file group,
    * so that we can append the mini-batch data buffer incrementally.
    */
-  private final Map<String, HoodieWriteHandle<?, ?, ?, ?>> bucketToHandles;
+  private final Map<String, MiniBatchHandle> bucketToHandles;
 

Review Comment:
   I think you are right. There is no need to store writers because they closed by `HoodieConsumer.finish()`. So if we have `List<WriteStatus> writeStatus` then the writer is already closed. We should keep only one `currentWriter` to handle exception during write operation. I'm preparing updated fix now



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] TengHuo commented on a diff in pull request #7626: [HUDI-5516] Reduce memory footprint on workload with thousand active partitions

Posted by GitBox <gi...@apache.org>.
TengHuo commented on code in PR #7626:
URL: https://github.com/apache/hudi/pull/7626#discussion_r1072031352


##########
hudi-client/hudi-flink-client/src/main/java/org/apache/hudi/client/BucketHandles.java:
##########
@@ -0,0 +1,107 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hudi.client;
+
+import org.apache.hudi.common.util.Option;
+import org.apache.hudi.common.util.VisibleForTesting;
+import org.apache.hudi.io.HoodieWriteHandle;
+import org.apache.hudi.io.MiniBatchHandle;
+
+import org.apache.hadoop.fs.Path;
+
+import java.util.HashMap;
+import java.util.Map;
+
+/**
+ * This class is responsible for keeping write handles between checkpoints.
+ * It keeps paths instead of closed handles to reduce memory footprint.
+ * These write paths are used to create ReplaceHandle.
+ */
+public final class BucketHandles {
+  private final Map<String, MiniBatchHandle> fileToHandle;
+  private final Map<String, Path> fileToPath;
+

Review Comment:
   Got it, np.
   
   Then I will try to implement a temporary fix in our internal version.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #7626: [HUDI-5516] Reduce memory footprint on workload with thousand active partitions

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #7626:
URL: https://github.com/apache/hudi/pull/7626#issuecomment-1397582224

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "5a1b3a0d551128fd1f54a8c1b43c021330b7d056",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14184",
       "triggerID" : "5a1b3a0d551128fd1f54a8c1b43c021330b7d056",
       "triggerType" : "PUSH"
     }, {
       "hash" : "9cdfeaad2dde6819cf04e9db4395115cf46c2f35",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14221",
       "triggerID" : "9cdfeaad2dde6819cf04e9db4395115cf46c2f35",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fad68a69ecf7e7eba8d43307e1b0fa9da6244857",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14236",
       "triggerID" : "fad68a69ecf7e7eba8d43307e1b0fa9da6244857",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ae0b2c787c8e3afd7f9a3f6cc04676f910373657",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14381",
       "triggerID" : "ae0b2c787c8e3afd7f9a3f6cc04676f910373657",
       "triggerType" : "PUSH"
     }, {
       "hash" : "784dd7c8e7f8d6b7013071df04bfb57121b1d6c9",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14382",
       "triggerID" : "784dd7c8e7f8d6b7013071df04bfb57121b1d6c9",
       "triggerType" : "PUSH"
     }, {
       "hash" : "1cfe72cb4a06010b19c96abf25b3ebf9d7f6e895",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14441",
       "triggerID" : "1cfe72cb4a06010b19c96abf25b3ebf9d7f6e895",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 1cfe72cb4a06010b19c96abf25b3ebf9d7f6e895 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14441) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #7626: [HUDI-5516] Reduce memory footprint on workload with thousand active partitions

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #7626:
URL: https://github.com/apache/hudi/pull/7626#issuecomment-1378275192

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "5a1b3a0d551128fd1f54a8c1b43c021330b7d056",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14184",
       "triggerID" : "5a1b3a0d551128fd1f54a8c1b43c021330b7d056",
       "triggerType" : "PUSH"
     }, {
       "hash" : "9cdfeaad2dde6819cf04e9db4395115cf46c2f35",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14221",
       "triggerID" : "9cdfeaad2dde6819cf04e9db4395115cf46c2f35",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fad68a69ecf7e7eba8d43307e1b0fa9da6244857",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14236",
       "triggerID" : "fad68a69ecf7e7eba8d43307e1b0fa9da6244857",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 9cdfeaad2dde6819cf04e9db4395115cf46c2f35 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14221) 
   * fad68a69ecf7e7eba8d43307e1b0fa9da6244857 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14236) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] danny0405 commented on a diff in pull request #7626: [HUDI-5516] Reduce memory footprint on workload with thousand active partitions

Posted by GitBox <gi...@apache.org>.
danny0405 commented on code in PR #7626:
URL: https://github.com/apache/hudi/pull/7626#discussion_r1080788718


##########
hudi-client/hudi-flink-client/src/main/java/org/apache/hudi/client/HoodieFlinkWriteClient.java:
##########
@@ -496,4 +474,67 @@ private List<String> getAllExistingFileIds(HoodieFlinkTable<T> table, String par
     // because new commit is not complete. it is safe to mark all existing file Ids as old files
     return table.getSliceView().getLatestFileSlices(partitionPath).map(FileSlice::getFileId).distinct().collect(Collectors.toList());
   }
+
+  private List<WriteStatus> performWriteOperation(
+      List<HoodieRecord<T>> records,
+      String instantTime,
+      WriteOperationType operationType,
+      WriteOperationAction<T> writeOperationAction
+  ) {
+    HoodieFlinkTable<T> table = initTable(operationType, instantTime);
+    return performWriteOperation(table, records, instantTime, operationType, writeOperationAction);
+  }
+
+  private List<WriteStatus> performWriteOperation(
+      HoodieFlinkTable<T> table,
+      List<HoodieRecord<T>> records,
+      String instantTime,
+      WriteOperationType operationType,
+      WriteOperationAction<T> writeOperationAction
+  ) {
+    HoodieWriteMetadata<List<WriteStatus>> result;
+    HoodieWriteHandle<?, ?, ?, ?> writeHandle = null;
+    try {
+      writeHandle = getOrCreateWriteHandle(records.get(0), getConfig(), instantTime, table, records.listIterator());
+      result = writeOperationAction.perform(table, writeHandle);
+    } finally {
+      if (writeHandle != null) {
+        // ensure the writer is closed
+        ((MiniBatchHandle) writeHandle).closeGracefully();

Review Comment:
   Check for `writeHandle != null` seems better?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] trushev commented on pull request #7626: [HUDI-5516] Reduce memory footprint on workload with thousand active partitions

Posted by GitBox <gi...@apache.org>.
trushev commented on PR #7626:
URL: https://github.com/apache/hudi/pull/7626#issuecomment-1396468460

   @TengHuo 
   I tried the following workload with MOR table, 2000 partitions and compaction (checkpoint here triggers compaction)
   ```java
   public class TestPartitionsWorkloadWithCompaction extends TestWriteMergeOnReadWithCompact {
     @Test
     public void write2000partitions() throws Exception {
       int partitionCount = 2000; // aka rowCount
       List<RowData> oneRowPerPartitionData = IntStream.range(0, partitionCount).mapToObj(counter -> TestData.insertRow(
           StringData.fromString("id" + counter),
           StringData.fromString("Name"),
           0,
           TimestampData.fromEpochMillis(counter),
           StringData.fromString("par" + counter))
       ).collect(Collectors.toList());
       conf.setDouble(FlinkOptions.WRITE_BATCH_SIZE, 0.001); // 1024 bytes
       conf.setDouble(FlinkOptions.WRITE_TASK_MAX_SIZE, 101.001); // 101.001MB - 1024 bytes
       conf.setInteger(FlinkOptions.WRITE_MERGE_MAX_MEMORY, 1); // 1024 bytes
       preparePipeline(conf)
           .consume(oneRowPerPartitionData)
           .assertNextEvent()
           .checkpoint(1)
           .checkpointComplete(1)
           .end();
     }
   }
   ```
   I guess your problem is fixed by this PR
   
   <img width="714" alt="Снимок экрана 2023-01-19 в 12 39 46" src="https://user-images.githubusercontent.com/42293632/213364715-c0a4c125-7415-4cab-8d94-6916ba85172e.png">
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #7626: [HUDI-5516] Reduce memory footprint on workload with thousand active partitions

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #7626:
URL: https://github.com/apache/hudi/pull/7626#issuecomment-1396492779

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "5a1b3a0d551128fd1f54a8c1b43c021330b7d056",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14184",
       "triggerID" : "5a1b3a0d551128fd1f54a8c1b43c021330b7d056",
       "triggerType" : "PUSH"
     }, {
       "hash" : "9cdfeaad2dde6819cf04e9db4395115cf46c2f35",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14221",
       "triggerID" : "9cdfeaad2dde6819cf04e9db4395115cf46c2f35",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fad68a69ecf7e7eba8d43307e1b0fa9da6244857",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14236",
       "triggerID" : "fad68a69ecf7e7eba8d43307e1b0fa9da6244857",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ae0b2c787c8e3afd7f9a3f6cc04676f910373657",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14381",
       "triggerID" : "ae0b2c787c8e3afd7f9a3f6cc04676f910373657",
       "triggerType" : "PUSH"
     }, {
       "hash" : "784dd7c8e7f8d6b7013071df04bfb57121b1d6c9",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14382",
       "triggerID" : "784dd7c8e7f8d6b7013071df04bfb57121b1d6c9",
       "triggerType" : "PUSH"
     }, {
       "hash" : "1cfe72cb4a06010b19c96abf25b3ebf9d7f6e895",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14441",
       "triggerID" : "1cfe72cb4a06010b19c96abf25b3ebf9d7f6e895",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 784dd7c8e7f8d6b7013071df04bfb57121b1d6c9 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14382) 
   * 1cfe72cb4a06010b19c96abf25b3ebf9d7f6e895 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14441) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] trushev commented on a diff in pull request #7626: [HUDI-5516] Reduce memory footprint on workload with thousand active partitions

Posted by GitBox <gi...@apache.org>.
trushev commented on code in PR #7626:
URL: https://github.com/apache/hudi/pull/7626#discussion_r1065499424


##########
hudi-client/hudi-flink-client/src/main/java/org/apache/hudi/client/HoodieFlinkWriteClient.java:
##########
@@ -94,7 +94,7 @@
    * FileID to write handle mapping in order to record the write handles for each file group,
    * so that we can append the mini-batch data buffer incrementally.
    */
-  private final Map<String, HoodieWriteHandle<?, ?, ?, ?>> bucketToHandles;
+  private final Map<String, MiniBatchHandle> bucketToHandles;
 

Review Comment:
   - Reverted replacing `HoodieWriteHandle` with `MiniBatchHandle`.
   - Reverted `FlinkClosedHandle`
   - Introduced class `BucketHandles` replacing with `Map<String, HoodieWriteHandle<?, ?, ?, ?>> bucketToHandles` to keep handles and write paths
   - Added test `BucketHandles`
   
   Workload benchmark results are the same



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #7626: [HUDI-5516] Reduce memory footprint on workload with thousand active partitions

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #7626:
URL: https://github.com/apache/hudi/pull/7626#issuecomment-1386560274

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "5a1b3a0d551128fd1f54a8c1b43c021330b7d056",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14184",
       "triggerID" : "5a1b3a0d551128fd1f54a8c1b43c021330b7d056",
       "triggerType" : "PUSH"
     }, {
       "hash" : "9cdfeaad2dde6819cf04e9db4395115cf46c2f35",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14221",
       "triggerID" : "9cdfeaad2dde6819cf04e9db4395115cf46c2f35",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fad68a69ecf7e7eba8d43307e1b0fa9da6244857",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14236",
       "triggerID" : "fad68a69ecf7e7eba8d43307e1b0fa9da6244857",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ae0b2c787c8e3afd7f9a3f6cc04676f910373657",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14381",
       "triggerID" : "ae0b2c787c8e3afd7f9a3f6cc04676f910373657",
       "triggerType" : "PUSH"
     }, {
       "hash" : "784dd7c8e7f8d6b7013071df04bfb57121b1d6c9",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "784dd7c8e7f8d6b7013071df04bfb57121b1d6c9",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * fad68a69ecf7e7eba8d43307e1b0fa9da6244857 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14236) 
   * ae0b2c787c8e3afd7f9a3f6cc04676f910373657 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14381) 
   * 784dd7c8e7f8d6b7013071df04bfb57121b1d6c9 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #7626: [HUDI-5516] Reduce memory footprint on workload with thousand active partitions

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #7626:
URL: https://github.com/apache/hudi/pull/7626#issuecomment-1378271217

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "5a1b3a0d551128fd1f54a8c1b43c021330b7d056",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14184",
       "triggerID" : "5a1b3a0d551128fd1f54a8c1b43c021330b7d056",
       "triggerType" : "PUSH"
     }, {
       "hash" : "9cdfeaad2dde6819cf04e9db4395115cf46c2f35",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14221",
       "triggerID" : "9cdfeaad2dde6819cf04e9db4395115cf46c2f35",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fad68a69ecf7e7eba8d43307e1b0fa9da6244857",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "fad68a69ecf7e7eba8d43307e1b0fa9da6244857",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 9cdfeaad2dde6819cf04e9db4395115cf46c2f35 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14221) 
   * fad68a69ecf7e7eba8d43307e1b0fa9da6244857 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] trushev commented on pull request #7626: [HUDI-5516] Reduce memory footprint on workload with thousand active partitions

Posted by GitBox <gi...@apache.org>.
trushev commented on PR #7626:
URL: https://github.com/apache/hudi/pull/7626#issuecomment-1384831292

   > May I ask if we can lazy load `HoodieTableFileSystemView` in `PriorityBasedFileSystemView` when creating `FlinkAppendHandle`? It can also reduce memory usage for active partitions.
   
    @TengHuo thank you for the report. I'll try to reproduce the scenario and consider it here
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #7626: [HUDI-5516] Reduce memory footprint on workload with thousand active partitions

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #7626:
URL: https://github.com/apache/hudi/pull/7626#issuecomment-1375133022

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "5a1b3a0d551128fd1f54a8c1b43c021330b7d056",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14184",
       "triggerID" : "5a1b3a0d551128fd1f54a8c1b43c021330b7d056",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 5a1b3a0d551128fd1f54a8c1b43c021330b7d056 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14184) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] TengHuo commented on a diff in pull request #7626: [HUDI-5516] Reduce memory footprint on workload with thousand active partitions

Posted by GitBox <gi...@apache.org>.
TengHuo commented on code in PR #7626:
URL: https://github.com/apache/hudi/pull/7626#discussion_r1072001434


##########
hudi-client/hudi-flink-client/src/main/java/org/apache/hudi/client/BucketHandles.java:
##########
@@ -0,0 +1,107 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hudi.client;
+
+import org.apache.hudi.common.util.Option;
+import org.apache.hudi.common.util.VisibleForTesting;
+import org.apache.hudi.io.HoodieWriteHandle;
+import org.apache.hudi.io.MiniBatchHandle;
+
+import org.apache.hadoop.fs.Path;
+
+import java.util.HashMap;
+import java.util.Map;
+
+/**
+ * This class is responsible for keeping write handles between checkpoints.
+ * It keeps paths instead of closed handles to reduce memory footprint.
+ * These write paths are used to create ReplaceHandle.
+ */
+public final class BucketHandles {
+  private final Map<String, MiniBatchHandle> fileToHandle;
+  private final Map<String, Path> fileToPath;
+

Review Comment:
   Think we can add a `HoodieTable` attribute here, and share this table object in handles.
   
   I can raise a PR to your branch. What do you think?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #7626: [HUDI-5516] Reduce memory footprint on workload with thousand active partitions

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #7626:
URL: https://github.com/apache/hudi/pull/7626#issuecomment-1386553951

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "5a1b3a0d551128fd1f54a8c1b43c021330b7d056",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14184",
       "triggerID" : "5a1b3a0d551128fd1f54a8c1b43c021330b7d056",
       "triggerType" : "PUSH"
     }, {
       "hash" : "9cdfeaad2dde6819cf04e9db4395115cf46c2f35",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14221",
       "triggerID" : "9cdfeaad2dde6819cf04e9db4395115cf46c2f35",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fad68a69ecf7e7eba8d43307e1b0fa9da6244857",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14236",
       "triggerID" : "fad68a69ecf7e7eba8d43307e1b0fa9da6244857",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ae0b2c787c8e3afd7f9a3f6cc04676f910373657",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14381",
       "triggerID" : "ae0b2c787c8e3afd7f9a3f6cc04676f910373657",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * fad68a69ecf7e7eba8d43307e1b0fa9da6244857 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14236) 
   * ae0b2c787c8e3afd7f9a3f6cc04676f910373657 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14381) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] TengHuo commented on pull request #7626: [HUDI-5516] Reduce memory footprint on workload with thousand active partitions

Posted by GitBox <gi...@apache.org>.
TengHuo commented on PR #7626:
URL: https://github.com/apache/hudi/pull/7626#issuecomment-1397989682

   > @TengHuo I tried the following workload with MOR table, 2000 partitions and compaction (checkpoint here triggers compaction)
   
   Got it, thanks so much @trushev 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] trushev commented on a diff in pull request #7626: [HUDI-5516] Reduce memory footprint on workload with thousand active partitions

Posted by GitBox <gi...@apache.org>.
trushev commented on code in PR #7626:
URL: https://github.com/apache/hudi/pull/7626#discussion_r1065293780


##########
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/StreamWriteFunction.java:
##########
@@ -449,6 +450,7 @@ private boolean flushBucket(DataBucket bucket) {
 
     this.eventGateway.sendEventToCoordinator(event);
     writeStatuses.addAll(writeStatus);
+    writeClient.cleanHandle(bucket.fileID);
     return true;
   }

Review Comment:
   You mean do we need to clean all handles `this.writeClient.cleanHandles()` in `flushRemaining` as we have cleaned handle here?
   
   To be honest, I'm not sure. The handle map would hold unlimited handles. Even lightweight closed handles could exceed heap size. I guess LRU cache solves this problem but not sure about benefits of such approach here
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #7626: [HUDI-5516] Reduce memory footprint on workload with thousand active partitions

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #7626:
URL: https://github.com/apache/hudi/pull/7626#issuecomment-1377542041

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "5a1b3a0d551128fd1f54a8c1b43c021330b7d056",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14184",
       "triggerID" : "5a1b3a0d551128fd1f54a8c1b43c021330b7d056",
       "triggerType" : "PUSH"
     }, {
       "hash" : "9cdfeaad2dde6819cf04e9db4395115cf46c2f35",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14221",
       "triggerID" : "9cdfeaad2dde6819cf04e9db4395115cf46c2f35",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 9cdfeaad2dde6819cf04e9db4395115cf46c2f35 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14221) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #7626: [HUDI-5516] Reduce memory footprint on workload with thousand active partitions

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #7626:
URL: https://github.com/apache/hudi/pull/7626#issuecomment-1375130533

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "5a1b3a0d551128fd1f54a8c1b43c021330b7d056",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "5a1b3a0d551128fd1f54a8c1b43c021330b7d056",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 5a1b3a0d551128fd1f54a8c1b43c021330b7d056 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] TengHuo commented on pull request #7626: [HUDI-5516] Reduce memory footprint on workload with thousand active partitions

Posted by GitBox <gi...@apache.org>.
TengHuo commented on PR #7626:
URL: https://github.com/apache/hudi/pull/7626#issuecomment-1384918691

   > > May I ask if we can lazy load `HoodieTableFileSystemView` in `PriorityBasedFileSystemView` when creating `FlinkAppendHandle`? It can also reduce memory usage for active partitions.
   > 
   > @TengHuo thank you for the report. I'll try to reproduce the scenario and consider it here
   
   Thanks @trushev for your reply. Sorry, my initial idea may not be a proper way to solve this issue. 
   
   The thing which I want to share is that caching write handles could take a lot of memory, because each handle obtains an instance of `HoodieTable`, and there is a `viewManager` in every `HoodieTable`, which will load all pending compaction plans from Hudi timeline when the `FileSystemViewManager#getFileSystemView` is called.
   
   So I'm thinking if it is feasible that we can share one instance of `HoodieTable` when creating a new handle in the method `HoodieFlinkWriteClient#upsert` and other similar methods? As you have implemented this `BucketHandles` for caching all active handles, we can just create one table instance and drop this instance when all handles are removed from the map.
   
   For reproducing this memory issue I got, you need use MOR table, and setup thousands of partitions, as long as one compaction plan generated, you will find each `FlinkAppendHandle` will take a lot of memory because of `CompactionOperation `.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] trushev commented on a diff in pull request #7626: [HUDI-5516] Reduce memory footprint on workload with thousand active partitions

Posted by GitBox <gi...@apache.org>.
trushev commented on code in PR #7626:
URL: https://github.com/apache/hudi/pull/7626#discussion_r1071735088


##########
hudi-client/hudi-flink-client/src/main/java/org/apache/hudi/client/HoodieFlinkWriteClient.java:
##########
@@ -94,7 +94,7 @@
    * FileID to write handle mapping in order to record the write handles for each file group,
    * so that we can append the mini-batch data buffer incrementally.
    */
-  private final Map<String, HoodieWriteHandle<?, ?, ?, ?>> bucketToHandles;
+  private final Map<String, MiniBatchHandle> bucketToHandles;
 

Review Comment:
   I think you are right. There is no need to store writers because they closed by `HoodieConsumer.finish()`. So if we have `List<WriteStatus> writeStatus` then the writer is definitely already closed. We should keep only one `currentWriter` to handle exception during write operation. I'm preparing updated fix now



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #7626: [HUDI-5516] Reduce memory footprint on workload with thousand active partitions

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #7626:
URL: https://github.com/apache/hudi/pull/7626#issuecomment-1375483526

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "5a1b3a0d551128fd1f54a8c1b43c021330b7d056",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14184",
       "triggerID" : "5a1b3a0d551128fd1f54a8c1b43c021330b7d056",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 5a1b3a0d551128fd1f54a8c1b43c021330b7d056 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14184) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #7626: [HUDI-5516] Reduce memory footprint on workload with thousand active partitions

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #7626:
URL: https://github.com/apache/hudi/pull/7626#issuecomment-1378613152

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "5a1b3a0d551128fd1f54a8c1b43c021330b7d056",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14184",
       "triggerID" : "5a1b3a0d551128fd1f54a8c1b43c021330b7d056",
       "triggerType" : "PUSH"
     }, {
       "hash" : "9cdfeaad2dde6819cf04e9db4395115cf46c2f35",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14221",
       "triggerID" : "9cdfeaad2dde6819cf04e9db4395115cf46c2f35",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fad68a69ecf7e7eba8d43307e1b0fa9da6244857",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14236",
       "triggerID" : "fad68a69ecf7e7eba8d43307e1b0fa9da6244857",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * fad68a69ecf7e7eba8d43307e1b0fa9da6244857 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14236) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #7626: [HUDI-5516] Reduce memory footprint on workload with thousand active partitions

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #7626:
URL: https://github.com/apache/hudi/pull/7626#issuecomment-1376966521

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "5a1b3a0d551128fd1f54a8c1b43c021330b7d056",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14184",
       "triggerID" : "5a1b3a0d551128fd1f54a8c1b43c021330b7d056",
       "triggerType" : "PUSH"
     }, {
       "hash" : "9cdfeaad2dde6819cf04e9db4395115cf46c2f35",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14221",
       "triggerID" : "9cdfeaad2dde6819cf04e9db4395115cf46c2f35",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 5a1b3a0d551128fd1f54a8c1b43c021330b7d056 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14184) 
   * 9cdfeaad2dde6819cf04e9db4395115cf46c2f35 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14221) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] trushev commented on a diff in pull request #7626: [HUDI-5516] Reduce memory footprint on workload with thousand active partitions

Posted by GitBox <gi...@apache.org>.
trushev commented on code in PR #7626:
URL: https://github.com/apache/hudi/pull/7626#discussion_r1065294170


##########
hudi-client/hudi-flink-client/src/main/java/org/apache/hudi/client/HoodieFlinkWriteClient.java:
##########
@@ -94,7 +94,7 @@
    * FileID to write handle mapping in order to record the write handles for each file group,
    * so that we can append the mini-batch data buffer incrementally.
    */
-  private final Map<String, HoodieWriteHandle<?, ?, ?, ?>> bucketToHandles;
+  private final Map<String, MiniBatchHandle> bucketToHandles;
 

Review Comment:
   Thanks, I'll try to rework it



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] danny0405 commented on a diff in pull request #7626: [HUDI-5516] Reduce memory footprint on workload with thousand active partitions

Posted by GitBox <gi...@apache.org>.
danny0405 commented on code in PR #7626:
URL: https://github.com/apache/hudi/pull/7626#discussion_r1064456753


##########
hudi-client/hudi-flink-client/src/main/java/org/apache/hudi/client/HoodieFlinkWriteClient.java:
##########
@@ -94,7 +94,7 @@
    * FileID to write handle mapping in order to record the write handles for each file group,
    * so that we can append the mini-batch data buffer incrementally.
    */
-  private final Map<String, HoodieWriteHandle<?, ?, ?, ?>> bucketToHandles;
+  private final Map<String, MiniBatchHandle> bucketToHandles;
 

Review Comment:
   Can we just cache the write path instead of write handles to not introduce too many unnecessary changes.



##########
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/StreamWriteFunction.java:
##########
@@ -449,6 +450,7 @@ private boolean flushBucket(DataBucket bucket) {
 
     this.eventGateway.sendEventToCoordinator(event);
     writeStatuses.addAll(writeStatus);
+    writeClient.cleanHandle(bucket.fileID);
     return true;
   }

Review Comment:
   Do we need to clean the handles eagerly for `#flushRemaining` ?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] danny0405 commented on a diff in pull request #7626: [HUDI-5516] Reduce memory footprint on workload with thousand active partitions

Posted by GitBox <gi...@apache.org>.
danny0405 commented on code in PR #7626:
URL: https://github.com/apache/hudi/pull/7626#discussion_r1066607740


##########
hudi-client/hudi-flink-client/src/main/java/org/apache/hudi/client/HoodieFlinkWriteClient.java:
##########
@@ -94,7 +94,7 @@
    * FileID to write handle mapping in order to record the write handles for each file group,
    * so that we can append the mini-batch data buffer incrementally.
    */
-  private final Map<String, HoodieWriteHandle<?, ?, ?, ?>> bucketToHandles;
+  private final Map<String, MiniBatchHandle> bucketToHandles;
 

Review Comment:
   We need to clarify whether `HoodieFlinkWriteClient#cleanHandlesGracefully` is still valid. I wrote the logic to want it work like post task killing hook, if a writing task was force killed when it is writing a file, how can we handle this more gracefully.
   
   For most of the case, if the writing task was force killed or canceled, the file writer should had been closed right? I'm skeptical whether it is still needed.
   
   It is related with this issue, because if `cleanHandlesGracefully` is not valid anymore, there is no need to cache any write handles at all.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #7626: [HUDI-5516] Reduce memory footprint on workload with thousand active partitions

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #7626:
URL: https://github.com/apache/hudi/pull/7626#issuecomment-1386503348

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "5a1b3a0d551128fd1f54a8c1b43c021330b7d056",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14184",
       "triggerID" : "5a1b3a0d551128fd1f54a8c1b43c021330b7d056",
       "triggerType" : "PUSH"
     }, {
       "hash" : "9cdfeaad2dde6819cf04e9db4395115cf46c2f35",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14221",
       "triggerID" : "9cdfeaad2dde6819cf04e9db4395115cf46c2f35",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fad68a69ecf7e7eba8d43307e1b0fa9da6244857",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14236",
       "triggerID" : "fad68a69ecf7e7eba8d43307e1b0fa9da6244857",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ae0b2c787c8e3afd7f9a3f6cc04676f910373657",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "ae0b2c787c8e3afd7f9a3f6cc04676f910373657",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * fad68a69ecf7e7eba8d43307e1b0fa9da6244857 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14236) 
   * ae0b2c787c8e3afd7f9a3f6cc04676f910373657 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] TengHuo commented on pull request #7626: [HUDI-5516] Reduce memory footprint on workload with thousand active partitions

Posted by GitBox <gi...@apache.org>.
TengHuo commented on PR #7626:
URL: https://github.com/apache/hudi/pull/7626#issuecomment-1384806549

   Hi @trushev 
   
   Nice feature! We are suffering a similar memory exception in our Flink Hudi MOR pipeline. We found a heap OOM exception and abnormal GC activities in task managers.
   
   Task manager GC metrics panel
   
   ![tm_gc](https://user-images.githubusercontent.com/7539060/212806036-e3a83720-ba72-42b0-9247-af9ca0913b0c.png)
   
   After checking, we noticed that the size of `CompactionOperation` in memory is unusually big, and it should be caused by `HoodieTableFileSystemView`, because each instance of `HoodieTableFileSystemView` will load all pending compaction plans from the timeline to memory.
   
   This is the part of task manager heap histogram showing the abnormal memory usage caused by `CompactionOperation`.
   
   ```log
      9:       2091712       83668480  org.apache.hudi.common.model.CompactionOperation
    479:            27           4752  org.apache.hudi.io.FlinkAppendHandle
    686:            28           2016  org.apache.hudi.common.table.view.HoodieTableFileSystemView
    800:            28           1344  org.apache.hudi.common.table.view.RemoteHoodieTableFileSystemView
    806:            27           1296  org.apache.hudi.table.HoodieFlinkMergeOnReadTable
    954:            27            864  org.apache.hudi.common.table.view.FileSystemViewManager
   1064:            28            672  org.apache.hudi.common.table.view.PriorityBasedFileSystemView
   ```
   
   In the timeline of our pipeline, there was only 1 unfinished compaction plan, which contained 74704 operations, `74704 * 28 = 2091712`.
   
   May I ask if we can lazy load `HoodieTableFileSystemView` in `PriorityBasedFileSystemView` when creating `FlinkAppendHandle`? It can also reduce memory usage for active partitions.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #7626: [HUDI-5516] Reduce memory footprint on workload with thousand active partitions

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #7626:
URL: https://github.com/apache/hudi/pull/7626#issuecomment-1376958366

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "5a1b3a0d551128fd1f54a8c1b43c021330b7d056",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14184",
       "triggerID" : "5a1b3a0d551128fd1f54a8c1b43c021330b7d056",
       "triggerType" : "PUSH"
     }, {
       "hash" : "9cdfeaad2dde6819cf04e9db4395115cf46c2f35",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "9cdfeaad2dde6819cf04e9db4395115cf46c2f35",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 5a1b3a0d551128fd1f54a8c1b43c021330b7d056 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14184) 
   * 9cdfeaad2dde6819cf04e9db4395115cf46c2f35 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] trushev commented on pull request #7626: [HUDI-5516] Reduce memory footprint on workload with thousand active partitions

Posted by GitBox <gi...@apache.org>.
trushev commented on PR #7626:
URL: https://github.com/apache/hudi/pull/7626#issuecomment-1376652385

   > Nice catch, @trushev , curious why the closed handle is also taking huge resource, we may need to figure it out first.
   > 
   > But I still think the change is valid.
   
   Thank you for the reviewing
   
   Here is create handle layout with nulled writer. My bad to call it "a huge object even with released writer". Speaking more precise, it is just bigger than we needed. 14 KB against 522 bytes
   
   <img width="878" alt="Снимок экрана 2023-01-10 в 09 31 27" src="https://user-images.githubusercontent.com/42293632/211450420-b6fd1e92-6552-4b9f-af78-2244fa22d41e.png">
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] danny0405 merged pull request #7626: [HUDI-5516] Reduce memory footprint on workload with thousand active partitions

Posted by GitBox <gi...@apache.org>.
danny0405 merged PR #7626:
URL: https://github.com/apache/hudi/pull/7626


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] trushev commented on pull request #7626: [HUDI-5516] Reduce memory footprint on workload with thousand active partitions

Posted by GitBox <gi...@apache.org>.
trushev commented on PR #7626:
URL: https://github.com/apache/hudi/pull/7626#issuecomment-1386555492

   @danny0405 Could you pls take a look again. New solution:
   - Replaced `Map<String, HoodieWriteHandle<?, ?, ?, ?>>` with `Map<String, Path>`
   - All handles are definitely closed by finally section with `closeGracefully()`


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] trushev commented on a diff in pull request #7626: [HUDI-5516] Reduce memory footprint on workload with thousand active partitions

Posted by GitBox <gi...@apache.org>.
trushev commented on code in PR #7626:
URL: https://github.com/apache/hudi/pull/7626#discussion_r1072027683


##########
hudi-client/hudi-flink-client/src/main/java/org/apache/hudi/client/BucketHandles.java:
##########
@@ -0,0 +1,107 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hudi.client;
+
+import org.apache.hudi.common.util.Option;
+import org.apache.hudi.common.util.VisibleForTesting;
+import org.apache.hudi.io.HoodieWriteHandle;
+import org.apache.hudi.io.MiniBatchHandle;
+
+import org.apache.hadoop.fs.Path;
+
+import java.util.HashMap;
+import java.util.Map;
+
+/**
+ * This class is responsible for keeping write handles between checkpoints.
+ * It keeps paths instead of closed handles to reduce memory footprint.
+ * These write paths are used to create ReplaceHandle.
+ */
+public final class BucketHandles {
+  private final Map<String, MiniBatchHandle> fileToHandle;
+  private final Map<String, Path> fileToPath;
+

Review Comment:
   Thank you for helping, I'm reworking the PR now according to https://github.com/apache/hudi/pull/7626#discussion_r1066607740 and `BucketHandles` is going to be removed at all



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #7626: [HUDI-5516] Reduce memory footprint on workload with thousand active partitions

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #7626:
URL: https://github.com/apache/hudi/pull/7626#issuecomment-1396486170

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "5a1b3a0d551128fd1f54a8c1b43c021330b7d056",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14184",
       "triggerID" : "5a1b3a0d551128fd1f54a8c1b43c021330b7d056",
       "triggerType" : "PUSH"
     }, {
       "hash" : "9cdfeaad2dde6819cf04e9db4395115cf46c2f35",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14221",
       "triggerID" : "9cdfeaad2dde6819cf04e9db4395115cf46c2f35",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fad68a69ecf7e7eba8d43307e1b0fa9da6244857",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14236",
       "triggerID" : "fad68a69ecf7e7eba8d43307e1b0fa9da6244857",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ae0b2c787c8e3afd7f9a3f6cc04676f910373657",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14381",
       "triggerID" : "ae0b2c787c8e3afd7f9a3f6cc04676f910373657",
       "triggerType" : "PUSH"
     }, {
       "hash" : "784dd7c8e7f8d6b7013071df04bfb57121b1d6c9",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14382",
       "triggerID" : "784dd7c8e7f8d6b7013071df04bfb57121b1d6c9",
       "triggerType" : "PUSH"
     }, {
       "hash" : "1cfe72cb4a06010b19c96abf25b3ebf9d7f6e895",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "1cfe72cb4a06010b19c96abf25b3ebf9d7f6e895",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 784dd7c8e7f8d6b7013071df04bfb57121b1d6c9 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14382) 
   * 1cfe72cb4a06010b19c96abf25b3ebf9d7f6e895 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org