You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by "zhangyue19921010 (via GitHub)" <gi...@apache.org> on 2023/02/06 09:18:38 UTC

[GitHub] [hudi] zhangyue19921010 opened a new pull request, #7865: [HUDI-5710] Load all partitions in advance when using KEEP_LATEST_FILE_VERSIONS clean policy and MDT enable

zhangyue19921010 opened a new pull request, #7865:
URL: https://github.com/apache/hudi/pull/7865

   ### Change Logs
   When MDT is enabled and users choose KEEP_LATEST_FILE_VERSIONS as clean policy, current master will read MDT and check each partition one by one to pick up deleted data files.
   
   If partition numbers are huge, it may take too much time.
   
   This PR will load all these partition/files in filesystemView in advance using MDT `listPartitions(list)` API.
   
   ### Impact
   
   Enable MDT and use KEEP_LATEST_FILE_VERSIONS as clean policy
   
   ### Risk level (write none, low medium or high below)
   
   low
   
   ### Documentation Update
   
   _Describe any necessary documentation update if there is any new feature, config, or user-facing change_
   
   - _The config description must be updated if new configs are added or the default value of the configs are changed_
   - _Any new feature or user-facing change requires updating the Hudi website. Please create a Jira ticket, attach the
     ticket number here and follow the [instruction](https://hudi.apache.org/contribute/developer-setup#website) to make
     changes to the website._
   
   ### Contributor's checklist
   
   - [ ] Read through [contributor's guide](https://hudi.apache.org/contribute/how-to-contribute)
   - [ ] Change Logs and Impact were stated clearly
   - [ ] Adequate tests were added if applicable
   - [ ] CI passed
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] stream2000 commented on a diff in pull request #7865: [HUDI-5710] Load all partitions in advance when using KEEP_LATEST_FILE_VERSIONS clean policy and MDT enable

Posted by "stream2000 (via GitHub)" <gi...@apache.org>.
stream2000 commented on code in PR #7865:
URL: https://github.com/apache/hudi/pull/7865#discussion_r1098438588


##########
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/action/clean/CleanPlanner.java:
##########
@@ -101,6 +102,16 @@ public CleanPlanner(HoodieEngineContext context, HoodieTable<T, I, K, O> hoodieT
     this.fgIdToPendingLogCompactionOperations = fileSystemView.getPendingLogCompactionOperations()
         .map(entry -> Pair.of(new HoodieFileGroupId(entry.getValue().getPartitionPath(), entry.getValue().getFileId()), entry.getValue()))
         .collect(Collectors.toMap(Pair::getKey, Pair::getValue));
+
+    // load all partitions in advance if necessary.
+    if (config.isMetadataTableEnabled()
+        && config.getCleanerPolicy().equals(HoodieCleaningPolicy.KEEP_LATEST_FILE_VERSIONS)
+        && (config.getViewStorageConfig().getStorageType().equals(FileSystemViewStorageType.MEMORY)
+            || (config.getViewStorageConfig().getStorageType().equals(FileSystemViewStorageType.REMOTE_FIRST)
+                && config.getViewStorageConfig().getSecondaryStorageType().equals(FileSystemViewStorageType.MEMORY)))) {
+      LOG.info("Load all partitions and files into file system view in advance when using KEEP_LATEST_FILE_VERSIONS.");
+      fileSystemView.loadAllPartitions();

Review Comment:
   Will this approach be slower than master branch implementation in spark?  In spark engine,  partitions will be loaded parallelly in subtasks while here partitions will be loaded by a single thread. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #7865: [HUDI-5710] Load all partitions in advance when using KEEP_LATEST_FILE_VERSIONS clean policy and MDT enable

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #7865:
URL: https://github.com/apache/hudi/pull/7865#issuecomment-1418812872

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "59c457e89bef1b404627f9b3700d65235044387c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "59c457e89bef1b404627f9b3700d65235044387c",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 59c457e89bef1b404627f9b3700d65235044387c UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #7865: [HUDI-5710] Load all partitions in advance when using KEEP_LATEST_FILE_VERSIONS clean policy and MDT enable

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #7865:
URL: https://github.com/apache/hudi/pull/7865#issuecomment-1420523825

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "59c457e89bef1b404627f9b3700d65235044387c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "59c457e89bef1b404627f9b3700d65235044387c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5179963c4e969c2fd90eb48dd6f8671a57a5d677",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14960",
       "triggerID" : "5179963c4e969c2fd90eb48dd6f8671a57a5d677",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5179963c4e969c2fd90eb48dd6f8671a57a5d677",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14975",
       "triggerID" : "1420031966",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "7c7293aa41d632051edac2b57fa22a4a39bd4cee",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14987",
       "triggerID" : "7c7293aa41d632051edac2b57fa22a4a39bd4cee",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 59c457e89bef1b404627f9b3700d65235044387c UNKNOWN
   * 5179963c4e969c2fd90eb48dd6f8671a57a5d677 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14960) Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14975) 
   * 7c7293aa41d632051edac2b57fa22a4a39bd4cee Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14987) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #7865: [HUDI-5710] Load all partitions in advance for clean when MDT is enabled

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #7865:
URL: https://github.com/apache/hudi/pull/7865#issuecomment-1545925359

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "59c457e89bef1b404627f9b3700d65235044387c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "59c457e89bef1b404627f9b3700d65235044387c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5179963c4e969c2fd90eb48dd6f8671a57a5d677",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14960",
       "triggerID" : "5179963c4e969c2fd90eb48dd6f8671a57a5d677",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5179963c4e969c2fd90eb48dd6f8671a57a5d677",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14975",
       "triggerID" : "1420031966",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "7c7293aa41d632051edac2b57fa22a4a39bd4cee",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14987",
       "triggerID" : "7c7293aa41d632051edac2b57fa22a4a39bd4cee",
       "triggerType" : "PUSH"
     }, {
       "hash" : "7c7293aa41d632051edac2b57fa22a4a39bd4cee",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14993",
       "triggerID" : "1420911211",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "d58c1e20faba3f484e8fa38c474a955b1b1dd0f0",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16502",
       "triggerID" : "d58c1e20faba3f484e8fa38c474a955b1b1dd0f0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "bb88f6ebfc9ac76a3789073190c7cc5c21fc1d80",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=17039",
       "triggerID" : "bb88f6ebfc9ac76a3789073190c7cc5c21fc1d80",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 59c457e89bef1b404627f9b3700d65235044387c UNKNOWN
   * bb88f6ebfc9ac76a3789073190c7cc5c21fc1d80 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=17039) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #7865: [HUDI-5710] Load all partitions in advance when using KEEP_LATEST_FILE_VERSIONS clean policy and MDT enable

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #7865:
URL: https://github.com/apache/hudi/pull/7865#issuecomment-1420055667

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "59c457e89bef1b404627f9b3700d65235044387c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "59c457e89bef1b404627f9b3700d65235044387c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5179963c4e969c2fd90eb48dd6f8671a57a5d677",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14960",
       "triggerID" : "5179963c4e969c2fd90eb48dd6f8671a57a5d677",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5179963c4e969c2fd90eb48dd6f8671a57a5d677",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14975",
       "triggerID" : "1420031966",
       "triggerType" : "MANUAL"
     } ]
   }-->
   ## CI report:
   
   * 59c457e89bef1b404627f9b3700d65235044387c UNKNOWN
   * 5179963c4e969c2fd90eb48dd6f8671a57a5d677 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14960) Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14975) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #7865: [HUDI-5710] Load all partitions in advance when using KEEP_LATEST_FILE_VERSIONS clean policy and MDT enable

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #7865:
URL: https://github.com/apache/hudi/pull/7865#issuecomment-1516388607

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "59c457e89bef1b404627f9b3700d65235044387c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "59c457e89bef1b404627f9b3700d65235044387c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5179963c4e969c2fd90eb48dd6f8671a57a5d677",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14960",
       "triggerID" : "5179963c4e969c2fd90eb48dd6f8671a57a5d677",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5179963c4e969c2fd90eb48dd6f8671a57a5d677",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14975",
       "triggerID" : "1420031966",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "7c7293aa41d632051edac2b57fa22a4a39bd4cee",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14987",
       "triggerID" : "7c7293aa41d632051edac2b57fa22a4a39bd4cee",
       "triggerType" : "PUSH"
     }, {
       "hash" : "7c7293aa41d632051edac2b57fa22a4a39bd4cee",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14993",
       "triggerID" : "1420911211",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "d58c1e20faba3f484e8fa38c474a955b1b1dd0f0",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "d58c1e20faba3f484e8fa38c474a955b1b1dd0f0",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 59c457e89bef1b404627f9b3700d65235044387c UNKNOWN
   * 7c7293aa41d632051edac2b57fa22a4a39bd4cee Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14987) Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14993) 
   * d58c1e20faba3f484e8fa38c474a955b1b1dd0f0 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] Zouxxyy commented on a diff in pull request #7865: [HUDI-5710] Load all partitions in advance when using KEEP_LATEST_FILE_VERSIONS clean policy and MDT enable

Posted by "Zouxxyy (via GitHub)" <gi...@apache.org>.
Zouxxyy commented on code in PR #7865:
URL: https://github.com/apache/hudi/pull/7865#discussion_r1105393423


##########
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/action/clean/CleanPlanner.java:
##########
@@ -101,6 +102,16 @@ public CleanPlanner(HoodieEngineContext context, HoodieTable<T, I, K, O> hoodieT
     this.fgIdToPendingLogCompactionOperations = fileSystemView.getPendingLogCompactionOperations()
         .map(entry -> Pair.of(new HoodieFileGroupId(entry.getValue().getPartitionPath(), entry.getValue().getFileId()), entry.getValue()))
         .collect(Collectors.toMap(Pair::getKey, Pair::getValue));
+
+    // load all partitions in advance if necessary.
+    if (config.isMetadataTableEnabled()
+        && config.getCleanerPolicy().equals(HoodieCleaningPolicy.KEEP_LATEST_FILE_VERSIONS)
+        && (config.getViewStorageConfig().getStorageType().equals(FileSystemViewStorageType.MEMORY)
+            || (config.getViewStorageConfig().getStorageType().equals(FileSystemViewStorageType.REMOTE_FIRST)
+                && config.getViewStorageConfig().getSecondaryStorageType().equals(FileSystemViewStorageType.MEMORY)))) {
+      LOG.info("Load all partitions and files into file system view in advance when using KEEP_LATEST_FILE_VERSIONS.");
+      fileSystemView.loadAllPartitions();

Review Comment:
   I think the idea of this PR is similar to https://github.com/apache/hudi/pull/7690, maybe we can refer to its conditions?
   ![image](https://user-images.githubusercontent.com/37108074/218666353-749c4d94-3112-493b-b0cc-69ecccc96d05.png)
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] bvaradar commented on a diff in pull request #7865: [HUDI-5710] Load all partitions in advance when using KEEP_LATEST_FILE_VERSIONS clean policy and MDT enable

Posted by "bvaradar (via GitHub)" <gi...@apache.org>.
bvaradar commented on code in PR #7865:
URL: https://github.com/apache/hudi/pull/7865#discussion_r1109280858


##########
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/action/clean/CleanPlanner.java:
##########
@@ -101,6 +102,16 @@ public CleanPlanner(HoodieEngineContext context, HoodieTable<T, I, K, O> hoodieT
     this.fgIdToPendingLogCompactionOperations = fileSystemView.getPendingLogCompactionOperations()
         .map(entry -> Pair.of(new HoodieFileGroupId(entry.getValue().getPartitionPath(), entry.getValue().getFileId()), entry.getValue()))
         .collect(Collectors.toMap(Pair::getKey, Pair::getValue));
+
+    // load all partitions in advance if necessary.
+    if (config.isMetadataTableEnabled()
+        && config.getCleanerPolicy().equals(HoodieCleaningPolicy.KEEP_LATEST_FILE_VERSIONS)
+        && (config.getViewStorageConfig().getStorageType().equals(FileSystemViewStorageType.MEMORY)
+            || (config.getViewStorageConfig().getStorageType().equals(FileSystemViewStorageType.REMOTE_FIRST)
+                && config.getViewStorageConfig().getSecondaryStorageType().equals(FileSystemViewStorageType.MEMORY)))) {
+      LOG.info("Load all partitions and files into file system view in advance when using KEEP_LATEST_FILE_VERSIONS.");
+      fileSystemView.loadAllPartitions();

Review Comment:
   +1 to @Zouxxyy comment. For cleaner, we would need to load old file slices too (not just latest file-slices).



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #7865: [HUDI-5710] Load all partitions in advance when using KEEP_LATEST_FILE_VERSIONS clean policy and MDT enable

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #7865:
URL: https://github.com/apache/hudi/pull/7865#issuecomment-1516491001

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "59c457e89bef1b404627f9b3700d65235044387c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "59c457e89bef1b404627f9b3700d65235044387c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5179963c4e969c2fd90eb48dd6f8671a57a5d677",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14960",
       "triggerID" : "5179963c4e969c2fd90eb48dd6f8671a57a5d677",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5179963c4e969c2fd90eb48dd6f8671a57a5d677",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14975",
       "triggerID" : "1420031966",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "7c7293aa41d632051edac2b57fa22a4a39bd4cee",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14987",
       "triggerID" : "7c7293aa41d632051edac2b57fa22a4a39bd4cee",
       "triggerType" : "PUSH"
     }, {
       "hash" : "7c7293aa41d632051edac2b57fa22a4a39bd4cee",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14993",
       "triggerID" : "1420911211",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "d58c1e20faba3f484e8fa38c474a955b1b1dd0f0",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16502",
       "triggerID" : "d58c1e20faba3f484e8fa38c474a955b1b1dd0f0",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 59c457e89bef1b404627f9b3700d65235044387c UNKNOWN
   * 7c7293aa41d632051edac2b57fa22a4a39bd4cee Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14987) Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14993) 
   * d58c1e20faba3f484e8fa38c474a955b1b1dd0f0 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16502) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] xushiyan commented on pull request #7865: [HUDI-5710] Load all partitions in advance for clean when MDT is enabled

Posted by "xushiyan (via GitHub)" <gi...@apache.org>.
xushiyan commented on PR #7865:
URL: https://github.com/apache/hudi/pull/7865#issuecomment-1546891936

   CI failure is fixed and is irrelevant to this change


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #7865: [HUDI-5710] Load all partitions in advance for clean when MDT is enabled

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #7865:
URL: https://github.com/apache/hudi/pull/7865#issuecomment-1546856358

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "59c457e89bef1b404627f9b3700d65235044387c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "59c457e89bef1b404627f9b3700d65235044387c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5179963c4e969c2fd90eb48dd6f8671a57a5d677",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14960",
       "triggerID" : "5179963c4e969c2fd90eb48dd6f8671a57a5d677",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5179963c4e969c2fd90eb48dd6f8671a57a5d677",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14975",
       "triggerID" : "1420031966",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "7c7293aa41d632051edac2b57fa22a4a39bd4cee",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14987",
       "triggerID" : "7c7293aa41d632051edac2b57fa22a4a39bd4cee",
       "triggerType" : "PUSH"
     }, {
       "hash" : "7c7293aa41d632051edac2b57fa22a4a39bd4cee",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14993",
       "triggerID" : "1420911211",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "d58c1e20faba3f484e8fa38c474a955b1b1dd0f0",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16502",
       "triggerID" : "d58c1e20faba3f484e8fa38c474a955b1b1dd0f0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "bb88f6ebfc9ac76a3789073190c7cc5c21fc1d80",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=17039",
       "triggerID" : "bb88f6ebfc9ac76a3789073190c7cc5c21fc1d80",
       "triggerType" : "PUSH"
     }, {
       "hash" : "bcc1a4521c3d16a0e391a79428ce5efed8d88687",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "bcc1a4521c3d16a0e391a79428ce5efed8d88687",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 59c457e89bef1b404627f9b3700d65235044387c UNKNOWN
   * bb88f6ebfc9ac76a3789073190c7cc5c21fc1d80 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=17039) 
   * bcc1a4521c3d16a0e391a79428ce5efed8d88687 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] Zouxxyy commented on a diff in pull request #7865: [HUDI-5710] Load all partitions in advance when using KEEP_LATEST_FILE_VERSIONS clean policy and MDT enable

Posted by "Zouxxyy (via GitHub)" <gi...@apache.org>.
Zouxxyy commented on code in PR #7865:
URL: https://github.com/apache/hudi/pull/7865#discussion_r1105393423


##########
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/action/clean/CleanPlanner.java:
##########
@@ -101,6 +102,16 @@ public CleanPlanner(HoodieEngineContext context, HoodieTable<T, I, K, O> hoodieT
     this.fgIdToPendingLogCompactionOperations = fileSystemView.getPendingLogCompactionOperations()
         .map(entry -> Pair.of(new HoodieFileGroupId(entry.getValue().getPartitionPath(), entry.getValue().getFileId()), entry.getValue()))
         .collect(Collectors.toMap(Pair::getKey, Pair::getValue));
+
+    // load all partitions in advance if necessary.
+    if (config.isMetadataTableEnabled()
+        && config.getCleanerPolicy().equals(HoodieCleaningPolicy.KEEP_LATEST_FILE_VERSIONS)
+        && (config.getViewStorageConfig().getStorageType().equals(FileSystemViewStorageType.MEMORY)
+            || (config.getViewStorageConfig().getStorageType().equals(FileSystemViewStorageType.REMOTE_FIRST)
+                && config.getViewStorageConfig().getSecondaryStorageType().equals(FileSystemViewStorageType.MEMORY)))) {
+      LOG.info("Load all partitions and files into file system view in advance when using KEEP_LATEST_FILE_VERSIONS.");
+      fileSystemView.loadAllPartitions();

Review Comment:
   I think the idea of this PR is similar to https://github.com/apache/hudi/pull/7690, maybe you can refer to its conditions?
   ![image](https://user-images.githubusercontent.com/37108074/218666353-749c4d94-3112-493b-b0cc-69ecccc96d05.png)
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #7865: [HUDI-5710] Load all partitions in advance when using KEEP_LATEST_FILE_VERSIONS clean policy and MDT enable

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #7865:
URL: https://github.com/apache/hudi/pull/7865#issuecomment-1421231387

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "59c457e89bef1b404627f9b3700d65235044387c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "59c457e89bef1b404627f9b3700d65235044387c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5179963c4e969c2fd90eb48dd6f8671a57a5d677",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14960",
       "triggerID" : "5179963c4e969c2fd90eb48dd6f8671a57a5d677",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5179963c4e969c2fd90eb48dd6f8671a57a5d677",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14975",
       "triggerID" : "1420031966",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "7c7293aa41d632051edac2b57fa22a4a39bd4cee",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14987",
       "triggerID" : "7c7293aa41d632051edac2b57fa22a4a39bd4cee",
       "triggerType" : "PUSH"
     }, {
       "hash" : "7c7293aa41d632051edac2b57fa22a4a39bd4cee",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14993",
       "triggerID" : "1420911211",
       "triggerType" : "MANUAL"
     } ]
   }-->
   ## CI report:
   
   * 59c457e89bef1b404627f9b3700d65235044387c UNKNOWN
   * 7c7293aa41d632051edac2b57fa22a4a39bd4cee Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14987) Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14993) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #7865: [HUDI-5710] Load all partitions in advance when using KEEP_LATEST_FILE_VERSIONS clean policy and MDT enable

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #7865:
URL: https://github.com/apache/hudi/pull/7865#issuecomment-1418824082

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "59c457e89bef1b404627f9b3700d65235044387c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "59c457e89bef1b404627f9b3700d65235044387c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5179963c4e969c2fd90eb48dd6f8671a57a5d677",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "5179963c4e969c2fd90eb48dd6f8671a57a5d677",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 59c457e89bef1b404627f9b3700d65235044387c UNKNOWN
   * 5179963c4e969c2fd90eb48dd6f8671a57a5d677 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #7865: [HUDI-5710] Load all partitions in advance for clean when MDT is enabled

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #7865:
URL: https://github.com/apache/hudi/pull/7865#issuecomment-1545669303

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "59c457e89bef1b404627f9b3700d65235044387c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "59c457e89bef1b404627f9b3700d65235044387c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5179963c4e969c2fd90eb48dd6f8671a57a5d677",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14960",
       "triggerID" : "5179963c4e969c2fd90eb48dd6f8671a57a5d677",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5179963c4e969c2fd90eb48dd6f8671a57a5d677",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14975",
       "triggerID" : "1420031966",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "7c7293aa41d632051edac2b57fa22a4a39bd4cee",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14987",
       "triggerID" : "7c7293aa41d632051edac2b57fa22a4a39bd4cee",
       "triggerType" : "PUSH"
     }, {
       "hash" : "7c7293aa41d632051edac2b57fa22a4a39bd4cee",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14993",
       "triggerID" : "1420911211",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "d58c1e20faba3f484e8fa38c474a955b1b1dd0f0",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16502",
       "triggerID" : "d58c1e20faba3f484e8fa38c474a955b1b1dd0f0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "bb88f6ebfc9ac76a3789073190c7cc5c21fc1d80",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "bb88f6ebfc9ac76a3789073190c7cc5c21fc1d80",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 59c457e89bef1b404627f9b3700d65235044387c UNKNOWN
   * d58c1e20faba3f484e8fa38c474a955b1b1dd0f0 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16502) 
   * bb88f6ebfc9ac76a3789073190c7cc5c21fc1d80 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] xushiyan merged pull request #7865: [HUDI-5710] Load all partitions in advance for clean when MDT is enabled

Posted by "xushiyan (via GitHub)" <gi...@apache.org>.
xushiyan merged PR #7865:
URL: https://github.com/apache/hudi/pull/7865


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #7865: [HUDI-5710] Load all partitions in advance for clean when MDT is enabled

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #7865:
URL: https://github.com/apache/hudi/pull/7865#issuecomment-1546903848

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "59c457e89bef1b404627f9b3700d65235044387c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "59c457e89bef1b404627f9b3700d65235044387c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5179963c4e969c2fd90eb48dd6f8671a57a5d677",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14960",
       "triggerID" : "5179963c4e969c2fd90eb48dd6f8671a57a5d677",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5179963c4e969c2fd90eb48dd6f8671a57a5d677",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14975",
       "triggerID" : "1420031966",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "7c7293aa41d632051edac2b57fa22a4a39bd4cee",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14987",
       "triggerID" : "7c7293aa41d632051edac2b57fa22a4a39bd4cee",
       "triggerType" : "PUSH"
     }, {
       "hash" : "7c7293aa41d632051edac2b57fa22a4a39bd4cee",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14993",
       "triggerID" : "1420911211",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "d58c1e20faba3f484e8fa38c474a955b1b1dd0f0",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16502",
       "triggerID" : "d58c1e20faba3f484e8fa38c474a955b1b1dd0f0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "bb88f6ebfc9ac76a3789073190c7cc5c21fc1d80",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=17039",
       "triggerID" : "bb88f6ebfc9ac76a3789073190c7cc5c21fc1d80",
       "triggerType" : "PUSH"
     }, {
       "hash" : "bcc1a4521c3d16a0e391a79428ce5efed8d88687",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=17051",
       "triggerID" : "bcc1a4521c3d16a0e391a79428ce5efed8d88687",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d26c15f3bf8ad3d9097b46d4ab663849b3ec50d3",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "d26c15f3bf8ad3d9097b46d4ab663849b3ec50d3",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 59c457e89bef1b404627f9b3700d65235044387c UNKNOWN
   * bcc1a4521c3d16a0e391a79428ce5efed8d88687 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=17051) 
   * d26c15f3bf8ad3d9097b46d4ab663849b3ec50d3 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #7865: [HUDI-5710] Load all partitions in advance when using KEEP_LATEST_FILE_VERSIONS clean policy and MDT enable

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #7865:
URL: https://github.com/apache/hudi/pull/7865#issuecomment-1420921460

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "59c457e89bef1b404627f9b3700d65235044387c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "59c457e89bef1b404627f9b3700d65235044387c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5179963c4e969c2fd90eb48dd6f8671a57a5d677",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14960",
       "triggerID" : "5179963c4e969c2fd90eb48dd6f8671a57a5d677",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5179963c4e969c2fd90eb48dd6f8671a57a5d677",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14975",
       "triggerID" : "1420031966",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "7c7293aa41d632051edac2b57fa22a4a39bd4cee",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14987",
       "triggerID" : "7c7293aa41d632051edac2b57fa22a4a39bd4cee",
       "triggerType" : "PUSH"
     }, {
       "hash" : "7c7293aa41d632051edac2b57fa22a4a39bd4cee",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14993",
       "triggerID" : "1420911211",
       "triggerType" : "MANUAL"
     } ]
   }-->
   ## CI report:
   
   * 59c457e89bef1b404627f9b3700d65235044387c UNKNOWN
   * 7c7293aa41d632051edac2b57fa22a4a39bd4cee Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14987) Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14993) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #7865: [HUDI-5710] Load all partitions in advance when using KEEP_LATEST_FILE_VERSIONS clean policy and MDT enable

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #7865:
URL: https://github.com/apache/hudi/pull/7865#issuecomment-1420511223

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "59c457e89bef1b404627f9b3700d65235044387c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "59c457e89bef1b404627f9b3700d65235044387c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5179963c4e969c2fd90eb48dd6f8671a57a5d677",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14960",
       "triggerID" : "5179963c4e969c2fd90eb48dd6f8671a57a5d677",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5179963c4e969c2fd90eb48dd6f8671a57a5d677",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14975",
       "triggerID" : "1420031966",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "7c7293aa41d632051edac2b57fa22a4a39bd4cee",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "7c7293aa41d632051edac2b57fa22a4a39bd4cee",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 59c457e89bef1b404627f9b3700d65235044387c UNKNOWN
   * 5179963c4e969c2fd90eb48dd6f8671a57a5d677 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14960) Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14975) 
   * 7c7293aa41d632051edac2b57fa22a4a39bd4cee UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #7865: [HUDI-5710] Load all partitions in advance when using KEEP_LATEST_FILE_VERSIONS clean policy and MDT enable

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #7865:
URL: https://github.com/apache/hudi/pull/7865#issuecomment-1419407255

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "59c457e89bef1b404627f9b3700d65235044387c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "59c457e89bef1b404627f9b3700d65235044387c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5179963c4e969c2fd90eb48dd6f8671a57a5d677",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14960",
       "triggerID" : "5179963c4e969c2fd90eb48dd6f8671a57a5d677",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 59c457e89bef1b404627f9b3700d65235044387c UNKNOWN
   * 5179963c4e969c2fd90eb48dd6f8671a57a5d677 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14960) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #7865: [HUDI-5710] Load all partitions in advance when using KEEP_LATEST_FILE_VERSIONS clean policy and MDT enable

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #7865:
URL: https://github.com/apache/hudi/pull/7865#issuecomment-1420820141

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "59c457e89bef1b404627f9b3700d65235044387c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "59c457e89bef1b404627f9b3700d65235044387c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5179963c4e969c2fd90eb48dd6f8671a57a5d677",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14960",
       "triggerID" : "5179963c4e969c2fd90eb48dd6f8671a57a5d677",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5179963c4e969c2fd90eb48dd6f8671a57a5d677",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14975",
       "triggerID" : "1420031966",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "7c7293aa41d632051edac2b57fa22a4a39bd4cee",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14987",
       "triggerID" : "7c7293aa41d632051edac2b57fa22a4a39bd4cee",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 59c457e89bef1b404627f9b3700d65235044387c UNKNOWN
   * 7c7293aa41d632051edac2b57fa22a4a39bd4cee Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14987) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #7865: [HUDI-5710] Load all partitions in advance when using KEEP_LATEST_FILE_VERSIONS clean policy and MDT enable

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #7865:
URL: https://github.com/apache/hudi/pull/7865#issuecomment-1420203905

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "59c457e89bef1b404627f9b3700d65235044387c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "59c457e89bef1b404627f9b3700d65235044387c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5179963c4e969c2fd90eb48dd6f8671a57a5d677",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14960",
       "triggerID" : "5179963c4e969c2fd90eb48dd6f8671a57a5d677",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5179963c4e969c2fd90eb48dd6f8671a57a5d677",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14975",
       "triggerID" : "1420031966",
       "triggerType" : "MANUAL"
     } ]
   }-->
   ## CI report:
   
   * 59c457e89bef1b404627f9b3700d65235044387c UNKNOWN
   * 5179963c4e969c2fd90eb48dd6f8671a57a5d677 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14960) Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14975) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] zhangyue19921010 commented on pull request #7865: [HUDI-5710] Load all partitions in advance when using KEEP_LATEST_FILE_VERSIONS clean policy and MDT enable

Posted by "zhangyue19921010 (via GitHub)" <gi...@apache.org>.
zhangyue19921010 commented on PR #7865:
URL: https://github.com/apache/hudi/pull/7865#issuecomment-1420031966

   @hudi-bot run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] zhangyue19921010 commented on pull request #7865: [HUDI-5710] Load all partitions in advance when using KEEP_LATEST_FILE_VERSIONS clean policy and MDT enable

Posted by "zhangyue19921010 (via GitHub)" <gi...@apache.org>.
zhangyue19921010 commented on PR #7865:
URL: https://github.com/apache/hudi/pull/7865#issuecomment-1420911211

   @hudi-bot run azure
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #7865: [HUDI-5710] Load all partitions in advance when using KEEP_LATEST_FILE_VERSIONS clean policy and MDT enable

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #7865:
URL: https://github.com/apache/hudi/pull/7865#issuecomment-1517147284

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "59c457e89bef1b404627f9b3700d65235044387c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "59c457e89bef1b404627f9b3700d65235044387c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5179963c4e969c2fd90eb48dd6f8671a57a5d677",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14960",
       "triggerID" : "5179963c4e969c2fd90eb48dd6f8671a57a5d677",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5179963c4e969c2fd90eb48dd6f8671a57a5d677",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14975",
       "triggerID" : "1420031966",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "7c7293aa41d632051edac2b57fa22a4a39bd4cee",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14987",
       "triggerID" : "7c7293aa41d632051edac2b57fa22a4a39bd4cee",
       "triggerType" : "PUSH"
     }, {
       "hash" : "7c7293aa41d632051edac2b57fa22a4a39bd4cee",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14993",
       "triggerID" : "1420911211",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "d58c1e20faba3f484e8fa38c474a955b1b1dd0f0",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16502",
       "triggerID" : "d58c1e20faba3f484e8fa38c474a955b1b1dd0f0",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 59c457e89bef1b404627f9b3700d65235044387c UNKNOWN
   * d58c1e20faba3f484e8fa38c474a955b1b1dd0f0 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16502) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #7865: [HUDI-5710] Load all partitions in advance for clean when MDT is enabled

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #7865:
URL: https://github.com/apache/hudi/pull/7865#issuecomment-1546881551

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "59c457e89bef1b404627f9b3700d65235044387c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "59c457e89bef1b404627f9b3700d65235044387c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5179963c4e969c2fd90eb48dd6f8671a57a5d677",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14960",
       "triggerID" : "5179963c4e969c2fd90eb48dd6f8671a57a5d677",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5179963c4e969c2fd90eb48dd6f8671a57a5d677",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14975",
       "triggerID" : "1420031966",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "7c7293aa41d632051edac2b57fa22a4a39bd4cee",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14987",
       "triggerID" : "7c7293aa41d632051edac2b57fa22a4a39bd4cee",
       "triggerType" : "PUSH"
     }, {
       "hash" : "7c7293aa41d632051edac2b57fa22a4a39bd4cee",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14993",
       "triggerID" : "1420911211",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "d58c1e20faba3f484e8fa38c474a955b1b1dd0f0",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16502",
       "triggerID" : "d58c1e20faba3f484e8fa38c474a955b1b1dd0f0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "bb88f6ebfc9ac76a3789073190c7cc5c21fc1d80",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=17039",
       "triggerID" : "bb88f6ebfc9ac76a3789073190c7cc5c21fc1d80",
       "triggerType" : "PUSH"
     }, {
       "hash" : "bcc1a4521c3d16a0e391a79428ce5efed8d88687",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=17051",
       "triggerID" : "bcc1a4521c3d16a0e391a79428ce5efed8d88687",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 59c457e89bef1b404627f9b3700d65235044387c UNKNOWN
   * bcc1a4521c3d16a0e391a79428ce5efed8d88687 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=17051) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #7865: [HUDI-5710] Load all partitions in advance when using KEEP_LATEST_FILE_VERSIONS clean policy and MDT enable

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #7865:
URL: https://github.com/apache/hudi/pull/7865#issuecomment-1418837282

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "59c457e89bef1b404627f9b3700d65235044387c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "59c457e89bef1b404627f9b3700d65235044387c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5179963c4e969c2fd90eb48dd6f8671a57a5d677",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14960",
       "triggerID" : "5179963c4e969c2fd90eb48dd6f8671a57a5d677",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 59c457e89bef1b404627f9b3700d65235044387c UNKNOWN
   * 5179963c4e969c2fd90eb48dd6f8671a57a5d677 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14960) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] zhangyue19921010 commented on a diff in pull request #7865: [HUDI-5710] Load all partitions in advance when using KEEP_LATEST_FILE_VERSIONS clean policy and MDT enable

Posted by "zhangyue19921010 (via GitHub)" <gi...@apache.org>.
zhangyue19921010 commented on code in PR #7865:
URL: https://github.com/apache/hudi/pull/7865#discussion_r1104187459


##########
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/action/clean/CleanPlanner.java:
##########
@@ -101,6 +102,16 @@ public CleanPlanner(HoodieEngineContext context, HoodieTable<T, I, K, O> hoodieT
     this.fgIdToPendingLogCompactionOperations = fileSystemView.getPendingLogCompactionOperations()
         .map(entry -> Pair.of(new HoodieFileGroupId(entry.getValue().getPartitionPath(), entry.getValue().getFileId()), entry.getValue()))
         .collect(Collectors.toMap(Pair::getKey, Pair::getValue));
+
+    // load all partitions in advance if necessary.
+    if (config.isMetadataTableEnabled()
+        && config.getCleanerPolicy().equals(HoodieCleaningPolicy.KEEP_LATEST_FILE_VERSIONS)
+        && (config.getViewStorageConfig().getStorageType().equals(FileSystemViewStorageType.MEMORY)
+            || (config.getViewStorageConfig().getStorageType().equals(FileSystemViewStorageType.REMOTE_FIRST)
+                && config.getViewStorageConfig().getSecondaryStorageType().equals(FileSystemViewStorageType.MEMORY)))) {
+      LOG.info("Load all partitions and files into file system view in advance when using KEEP_LATEST_FILE_VERSIONS.");
+      fileSystemView.loadAllPartitions();

Review Comment:
   Thanks @stream2000 for your attention.  I will update our test result later, in summary
   Data size: 300T
   Partition numbers: 63,000
   KEEP_LATEST_FILE_VERSIONS
   
   |  feature   | clean time |
   |  ----  | ----  |
   | Disable MDT | 2.3 min |
   | Enable MDT  | 1.1h |
   |Enable MDT + current PR| 7s |
   
   Disable MDT
   ![image](https://user-images.githubusercontent.com/69956021/218418777-a14b94ba-302c-483b-b7a8-401c45b62fc0.png)
   
   Enable MDT
   ![image](https://user-images.githubusercontent.com/69956021/218418908-8f17c673-5102-426c-b0d9-a8f7f935ffe2.png)
   
   
   Enable MDT + current PR
   ![image](https://user-images.githubusercontent.com/69956021/218418593-8b1a3b90-198e-4c77-9e1d-8fd67e61e6ce.png)
   
   
   CC @yihua and @nsivabalan 
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] Zouxxyy commented on a diff in pull request #7865: [HUDI-5710] Load all partitions in advance when using KEEP_LATEST_FILE_VERSIONS clean policy and MDT enable

Posted by "Zouxxyy (via GitHub)" <gi...@apache.org>.
Zouxxyy commented on code in PR #7865:
URL: https://github.com/apache/hudi/pull/7865#discussion_r1105393423


##########
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/action/clean/CleanPlanner.java:
##########
@@ -101,6 +102,16 @@ public CleanPlanner(HoodieEngineContext context, HoodieTable<T, I, K, O> hoodieT
     this.fgIdToPendingLogCompactionOperations = fileSystemView.getPendingLogCompactionOperations()
         .map(entry -> Pair.of(new HoodieFileGroupId(entry.getValue().getPartitionPath(), entry.getValue().getFileId()), entry.getValue()))
         .collect(Collectors.toMap(Pair::getKey, Pair::getValue));
+
+    // load all partitions in advance if necessary.
+    if (config.isMetadataTableEnabled()
+        && config.getCleanerPolicy().equals(HoodieCleaningPolicy.KEEP_LATEST_FILE_VERSIONS)
+        && (config.getViewStorageConfig().getStorageType().equals(FileSystemViewStorageType.MEMORY)
+            || (config.getViewStorageConfig().getStorageType().equals(FileSystemViewStorageType.REMOTE_FIRST)
+                && config.getViewStorageConfig().getSecondaryStorageType().equals(FileSystemViewStorageType.MEMORY)))) {
+      LOG.info("Load all partitions and files into file system view in advance when using KEEP_LATEST_FILE_VERSIONS.");
+      fileSystemView.loadAllPartitions();

Review Comment:
   I think the idea of this PR is similar to https://github.com/apache/hudi/pull/7690, maybe we can refer to its conditions `shouldUseBatchLookup`?
   ![image](https://user-images.githubusercontent.com/37108074/218666353-749c4d94-3112-493b-b0cc-69ecccc96d05.png)
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #7865: [HUDI-5710] Load all partitions in advance for clean when MDT is enabled

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #7865:
URL: https://github.com/apache/hudi/pull/7865#issuecomment-1546857667

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "59c457e89bef1b404627f9b3700d65235044387c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "59c457e89bef1b404627f9b3700d65235044387c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5179963c4e969c2fd90eb48dd6f8671a57a5d677",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14960",
       "triggerID" : "5179963c4e969c2fd90eb48dd6f8671a57a5d677",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5179963c4e969c2fd90eb48dd6f8671a57a5d677",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14975",
       "triggerID" : "1420031966",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "7c7293aa41d632051edac2b57fa22a4a39bd4cee",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14987",
       "triggerID" : "7c7293aa41d632051edac2b57fa22a4a39bd4cee",
       "triggerType" : "PUSH"
     }, {
       "hash" : "7c7293aa41d632051edac2b57fa22a4a39bd4cee",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14993",
       "triggerID" : "1420911211",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "d58c1e20faba3f484e8fa38c474a955b1b1dd0f0",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16502",
       "triggerID" : "d58c1e20faba3f484e8fa38c474a955b1b1dd0f0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "bb88f6ebfc9ac76a3789073190c7cc5c21fc1d80",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=17039",
       "triggerID" : "bb88f6ebfc9ac76a3789073190c7cc5c21fc1d80",
       "triggerType" : "PUSH"
     }, {
       "hash" : "bcc1a4521c3d16a0e391a79428ce5efed8d88687",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=17051",
       "triggerID" : "bcc1a4521c3d16a0e391a79428ce5efed8d88687",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 59c457e89bef1b404627f9b3700d65235044387c UNKNOWN
   * bb88f6ebfc9ac76a3789073190c7cc5c21fc1d80 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=17039) 
   * bcc1a4521c3d16a0e391a79428ce5efed8d88687 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=17051) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #7865: [HUDI-5710] Load all partitions in advance for clean when MDT is enabled

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #7865:
URL: https://github.com/apache/hudi/pull/7865#issuecomment-1545679619

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "59c457e89bef1b404627f9b3700d65235044387c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "59c457e89bef1b404627f9b3700d65235044387c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5179963c4e969c2fd90eb48dd6f8671a57a5d677",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14960",
       "triggerID" : "5179963c4e969c2fd90eb48dd6f8671a57a5d677",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5179963c4e969c2fd90eb48dd6f8671a57a5d677",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14975",
       "triggerID" : "1420031966",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "7c7293aa41d632051edac2b57fa22a4a39bd4cee",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14987",
       "triggerID" : "7c7293aa41d632051edac2b57fa22a4a39bd4cee",
       "triggerType" : "PUSH"
     }, {
       "hash" : "7c7293aa41d632051edac2b57fa22a4a39bd4cee",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14993",
       "triggerID" : "1420911211",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "d58c1e20faba3f484e8fa38c474a955b1b1dd0f0",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16502",
       "triggerID" : "d58c1e20faba3f484e8fa38c474a955b1b1dd0f0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "bb88f6ebfc9ac76a3789073190c7cc5c21fc1d80",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=17039",
       "triggerID" : "bb88f6ebfc9ac76a3789073190c7cc5c21fc1d80",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 59c457e89bef1b404627f9b3700d65235044387c UNKNOWN
   * d58c1e20faba3f484e8fa38c474a955b1b1dd0f0 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16502) 
   * bb88f6ebfc9ac76a3789073190c7cc5c21fc1d80 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=17039) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #7865: [HUDI-5710] Load all partitions in advance for clean when MDT is enabled

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #7865:
URL: https://github.com/apache/hudi/pull/7865#issuecomment-1546984187

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "59c457e89bef1b404627f9b3700d65235044387c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "59c457e89bef1b404627f9b3700d65235044387c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5179963c4e969c2fd90eb48dd6f8671a57a5d677",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14960",
       "triggerID" : "5179963c4e969c2fd90eb48dd6f8671a57a5d677",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5179963c4e969c2fd90eb48dd6f8671a57a5d677",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14975",
       "triggerID" : "1420031966",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "7c7293aa41d632051edac2b57fa22a4a39bd4cee",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14987",
       "triggerID" : "7c7293aa41d632051edac2b57fa22a4a39bd4cee",
       "triggerType" : "PUSH"
     }, {
       "hash" : "7c7293aa41d632051edac2b57fa22a4a39bd4cee",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14993",
       "triggerID" : "1420911211",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "d58c1e20faba3f484e8fa38c474a955b1b1dd0f0",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16502",
       "triggerID" : "d58c1e20faba3f484e8fa38c474a955b1b1dd0f0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "bb88f6ebfc9ac76a3789073190c7cc5c21fc1d80",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=17039",
       "triggerID" : "bb88f6ebfc9ac76a3789073190c7cc5c21fc1d80",
       "triggerType" : "PUSH"
     }, {
       "hash" : "bcc1a4521c3d16a0e391a79428ce5efed8d88687",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=17051",
       "triggerID" : "bcc1a4521c3d16a0e391a79428ce5efed8d88687",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d26c15f3bf8ad3d9097b46d4ab663849b3ec50d3",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=17054",
       "triggerID" : "d26c15f3bf8ad3d9097b46d4ab663849b3ec50d3",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 59c457e89bef1b404627f9b3700d65235044387c UNKNOWN
   * d26c15f3bf8ad3d9097b46d4ab663849b3ec50d3 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=17054) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #7865: [HUDI-5710] Load all partitions in advance for clean when MDT is enabled

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #7865:
URL: https://github.com/apache/hudi/pull/7865#issuecomment-1546916162

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "59c457e89bef1b404627f9b3700d65235044387c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "59c457e89bef1b404627f9b3700d65235044387c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5179963c4e969c2fd90eb48dd6f8671a57a5d677",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14960",
       "triggerID" : "5179963c4e969c2fd90eb48dd6f8671a57a5d677",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5179963c4e969c2fd90eb48dd6f8671a57a5d677",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14975",
       "triggerID" : "1420031966",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "7c7293aa41d632051edac2b57fa22a4a39bd4cee",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14987",
       "triggerID" : "7c7293aa41d632051edac2b57fa22a4a39bd4cee",
       "triggerType" : "PUSH"
     }, {
       "hash" : "7c7293aa41d632051edac2b57fa22a4a39bd4cee",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14993",
       "triggerID" : "1420911211",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "d58c1e20faba3f484e8fa38c474a955b1b1dd0f0",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16502",
       "triggerID" : "d58c1e20faba3f484e8fa38c474a955b1b1dd0f0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "bb88f6ebfc9ac76a3789073190c7cc5c21fc1d80",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=17039",
       "triggerID" : "bb88f6ebfc9ac76a3789073190c7cc5c21fc1d80",
       "triggerType" : "PUSH"
     }, {
       "hash" : "bcc1a4521c3d16a0e391a79428ce5efed8d88687",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=17051",
       "triggerID" : "bcc1a4521c3d16a0e391a79428ce5efed8d88687",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d26c15f3bf8ad3d9097b46d4ab663849b3ec50d3",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=17054",
       "triggerID" : "d26c15f3bf8ad3d9097b46d4ab663849b3ec50d3",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 59c457e89bef1b404627f9b3700d65235044387c UNKNOWN
   * bcc1a4521c3d16a0e391a79428ce5efed8d88687 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=17051) 
   * d26c15f3bf8ad3d9097b46d4ab663849b3ec50d3 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=17054) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org