You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by "loukey-lj (via GitHub)" <gi...@apache.org> on 2023/02/15 09:28:36 UTC

[GitHub] [hudi] loukey-lj opened a new pull request, #7962: [HUDI-5801] Speed metaTable initializeFileGroups

loukey-lj opened a new pull request, #7962:
URL: https://github.com/apache/hudi/pull/7962

   ### Change Logs
   
   org.apache.hudi.metadata.HoodieBackedTableMetadataWriter#initializeFileGroups 
   Too slow when there are many filegroups
   
   ### Impact
   
   NA
   
   ### Risk level (write none, low medium or high below)
   
   NA
   
   ### Documentation Update
   
   NA
   
   ### Contributor's checklist
   
   - [ ] Read through [contributor's guide](https://hudi.apache.org/contribute/how-to-contribute)
   - [ ] Change Logs and Impact were stated clearly
   - [ ] Adequate tests were added if applicable
   - [ ] CI passed
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #7962: [HUDI-5801] Speed metaTable initializeFileGroups

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #7962:
URL: https://github.com/apache/hudi/pull/7962#issuecomment-1442703937

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "bd715641ef0532c50771d1ae02fdeb5f39e6a52c",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15202",
       "triggerID" : "bd715641ef0532c50771d1ae02fdeb5f39e6a52c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a3c0dc7bddb55332966676136a55d9cd59dd6bb6",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15374",
       "triggerID" : "a3c0dc7bddb55332966676136a55d9cd59dd6bb6",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * bd715641ef0532c50771d1ae02fdeb5f39e6a52c Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15202) 
   * a3c0dc7bddb55332966676136a55d9cd59dd6bb6 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15374) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #7962: [HUDI-5801] Speed metaTable initializeFileGroups

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #7962:
URL: https://github.com/apache/hudi/pull/7962#issuecomment-1442698881

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "bd715641ef0532c50771d1ae02fdeb5f39e6a52c",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15202",
       "triggerID" : "bd715641ef0532c50771d1ae02fdeb5f39e6a52c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a3c0dc7bddb55332966676136a55d9cd59dd6bb6",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "a3c0dc7bddb55332966676136a55d9cd59dd6bb6",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * bd715641ef0532c50771d1ae02fdeb5f39e6a52c Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15202) 
   * a3c0dc7bddb55332966676136a55d9cd59dd6bb6 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #7962: [HUDI-5801] Speed metaTable initializeFileGroups

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #7962:
URL: https://github.com/apache/hudi/pull/7962#issuecomment-1431022243

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "bd715641ef0532c50771d1ae02fdeb5f39e6a52c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "bd715641ef0532c50771d1ae02fdeb5f39e6a52c",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * bd715641ef0532c50771d1ae02fdeb5f39e6a52c UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #7962: [HUDI-5801] Speed metaTable initializeFileGroups

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #7962:
URL: https://github.com/apache/hudi/pull/7962#issuecomment-1458317252

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "bd715641ef0532c50771d1ae02fdeb5f39e6a52c",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15202",
       "triggerID" : "bd715641ef0532c50771d1ae02fdeb5f39e6a52c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a3c0dc7bddb55332966676136a55d9cd59dd6bb6",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15374",
       "triggerID" : "a3c0dc7bddb55332966676136a55d9cd59dd6bb6",
       "triggerType" : "PUSH"
     }, {
       "hash" : "0933c7e5b0ac4343a94d2b5e7d86536566db1c9b",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15608",
       "triggerID" : "0933c7e5b0ac4343a94d2b5e7d86536566db1c9b",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 0933c7e5b0ac4343a94d2b5e7d86536566db1c9b Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15608) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] loukey-lj commented on a diff in pull request #7962: [HUDI-5801] Speed metaTable initializeFileGroups

Posted by "loukey-lj (via GitHub)" <gi...@apache.org>.
loukey-lj commented on code in PR #7962:
URL: https://github.com/apache/hudi/pull/7962#discussion_r1127750519


##########
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/metadata/HoodieBackedTableMetadataWriter.java:
##########
@@ -721,17 +721,22 @@ public void initializeMetadataPartitions(HoodieTableMetaClient dataMetaClient, L
    */
   private void initializeFileGroups(HoodieTableMetaClient dataMetaClient, MetadataPartitionType metadataPartition, String instantTime,
                                     int fileGroupCount) throws IOException {
-    final HashMap<HeaderMetadataType, String> blockHeader = new HashMap<>();
-    blockHeader.put(HeaderMetadataType.INSTANT_TIME, instantTime);
-    // Archival of data table has a dependency on compaction(base files) in metadata table.
-    // It is assumed that as of time Tx of base instant (/compaction time) in metadata table,
-    // all commits in data table is in sync with metadata table. So, we always start with log file for any fileGroup.
-    final HoodieDeleteBlock block = new HoodieDeleteBlock(new DeleteRecord[0], blockHeader);
-
-    LOG.info(String.format("Creating %d file groups for partition %s with base fileId %s at instant time %s",
-        fileGroupCount, metadataPartition.getPartitionPath(), metadataPartition.getFileIdPrefix(), instantTime));
+    final List<Integer> list = new ArrayList<>(fileGroupCount);
     for (int i = 0; i < fileGroupCount; ++i) {
-      final String fileGroupFileId = String.format("%s%04d", metadataPartition.getFileIdPrefix(), i);
+      list.add(i);
+    }
+    engineContext.setJobStatus(this.getClass().getSimpleName(), "Initializing metadata table file groups: " + metadataPartition);
+    engineContext.parallelize(list, fileGroupCount).map(x -> {

Review Comment:
   agree
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #7962: [HUDI-5801] Speed metaTable initializeFileGroups

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #7962:
URL: https://github.com/apache/hudi/pull/7962#issuecomment-1458072201

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "bd715641ef0532c50771d1ae02fdeb5f39e6a52c",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15202",
       "triggerID" : "bd715641ef0532c50771d1ae02fdeb5f39e6a52c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a3c0dc7bddb55332966676136a55d9cd59dd6bb6",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15374",
       "triggerID" : "a3c0dc7bddb55332966676136a55d9cd59dd6bb6",
       "triggerType" : "PUSH"
     }, {
       "hash" : "0933c7e5b0ac4343a94d2b5e7d86536566db1c9b",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "0933c7e5b0ac4343a94d2b5e7d86536566db1c9b",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * a3c0dc7bddb55332966676136a55d9cd59dd6bb6 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15374) 
   * 0933c7e5b0ac4343a94d2b5e7d86536566db1c9b UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] danny0405 commented on a diff in pull request #7962: [HUDI-5801] Speed metaTable initializeFileGroups

Posted by "danny0405 (via GitHub)" <gi...@apache.org>.
danny0405 commented on code in PR #7962:
URL: https://github.com/apache/hudi/pull/7962#discussion_r1119752923


##########
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/metadata/HoodieBackedTableMetadataWriter.java:
##########
@@ -721,17 +721,22 @@ public void initializeMetadataPartitions(HoodieTableMetaClient dataMetaClient, L
    */
   private void initializeFileGroups(HoodieTableMetaClient dataMetaClient, MetadataPartitionType metadataPartition, String instantTime,
                                     int fileGroupCount) throws IOException {
-    final HashMap<HeaderMetadataType, String> blockHeader = new HashMap<>();
-    blockHeader.put(HeaderMetadataType.INSTANT_TIME, instantTime);
-    // Archival of data table has a dependency on compaction(base files) in metadata table.
-    // It is assumed that as of time Tx of base instant (/compaction time) in metadata table,
-    // all commits in data table is in sync with metadata table. So, we always start with log file for any fileGroup.
-    final HoodieDeleteBlock block = new HoodieDeleteBlock(new DeleteRecord[0], blockHeader);
-
-    LOG.info(String.format("Creating %d file groups for partition %s with base fileId %s at instant time %s",
-        fileGroupCount, metadataPartition.getPartitionPath(), metadataPartition.getFileIdPrefix(), instantTime));
+    final List<Integer> list = new ArrayList<>(fileGroupCount);
     for (int i = 0; i < fileGroupCount; ++i) {
-      final String fileGroupFileId = String.format("%s%04d", metadataPartition.getFileIdPrefix(), i);
+      list.add(i);
+    }
+    engineContext.setJobStatus(this.getClass().getSimpleName(), "Initializing metadata table file groups: " + metadataPartition);
+    engineContext.parallelize(list, fileGroupCount).map(x -> {

Review Comment:
   Does IntStream.range(0, xxx) work here?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #7962: [HUDI-5801] Speed metaTable initializeFileGroups

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #7962:
URL: https://github.com/apache/hudi/pull/7962#issuecomment-1442825587

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "bd715641ef0532c50771d1ae02fdeb5f39e6a52c",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15202",
       "triggerID" : "bd715641ef0532c50771d1ae02fdeb5f39e6a52c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a3c0dc7bddb55332966676136a55d9cd59dd6bb6",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15374",
       "triggerID" : "a3c0dc7bddb55332966676136a55d9cd59dd6bb6",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * a3c0dc7bddb55332966676136a55d9cd59dd6bb6 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15374) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] danny0405 commented on a diff in pull request #7962: [HUDI-5801] Speed metaTable initializeFileGroups

Posted by "danny0405 (via GitHub)" <gi...@apache.org>.
danny0405 commented on code in PR #7962:
URL: https://github.com/apache/hudi/pull/7962#discussion_r1129076540


##########
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/metadata/HoodieBackedTableMetadataWriter.java:
##########
@@ -721,17 +721,22 @@ public void initializeMetadataPartitions(HoodieTableMetaClient dataMetaClient, L
    */
   private void initializeFileGroups(HoodieTableMetaClient dataMetaClient, MetadataPartitionType metadataPartition, String instantTime,
                                     int fileGroupCount) throws IOException {
-    final HashMap<HeaderMetadataType, String> blockHeader = new HashMap<>();
-    blockHeader.put(HeaderMetadataType.INSTANT_TIME, instantTime);
-    // Archival of data table has a dependency on compaction(base files) in metadata table.
-    // It is assumed that as of time Tx of base instant (/compaction time) in metadata table,
-    // all commits in data table is in sync with metadata table. So, we always start with log file for any fileGroup.
-    final HoodieDeleteBlock block = new HoodieDeleteBlock(new DeleteRecord[0], blockHeader);
-
-    LOG.info(String.format("Creating %d file groups for partition %s with base fileId %s at instant time %s",
-        fileGroupCount, metadataPartition.getPartitionPath(), metadataPartition.getFileIdPrefix(), instantTime));
+    final List<Integer> list = new ArrayList<>(fileGroupCount);
     for (int i = 0; i < fileGroupCount; ++i) {
-      final String fileGroupFileId = String.format("%s%04d", metadataPartition.getFileIdPrefix(), i);
+      list.add(i);
+    }
+    engineContext.setJobStatus(this.getClass().getSimpleName(), "Initializing metadata table file groups: " + metadataPartition);
+    engineContext.parallelize(list, fileGroupCount).map(x -> {

Review Comment:
   Can we fix it then



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #7962: [HUDI-5801] Speed metaTable initializeFileGroups

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #7962:
URL: https://github.com/apache/hudi/pull/7962#issuecomment-1431033376

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "bd715641ef0532c50771d1ae02fdeb5f39e6a52c",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15202",
       "triggerID" : "bd715641ef0532c50771d1ae02fdeb5f39e6a52c",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * bd715641ef0532c50771d1ae02fdeb5f39e6a52c Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15202) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] loukey-lj commented on pull request #7962: [HUDI-5801] Speed metaTable initializeFileGroups

Posted by "loukey-lj (via GitHub)" <gi...@apache.org>.
loukey-lj commented on PR #7962:
URL: https://github.com/apache/hudi/pull/7962#issuecomment-1458047185

   > Now much we can gain, the metadata file group initialization should be light-weight enough right?
   Yes, now usually the fileGroup Count will not be very large, but it will be large in the case of record level index. Can be optimized for parallel execution


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] bvaradar commented on pull request #7962: [HUDI-5801] Speed metaTable initializeFileGroups

Posted by "bvaradar (via GitHub)" <gi...@apache.org>.
bvaradar commented on PR #7962:
URL: https://github.com/apache/hudi/pull/7962#issuecomment-1495324171

   @loukey-lj : Have you seen slowness in metatable initialization in practice before. For cases like PARTITION_NAME_FILES metadata, the number of file-groups is 1. Running under engine context would result in more overhead for such case. 
   cc @nsivabalan 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #7962: [HUDI-5801] Speed metaTable initializeFileGroups

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #7962:
URL: https://github.com/apache/hudi/pull/7962#issuecomment-1431664630

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "bd715641ef0532c50771d1ae02fdeb5f39e6a52c",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15202",
       "triggerID" : "bd715641ef0532c50771d1ae02fdeb5f39e6a52c",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * bd715641ef0532c50771d1ae02fdeb5f39e6a52c Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15202) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #7962: [HUDI-5801] Speed metaTable initializeFileGroups

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #7962:
URL: https://github.com/apache/hudi/pull/7962#issuecomment-1458125374

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "bd715641ef0532c50771d1ae02fdeb5f39e6a52c",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15202",
       "triggerID" : "bd715641ef0532c50771d1ae02fdeb5f39e6a52c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a3c0dc7bddb55332966676136a55d9cd59dd6bb6",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15374",
       "triggerID" : "a3c0dc7bddb55332966676136a55d9cd59dd6bb6",
       "triggerType" : "PUSH"
     }, {
       "hash" : "0933c7e5b0ac4343a94d2b5e7d86536566db1c9b",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15608",
       "triggerID" : "0933c7e5b0ac4343a94d2b5e7d86536566db1c9b",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * a3c0dc7bddb55332966676136a55d9cd59dd6bb6 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15374) 
   * 0933c7e5b0ac4343a94d2b5e7d86536566db1c9b Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15608) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] loukey-lj commented on pull request #7962: [HUDI-5801] Speed metaTable initializeFileGroups

Posted by "loukey-lj (via GitHub)" <gi...@apache.org>.
loukey-lj commented on PR #7962:
URL: https://github.com/apache/hudi/pull/7962#issuecomment-1495623468

   > > Now much we can gain, the metadata file group initialization should be light-weight enough right?
   > 
   > Yes, now usually the fileGroup Count will not be very large, but it will be large in the case of record level index. Can be optimized for parallel execution
   
   The initialization method will not be called often, and the number of fileGroups in the metaTable is relatively small, but in the record level index scenario, the large table will have thousands of filegroups, and the single-threaded creation on the driver side is very slow, in general, this method is not often called and has little impact on the small metaTable, and the benefits are much greater than single-thread initialization in the large filegroup scenario


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org