You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by "danny0405 (via GitHub)" <gi...@apache.org> on 2023/03/08 07:53:52 UTC

[GitHub] [hudi] danny0405 commented on a diff in pull request #7962: [HUDI-5801] Speed metaTable initializeFileGroups

danny0405 commented on code in PR #7962:
URL: https://github.com/apache/hudi/pull/7962#discussion_r1129076540


##########
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/metadata/HoodieBackedTableMetadataWriter.java:
##########
@@ -721,17 +721,22 @@ public void initializeMetadataPartitions(HoodieTableMetaClient dataMetaClient, L
    */
   private void initializeFileGroups(HoodieTableMetaClient dataMetaClient, MetadataPartitionType metadataPartition, String instantTime,
                                     int fileGroupCount) throws IOException {
-    final HashMap<HeaderMetadataType, String> blockHeader = new HashMap<>();
-    blockHeader.put(HeaderMetadataType.INSTANT_TIME, instantTime);
-    // Archival of data table has a dependency on compaction(base files) in metadata table.
-    // It is assumed that as of time Tx of base instant (/compaction time) in metadata table,
-    // all commits in data table is in sync with metadata table. So, we always start with log file for any fileGroup.
-    final HoodieDeleteBlock block = new HoodieDeleteBlock(new DeleteRecord[0], blockHeader);
-
-    LOG.info(String.format("Creating %d file groups for partition %s with base fileId %s at instant time %s",
-        fileGroupCount, metadataPartition.getPartitionPath(), metadataPartition.getFileIdPrefix(), instantTime));
+    final List<Integer> list = new ArrayList<>(fileGroupCount);
     for (int i = 0; i < fileGroupCount; ++i) {
-      final String fileGroupFileId = String.format("%s%04d", metadataPartition.getFileIdPrefix(), i);
+      list.add(i);
+    }
+    engineContext.setJobStatus(this.getClass().getSimpleName(), "Initializing metadata table file groups: " + metadataPartition);
+    engineContext.parallelize(list, fileGroupCount).map(x -> {

Review Comment:
   Can we fix it then



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org