You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2021/11/25 08:46:40 UTC

[GitHub] [hudi] manojpec commented on a change in pull request #4067: [HUDI-2763] Metadata table records key deduplication

manojpec commented on a change in pull request #4067:
URL: https://github.com/apache/hudi/pull/4067#discussion_r756679137



##########
File path: hudi-common/src/main/java/org/apache/hudi/common/table/log/HoodieLogFileReader.java
##########
@@ -259,6 +254,11 @@ private HoodieLogBlock readBlock() throws IOException {
               contentPosition, contentLength, blockEndPos, readerSchema, header, footer, keyField);
         }
       case HFILE_DATA_BLOCK:
+        if (HoodieTableMetadata.isMetadataTable(logFile) && keyDeDuplication) {
+          return new HoodieMetadataHFileDataBlock(logFile, inputStream, Option.ofNullable(content), readBlockLazily,

Review comment:
       Right, we want to avoid any on-disk format/type changes like new block type code that gets persisted in the log block. 
   
   ```
     public enum HoodieLogBlockType {
       COMMAND_BLOCK, DELETE_BLOCK, CORRUPT_BLOCK, AVRO_DATA_BLOCK, HFILE_DATA_BLOCK
     }
   ```
   
   This PR uses the existing HFILE_DATA_BLOCK type only. The new classes in the PR are only code org to make the metadata HFile reading and writing override the base HFile reader/block for the needed functionality. 
   
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org