You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2022/03/21 12:24:15 UTC

[GitHub] [hudi] vinothchandar commented on a change in pull request #4428: [HUDI-44] Adding support to preserve commit metadata for compaction

vinothchandar commented on a change in pull request #4428:
URL: https://github.com/apache/hudi/pull/4428#discussion_r831043757



##########
File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/config/HoodieCompactionConfig.java
##########
@@ -224,6 +224,12 @@
       .withDocumentation("Used by org.apache.hudi.io.compact.strategy.DayBasedCompactionStrategy to denote the number of "
           + "latest partitions to compact during a compaction run.");
 
+  public static final ConfigProperty<Boolean> PRESERVE_COMMIT_METADATA = ConfigProperty
+      .key("hoodie.compaction.preserve.commit.metadata")
+      .defaultValue(false)

Review comment:
       Compaction should not change the original `_hoodie_commit_time` or `_hoodie_commit_seqno` values at all. So we should look into making that the default behavior as @YannByron suggested.

##########
File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/action/cluster/strategy/PartitionAwareClusteringPlanStrategy.java
##########
@@ -101,7 +101,7 @@ public PartitionAwareClusteringPlanStrategy(HoodieTable table, HoodieEngineConte
         .setInputGroups(clusteringGroups)
         .setExtraMetadata(getExtraMetadata())
         .setVersion(getPlanVersion())
-        .setPreserveHoodieMetadata(getWriteConfig().isPreserveHoodieCommitMetadata())
+        .setPreserveHoodieMetadata(getWriteConfig().isPreserveHoodieCommitMetadataForClustering())

Review comment:
       clustering should not change the commit time either.

##########
File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/io/HoodieMergeHandle.java
##########
@@ -291,7 +293,11 @@ protected boolean writeRecord(HoodieRecord<T> hoodieRecord, Option<IndexedRecord
       if (indexedRecord.isPresent() && !isDelete) {
         // Convert GenericRecord to GenericRecord with hoodie commit metadata in schema
         IndexedRecord recordWithMetadataInSchema = rewriteRecord((GenericRecord) indexedRecord.get());
-        fileWriter.writeAvroWithMetadata(recordWithMetadataInSchema, hoodieRecord);
+        if (preserveMetadata) {

Review comment:
       let's see. `_hoodie_file_name` could technically change to the base file? 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org