You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2020/08/01 16:53:40 UTC

[GitHub] [hudi] nsivabalan commented on a change in pull request #1819: [HUDI-1058] Make delete marker configurable

nsivabalan commented on a change in pull request #1819:
URL: https://github.com/apache/hudi/pull/1819#discussion_r463979191



##########
File path: hudi-client/src/main/java/org/apache/hudi/table/HoodieTimelineArchiveLog.java
##########
@@ -267,15 +268,15 @@ private boolean deleteAllInstantsOlderorEqualsInAuxMetaFolder(HoodieInstant thre
     return success;
   }
 
-  public void archive(List<HoodieInstant> instants) throws HoodieCommitException {
+  public void archive(JavaSparkContext jsc, List<HoodieInstant> instants) throws HoodieCommitException {
     try {
       HoodieTimeline commitTimeline = metaClient.getActiveTimeline().getAllCommitsTimeline().filterCompletedInstants();
       Schema wrapperSchema = HoodieArchivedMetaEntry.getClassSchema();
       LOG.info("Wrapper schema " + wrapperSchema.toString());
       List<IndexedRecord> records = new ArrayList<>();
       for (HoodieInstant hoodieInstant : instants) {
         try {
-          deleteAnyLeftOverMarkerFiles(hoodieInstant);
+          deleteAnyLeftOverMarkerFiles(jsc, hoodieInstant);

Review comment:
       sorry, why making these changes in this PR ? This PR is meant for delete marker field. are these changes related to user defined delete marker field ? 

##########
File path: hudi-common/src/test/java/org/apache/hudi/common/model/TestOverwriteWithLatestAvroPayload.java
##########
@@ -54,13 +57,15 @@ public void testActiveRecords() throws IOException {
     record1.put("id", "1");
     record1.put("partition", "partition0");
     record1.put("ts", 0L);
-    record1.put("_hoodie_is_deleted", false);
+    record1.put(defaultDeleteMarkerField, false);

Review comment:
       actually we could test this way. not sure if you already do that. 
   set default marker field value to true and user defined to false. If OverwriteWithLatestAvro is instantiated w/o any marker fields, the record should be deleted. If OverwriteWithLatestAvro is instantiated w/ user defined marker field, the record should be considered active. Vice versa as well. All tests in this class could be done this way to ensure that the other column is treated as yet another user's data column and hoodie does not care about it. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org