You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@gobblin.apache.org by "vikrambohra (via GitHub)" <gi...@apache.org> on 2023/06/10 01:37:00 UTC

[GitHub] [gobblin] vikrambohra commented on a diff in pull request #3701: [GOBBLIN-1838] Introduce total count based completion watermark

vikrambohra commented on code in PR #3701:
URL: https://github.com/apache/gobblin/pull/3701#discussion_r1224949023


##########
gobblin-iceberg/src/main/java/org/apache/gobblin/iceberg/writer/IcebergMetadataWriter.java:
##########
@@ -891,94 +901,45 @@ public void flush(String dbName, String tableName) throws IOException {
     }
   }
 
-  @Override
-  public void reset(String dbName, String tableName) throws IOException {
-    this.tableMetadataMap.remove(TableIdentifier.of(dbName, tableName));
+  private AbstractCompletenessWatermarkUpdater getWatermarkUpdater(String topicName, TableMetadata tableMetadata,
+      Map<String, String> propsToUpdate, boolean isTotalCountCompleteness) {
+    return isTotalCountCompleteness

Review Comment:
   Is this correct? Seems reverse to me



##########
gobblin-iceberg/src/main/java/org/apache/gobblin/iceberg/writer/IcebergMetadataWriter.java:
##########
@@ -836,15 +851,10 @@ public void flush(String dbName, String tableName) throws IOException {
         // The logic is to check the window [currentHour-1,currentHour] and update the watermark if there are no audit counts
         if(!tableMetadata.appendFiles.isPresent() && !tableMetadata.deleteFiles.isPresent()
             && tableMetadata.completenessEnabled) {
-          if (tableMetadata.completionWatermark > DEFAULT_COMPLETION_WATERMARK) {
-            log.info(String.format("Checking kafka audit for %s on change_property ", topicName));
-            SortedSet<ZonedDateTime> timestamps = new TreeSet<>();
-            ZonedDateTime dtAtBeginningOfHour = ZonedDateTime.now(ZoneId.of(this.timeZone)).truncatedTo(ChronoUnit.HOURS);
-            timestamps.add(dtAtBeginningOfHour);
-            checkAndUpdateCompletenessWatermark(tableMetadata, topicName, timestamps, props);
-          } else {
-            log.info(String.format("Need valid watermark, current watermark is %s, Not checking kafka audit for %s",
-                tableMetadata.completionWatermark, topicName));
+          updateWatermarkWithEmptyFilesRegistered(topicName, tableMetadata, props, false);
+
+          if (tableMetadata.totalCountCompletenessEnabled) {
+            updateWatermarkWithEmptyFilesRegistered(topicName, tableMetadata, props, true);

Review Comment:
   updateWatermarkWithNoFilesRegistered?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@gobblin.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org