You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by "stevenzwu (via GitHub)" <gi...@apache.org> on 2023/04/13 14:54:07 UTC

[GitHub] [iceberg] stevenzwu commented on a diff in pull request #7338: Flink: use correct scan mode when in TABLE_SCAN_THEN_INCREMENTAL mode

stevenzwu commented on code in PR #7338:
URL: https://github.com/apache/iceberg/pull/7338#discussion_r1165658208


##########
flink/v1.17/flink/src/main/java/org/apache/iceberg/flink/source/FlinkSplitPlanner.java:
##########
@@ -136,12 +137,21 @@ static CloseableIterable<CombinedScanTask> planTasks(
     }
   }
 
-  private enum ScanMode {
+  @VisibleForTesting
+  enum ScanMode {
     BATCH,
     INCREMENTAL_APPEND_SCAN
   }
 
-  private static ScanMode checkScanMode(ScanContext context) {
+  @VisibleForTesting
+  static ScanMode checkScanMode(ScanContext context) {

Review Comment:
   @chenjunjiedada  thx for catching the bug and creating the PR fix.
   
   For the conditions here, is there any other simpler logic? E.g., is it enough to just remove the `context.isStreaming()` condition in the original if clause?
   
   Also I think it is better safer/more clear to construct a new `ScanContext` object and set the `useSnapshotId`.
   ```
       if (scanContext.streamingStartingStrategy()
           == StreamingStartingStrategy.TABLE_SCAN_THEN_INCREMENTAL) {
         // do a batch table scan first
         splits = FlinkSplitPlanner.planIcebergSourceSplits(table, scanContext, workerPool);
         LOG.info(
             "Discovered {} splits from initial batch table scan with snapshot Id {}",
             splits.size(),
             startSnapshot.snapshotId());
         // For TABLE_SCAN_THEN_INCREMENTAL, incremental mode starts exclusive from the startSnapshot
         toPosition =
             IcebergEnumeratorPosition.of(startSnapshot.snapshotId(), startSnapshot.timestampMillis());
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org