You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2022/07/25 15:04:04 UTC

[GitHub] [iceberg] Reo-LEI commented on a diff in pull request #5301: Core: Support building custom tasks in ManifestGroup

Reo-LEI commented on code in PR #5301:
URL: https://github.com/apache/iceberg/pull/5301#discussion_r928991682


##########
core/src/main/java/org/apache/iceberg/ManifestGroup.java:
##########
@@ -279,4 +287,84 @@ public void close() throws IOException {
           }
         });
   }
+
+  abstract static class ScanTaskFactory<T extends ScanTask> {
+    private final String schemaAsString;
+    private final String specAsString;
+    private final DeleteFileIndex deletes;
+    private final ResidualEvaluator residuals;
+    private final boolean dropStats;
+
+    ScanTaskFactory(PartitionSpec spec, DeleteFileIndex deletes, ResidualEvaluator residuals, boolean dropStats) {
+      this.schemaAsString = SchemaParser.toJson(spec.schema());
+      this.specAsString = PartitionSpecParser.toJson(spec);
+      this.deletes = deletes;
+      this.residuals = residuals;
+      this.dropStats = dropStats;
+    }
+
+    abstract CloseableIterable<T> createTasks(CloseableIterable<ManifestEntry<DataFile>> entries);
+
+    String schemaAsString() {
+      return schemaAsString;
+    }
+
+    String specAsString() {
+      return specAsString;
+    }
+
+    DeleteFileIndex deletes() {
+      return deletes;
+    }
+
+    ResidualEvaluator residuals() {
+      return residuals;
+    }
+
+    boolean shouldKeepStats() {
+      return !dropStats;
+    }
+
+    abstract static class Builder<T extends ScanTask> {

Review Comment:
   +1 for yufei approach. Or we construct the cache for `schemaString` per spec and pass the `schemaString` from cache to `createTask(CloseableIterable<ManifestEntry<DataFile>> entries, String schemaString)`. But I prefer the former.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org