You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2020/01/28 01:21:09 UTC

[GitHub] [incubator-iceberg] rdsr commented on a change in pull request #315: [WIP] Incremental processing implementation

rdsr commented on a change in pull request #315: [WIP] Incremental processing implementation
URL: https://github.com/apache/incubator-iceberg/pull/315#discussion_r371569916
 
 

 ##########
 File path: core/src/main/java/org/apache/iceberg/ManifestGroup.java
 ##########
 @@ -104,6 +129,29 @@ ManifestGroup caseSensitive(boolean newCaseSensitive) {
   }
 
   /**
+   * Returns a iterable of scan tasks. It is safe to add entries of this iterable
+   * to a collection as {@link DataFile} in each {@link FileScanTask} is defensively
+   * copied.
+   * @return a {@link CloseableIterable} of {@link FileScanTask}
+   */
+  public CloseableIterable<FileScanTask> tasks() {
+    Iterable<CloseableIterable<FileScanTask>> tasks = entries((manifest, entries) -> {
+      PartitionSpec spec = specsById.get(manifest.partitionSpecId());
+      String schemaString = SchemaParser.toJson(spec.schema());
+      String specString = PartitionSpecParser.toJson(spec);
+      ResidualEvaluator residuals = ResidualEvaluator.of(spec, dataFilter, caseSensitive);
+      return CloseableIterable.transform(entries, e -> new BaseFileScanTask(
+          e.copy().file(), schemaString, specString, residuals));
+    });
+
+    if (PLAN_SCANS_WITH_WORKER_POOL && manifests.size() > 1) {
 
 Review comment:
   note:`manifest.size() > 1` can be `true` but because of the `manifestPredicate` matching manifest files may be `<= 1` .  I couldn't find a good way to check if the filtered manifests are > 1, but I'd imagine this is more of an optimization so it doesn't hurt in cases where `manifests.size() > 1` but the filtered manifests are <=1

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org