You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@iceberg.apache.org by "RussellSpitzer (via GitHub)" <gi...@apache.org> on 2023/03/21 18:15:38 UTC

[GitHub] [iceberg] RussellSpitzer commented on a diff in pull request #6624: 🎨 Add "parallelism" parameter to "add_files" syscall and MigrateTable, SnapshotTable.

RussellSpitzer commented on code in PR #6624:
URL: https://github.com/apache/iceberg/pull/6624#discussion_r1143821714


##########
spark/v3.2/spark/src/main/java/org/apache/iceberg/spark/actions/MigrateTableSparkAction.java:
##########
@@ -57,6 +58,8 @@ public class MigrateTableSparkAction extends BaseTableCreationSparkAction<Migrat
   private final StagingTableCatalog destCatalog;
   private final Identifier destTableIdent;
   private final Identifier backupIdent;
+  // Max number of concurrent files to read per partition while indexing table
+  private int readDatafileParallelism = 1;

Review Comment:
   This is different than the default in SparkTableUtil, is that intentional? I thought we were going to do the worker pool size everywhere



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org