You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2021/09/10 21:48:02 UTC

[GitHub] [iceberg] rdblue commented on a change in pull request #3073: Core: Enhance weightFunc of bin-packing to adapt to V2Format

rdblue commented on a change in pull request #3073:
URL: https://github.com/apache/iceberg/pull/3073#discussion_r706487848



##########
File path: core/src/main/java/org/apache/iceberg/actions/BinPackStrategy.java
##########
@@ -211,7 +220,13 @@ protected long writeMaxFileSize() {
   }
 
   private long sizeOfInputFiles(List<FileScanTask> group) {
-    return group.stream().mapToLong(FileScanTask::length).sum();
+    return group.stream().mapToLong(this::sizeOfInputFile).sum();
+  }
+
+  private long sizeOfInputFile(FileScanTask file) {
+    // For V2Format, we should check the size of delete file as well to avoid unbalanced bin-packing
+    return Math.max(file.length() + file.deletes().stream().mapToLong(ContentFile::fileSizeInBytes).sum(),

Review comment:
       Can you wrap this so that both arguments to `max` start on a new line?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org