You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2021/01/16 02:08:50 UTC

[GitHub] [hudi] n3nash commented on a change in pull request #2451: [HUDI-1529] Add block size to the FileStatus objects returned from metadata table to avoid too many file splits

n3nash commented on a change in pull request #2451:
URL: https://github.com/apache/hudi/pull/2451#discussion_r558762513



##########
File path: hudi-common/src/main/java/org/apache/hudi/metadata/HoodieMetadataPayload.java
##########
@@ -177,10 +179,12 @@ public HoodieMetadataPayload preCombine(HoodieMetadataPayload previousRecord) {
   /**
    * Returns the files added as part of this record.
    */
-  public FileStatus[] getFileStatuses(Path partitionPath) {
+  public FileStatus[] getFileStatuses(Configuration hadoopConf, Path partitionPath) throws IOException {
+    FileSystem fs = partitionPath.getFileSystem(hadoopConf);
+    long blockSize = fs.getDefaultBlockSize(partitionPath);
     return filterFileInfoEntries(false)
-        .map(e -> new FileStatus(e.getValue().getSize(), false, 0, 0, 0, 0, null, null, null,
-            new Path(partitionPath, e.getKey())))
+        .map(e -> new FileStatus(e.getValue().getSize(), false, 0, blockSize, 0, 0,

Review comment:
       @umehrot2 Since we are fixing 0 -> blockSize, just curious, are the other `0` arguments valid or should they also have some dynamic value ? We don't have to fix it as part of the PR but would be great if you can take a look..




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org