You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2020/10/13 07:29:13 UTC

[GitHub] [hudi] vinothchandar commented on a change in pull request #2157: [HUDI-1330] handle prefix filtering at directory level

vinothchandar commented on a change in pull request #2157:
URL: https://github.com/apache/hudi/pull/2157#discussion_r503723966



##########
File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/sources/helpers/DFSPathSelector.java
##########
@@ -119,4 +103,25 @@ public DFSPathSelector(TypedProperties props, Configuration hadoopConf) {
       throw new HoodieIOException("Unable to read from source from checkpoint: " + lastCheckpointStr, ioe);
     }
   }
+
+  /**
+   * List files recursively, filter out illegible files/directories while doing so.
+   */
+  private List<FileStatus> listEligibleFiles(FileSystem fs, Path path, long lastCheckpointTime) throws IOException {
+    // skip files/dirs whose names start with (_, ., etc)
+    FileStatus[] statuses = fs.listStatus(path, file ->

Review comment:
       what's the advantage of walking recursively by ourselves, instead of calling fs.listStatus(,true) like before?  (in the interest of keeping PR minimal and changing just the prefix filtering) 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org