You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by GitBox <gi...@apache.org> on 2020/10/05 16:31:55 UTC

[GitHub] [hadoop] steveloughran commented on a change in pull request #2354: HADOOP-17281 Implement FileSystem.listStatusIterator() in S3AFileSystem

steveloughran commented on a change in pull request #2354:
URL: https://github.com/apache/hadoop/pull/2354#discussion_r499720582



##########
File path: hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/contract/ContractTestUtils.java
##########
@@ -1453,6 +1453,52 @@ public static TreeScanResults treeWalk(FileSystem fs, Path path)
     return list;
   }
 
+  /**
+   * Convert a remote iterator over file status results into a list.
+   * The utility equivalents in commons collection and guava cannot be
+   * used here, as this is a different interface, one whose operators
+   * can throw IOEs.
+   * @param iterator input iterator
+   * @return the file status entries as a list.
+   * @throws IOException
+   */
+  public static List<? extends FileStatus> iteratorToList(
+          RemoteIterator<? extends FileStatus> iterator) throws IOException {
+    ArrayList<FileStatus> list = new ArrayList<>();
+    while (iterator.hasNext()) {
+      list.add(iterator.next());
+    }
+    return list;
+  }
+
+
+  /**
+   * Convert a remote iterator over file status results into a list.
+   * This uses {@link RemoteIterator#next()} calls only, expecting
+   * a raised {@link NoSuchElementException} exception to indicate that
+   * the end of the listing has been reached. This iteration strategy is
+   * designed to verify that the implementation of the remote iterator
+   * generates results and terminates consistently with the {@code hasNext/next}
+   * iteration. More succinctly "verifies that the {@code next()} operator
+   * isn't relying on {@code hasNext()} to always be called during an iteration.
+   * @param iterator input iterator
+   * @return the status entries as a list.
+   * @throws IOException IO problems
+   */
+  @SuppressWarnings("InfiniteLoopStatement")
+  public static List<? extends FileStatus> iteratorToListThroughNextCallsAlone(
+          RemoteIterator<? extends FileStatus> iterator) throws IOException {
+    ArrayList<FileStatus> list = new ArrayList<>();

Review comment:
       same: `List` as variable type
   
   Actually, it may make sense to type this method 
   ```
   RemoteIterator<T extends FileStatus>
   ```
   and have 
   ```
   List<T>
   ```
   and
   ```
   RemoteIterator<T>
   ```
    where appropriate

##########
File path: hadoop-common-project/hadoop-common/src/site/markdown/filesystem/filesystem.md
##########
@@ -294,6 +294,20 @@ any optimizations.
 The atomicity and consistency constraints are as for
 `listStatus(Path, PathFilter)`.
 
+### `RemoteIterator<FileStatus> listStatusIterator(Path p)`
+Return an iterator enumerating the `FileStatus` entries under
+a path. This is similar to `listStatus(Path)` except the fact that
+rather than returning an entire list, an iterator is returned.
+The result is exactly the same as listStatus, provided no other caller 

Review comment:
       use `listStatus()` with backticks

##########
File path: hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/contract/AbstractContractRootDirectoryTest.java
##########
@@ -242,6 +243,13 @@ public void testSimpleRootListing() throws IOException {
             + "listStatus = " + listStatusResult
             + "listFiles = " + listFilesResult,
         fileList.size() <= statuses.length);
+    List<FileStatus> statusList = (List<FileStatus>) iteratorToList(
+            fs.listStatusIterator(root));
+    String listStatusItrRes = join(statusList, "\n");

Review comment:
       if you can have both sets of listing as collections, use Assertions.assertThat() as you've done earlier

##########
File path: hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/contract/AbstractContractGetFileStatusTest.java
##########
@@ -322,6 +355,22 @@ public void testListStatusFile() throws Throwable {
     verifyStatusArrayMatchesFile(f, getFileSystem().listStatus(f));
   }
 
+  @Test
+  public void testListStatusIteratorFile() throws Throwable {
+    describe("test the listStatusIterator(path) on a file");
+    Path f = touchf("listStItrFile");
+    List<FileStatus> statusList = (List<FileStatus>) iteratorToList(
+            getFileSystem().listStatusIterator(f));
+    assertEquals("size of file list returned", 1, statusList.size());
+    assertIsNamedFile(f, statusList.get(0));
+    List<FileStatus> statusList2 =
+            (List<FileStatus>) iteratorToListThroughNextCallsAlone(
+                    getFileSystem().listStatusIterator(f));
+    assertEquals("size of file list returned through next() calls",

Review comment:
       AssertionsAssertThat to assert that the list size == 1




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org