You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ozone.apache.org by GitBox <gi...@apache.org> on 2021/02/23 12:36:18 UTC

[GitHub] [ozone] rakeshadr opened a new pull request #1954: HDDS-4683. [FSO]ListKeys: do lookup in dir and file tables

rakeshadr opened a new pull request #1954:
URL: https://github.com/apache/ozone/pull/1954


   ## What changes were proposed in this pull request?
   
   This task is to perform #listKeys by searching the startKey, keyPrefix in Dir & File Tables.
   
   ## What is the link to the Apache JIRA
   
   https://issues.apache.org/jira/browse/HDDS-4683
   
   ## How was this patch tested?
   
   Added only one UT now.  I will add more test cases. I'm thinking something like large dirs and sub-dirs/files.
   
   Also, there is an open point: #listKeys("key-a-"), basically partial keyName and startKey. For example, actual keyName stored in DB are "key-a-10", "key-a-11", "key-a-12"..."key-z-10", "key-z-11", "key-z-12".
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] linyiqun commented on a change in pull request #1954: HDDS-4683. [FSO]ListKeys: do lookup in dir and file tables

Posted by GitBox <gi...@apache.org>.
linyiqun commented on a change in pull request #1954:
URL: https://github.com/apache/ozone/pull/1954#discussion_r581154703



##########
File path: hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/OzoneBucket.java
##########
@@ -828,10 +846,273 @@ public OzoneKey next() {
      * @param prevKey
      * @return {@code List<OzoneKey>}
      */
-    private List<OzoneKey> getNextListOfKeys(String prevKey) throws
+    List<OzoneKey> getNextListOfKeys(String prevKey) throws
         IOException {
       return proxy.listKeys(volumeName, name, keyPrefix, prevKey,
           listCacheSize);
     }
   }
+
+
+  /**
+   * An Iterator to iterate over {@link OzoneKey} list.
+     *               buck-1
+     *                |
+     *                a
+     *                |
+     *      ---------------------------
+     *     |           |               |
+     *     b1          b2              b3
+     *   -----       --------        ----------
+     *   |    |      |    |   |     |    |     |
+     *  c1   c2     d1   d2  d3     e1   e2   e3
+     *                   |          |
+     *                d21.txt    e11.txt
+   *
+   * Will do Depth-First-Traversal and visit node in this fashion:
+   *
+   * a -> b1 -> c1 -> c2 -> b2 -> d1 -> d2 -> d21.txt -> d3 -> b3 -> e1 ->
+   * e11.txt -> e2 -> e3
+   *
+   * Note: there is no order guarantee.
+   */
+  private class KeyIteratorV1 extends KeyIterator{
+
+    private Stack<String> stack;
+    private List<OzoneKey> pendingItemsToBeBatched;
+    private boolean addedKeyPrefix;
+
+    /**
+     * Creates an Iterator to iterate over all keys after prevKey in the bucket.
+     * If prevKey is null it iterates from the first key in the bucket.
+     * The returned keys match key prefix.
+     *
+     * @param keyPrefix
+     * @param prevKey
+     */
+    KeyIteratorV1(String keyPrefix, String prevKey) throws IOException {
+      super(keyPrefix, prevKey);
+      addedKeyPrefix = true;
+    }
+
+    @Override
+    List<OzoneKey> getNextListOfKeys(String prevKey) throws IOException {
+      if (stack == null) {
+        stack = new Stack();
+        pendingItemsToBeBatched = new ArrayList<>();
+      }
+
+      // normalize paths
+      if (!addedKeyPrefix) {
+        prevKey = OmUtils.normalizeKey(prevKey, true);
+        String keyPrefixName = "";
+        if (StringUtils.isNotBlank(getKeyPrefix())) {
+          keyPrefixName = OmUtils.normalizeKey(getKeyPrefix(), true);
+        }
+        setKeyprefix(keyPrefixName);
+      }
+
+      // Get immediate children
+      List<OzoneKey> keysResultList = new ArrayList<>();
+      listChildrenKeys(getKeyPrefix(), prevKey, keysResultList);
+
+      // TODO: Back and Forth seek all the files & dirs, starting from
+      //  startKey till keyPrefix.
+
+      return keysResultList;
+    }
+
+    /**
+     * List children under the given keyPrefix path. This doesn't guarantee
+     * order. Internally, it does recursive #listStatus calls to list all the
+     * sub-keysResultList.

Review comment:
       >This doesn't guarantee order.
   
   The listStatus result list returned should already in order I think. Why here say doesn't guarantee order?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] linyiqun commented on a change in pull request #1954: HDDS-4683. [FSO]ListKeys: do lookup in dir and file tables

Posted by GitBox <gi...@apache.org>.
linyiqun commented on a change in pull request #1954:
URL: https://github.com/apache/ozone/pull/1954#discussion_r582042055



##########
File path: hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/OzoneBucket.java
##########
@@ -828,10 +846,285 @@ public OzoneKey next() {
      * @param prevKey
      * @return {@code List<OzoneKey>}
      */
-    private List<OzoneKey> getNextListOfKeys(String prevKey) throws
+    List<OzoneKey> getNextListOfKeys(String prevKey) throws
         IOException {
       return proxy.listKeys(volumeName, name, keyPrefix, prevKey,
           listCacheSize);
     }
   }
+
+
+  /**
+   * An Iterator to iterate over {@link OzoneKey} list.
+   *
+   *                  buck-1
+   *                    |
+   *                    a
+   *                    |
+   *      -----------------------------------
+   *     |           |                       |
+   *     b1          b2                      b3
+   *   -----       --------               ----------
+   *   |    |      |    |   |             |    |     |
+   *  c1   c2     d1   d2  d3             e1   e2   e3
+   *                   |                  |
+   *               --------               |
+   *              |        |              |
+   *           d21.txt   d22.txt        e11.txt
+   *
+   * Say, keyPrefix="a" and prevKey="", then will do Depth-First-Traversal and
+   * visit node to getChildren in below fashion:-
+   * 1. getChildren("a/")  2. getChildren("a/b1")  3. getChildren("a/b1/c1")
+   * 4. getChildren("a/b1/c2")  5. getChildren("a/b2/d1")
+   * 6. getChildren("a/b2/d2")  7. getChildren("a/b2/d3")
+   * 8. getChildren("a/b3/e1")  9. getChildren("a/b3/e2")
+   * 10. getChildren("a/b3/e3")
+   *
+   * Note: Does not guarantee to return the list of keys in a sorted order.
+   */
+  private class KeyIteratorV1 extends KeyIterator{
+
+    private Stack<String> stack;
+    private List<OzoneKey> pendingItemsToBeBatched;
+    private boolean addedKeyPrefix;
+
+    /**
+     * Creates an Iterator to iterate over all keys after prevKey in the bucket.
+     * If prevKey is null it iterates from the first key in the bucket.
+     * The returned keys match key prefix.
+     *
+     * @param keyPrefix
+     * @param prevKey
+     */
+    KeyIteratorV1(String keyPrefix, String prevKey) throws IOException {
+      super(keyPrefix, prevKey);
+      addedKeyPrefix = true;
+    }
+
+    @Override
+    List<OzoneKey> getNextListOfKeys(String prevKey) throws IOException {
+      if (stack == null) {
+        stack = new Stack();
+        pendingItemsToBeBatched = new ArrayList<>();
+      }
+
+      // normalize paths
+      if (!addedKeyPrefix) {
+        prevKey = OmUtils.normalizeKey(prevKey, true);
+        String keyPrefixName = "";
+        if (StringUtils.isNotBlank(getKeyPrefix())) {
+          keyPrefixName = OmUtils.normalizeKey(getKeyPrefix(), true);
+        }
+        setKeyprefix(keyPrefixName);
+      }
+
+      // Get immediate children
+      List<OzoneKey> keysResultList = new ArrayList<>();
+      getChildrenKeys(getKeyPrefix(), prevKey, keysResultList);
+
+      // TODO: Back and Forth seek all the files & dirs, starting from
+      //  startKey till keyPrefix.
+
+      return keysResultList;
+    }
+
+    /**
+     * List children under the given keyPrefix and startKey path. It does
+     * recursive #listStatus calls to list all the sub-keys resultList.
+     *
+     *                  buck-1
+     *                    |
+     *                    a
+     *                    |
+     *      -----------------------------------
+     *     |           |                       |
+     *     b1          b2                      b3
+     *   -----       --------               ----------
+     *   |    |      |    |   |             |    |     |
+     *  c1   c2     d1   d2  d3             e1   e2   e3
+     *                   |                  |
+     *               --------               |
+     *              |        |              |
+     *           d21.txt   d22.txt        e11.txt
+     *
+     * Say, KeyPrefix = "a" and startKey = null;
+     *
+     * Iteration-1) RPC call proxy#listStatus("a").
+     *              Add b3, b2 and b1 to stack.
+     * Iteration-2) pop b1 and do RPC call proxy#listStatus("b1")
+     *              Add c2, c1 to stack.
+     * Iteration-3) pop c1 and do RPC call proxy#listStatus("c1"). Empty list.
+     * Iteration-4) pop c2 and do RPC call proxy#listStatus("c2"). Empty list.
+     * Iteration-5) pop b2 and do RPC call proxy#listStatus("b2")
+     *              Add d3, d2 and d1 to stack.
+     *              ..........
+     *              ..........
+     * Iteration-n) pop e3 and do RPC call proxy#listStatus("e2")
+     *              Add d3, d2 and d1 to stack.
+     *
+     * @param keyPrefix
+     * @param startKey
+     * @param keysResultList
+     * @return true represents it reached limit batch size, false otherwise.
+     * @throws IOException
+     */
+    private boolean getChildrenKeys(String keyPrefix, String startKey,
+        List<OzoneKey> keysResultList) throws IOException {
+
+      // listStatus API expects a not null 'startKey' value
+      startKey = startKey == null ? "" : startKey;
+
+      // 1. Add pending items to the user key resultList
+      if (addAllPendingItemsToResultList(keysResultList)) {
+        // reached limit batch size.
+        return true;
+      }
+
+      // 2. Get immediate children of keyPrefix, starting with startKey
+      List<OzoneFileStatus> statuses = proxy.listStatus(volumeName, name,
+              keyPrefix, false, startKey, listCacheSize);
+
+      // 3. Special case: ListKey expects keyPrefix element should present in
+      // the resultList, only if startKey is blank. If startKey is not blank
+      // then resultList shouldn't contain the startKey element.
+      // Since proxy#listStatus API won't return keyPrefix element in the
+      // resultList. So, this is to add user given keyPrefix to the return list.
+      addKeyPrefixInfoToResultList(keyPrefix, startKey, keysResultList);

Review comment:
       Can you share the code link related with this case? I didn't actually catch this point, the code I am seeing is [KeyManagerImpl.java#L2173](https://github.com/apache/ozone/blob/HDDS-2939/hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/KeyManagerImpl.java#L2173). But seems the code I am checking is not the place that match the above proxy.listStatus call.

##########
File path: hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/ozone/om/TestObjectStoreV1.java
##########
@@ -275,6 +285,246 @@ public void testLookupKey() throws Exception {
             dirPathC.getObjectID(), true);
   }
 
+  /**
+   * Verify listKeys at different levels.
+   *
+   *                  buck-1
+   *                    |
+   *                    a
+   *                    |
+   *      -----------------------------------
+   *     |              |                       |
+   *     b1             b2                      b3
+   *    -----           --------               ----------
+   *   |      |        |    |   |             |    |     |
+   *  c1     c2        d1   d2  d3             e1   e2   e3
+   *  |      |         |    |   |              |    |    |
+   * c1.tx  c2.tx   d11.tx  | d31.tx           |    |    e31.tx
+   *                      --------             |   e21.tx
+   *                     |        |            |
+   *                   d21.tx   d22.tx        e11.tx
+   *
+   * Above is the FS tree structure.
+   */
+  @Test
+  public void testListKeysAtDifferentLevels() throws Exception {
+    OzoneClient client = cluster.getClient();
+
+    ObjectStore objectStore = client.getObjectStore();
+    OzoneVolume ozoneVolume = objectStore.getVolume(volumeName);
+    Assert.assertTrue(ozoneVolume.getName().equals(volumeName));
+    OzoneBucket ozoneBucket = ozoneVolume.getBucket(bucketName);
+    Assert.assertTrue(ozoneBucket.getName().equals(bucketName));
+
+    String keyc1 = "/a/b1/c1/c1.tx";
+    String keyc2 = "/a/b1/c2/c2.tx";
+
+    String keyd13 = "/a/b2/d1/d11.tx";
+    String keyd21 = "/a/b2/d2/d21.tx";
+    String keyd22 = "/a/b2/d2/d22.tx";
+    String keyd31 = "/a/b2/d3/d31.tx";
+
+    String keye11 = "/a/b3/e1/e11.tx";
+    String keye21 = "/a/b3/e2/e21.tx";
+    String keye31 = "/a/b3/e3/e31.tx";
+
+    LinkedList<String> keys = new LinkedList<>();
+    keys.add(keyc1);
+    keys.add(keyc2);
+
+    keys.add(keyd13);
+    keys.add(keyd21);
+    keys.add(keyd22);
+    keys.add(keyd31);
+
+    keys.add(keye11);
+    keys.add(keye21);
+    keys.add(keye31);
+
+    int length = 10;
+    byte[] input = new byte[length];
+    Arrays.fill(input, (byte)96);
+
+    createKeys(ozoneBucket, keys);
+
+    // Root level listing keys
+    Iterator<? extends OzoneKey> ozoneKeyIterator =
+        ozoneBucket.listKeys("/a", null);
+
+    LinkedList<String> expectedKeys = new LinkedList<>();
+    expectedKeys.add("a/");
+    expectedKeys.add("a/b1/");
+    expectedKeys.add("a/b2/");
+    expectedKeys.add("a/b3/");
+    expectedKeys.add("a/b1/c1/");
+    expectedKeys.add("a/b1/c2/");
+    expectedKeys.add("a/b1/c1/c1.tx");
+    expectedKeys.add("a/b1/c2/c2.tx");
+    expectedKeys.add("a/b2/d1/");
+    expectedKeys.add("a/b2/d2/");
+    expectedKeys.add("a/b2/d3/");
+    expectedKeys.add("a/b2/d1/d11.tx");
+    expectedKeys.add("a/b2/d2/d21.tx");
+    expectedKeys.add("a/b2/d2/d22.tx");
+    expectedKeys.add("a/b2/d3/d31.tx");
+    expectedKeys.add("a/b3/e1/");
+    expectedKeys.add("a/b3/e2/");
+    expectedKeys.add("a/b3/e3/");
+    expectedKeys.add("a/b3/e1/e11.tx");
+    expectedKeys.add("a/b3/e2/e21.tx");
+    expectedKeys.add("a/b3/e3/e31.tx");
+    checkKeyList(ozoneKeyIterator, expectedKeys);
+
+    // Intermediate level keyPrefix - 2nd level
+    ozoneKeyIterator =
+        ozoneBucket.listKeys("a///b2///", null);
+    expectedKeys = new LinkedList<>();
+    expectedKeys.add("a/b2/");
+    expectedKeys.add("a/b2/d1/");
+    expectedKeys.add("a/b2/d2/");
+    expectedKeys.add("a/b2/d3/");
+    expectedKeys.add("a/b2/d1/d11.tx");
+    expectedKeys.add("a/b2/d2/d21.tx");
+    expectedKeys.add("a/b2/d2/d22.tx");
+    expectedKeys.add("a/b2/d3/d31.tx");
+    checkKeyList(ozoneKeyIterator, expectedKeys);
+
+    // Intermediate level keyPrefix - 3rd level
+    ozoneKeyIterator =
+        ozoneBucket.listKeys("a/b2/d1", null);
+    expectedKeys = new LinkedList<>();
+    expectedKeys.add("a/b2/d1/");
+    expectedKeys.add("a/b2/d1/d11.tx");
+    checkKeyList(ozoneKeyIterator, expectedKeys);
+
+    // Boundary of a level
+    ozoneKeyIterator =
+        ozoneBucket.listKeys("a/b2/d2", "a/b2/d2/d21.tx");
+    expectedKeys = new LinkedList<>();
+    expectedKeys.add("a/b2/d2/d22.tx");
+    checkKeyList(ozoneKeyIterator, expectedKeys);
+
+    // Boundary case - last node in the depth-first-traversal
+    ozoneKeyIterator =
+        ozoneBucket.listKeys("a/b3/e3", "a/b3/e3/e31.tx");
+    expectedKeys = new LinkedList<>();
+    checkKeyList(ozoneKeyIterator, expectedKeys);

Review comment:
       Can you add one more case for that keyPrefix is null meanwhile startKey is not  null, like listKeys(null,"xxx");




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] linyiqun commented on a change in pull request #1954: HDDS-4683. [FSO]ListKeys: do lookup in dir and file tables

Posted by GitBox <gi...@apache.org>.
linyiqun commented on a change in pull request #1954:
URL: https://github.com/apache/ozone/pull/1954#discussion_r582880826



##########
File path: hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/ozone/om/TestObjectStoreV1.java
##########
@@ -275,6 +285,259 @@ public void testLookupKey() throws Exception {
             dirPathC.getObjectID(), true);
   }
 
+  /**
+   * Verify listKeys at different levels.
+   *
+   *                  buck-1
+   *                    |
+   *                    a
+   *                    |
+   *      -----------------------------------
+   *     |              |                       |
+   *     b1             b2                      b3
+   *    -----           --------               ----------
+   *   |      |        |    |   |             |    |     |
+   *  c1     c2        d1   d2  d3             e1   e2   e3
+   *  |      |         |    |   |              |    |    |
+   * c1.tx  c2.tx   d11.tx  | d31.tx           |    |    e31.tx
+   *                      --------             |   e21.tx
+   *                     |        |            |
+   *                   d21.tx   d22.tx        e11.tx
+   *
+   * Above is the FS tree structure.
+   */
+  @Test
+  public void testListKeysAtDifferentLevels() throws Exception {
+    OzoneClient client = cluster.getClient();
+
+    ObjectStore objectStore = client.getObjectStore();
+    OzoneVolume ozoneVolume = objectStore.getVolume(volumeName);
+    Assert.assertTrue(ozoneVolume.getName().equals(volumeName));
+    OzoneBucket ozoneBucket = ozoneVolume.getBucket(bucketName);
+    Assert.assertTrue(ozoneBucket.getName().equals(bucketName));
+
+    String keyc1 = "/a/b1/c1/c1.tx";
+    String keyc2 = "/a/b1/c2/c2.tx";
+
+    String keyd13 = "/a/b2/d1/d11.tx";
+    String keyd21 = "/a/b2/d2/d21.tx";
+    String keyd22 = "/a/b2/d2/d22.tx";
+    String keyd31 = "/a/b2/d3/d31.tx";
+
+    String keye11 = "/a/b3/e1/e11.tx";
+    String keye21 = "/a/b3/e2/e21.tx";
+    String keye31 = "/a/b3/e3/e31.tx";
+
+    LinkedList<String> keys = new LinkedList<>();
+    keys.add(keyc1);
+    keys.add(keyc2);
+
+    keys.add(keyd13);
+    keys.add(keyd21);
+    keys.add(keyd22);
+    keys.add(keyd31);
+
+    keys.add(keye11);
+    keys.add(keye21);
+    keys.add(keye31);
+
+    int length = 10;
+    byte[] input = new byte[length];
+    Arrays.fill(input, (byte)96);
+
+    createKeys(ozoneBucket, keys);
+
+    // Root level listing keys
+    Iterator<? extends OzoneKey> ozoneKeyIterator =
+        ozoneBucket.listKeys(null, null);
+    verifyFullTreeStructure(ozoneKeyIterator);
+
+    ozoneKeyIterator =
+        ozoneBucket.listKeys(null, null);
+    verifyFullTreeStructure(ozoneKeyIterator);

Review comment:
       Duplicated two lines.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] rakeshadr commented on a change in pull request #1954: HDDS-4683. [FSO]ListKeys: do lookup in dir and file tables

Posted by GitBox <gi...@apache.org>.
rakeshadr commented on a change in pull request #1954:
URL: https://github.com/apache/ozone/pull/1954#discussion_r581604059



##########
File path: hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/OzoneBucket.java
##########
@@ -828,10 +846,273 @@ public OzoneKey next() {
      * @param prevKey
      * @return {@code List<OzoneKey>}
      */
-    private List<OzoneKey> getNextListOfKeys(String prevKey) throws
+    List<OzoneKey> getNextListOfKeys(String prevKey) throws
         IOException {
       return proxy.listKeys(volumeName, name, keyPrefix, prevKey,
           listCacheSize);
     }
   }
+
+
+  /**
+   * An Iterator to iterate over {@link OzoneKey} list.
+     *               buck-1
+     *                |
+     *                a
+     *                |
+     *      ---------------------------
+     *     |           |               |
+     *     b1          b2              b3
+     *   -----       --------        ----------
+     *   |    |      |    |   |     |    |     |
+     *  c1   c2     d1   d2  d3     e1   e2   e3
+     *                   |          |
+     *                d21.txt    e11.txt
+   *
+   * Will do Depth-First-Traversal and visit node in this fashion:
+   *
+   * a -> b1 -> c1 -> c2 -> b2 -> d1 -> d2 -> d21.txt -> d3 -> b3 -> e1 ->
+   * e11.txt -> e2 -> e3
+   *
+   * Note: there is no order guarantee.
+   */
+  private class KeyIteratorV1 extends KeyIterator{
+
+    private Stack<String> stack;
+    private List<OzoneKey> pendingItemsToBeBatched;
+    private boolean addedKeyPrefix;
+
+    /**
+     * Creates an Iterator to iterate over all keys after prevKey in the bucket.
+     * If prevKey is null it iterates from the first key in the bucket.
+     * The returned keys match key prefix.
+     *
+     * @param keyPrefix
+     * @param prevKey
+     */
+    KeyIteratorV1(String keyPrefix, String prevKey) throws IOException {
+      super(keyPrefix, prevKey);
+      addedKeyPrefix = true;
+    }
+
+    @Override
+    List<OzoneKey> getNextListOfKeys(String prevKey) throws IOException {
+      if (stack == null) {
+        stack = new Stack();
+        pendingItemsToBeBatched = new ArrayList<>();
+      }
+
+      // normalize paths
+      if (!addedKeyPrefix) {
+        prevKey = OmUtils.normalizeKey(prevKey, true);
+        String keyPrefixName = "";
+        if (StringUtils.isNotBlank(getKeyPrefix())) {
+          keyPrefixName = OmUtils.normalizeKey(getKeyPrefix(), true);
+        }
+        setKeyprefix(keyPrefixName);
+      }
+
+      // Get immediate children
+      List<OzoneKey> keysResultList = new ArrayList<>();
+      listChildrenKeys(getKeyPrefix(), prevKey, keysResultList);
+
+      // TODO: Back and Forth seek all the files & dirs, starting from
+      //  startKey till keyPrefix.
+
+      return keysResultList;
+    }
+
+    /**
+     * List children under the given keyPrefix path. This doesn't guarantee
+     * order. Internally, it does recursive #listStatus calls to list all the
+     * sub-keysResultList.

Review comment:
       Thanks a lot @linyiqun for the reviews and your comments greatly help to push this feature fast. 
   
   **[FileSystem#listStatus](https://github.com/apache/hadoop/blob/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/FileSystem.java#L1890)** its not guaranteeing sorted order. `ozone#listStatus` user final status list is also not maintaining sorted order. 
   
   We are sorting files and dirs list [separately in OM KeyManagerImpl](https://github.com/apache/ozone/blob/HDDS-2939/hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/KeyManagerImpl.java#L2353), this is done for batch-wise key iteration. OM maintains only seek order to avoid infinite looping - first it will seek files(in Filetable) and then dirs(in DirTable). If the `startKey` is a directory that means, all the files has been fetched in previous iteration and it can stop seeking after fetching all the dirs from DirTable.
   
   
   `Ozone#listKeys `also not expects sorted order and existing javadoc didn't have anything specific about the ordering. But user can experience a sorted list in V0 code because RocksDB maintains order. So, I thought of explicitly mention about the order part in V1 code base.
   
   
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] rakeshadr commented on a change in pull request #1954: HDDS-4683. [FSO]ListKeys: do lookup in dir and file tables

Posted by GitBox <gi...@apache.org>.
rakeshadr commented on a change in pull request #1954:
URL: https://github.com/apache/ozone/pull/1954#discussion_r582577737



##########
File path: hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/ozone/om/TestObjectStoreV1.java
##########
@@ -275,6 +285,246 @@ public void testLookupKey() throws Exception {
             dirPathC.getObjectID(), true);
   }
 
+  /**
+   * Verify listKeys at different levels.
+   *
+   *                  buck-1
+   *                    |
+   *                    a
+   *                    |
+   *      -----------------------------------
+   *     |              |                       |
+   *     b1             b2                      b3
+   *    -----           --------               ----------
+   *   |      |        |    |   |             |    |     |
+   *  c1     c2        d1   d2  d3             e1   e2   e3
+   *  |      |         |    |   |              |    |    |
+   * c1.tx  c2.tx   d11.tx  | d31.tx           |    |    e31.tx
+   *                      --------             |   e21.tx
+   *                     |        |            |
+   *                   d21.tx   d22.tx        e11.tx
+   *
+   * Above is the FS tree structure.
+   */
+  @Test
+  public void testListKeysAtDifferentLevels() throws Exception {
+    OzoneClient client = cluster.getClient();
+
+    ObjectStore objectStore = client.getObjectStore();
+    OzoneVolume ozoneVolume = objectStore.getVolume(volumeName);
+    Assert.assertTrue(ozoneVolume.getName().equals(volumeName));
+    OzoneBucket ozoneBucket = ozoneVolume.getBucket(bucketName);
+    Assert.assertTrue(ozoneBucket.getName().equals(bucketName));
+
+    String keyc1 = "/a/b1/c1/c1.tx";
+    String keyc2 = "/a/b1/c2/c2.tx";
+
+    String keyd13 = "/a/b2/d1/d11.tx";
+    String keyd21 = "/a/b2/d2/d21.tx";
+    String keyd22 = "/a/b2/d2/d22.tx";
+    String keyd31 = "/a/b2/d3/d31.tx";
+
+    String keye11 = "/a/b3/e1/e11.tx";
+    String keye21 = "/a/b3/e2/e21.tx";
+    String keye31 = "/a/b3/e3/e31.tx";
+
+    LinkedList<String> keys = new LinkedList<>();
+    keys.add(keyc1);
+    keys.add(keyc2);
+
+    keys.add(keyd13);
+    keys.add(keyd21);
+    keys.add(keyd22);
+    keys.add(keyd31);
+
+    keys.add(keye11);
+    keys.add(keye21);
+    keys.add(keye31);
+
+    int length = 10;
+    byte[] input = new byte[length];
+    Arrays.fill(input, (byte)96);
+
+    createKeys(ozoneBucket, keys);
+
+    // Root level listing keys
+    Iterator<? extends OzoneKey> ozoneKeyIterator =
+        ozoneBucket.listKeys("/a", null);
+
+    LinkedList<String> expectedKeys = new LinkedList<>();
+    expectedKeys.add("a/");
+    expectedKeys.add("a/b1/");
+    expectedKeys.add("a/b2/");
+    expectedKeys.add("a/b3/");
+    expectedKeys.add("a/b1/c1/");
+    expectedKeys.add("a/b1/c2/");
+    expectedKeys.add("a/b1/c1/c1.tx");
+    expectedKeys.add("a/b1/c2/c2.tx");
+    expectedKeys.add("a/b2/d1/");
+    expectedKeys.add("a/b2/d2/");
+    expectedKeys.add("a/b2/d3/");
+    expectedKeys.add("a/b2/d1/d11.tx");
+    expectedKeys.add("a/b2/d2/d21.tx");
+    expectedKeys.add("a/b2/d2/d22.tx");
+    expectedKeys.add("a/b2/d3/d31.tx");
+    expectedKeys.add("a/b3/e1/");
+    expectedKeys.add("a/b3/e2/");
+    expectedKeys.add("a/b3/e3/");
+    expectedKeys.add("a/b3/e1/e11.tx");
+    expectedKeys.add("a/b3/e2/e21.tx");
+    expectedKeys.add("a/b3/e3/e31.tx");
+    checkKeyList(ozoneKeyIterator, expectedKeys);
+
+    // Intermediate level keyPrefix - 2nd level
+    ozoneKeyIterator =
+        ozoneBucket.listKeys("a///b2///", null);
+    expectedKeys = new LinkedList<>();
+    expectedKeys.add("a/b2/");
+    expectedKeys.add("a/b2/d1/");
+    expectedKeys.add("a/b2/d2/");
+    expectedKeys.add("a/b2/d3/");
+    expectedKeys.add("a/b2/d1/d11.tx");
+    expectedKeys.add("a/b2/d2/d21.tx");
+    expectedKeys.add("a/b2/d2/d22.tx");
+    expectedKeys.add("a/b2/d3/d31.tx");
+    checkKeyList(ozoneKeyIterator, expectedKeys);
+
+    // Intermediate level keyPrefix - 3rd level
+    ozoneKeyIterator =
+        ozoneBucket.listKeys("a/b2/d1", null);
+    expectedKeys = new LinkedList<>();
+    expectedKeys.add("a/b2/d1/");
+    expectedKeys.add("a/b2/d1/d11.tx");
+    checkKeyList(ozoneKeyIterator, expectedKeys);
+
+    // Boundary of a level
+    ozoneKeyIterator =
+        ozoneBucket.listKeys("a/b2/d2", "a/b2/d2/d21.tx");
+    expectedKeys = new LinkedList<>();
+    expectedKeys.add("a/b2/d2/d22.tx");
+    checkKeyList(ozoneKeyIterator, expectedKeys);
+
+    // Boundary case - last node in the depth-first-traversal
+    ozoneKeyIterator =
+        ozoneBucket.listKeys("a/b3/e3", "a/b3/e3/e31.tx");
+    expectedKeys = new LinkedList<>();
+    checkKeyList(ozoneKeyIterator, expectedKeys);

Review comment:
       Thanks for bringing this case. I'm adding UTs for below case-1 and case-2. But **case-3** requires HDDS-4859 logic.
   Case-1)   
               ` Iterator<? extends OzoneKey> ozoneKeyIterator = ozoneBucket.listKeys(null, null);`
   
   Case-2) 
      ` Iterator<? extends OzoneKey> ozoneKeyIterator = ozoneBucket.listKeys("a/", null);`
        
   Case-3) I've verified this in master and the expected list would be like:
    ```
    ozoneKeyIterator = ozoneBucket.listKeys(null, "a/b2/d2/d21.tx");
   
    ExpectedList [a/b2/d2/d22.tx, a/b2/d3/, a/b2/d3/d31.tx, a/b3/, a/b3/e1/, a/b3/e1/e11.tx, a/b3/e2/, a/b3/e2/e21.tx, a/b3/e3/, a/b3/e3/e31.tx]
   ```
   HDDS-4859 is raised to handle the startKey cases. I will comment in that jira to take care this case.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] rakeshadr commented on a change in pull request #1954: HDDS-4683. [FSO]ListKeys: do lookup in dir and file tables

Posted by GitBox <gi...@apache.org>.
rakeshadr commented on a change in pull request #1954:
URL: https://github.com/apache/ozone/pull/1954#discussion_r582883736



##########
File path: hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/OzoneBucket.java
##########
@@ -828,10 +846,286 @@ public OzoneKey next() {
      * @param prevKey
      * @return {@code List<OzoneKey>}
      */
-    private List<OzoneKey> getNextListOfKeys(String prevKey) throws
+    List<OzoneKey> getNextListOfKeys(String prevKey) throws
         IOException {
       return proxy.listKeys(volumeName, name, keyPrefix, prevKey,
           listCacheSize);
     }
   }
+
+
+  /**
+   * An Iterator to iterate over {@link OzoneKey} list.
+   *
+   *                  buck-1
+   *                    |
+   *                    a
+   *                    |
+   *      -----------------------------------
+   *     |           |                       |
+   *     b1          b2                      b3
+   *   -----       --------               ----------
+   *   |    |      |    |   |             |    |     |
+   *  c1   c2     d1   d2  d3             e1   e2   e3
+   *                   |                  |
+   *               --------               |
+   *              |        |              |
+   *           d21.txt   d22.txt        e11.txt
+   *
+   * Say, keyPrefix="a" and prevKey="", then will do Depth-First-Traversal and
+   * visit node to getChildren in below fashion:-
+   * 1. getChildren("a/")  2. getChildren("a/b1")  3. getChildren("a/b1/c1")
+   * 4. getChildren("a/b1/c2")  5. getChildren("a/b2/d1")
+   * 6. getChildren("a/b2/d2")  7. getChildren("a/b2/d3")
+   * 8. getChildren("a/b3/e1")  9. getChildren("a/b3/e2")
+   * 10. getChildren("a/b3/e3")
+   *
+   * Note: Does not guarantee to return the list of keys in a sorted order.
+   */
+  private class KeyIteratorV1 extends KeyIterator{
+
+    private Stack<String> stack;
+    private List<OzoneKey> pendingItemsToBeBatched;
+    private boolean addedKeyPrefix;
+
+    /**
+     * Creates an Iterator to iterate over all keys after prevKey in the bucket.
+     * If prevKey is null it iterates from the first key in the bucket.
+     * The returned keys match key prefix.
+     *
+     * @param keyPrefix
+     * @param prevKey
+     */
+    KeyIteratorV1(String keyPrefix, String prevKey) throws IOException {
+      super(keyPrefix, prevKey);
+    }
+
+    @Override
+    List<OzoneKey> getNextListOfKeys(String prevKey) throws IOException {
+      if (stack == null) {
+        stack = new Stack();
+        pendingItemsToBeBatched = new ArrayList<>();
+      }
+
+      // normalize paths
+      if (!addedKeyPrefix) {
+        prevKey = OmUtils.normalizeKey(prevKey, true);
+        String keyPrefixName = "";
+        if (StringUtils.isNotBlank(getKeyPrefix())) {
+          keyPrefixName = OmUtils.normalizeKey(getKeyPrefix(), true);
+        }
+        setKeyprefix(keyPrefixName);
+      }
+
+      // Get immediate children
+      List<OzoneKey> keysResultList = new ArrayList<>();
+      getChildrenKeys(getKeyPrefix(), prevKey, keysResultList);
+
+      // TODO: Back and Forth seek all the files & dirs, starting from
+      //  startKey till keyPrefix.
+
+      return keysResultList;
+    }
+
+    /**
+     * List children under the given keyPrefix and startKey path. It does
+     * recursive #listStatus calls to list all the sub-keys resultList.
+     *
+     *                  buck-1
+     *                    |
+     *                    a
+     *                    |
+     *      -----------------------------------
+     *     |           |                       |
+     *     b1          b2                      b3
+     *   -----       --------               ----------
+     *   |    |      |    |   |             |    |     |
+     *  c1   c2     d1   d2  d3             e1   e2   e3
+     *                   |                  |
+     *               --------               |
+     *              |        |              |
+     *           d21.txt   d22.txt        e11.txt
+     *
+     * Say, KeyPrefix = "a" and startKey = null;
+     *
+     * Iteration-1) RPC call proxy#listStatus("a").
+     *              Add b3, b2 and b1 to stack.
+     * Iteration-2) pop b1 and do RPC call proxy#listStatus("b1")
+     *              Add c2, c1 to stack.
+     * Iteration-3) pop c1 and do RPC call proxy#listStatus("c1"). Empty list.
+     * Iteration-4) pop c2 and do RPC call proxy#listStatus("c2"). Empty list.
+     * Iteration-5) pop b2 and do RPC call proxy#listStatus("b2")
+     *              Add d3, d2 and d1 to stack.
+     *              ..........
+     *              ..........
+     * Iteration-n) pop e3 and do RPC call proxy#listStatus("e3")
+     *              Reached end of the FS tree.
+     *
+     * @param keyPrefix
+     * @param startKey
+     * @param keysResultList
+     * @return true represents it reached limit batch size, false otherwise.
+     * @throws IOException
+     */
+    private boolean getChildrenKeys(String keyPrefix, String startKey,
+        List<OzoneKey> keysResultList) throws IOException {
+
+      // listStatus API expects a not null 'startKey' value
+      startKey = startKey == null ? "" : startKey;
+
+      // 1. Add pending items to the user key resultList
+      if (addAllPendingItemsToResultList(keysResultList)) {
+        // reached limit batch size.
+        return true;
+      }
+
+      // 2. Get immediate children of keyPrefix, starting with startKey
+      List<OzoneFileStatus> statuses = proxy.listStatus(volumeName, name,
+              keyPrefix, false, startKey, listCacheSize);
+
+      // 3. Special case: ListKey expects keyPrefix element should present in
+      // the resultList, only if startKey is blank. If startKey is not blank
+      // then resultList shouldn't contain the startKey element.
+      // Since proxy#listStatus API won't return keyPrefix element in the
+      // resultList. So, this is to add user given keyPrefix to the return list.
+      addKeyPrefixInfoToResultList(keyPrefix, startKey, keysResultList);
+
+      // 4. Special case: ListKey expects startKey shouldn't present in the
+      // resultList. Since proxy#listStatus API returns startKey element to
+      // the returnList, this function is to remove the startKey element.
+      removeStartKeyIfExistsInStatusList(startKey, statuses);
+
+      boolean reachedLimitCacheSize = false;
+      // This dirList is used to store paths elements in left-to-right order.
+      List<String> dirList = new ArrayList<>();
+
+      // 5. Iterating over the resultStatuses list and add each key to the
+      // resultList. If the listCacheSize reaches then it will add the rest
+      // of the statuses to pendingItemsToBeBatched
+      for (int indx = 0; indx < statuses.size(); indx++) {
+        OzoneFileStatus status = statuses.get(indx);
+        OmKeyInfo keyInfo = status.getKeyInfo();
+        String keyName = keyInfo.getKeyName();
+
+        // Add dir to the dirList
+        if (status.isDirectory()) {
+          dirList.add(keyInfo.getKeyName());
+          // add trailing slash to represent directory
+          keyName = OzoneFSUtils.addTrailingSlashIfNeeded(keyName);
+        }
+
+        OzoneKey ozoneKey = new OzoneKey(keyInfo.getVolumeName(),
+                keyInfo.getBucketName(), keyName,
+                keyInfo.getDataSize(), keyInfo.getCreationTime(),
+                keyInfo.getModificationTime(),
+                ReplicationType.valueOf(keyInfo.getType().toString()),
+                keyInfo.getFactor().getNumber());
+
+        // 5.1) Add to the resultList till it reaches limit batch size.
+        // Once it reaches limit, then add rest of the items to
+        // pendingItemsToBeBatched and this will picked in next batch iteration
+        if (!reachedLimitCacheSize && listCacheSize > keysResultList.size()) {
+          keysResultList.add(ozoneKey);
+          reachedLimitCacheSize = listCacheSize <= keysResultList.size();
+        } else {
+          pendingItemsToBeBatched.add(ozoneKey);
+        }
+      }
+
+      // 6. Push elements in reverse order so that the FS tree traversal will
+      // occur in left-to-right fashion.
+      for (int indx = dirList.size() - 1; indx >= 0; indx--) {
+        String dirPathComponent = dirList.get(indx);
+        stack.push(dirPathComponent);
+      }
+
+      if (reachedLimitCacheSize) {
+        return true;
+      }
+
+      // 7. Pop element and seek for its sub-child path(s). Basically moving
+      // seek pointer to next level(depth) in FS tree.
+      while (!stack.isEmpty()) {
+        keyPrefix = stack.pop();
+        if (getChildrenKeys(keyPrefix, "", keysResultList)) {
+          // reached limit batch size.
+          return true;
+        }
+      }
+
+      return false;
+    }
+
+    private void removeStartKeyIfExistsInStatusList(String startKey,
+        List<OzoneFileStatus> statuses) {
+
+      if (StringUtils.isNotBlank(startKey) && !statuses.isEmpty()) {
+        String startKeyPath = startKey;
+        if (startKey.endsWith(OZONE_URI_DELIMITER)) {
+          startKeyPath = OzoneFSUtils.removeTrailingSlashIfNeeded(startKey);
+        }
+        if (StringUtils.equals(statuses.get(0).getKeyInfo().getKeyName(),
+                startKeyPath)) {
+          // remove the duplicateKey from the list.
+          statuses.remove(0);
+        }
+      }
+    }
+
+    private boolean addAllPendingItemsToResultList(List<OzoneKey> keys) {
+
+      Iterator<OzoneKey> ozoneKeyItr = pendingItemsToBeBatched.iterator();
+      while (ozoneKeyItr.hasNext()) {
+        if (listCacheSize <= keys.size()) {
+          // reached limit batch size.
+          return true;
+        }
+        keys.add(ozoneKeyItr.next());
+        ozoneKeyItr.remove();
+      }
+      return false;
+    }
+
+    private void addKeyPrefixInfoToResultList(String keyPrefix,
+        String startKey, List<OzoneKey> keysResultList) throws IOException {
+
+      if (addedKeyPrefix) {
+        return;
+      }
+
+      // setting flag to true.
+      addedKeyPrefix = true;
+
+      // not required to addKeyPrefix
+      // case-1) if keyPrefix is null or empty
+      // case-2) if startKey is not null or empty
+      if (StringUtils.isBlank(keyPrefix) || StringUtils.isNotBlank(startKey)) {
+        return;
+      }
+
+      // TODO: Handle a case, where startKey not started with keyPrefix.

Review comment:
       Not that jira. Since listKeys internally using listStatus, HDDS-4364 will take care both listKeys & listStatus.
   I will include jira ID in TODO.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] rakeshadr commented on a change in pull request #1954: HDDS-4683. [FSO]ListKeys: do lookup in dir and file tables

Posted by GitBox <gi...@apache.org>.
rakeshadr commented on a change in pull request #1954:
URL: https://github.com/apache/ozone/pull/1954#discussion_r582889400



##########
File path: hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/OzoneBucket.java
##########
@@ -828,10 +846,286 @@ public OzoneKey next() {
      * @param prevKey
      * @return {@code List<OzoneKey>}
      */
-    private List<OzoneKey> getNextListOfKeys(String prevKey) throws
+    List<OzoneKey> getNextListOfKeys(String prevKey) throws
         IOException {
       return proxy.listKeys(volumeName, name, keyPrefix, prevKey,
           listCacheSize);
     }
   }
+
+
+  /**
+   * An Iterator to iterate over {@link OzoneKey} list.
+   *
+   *                  buck-1
+   *                    |
+   *                    a
+   *                    |
+   *      -----------------------------------
+   *     |           |                       |
+   *     b1          b2                      b3
+   *   -----       --------               ----------
+   *   |    |      |    |   |             |    |     |
+   *  c1   c2     d1   d2  d3             e1   e2   e3
+   *                   |                  |
+   *               --------               |
+   *              |        |              |
+   *           d21.txt   d22.txt        e11.txt
+   *
+   * Say, keyPrefix="a" and prevKey="", then will do Depth-First-Traversal and
+   * visit node to getChildren in below fashion:-
+   * 1. getChildren("a/")  2. getChildren("a/b1")  3. getChildren("a/b1/c1")
+   * 4. getChildren("a/b1/c2")  5. getChildren("a/b2/d1")
+   * 6. getChildren("a/b2/d2")  7. getChildren("a/b2/d3")
+   * 8. getChildren("a/b3/e1")  9. getChildren("a/b3/e2")
+   * 10. getChildren("a/b3/e3")
+   *
+   * Note: Does not guarantee to return the list of keys in a sorted order.
+   */
+  private class KeyIteratorV1 extends KeyIterator{
+
+    private Stack<String> stack;
+    private List<OzoneKey> pendingItemsToBeBatched;
+    private boolean addedKeyPrefix;
+
+    /**
+     * Creates an Iterator to iterate over all keys after prevKey in the bucket.
+     * If prevKey is null it iterates from the first key in the bucket.
+     * The returned keys match key prefix.
+     *
+     * @param keyPrefix
+     * @param prevKey
+     */
+    KeyIteratorV1(String keyPrefix, String prevKey) throws IOException {
+      super(keyPrefix, prevKey);
+    }
+
+    @Override
+    List<OzoneKey> getNextListOfKeys(String prevKey) throws IOException {
+      if (stack == null) {
+        stack = new Stack();
+        pendingItemsToBeBatched = new ArrayList<>();
+      }
+
+      // normalize paths
+      if (!addedKeyPrefix) {
+        prevKey = OmUtils.normalizeKey(prevKey, true);
+        String keyPrefixName = "";
+        if (StringUtils.isNotBlank(getKeyPrefix())) {
+          keyPrefixName = OmUtils.normalizeKey(getKeyPrefix(), true);
+        }
+        setKeyprefix(keyPrefixName);
+      }
+
+      // Get immediate children
+      List<OzoneKey> keysResultList = new ArrayList<>();
+      getChildrenKeys(getKeyPrefix(), prevKey, keysResultList);
+
+      // TODO: Back and Forth seek all the files & dirs, starting from
+      //  startKey till keyPrefix.
+
+      return keysResultList;
+    }
+
+    /**
+     * List children under the given keyPrefix and startKey path. It does
+     * recursive #listStatus calls to list all the sub-keys resultList.
+     *
+     *                  buck-1
+     *                    |
+     *                    a
+     *                    |
+     *      -----------------------------------
+     *     |           |                       |
+     *     b1          b2                      b3
+     *   -----       --------               ----------
+     *   |    |      |    |   |             |    |     |
+     *  c1   c2     d1   d2  d3             e1   e2   e3
+     *                   |                  |
+     *               --------               |
+     *              |        |              |
+     *           d21.txt   d22.txt        e11.txt
+     *
+     * Say, KeyPrefix = "a" and startKey = null;
+     *
+     * Iteration-1) RPC call proxy#listStatus("a").
+     *              Add b3, b2 and b1 to stack.
+     * Iteration-2) pop b1 and do RPC call proxy#listStatus("b1")
+     *              Add c2, c1 to stack.
+     * Iteration-3) pop c1 and do RPC call proxy#listStatus("c1"). Empty list.
+     * Iteration-4) pop c2 and do RPC call proxy#listStatus("c2"). Empty list.
+     * Iteration-5) pop b2 and do RPC call proxy#listStatus("b2")
+     *              Add d3, d2 and d1 to stack.
+     *              ..........
+     *              ..........
+     * Iteration-n) pop e3 and do RPC call proxy#listStatus("e3")
+     *              Reached end of the FS tree.
+     *
+     * @param keyPrefix
+     * @param startKey
+     * @param keysResultList
+     * @return true represents it reached limit batch size, false otherwise.
+     * @throws IOException
+     */
+    private boolean getChildrenKeys(String keyPrefix, String startKey,
+        List<OzoneKey> keysResultList) throws IOException {
+
+      // listStatus API expects a not null 'startKey' value
+      startKey = startKey == null ? "" : startKey;
+
+      // 1. Add pending items to the user key resultList
+      if (addAllPendingItemsToResultList(keysResultList)) {
+        // reached limit batch size.
+        return true;
+      }
+
+      // 2. Get immediate children of keyPrefix, starting with startKey
+      List<OzoneFileStatus> statuses = proxy.listStatus(volumeName, name,
+              keyPrefix, false, startKey, listCacheSize);
+
+      // 3. Special case: ListKey expects keyPrefix element should present in
+      // the resultList, only if startKey is blank. If startKey is not blank
+      // then resultList shouldn't contain the startKey element.
+      // Since proxy#listStatus API won't return keyPrefix element in the
+      // resultList. So, this is to add user given keyPrefix to the return list.
+      addKeyPrefixInfoToResultList(keyPrefix, startKey, keysResultList);
+
+      // 4. Special case: ListKey expects startKey shouldn't present in the
+      // resultList. Since proxy#listStatus API returns startKey element to
+      // the returnList, this function is to remove the startKey element.
+      removeStartKeyIfExistsInStatusList(startKey, statuses);
+
+      boolean reachedLimitCacheSize = false;
+      // This dirList is used to store paths elements in left-to-right order.
+      List<String> dirList = new ArrayList<>();
+
+      // 5. Iterating over the resultStatuses list and add each key to the
+      // resultList. If the listCacheSize reaches then it will add the rest
+      // of the statuses to pendingItemsToBeBatched
+      for (int indx = 0; indx < statuses.size(); indx++) {
+        OzoneFileStatus status = statuses.get(indx);
+        OmKeyInfo keyInfo = status.getKeyInfo();
+        String keyName = keyInfo.getKeyName();
+
+        // Add dir to the dirList
+        if (status.isDirectory()) {
+          dirList.add(keyInfo.getKeyName());
+          // add trailing slash to represent directory
+          keyName = OzoneFSUtils.addTrailingSlashIfNeeded(keyName);
+        }
+
+        OzoneKey ozoneKey = new OzoneKey(keyInfo.getVolumeName(),
+                keyInfo.getBucketName(), keyName,
+                keyInfo.getDataSize(), keyInfo.getCreationTime(),
+                keyInfo.getModificationTime(),
+                ReplicationType.valueOf(keyInfo.getType().toString()),
+                keyInfo.getFactor().getNumber());
+
+        // 5.1) Add to the resultList till it reaches limit batch size.
+        // Once it reaches limit, then add rest of the items to
+        // pendingItemsToBeBatched and this will picked in next batch iteration
+        if (!reachedLimitCacheSize && listCacheSize > keysResultList.size()) {
+          keysResultList.add(ozoneKey);
+          reachedLimitCacheSize = listCacheSize <= keysResultList.size();
+        } else {
+          pendingItemsToBeBatched.add(ozoneKey);
+        }
+      }
+
+      // 6. Push elements in reverse order so that the FS tree traversal will
+      // occur in left-to-right fashion.
+      for (int indx = dirList.size() - 1; indx >= 0; indx--) {
+        String dirPathComponent = dirList.get(indx);
+        stack.push(dirPathComponent);
+      }
+
+      if (reachedLimitCacheSize) {
+        return true;
+      }
+
+      // 7. Pop element and seek for its sub-child path(s). Basically moving
+      // seek pointer to next level(depth) in FS tree.
+      while (!stack.isEmpty()) {
+        keyPrefix = stack.pop();
+        if (getChildrenKeys(keyPrefix, "", keysResultList)) {
+          // reached limit batch size.
+          return true;
+        }
+      }
+
+      return false;
+    }
+
+    private void removeStartKeyIfExistsInStatusList(String startKey,
+        List<OzoneFileStatus> statuses) {
+
+      if (StringUtils.isNotBlank(startKey) && !statuses.isEmpty()) {
+        String startKeyPath = startKey;
+        if (startKey.endsWith(OZONE_URI_DELIMITER)) {
+          startKeyPath = OzoneFSUtils.removeTrailingSlashIfNeeded(startKey);
+        }
+        if (StringUtils.equals(statuses.get(0).getKeyInfo().getKeyName(),
+                startKeyPath)) {
+          // remove the duplicateKey from the list.
+          statuses.remove(0);
+        }
+      }
+    }
+
+    private boolean addAllPendingItemsToResultList(List<OzoneKey> keys) {
+
+      Iterator<OzoneKey> ozoneKeyItr = pendingItemsToBeBatched.iterator();
+      while (ozoneKeyItr.hasNext()) {
+        if (listCacheSize <= keys.size()) {
+          // reached limit batch size.
+          return true;
+        }
+        keys.add(ozoneKeyItr.next());
+        ozoneKeyItr.remove();
+      }
+      return false;
+    }
+
+    private void addKeyPrefixInfoToResultList(String keyPrefix,
+        String startKey, List<OzoneKey> keysResultList) throws IOException {
+
+      if (addedKeyPrefix) {
+        return;
+      }
+
+      // setting flag to true.
+      addedKeyPrefix = true;
+
+      // not required to addKeyPrefix
+      // case-1) if keyPrefix is null or empty
+      // case-2) if startKey is not null or empty
+      if (StringUtils.isBlank(keyPrefix) || StringUtils.isNotBlank(startKey)) {
+        return;
+      }
+
+      // TODO: Handle a case, where startKey not started with keyPrefix.

Review comment:
       Sorry for the confusion. Yes, you are right will consider this case along with HDDS-4859. I will add a comment. There we can focus on the startKey cases.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] rakeshadr commented on a change in pull request #1954: HDDS-4683. [FSO]ListKeys: do lookup in dir and file tables

Posted by GitBox <gi...@apache.org>.
rakeshadr commented on a change in pull request #1954:
URL: https://github.com/apache/ozone/pull/1954#discussion_r582926829



##########
File path: hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/ozone/om/TestObjectStoreV1.java
##########
@@ -275,6 +285,259 @@ public void testLookupKey() throws Exception {
             dirPathC.getObjectID(), true);
   }
 
+  /**
+   * Verify listKeys at different levels.
+   *
+   *                  buck-1
+   *                    |
+   *                    a
+   *                    |
+   *      -----------------------------------
+   *     |              |                       |
+   *     b1             b2                      b3
+   *    -----           --------               ----------
+   *   |      |        |    |   |             |    |     |
+   *  c1     c2        d1   d2  d3             e1   e2   e3
+   *  |      |         |    |   |              |    |    |
+   * c1.tx  c2.tx   d11.tx  | d31.tx           |    |    e31.tx
+   *                      --------             |   e21.tx
+   *                     |        |            |
+   *                   d21.tx   d22.tx        e11.tx
+   *
+   * Above is the FS tree structure.
+   */
+  @Test
+  public void testListKeysAtDifferentLevels() throws Exception {
+    OzoneClient client = cluster.getClient();
+
+    ObjectStore objectStore = client.getObjectStore();
+    OzoneVolume ozoneVolume = objectStore.getVolume(volumeName);
+    Assert.assertTrue(ozoneVolume.getName().equals(volumeName));
+    OzoneBucket ozoneBucket = ozoneVolume.getBucket(bucketName);
+    Assert.assertTrue(ozoneBucket.getName().equals(bucketName));
+
+    String keyc1 = "/a/b1/c1/c1.tx";
+    String keyc2 = "/a/b1/c2/c2.tx";
+
+    String keyd13 = "/a/b2/d1/d11.tx";
+    String keyd21 = "/a/b2/d2/d21.tx";
+    String keyd22 = "/a/b2/d2/d22.tx";
+    String keyd31 = "/a/b2/d3/d31.tx";
+
+    String keye11 = "/a/b3/e1/e11.tx";
+    String keye21 = "/a/b3/e2/e21.tx";
+    String keye31 = "/a/b3/e3/e31.tx";
+
+    LinkedList<String> keys = new LinkedList<>();
+    keys.add(keyc1);
+    keys.add(keyc2);
+
+    keys.add(keyd13);
+    keys.add(keyd21);
+    keys.add(keyd22);
+    keys.add(keyd31);
+
+    keys.add(keye11);
+    keys.add(keye21);
+    keys.add(keye31);
+
+    int length = 10;
+    byte[] input = new byte[length];
+    Arrays.fill(input, (byte)96);
+
+    createKeys(ozoneBucket, keys);
+
+    // Root level listing keys
+    Iterator<? extends OzoneKey> ozoneKeyIterator =
+        ozoneBucket.listKeys(null, null);
+    verifyFullTreeStructure(ozoneKeyIterator);
+
+    ozoneKeyIterator =
+        ozoneBucket.listKeys(null, null);
+    verifyFullTreeStructure(ozoneKeyIterator);

Review comment:
       Done




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] rakeshadr commented on a change in pull request #1954: HDDS-4683. [FSO]ListKeys: do lookup in dir and file tables

Posted by GitBox <gi...@apache.org>.
rakeshadr commented on a change in pull request #1954:
URL: https://github.com/apache/ozone/pull/1954#discussion_r582577737



##########
File path: hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/ozone/om/TestObjectStoreV1.java
##########
@@ -275,6 +285,246 @@ public void testLookupKey() throws Exception {
             dirPathC.getObjectID(), true);
   }
 
+  /**
+   * Verify listKeys at different levels.
+   *
+   *                  buck-1
+   *                    |
+   *                    a
+   *                    |
+   *      -----------------------------------
+   *     |              |                       |
+   *     b1             b2                      b3
+   *    -----           --------               ----------
+   *   |      |        |    |   |             |    |     |
+   *  c1     c2        d1   d2  d3             e1   e2   e3
+   *  |      |         |    |   |              |    |    |
+   * c1.tx  c2.tx   d11.tx  | d31.tx           |    |    e31.tx
+   *                      --------             |   e21.tx
+   *                     |        |            |
+   *                   d21.tx   d22.tx        e11.tx
+   *
+   * Above is the FS tree structure.
+   */
+  @Test
+  public void testListKeysAtDifferentLevels() throws Exception {
+    OzoneClient client = cluster.getClient();
+
+    ObjectStore objectStore = client.getObjectStore();
+    OzoneVolume ozoneVolume = objectStore.getVolume(volumeName);
+    Assert.assertTrue(ozoneVolume.getName().equals(volumeName));
+    OzoneBucket ozoneBucket = ozoneVolume.getBucket(bucketName);
+    Assert.assertTrue(ozoneBucket.getName().equals(bucketName));
+
+    String keyc1 = "/a/b1/c1/c1.tx";
+    String keyc2 = "/a/b1/c2/c2.tx";
+
+    String keyd13 = "/a/b2/d1/d11.tx";
+    String keyd21 = "/a/b2/d2/d21.tx";
+    String keyd22 = "/a/b2/d2/d22.tx";
+    String keyd31 = "/a/b2/d3/d31.tx";
+
+    String keye11 = "/a/b3/e1/e11.tx";
+    String keye21 = "/a/b3/e2/e21.tx";
+    String keye31 = "/a/b3/e3/e31.tx";
+
+    LinkedList<String> keys = new LinkedList<>();
+    keys.add(keyc1);
+    keys.add(keyc2);
+
+    keys.add(keyd13);
+    keys.add(keyd21);
+    keys.add(keyd22);
+    keys.add(keyd31);
+
+    keys.add(keye11);
+    keys.add(keye21);
+    keys.add(keye31);
+
+    int length = 10;
+    byte[] input = new byte[length];
+    Arrays.fill(input, (byte)96);
+
+    createKeys(ozoneBucket, keys);
+
+    // Root level listing keys
+    Iterator<? extends OzoneKey> ozoneKeyIterator =
+        ozoneBucket.listKeys("/a", null);
+
+    LinkedList<String> expectedKeys = new LinkedList<>();
+    expectedKeys.add("a/");
+    expectedKeys.add("a/b1/");
+    expectedKeys.add("a/b2/");
+    expectedKeys.add("a/b3/");
+    expectedKeys.add("a/b1/c1/");
+    expectedKeys.add("a/b1/c2/");
+    expectedKeys.add("a/b1/c1/c1.tx");
+    expectedKeys.add("a/b1/c2/c2.tx");
+    expectedKeys.add("a/b2/d1/");
+    expectedKeys.add("a/b2/d2/");
+    expectedKeys.add("a/b2/d3/");
+    expectedKeys.add("a/b2/d1/d11.tx");
+    expectedKeys.add("a/b2/d2/d21.tx");
+    expectedKeys.add("a/b2/d2/d22.tx");
+    expectedKeys.add("a/b2/d3/d31.tx");
+    expectedKeys.add("a/b3/e1/");
+    expectedKeys.add("a/b3/e2/");
+    expectedKeys.add("a/b3/e3/");
+    expectedKeys.add("a/b3/e1/e11.tx");
+    expectedKeys.add("a/b3/e2/e21.tx");
+    expectedKeys.add("a/b3/e3/e31.tx");
+    checkKeyList(ozoneKeyIterator, expectedKeys);
+
+    // Intermediate level keyPrefix - 2nd level
+    ozoneKeyIterator =
+        ozoneBucket.listKeys("a///b2///", null);
+    expectedKeys = new LinkedList<>();
+    expectedKeys.add("a/b2/");
+    expectedKeys.add("a/b2/d1/");
+    expectedKeys.add("a/b2/d2/");
+    expectedKeys.add("a/b2/d3/");
+    expectedKeys.add("a/b2/d1/d11.tx");
+    expectedKeys.add("a/b2/d2/d21.tx");
+    expectedKeys.add("a/b2/d2/d22.tx");
+    expectedKeys.add("a/b2/d3/d31.tx");
+    checkKeyList(ozoneKeyIterator, expectedKeys);
+
+    // Intermediate level keyPrefix - 3rd level
+    ozoneKeyIterator =
+        ozoneBucket.listKeys("a/b2/d1", null);
+    expectedKeys = new LinkedList<>();
+    expectedKeys.add("a/b2/d1/");
+    expectedKeys.add("a/b2/d1/d11.tx");
+    checkKeyList(ozoneKeyIterator, expectedKeys);
+
+    // Boundary of a level
+    ozoneKeyIterator =
+        ozoneBucket.listKeys("a/b2/d2", "a/b2/d2/d21.tx");
+    expectedKeys = new LinkedList<>();
+    expectedKeys.add("a/b2/d2/d22.tx");
+    checkKeyList(ozoneKeyIterator, expectedKeys);
+
+    // Boundary case - last node in the depth-first-traversal
+    ozoneKeyIterator =
+        ozoneBucket.listKeys("a/b3/e3", "a/b3/e3/e31.tx");
+    expectedKeys = new LinkedList<>();
+    checkKeyList(ozoneKeyIterator, expectedKeys);

Review comment:
       Thanks for bringing this case. I've added UTs for case-1 and case-2. But case-3 requires HDDS-4859.
   Case-1)   
               ` Iterator<? extends OzoneKey> ozoneKeyIterator = ozoneBucket.listKeys(null, null);`
   
   Case-2) 
      ` Iterator<? extends OzoneKey> ozoneKeyIterator = ozoneBucket.listKeys("a/", null);`
        
   Case-3) I've verified this in master and the expected list would be like:
    ```
    ozoneKeyIterator = ozoneBucket.listKeys(null, "a/b2/d2/d21.tx");
   
    ExpectedList [a/b2/d2/d22.tx, a/b2/d3/, a/b2/d3/d31.tx, a/b3/, a/b3/e1/, a/b3/e1/e11.tx, a/b3/e2/, a/b3/e2/e21.tx, a/b3/e3/, a/b3/e3/e31.tx]
   ```
   HDDS-4859 is raised to handle the startKey cases. I will comment in that jira to take care this case.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] linyiqun commented on a change in pull request #1954: HDDS-4683. [FSO]ListKeys: do lookup in dir and file tables

Posted by GitBox <gi...@apache.org>.
linyiqun commented on a change in pull request #1954:
URL: https://github.com/apache/ozone/pull/1954#discussion_r582880047



##########
File path: hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/OzoneBucket.java
##########
@@ -828,10 +846,286 @@ public OzoneKey next() {
      * @param prevKey
      * @return {@code List<OzoneKey>}
      */
-    private List<OzoneKey> getNextListOfKeys(String prevKey) throws
+    List<OzoneKey> getNextListOfKeys(String prevKey) throws
         IOException {
       return proxy.listKeys(volumeName, name, keyPrefix, prevKey,
           listCacheSize);
     }
   }
+
+
+  /**
+   * An Iterator to iterate over {@link OzoneKey} list.
+   *
+   *                  buck-1
+   *                    |
+   *                    a
+   *                    |
+   *      -----------------------------------
+   *     |           |                       |
+   *     b1          b2                      b3
+   *   -----       --------               ----------
+   *   |    |      |    |   |             |    |     |
+   *  c1   c2     d1   d2  d3             e1   e2   e3
+   *                   |                  |
+   *               --------               |
+   *              |        |              |
+   *           d21.txt   d22.txt        e11.txt
+   *
+   * Say, keyPrefix="a" and prevKey="", then will do Depth-First-Traversal and
+   * visit node to getChildren in below fashion:-
+   * 1. getChildren("a/")  2. getChildren("a/b1")  3. getChildren("a/b1/c1")
+   * 4. getChildren("a/b1/c2")  5. getChildren("a/b2/d1")
+   * 6. getChildren("a/b2/d2")  7. getChildren("a/b2/d3")
+   * 8. getChildren("a/b3/e1")  9. getChildren("a/b3/e2")
+   * 10. getChildren("a/b3/e3")
+   *
+   * Note: Does not guarantee to return the list of keys in a sorted order.
+   */
+  private class KeyIteratorV1 extends KeyIterator{
+
+    private Stack<String> stack;
+    private List<OzoneKey> pendingItemsToBeBatched;
+    private boolean addedKeyPrefix;
+
+    /**
+     * Creates an Iterator to iterate over all keys after prevKey in the bucket.
+     * If prevKey is null it iterates from the first key in the bucket.
+     * The returned keys match key prefix.
+     *
+     * @param keyPrefix
+     * @param prevKey
+     */
+    KeyIteratorV1(String keyPrefix, String prevKey) throws IOException {
+      super(keyPrefix, prevKey);
+    }
+
+    @Override
+    List<OzoneKey> getNextListOfKeys(String prevKey) throws IOException {
+      if (stack == null) {
+        stack = new Stack();
+        pendingItemsToBeBatched = new ArrayList<>();
+      }
+
+      // normalize paths
+      if (!addedKeyPrefix) {
+        prevKey = OmUtils.normalizeKey(prevKey, true);
+        String keyPrefixName = "";
+        if (StringUtils.isNotBlank(getKeyPrefix())) {
+          keyPrefixName = OmUtils.normalizeKey(getKeyPrefix(), true);
+        }
+        setKeyprefix(keyPrefixName);
+      }
+
+      // Get immediate children
+      List<OzoneKey> keysResultList = new ArrayList<>();
+      getChildrenKeys(getKeyPrefix(), prevKey, keysResultList);
+
+      // TODO: Back and Forth seek all the files & dirs, starting from
+      //  startKey till keyPrefix.
+
+      return keysResultList;
+    }
+
+    /**
+     * List children under the given keyPrefix and startKey path. It does
+     * recursive #listStatus calls to list all the sub-keys resultList.
+     *
+     *                  buck-1
+     *                    |
+     *                    a
+     *                    |
+     *      -----------------------------------
+     *     |           |                       |
+     *     b1          b2                      b3
+     *   -----       --------               ----------
+     *   |    |      |    |   |             |    |     |
+     *  c1   c2     d1   d2  d3             e1   e2   e3
+     *                   |                  |
+     *               --------               |
+     *              |        |              |
+     *           d21.txt   d22.txt        e11.txt
+     *
+     * Say, KeyPrefix = "a" and startKey = null;
+     *
+     * Iteration-1) RPC call proxy#listStatus("a").
+     *              Add b3, b2 and b1 to stack.
+     * Iteration-2) pop b1 and do RPC call proxy#listStatus("b1")
+     *              Add c2, c1 to stack.
+     * Iteration-3) pop c1 and do RPC call proxy#listStatus("c1"). Empty list.
+     * Iteration-4) pop c2 and do RPC call proxy#listStatus("c2"). Empty list.
+     * Iteration-5) pop b2 and do RPC call proxy#listStatus("b2")
+     *              Add d3, d2 and d1 to stack.
+     *              ..........
+     *              ..........
+     * Iteration-n) pop e3 and do RPC call proxy#listStatus("e3")
+     *              Reached end of the FS tree.
+     *
+     * @param keyPrefix
+     * @param startKey
+     * @param keysResultList
+     * @return true represents it reached limit batch size, false otherwise.
+     * @throws IOException
+     */
+    private boolean getChildrenKeys(String keyPrefix, String startKey,
+        List<OzoneKey> keysResultList) throws IOException {
+
+      // listStatus API expects a not null 'startKey' value
+      startKey = startKey == null ? "" : startKey;
+
+      // 1. Add pending items to the user key resultList
+      if (addAllPendingItemsToResultList(keysResultList)) {
+        // reached limit batch size.
+        return true;
+      }
+
+      // 2. Get immediate children of keyPrefix, starting with startKey
+      List<OzoneFileStatus> statuses = proxy.listStatus(volumeName, name,
+              keyPrefix, false, startKey, listCacheSize);
+
+      // 3. Special case: ListKey expects keyPrefix element should present in
+      // the resultList, only if startKey is blank. If startKey is not blank
+      // then resultList shouldn't contain the startKey element.
+      // Since proxy#listStatus API won't return keyPrefix element in the
+      // resultList. So, this is to add user given keyPrefix to the return list.
+      addKeyPrefixInfoToResultList(keyPrefix, startKey, keysResultList);
+
+      // 4. Special case: ListKey expects startKey shouldn't present in the
+      // resultList. Since proxy#listStatus API returns startKey element to
+      // the returnList, this function is to remove the startKey element.
+      removeStartKeyIfExistsInStatusList(startKey, statuses);
+
+      boolean reachedLimitCacheSize = false;
+      // This dirList is used to store paths elements in left-to-right order.
+      List<String> dirList = new ArrayList<>();
+
+      // 5. Iterating over the resultStatuses list and add each key to the
+      // resultList. If the listCacheSize reaches then it will add the rest
+      // of the statuses to pendingItemsToBeBatched
+      for (int indx = 0; indx < statuses.size(); indx++) {
+        OzoneFileStatus status = statuses.get(indx);
+        OmKeyInfo keyInfo = status.getKeyInfo();
+        String keyName = keyInfo.getKeyName();
+
+        // Add dir to the dirList
+        if (status.isDirectory()) {
+          dirList.add(keyInfo.getKeyName());
+          // add trailing slash to represent directory
+          keyName = OzoneFSUtils.addTrailingSlashIfNeeded(keyName);
+        }
+
+        OzoneKey ozoneKey = new OzoneKey(keyInfo.getVolumeName(),
+                keyInfo.getBucketName(), keyName,
+                keyInfo.getDataSize(), keyInfo.getCreationTime(),
+                keyInfo.getModificationTime(),
+                ReplicationType.valueOf(keyInfo.getType().toString()),
+                keyInfo.getFactor().getNumber());
+
+        // 5.1) Add to the resultList till it reaches limit batch size.
+        // Once it reaches limit, then add rest of the items to
+        // pendingItemsToBeBatched and this will picked in next batch iteration
+        if (!reachedLimitCacheSize && listCacheSize > keysResultList.size()) {
+          keysResultList.add(ozoneKey);
+          reachedLimitCacheSize = listCacheSize <= keysResultList.size();
+        } else {
+          pendingItemsToBeBatched.add(ozoneKey);
+        }
+      }
+
+      // 6. Push elements in reverse order so that the FS tree traversal will
+      // occur in left-to-right fashion.
+      for (int indx = dirList.size() - 1; indx >= 0; indx--) {
+        String dirPathComponent = dirList.get(indx);
+        stack.push(dirPathComponent);
+      }
+
+      if (reachedLimitCacheSize) {
+        return true;
+      }
+
+      // 7. Pop element and seek for its sub-child path(s). Basically moving
+      // seek pointer to next level(depth) in FS tree.
+      while (!stack.isEmpty()) {
+        keyPrefix = stack.pop();
+        if (getChildrenKeys(keyPrefix, "", keysResultList)) {
+          // reached limit batch size.
+          return true;
+        }
+      }
+
+      return false;
+    }
+
+    private void removeStartKeyIfExistsInStatusList(String startKey,
+        List<OzoneFileStatus> statuses) {
+
+      if (StringUtils.isNotBlank(startKey) && !statuses.isEmpty()) {
+        String startKeyPath = startKey;
+        if (startKey.endsWith(OZONE_URI_DELIMITER)) {
+          startKeyPath = OzoneFSUtils.removeTrailingSlashIfNeeded(startKey);
+        }
+        if (StringUtils.equals(statuses.get(0).getKeyInfo().getKeyName(),
+                startKeyPath)) {
+          // remove the duplicateKey from the list.
+          statuses.remove(0);
+        }
+      }
+    }
+
+    private boolean addAllPendingItemsToResultList(List<OzoneKey> keys) {
+
+      Iterator<OzoneKey> ozoneKeyItr = pendingItemsToBeBatched.iterator();
+      while (ozoneKeyItr.hasNext()) {
+        if (listCacheSize <= keys.size()) {
+          // reached limit batch size.
+          return true;
+        }
+        keys.add(ozoneKeyItr.next());
+        ozoneKeyItr.remove();
+      }
+      return false;
+    }
+
+    private void addKeyPrefixInfoToResultList(String keyPrefix,
+        String startKey, List<OzoneKey> keysResultList) throws IOException {
+
+      if (addedKeyPrefix) {
+        return;
+      }
+
+      // setting flag to true.
+      addedKeyPrefix = true;
+
+      // not required to addKeyPrefix
+      // case-1) if keyPrefix is null or empty
+      // case-2) if startKey is not null or empty
+      if (StringUtils.isBlank(keyPrefix) || StringUtils.isNotBlank(startKey)) {
+        return;
+      }
+
+      // TODO: Handle a case, where startKey not started with keyPrefix.

Review comment:
       Will we also addressed on this case in HDDS-4859?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] rakeshadr commented on a change in pull request #1954: HDDS-4683. [FSO]ListKeys: do lookup in dir and file tables

Posted by GitBox <gi...@apache.org>.
rakeshadr commented on a change in pull request #1954:
URL: https://github.com/apache/ozone/pull/1954#discussion_r582883736



##########
File path: hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/OzoneBucket.java
##########
@@ -828,10 +846,286 @@ public OzoneKey next() {
      * @param prevKey
      * @return {@code List<OzoneKey>}
      */
-    private List<OzoneKey> getNextListOfKeys(String prevKey) throws
+    List<OzoneKey> getNextListOfKeys(String prevKey) throws
         IOException {
       return proxy.listKeys(volumeName, name, keyPrefix, prevKey,
           listCacheSize);
     }
   }
+
+
+  /**
+   * An Iterator to iterate over {@link OzoneKey} list.
+   *
+   *                  buck-1
+   *                    |
+   *                    a
+   *                    |
+   *      -----------------------------------
+   *     |           |                       |
+   *     b1          b2                      b3
+   *   -----       --------               ----------
+   *   |    |      |    |   |             |    |     |
+   *  c1   c2     d1   d2  d3             e1   e2   e3
+   *                   |                  |
+   *               --------               |
+   *              |        |              |
+   *           d21.txt   d22.txt        e11.txt
+   *
+   * Say, keyPrefix="a" and prevKey="", then will do Depth-First-Traversal and
+   * visit node to getChildren in below fashion:-
+   * 1. getChildren("a/")  2. getChildren("a/b1")  3. getChildren("a/b1/c1")
+   * 4. getChildren("a/b1/c2")  5. getChildren("a/b2/d1")
+   * 6. getChildren("a/b2/d2")  7. getChildren("a/b2/d3")
+   * 8. getChildren("a/b3/e1")  9. getChildren("a/b3/e2")
+   * 10. getChildren("a/b3/e3")
+   *
+   * Note: Does not guarantee to return the list of keys in a sorted order.
+   */
+  private class KeyIteratorV1 extends KeyIterator{
+
+    private Stack<String> stack;
+    private List<OzoneKey> pendingItemsToBeBatched;
+    private boolean addedKeyPrefix;
+
+    /**
+     * Creates an Iterator to iterate over all keys after prevKey in the bucket.
+     * If prevKey is null it iterates from the first key in the bucket.
+     * The returned keys match key prefix.
+     *
+     * @param keyPrefix
+     * @param prevKey
+     */
+    KeyIteratorV1(String keyPrefix, String prevKey) throws IOException {
+      super(keyPrefix, prevKey);
+    }
+
+    @Override
+    List<OzoneKey> getNextListOfKeys(String prevKey) throws IOException {
+      if (stack == null) {
+        stack = new Stack();
+        pendingItemsToBeBatched = new ArrayList<>();
+      }
+
+      // normalize paths
+      if (!addedKeyPrefix) {
+        prevKey = OmUtils.normalizeKey(prevKey, true);
+        String keyPrefixName = "";
+        if (StringUtils.isNotBlank(getKeyPrefix())) {
+          keyPrefixName = OmUtils.normalizeKey(getKeyPrefix(), true);
+        }
+        setKeyprefix(keyPrefixName);
+      }
+
+      // Get immediate children
+      List<OzoneKey> keysResultList = new ArrayList<>();
+      getChildrenKeys(getKeyPrefix(), prevKey, keysResultList);
+
+      // TODO: Back and Forth seek all the files & dirs, starting from
+      //  startKey till keyPrefix.
+
+      return keysResultList;
+    }
+
+    /**
+     * List children under the given keyPrefix and startKey path. It does
+     * recursive #listStatus calls to list all the sub-keys resultList.
+     *
+     *                  buck-1
+     *                    |
+     *                    a
+     *                    |
+     *      -----------------------------------
+     *     |           |                       |
+     *     b1          b2                      b3
+     *   -----       --------               ----------
+     *   |    |      |    |   |             |    |     |
+     *  c1   c2     d1   d2  d3             e1   e2   e3
+     *                   |                  |
+     *               --------               |
+     *              |        |              |
+     *           d21.txt   d22.txt        e11.txt
+     *
+     * Say, KeyPrefix = "a" and startKey = null;
+     *
+     * Iteration-1) RPC call proxy#listStatus("a").
+     *              Add b3, b2 and b1 to stack.
+     * Iteration-2) pop b1 and do RPC call proxy#listStatus("b1")
+     *              Add c2, c1 to stack.
+     * Iteration-3) pop c1 and do RPC call proxy#listStatus("c1"). Empty list.
+     * Iteration-4) pop c2 and do RPC call proxy#listStatus("c2"). Empty list.
+     * Iteration-5) pop b2 and do RPC call proxy#listStatus("b2")
+     *              Add d3, d2 and d1 to stack.
+     *              ..........
+     *              ..........
+     * Iteration-n) pop e3 and do RPC call proxy#listStatus("e3")
+     *              Reached end of the FS tree.
+     *
+     * @param keyPrefix
+     * @param startKey
+     * @param keysResultList
+     * @return true represents it reached limit batch size, false otherwise.
+     * @throws IOException
+     */
+    private boolean getChildrenKeys(String keyPrefix, String startKey,
+        List<OzoneKey> keysResultList) throws IOException {
+
+      // listStatus API expects a not null 'startKey' value
+      startKey = startKey == null ? "" : startKey;
+
+      // 1. Add pending items to the user key resultList
+      if (addAllPendingItemsToResultList(keysResultList)) {
+        // reached limit batch size.
+        return true;
+      }
+
+      // 2. Get immediate children of keyPrefix, starting with startKey
+      List<OzoneFileStatus> statuses = proxy.listStatus(volumeName, name,
+              keyPrefix, false, startKey, listCacheSize);
+
+      // 3. Special case: ListKey expects keyPrefix element should present in
+      // the resultList, only if startKey is blank. If startKey is not blank
+      // then resultList shouldn't contain the startKey element.
+      // Since proxy#listStatus API won't return keyPrefix element in the
+      // resultList. So, this is to add user given keyPrefix to the return list.
+      addKeyPrefixInfoToResultList(keyPrefix, startKey, keysResultList);
+
+      // 4. Special case: ListKey expects startKey shouldn't present in the
+      // resultList. Since proxy#listStatus API returns startKey element to
+      // the returnList, this function is to remove the startKey element.
+      removeStartKeyIfExistsInStatusList(startKey, statuses);
+
+      boolean reachedLimitCacheSize = false;
+      // This dirList is used to store paths elements in left-to-right order.
+      List<String> dirList = new ArrayList<>();
+
+      // 5. Iterating over the resultStatuses list and add each key to the
+      // resultList. If the listCacheSize reaches then it will add the rest
+      // of the statuses to pendingItemsToBeBatched
+      for (int indx = 0; indx < statuses.size(); indx++) {
+        OzoneFileStatus status = statuses.get(indx);
+        OmKeyInfo keyInfo = status.getKeyInfo();
+        String keyName = keyInfo.getKeyName();
+
+        // Add dir to the dirList
+        if (status.isDirectory()) {
+          dirList.add(keyInfo.getKeyName());
+          // add trailing slash to represent directory
+          keyName = OzoneFSUtils.addTrailingSlashIfNeeded(keyName);
+        }
+
+        OzoneKey ozoneKey = new OzoneKey(keyInfo.getVolumeName(),
+                keyInfo.getBucketName(), keyName,
+                keyInfo.getDataSize(), keyInfo.getCreationTime(),
+                keyInfo.getModificationTime(),
+                ReplicationType.valueOf(keyInfo.getType().toString()),
+                keyInfo.getFactor().getNumber());
+
+        // 5.1) Add to the resultList till it reaches limit batch size.
+        // Once it reaches limit, then add rest of the items to
+        // pendingItemsToBeBatched and this will picked in next batch iteration
+        if (!reachedLimitCacheSize && listCacheSize > keysResultList.size()) {
+          keysResultList.add(ozoneKey);
+          reachedLimitCacheSize = listCacheSize <= keysResultList.size();
+        } else {
+          pendingItemsToBeBatched.add(ozoneKey);
+        }
+      }
+
+      // 6. Push elements in reverse order so that the FS tree traversal will
+      // occur in left-to-right fashion.
+      for (int indx = dirList.size() - 1; indx >= 0; indx--) {
+        String dirPathComponent = dirList.get(indx);
+        stack.push(dirPathComponent);
+      }
+
+      if (reachedLimitCacheSize) {
+        return true;
+      }
+
+      // 7. Pop element and seek for its sub-child path(s). Basically moving
+      // seek pointer to next level(depth) in FS tree.
+      while (!stack.isEmpty()) {
+        keyPrefix = stack.pop();
+        if (getChildrenKeys(keyPrefix, "", keysResultList)) {
+          // reached limit batch size.
+          return true;
+        }
+      }
+
+      return false;
+    }
+
+    private void removeStartKeyIfExistsInStatusList(String startKey,
+        List<OzoneFileStatus> statuses) {
+
+      if (StringUtils.isNotBlank(startKey) && !statuses.isEmpty()) {
+        String startKeyPath = startKey;
+        if (startKey.endsWith(OZONE_URI_DELIMITER)) {
+          startKeyPath = OzoneFSUtils.removeTrailingSlashIfNeeded(startKey);
+        }
+        if (StringUtils.equals(statuses.get(0).getKeyInfo().getKeyName(),
+                startKeyPath)) {
+          // remove the duplicateKey from the list.
+          statuses.remove(0);
+        }
+      }
+    }
+
+    private boolean addAllPendingItemsToResultList(List<OzoneKey> keys) {
+
+      Iterator<OzoneKey> ozoneKeyItr = pendingItemsToBeBatched.iterator();
+      while (ozoneKeyItr.hasNext()) {
+        if (listCacheSize <= keys.size()) {
+          // reached limit batch size.
+          return true;
+        }
+        keys.add(ozoneKeyItr.next());
+        ozoneKeyItr.remove();
+      }
+      return false;
+    }
+
+    private void addKeyPrefixInfoToResultList(String keyPrefix,
+        String startKey, List<OzoneKey> keysResultList) throws IOException {
+
+      if (addedKeyPrefix) {
+        return;
+      }
+
+      // setting flag to true.
+      addedKeyPrefix = true;
+
+      // not required to addKeyPrefix
+      // case-1) if keyPrefix is null or empty
+      // case-2) if startKey is not null or empty
+      if (StringUtils.isBlank(keyPrefix) || StringUtils.isNotBlank(startKey)) {
+        return;
+      }
+
+      // TODO: Handle a case, where startKey not started with keyPrefix.

Review comment:
       Yes. Since listKeys internally using listStatus, HDDS-4364 will take care both listKeys & listStatus.
   I will include jira ID in TODO.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] linyiqun commented on a change in pull request #1954: HDDS-4683. [FSO]ListKeys: do lookup in dir and file tables

Posted by GitBox <gi...@apache.org>.
linyiqun commented on a change in pull request #1954:
URL: https://github.com/apache/ozone/pull/1954#discussion_r582046521



##########
File path: hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/OzoneBucket.java
##########
@@ -828,10 +846,273 @@ public OzoneKey next() {
      * @param prevKey
      * @return {@code List<OzoneKey>}
      */
-    private List<OzoneKey> getNextListOfKeys(String prevKey) throws
+    List<OzoneKey> getNextListOfKeys(String prevKey) throws
         IOException {
       return proxy.listKeys(volumeName, name, keyPrefix, prevKey,
           listCacheSize);
     }
   }
+
+
+  /**
+   * An Iterator to iterate over {@link OzoneKey} list.
+     *               buck-1
+     *                |
+     *                a
+     *                |
+     *      ---------------------------
+     *     |           |               |
+     *     b1          b2              b3
+     *   -----       --------        ----------
+     *   |    |      |    |   |     |    |     |
+     *  c1   c2     d1   d2  d3     e1   e2   e3
+     *                   |          |
+     *                d21.txt    e11.txt
+   *
+   * Will do Depth-First-Traversal and visit node in this fashion:
+   *
+   * a -> b1 -> c1 -> c2 -> b2 -> d1 -> d2 -> d21.txt -> d3 -> b3 -> e1 ->
+   * e11.txt -> e2 -> e3
+   *
+   * Note: there is no order guarantee.
+   */
+  private class KeyIteratorV1 extends KeyIterator{
+
+    private Stack<String> stack;
+    private List<OzoneKey> pendingItemsToBeBatched;
+    private boolean addedKeyPrefix;
+
+    /**
+     * Creates an Iterator to iterate over all keys after prevKey in the bucket.
+     * If prevKey is null it iterates from the first key in the bucket.
+     * The returned keys match key prefix.
+     *
+     * @param keyPrefix
+     * @param prevKey
+     */
+    KeyIteratorV1(String keyPrefix, String prevKey) throws IOException {
+      super(keyPrefix, prevKey);
+      addedKeyPrefix = true;
+    }
+
+    @Override
+    List<OzoneKey> getNextListOfKeys(String prevKey) throws IOException {
+      if (stack == null) {
+        stack = new Stack();
+        pendingItemsToBeBatched = new ArrayList<>();
+      }
+
+      // normalize paths
+      if (!addedKeyPrefix) {
+        prevKey = OmUtils.normalizeKey(prevKey, true);
+        String keyPrefixName = "";
+        if (StringUtils.isNotBlank(getKeyPrefix())) {
+          keyPrefixName = OmUtils.normalizeKey(getKeyPrefix(), true);
+        }
+        setKeyprefix(keyPrefixName);
+      }
+
+      // Get immediate children
+      List<OzoneKey> keysResultList = new ArrayList<>();
+      listChildrenKeys(getKeyPrefix(), prevKey, keysResultList);
+
+      // TODO: Back and Forth seek all the files & dirs, starting from
+      //  startKey till keyPrefix.
+
+      return keysResultList;
+    }
+
+    /**
+     * List children under the given keyPrefix path. This doesn't guarantee
+     * order. Internally, it does recursive #listStatus calls to list all the
+     * sub-keysResultList.

Review comment:
       Okay, get it.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] rakeshadr commented on pull request #1954: HDDS-4683. [FSO]ListKeys: do lookup in dir and file tables

Posted by GitBox <gi...@apache.org>.
rakeshadr commented on pull request #1954:
URL: https://github.com/apache/ozone/pull/1954#issuecomment-785985983


   > @rakeshadr , the PR looks good to me now. Please check above minor comment.
   
   Thanks a lot @linyiqun for the review comments. I will wait for QA report and then merge it.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] rakeshadr commented on a change in pull request #1954: HDDS-4683. [FSO]ListKeys: do lookup in dir and file tables

Posted by GitBox <gi...@apache.org>.
rakeshadr commented on a change in pull request #1954:
URL: https://github.com/apache/ozone/pull/1954#discussion_r582891141



##########
File path: hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/ozone/om/TestObjectStoreV1.java
##########
@@ -275,6 +285,259 @@ public void testLookupKey() throws Exception {
             dirPathC.getObjectID(), true);
   }
 
+  /**
+   * Verify listKeys at different levels.
+   *
+   *                  buck-1
+   *                    |
+   *                    a
+   *                    |
+   *      -----------------------------------
+   *     |              |                       |
+   *     b1             b2                      b3
+   *    -----           --------               ----------
+   *   |      |        |    |   |             |    |     |
+   *  c1     c2        d1   d2  d3             e1   e2   e3
+   *  |      |         |    |   |              |    |    |
+   * c1.tx  c2.tx   d11.tx  | d31.tx           |    |    e31.tx
+   *                      --------             |   e21.tx
+   *                     |        |            |
+   *                   d21.tx   d22.tx        e11.tx
+   *
+   * Above is the FS tree structure.
+   */
+  @Test
+  public void testListKeysAtDifferentLevels() throws Exception {
+    OzoneClient client = cluster.getClient();
+
+    ObjectStore objectStore = client.getObjectStore();
+    OzoneVolume ozoneVolume = objectStore.getVolume(volumeName);
+    Assert.assertTrue(ozoneVolume.getName().equals(volumeName));
+    OzoneBucket ozoneBucket = ozoneVolume.getBucket(bucketName);
+    Assert.assertTrue(ozoneBucket.getName().equals(bucketName));
+
+    String keyc1 = "/a/b1/c1/c1.tx";
+    String keyc2 = "/a/b1/c2/c2.tx";
+
+    String keyd13 = "/a/b2/d1/d11.tx";
+    String keyd21 = "/a/b2/d2/d21.tx";
+    String keyd22 = "/a/b2/d2/d22.tx";
+    String keyd31 = "/a/b2/d3/d31.tx";
+
+    String keye11 = "/a/b3/e1/e11.tx";
+    String keye21 = "/a/b3/e2/e21.tx";
+    String keye31 = "/a/b3/e3/e31.tx";
+
+    LinkedList<String> keys = new LinkedList<>();
+    keys.add(keyc1);
+    keys.add(keyc2);
+
+    keys.add(keyd13);
+    keys.add(keyd21);
+    keys.add(keyd22);
+    keys.add(keyd31);
+
+    keys.add(keye11);
+    keys.add(keye21);
+    keys.add(keye31);
+
+    int length = 10;
+    byte[] input = new byte[length];
+    Arrays.fill(input, (byte)96);
+
+    createKeys(ozoneBucket, keys);
+
+    // Root level listing keys
+    Iterator<? extends OzoneKey> ozoneKeyIterator =
+        ozoneBucket.listKeys(null, null);
+    verifyFullTreeStructure(ozoneKeyIterator);
+
+    ozoneKeyIterator =
+        ozoneBucket.listKeys(null, null);
+    verifyFullTreeStructure(ozoneKeyIterator);

Review comment:
       Sure, I will remove it.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] linyiqun commented on pull request #1954: HDDS-4683. [FSO]ListKeys: do lookup in dir and file tables

Posted by GitBox <gi...@apache.org>.
linyiqun commented on pull request #1954:
URL: https://github.com/apache/ozone/pull/1954#issuecomment-785941183


   @rakeshadr , the PR looks good to me now. Please check above minor comment.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] rakeshadr commented on pull request #1954: HDDS-4683. [FSO]ListKeys: do lookup in dir and file tables

Posted by GitBox <gi...@apache.org>.
rakeshadr commented on pull request #1954:
URL: https://github.com/apache/ozone/pull/1954#issuecomment-784822738


   FYI, raised HDDS-4859 jira to handle the back and forth search.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] rakeshadr merged pull request #1954: HDDS-4683. [FSO]ListKeys: do lookup in dir and file tables

Posted by GitBox <gi...@apache.org>.
rakeshadr merged pull request #1954:
URL: https://github.com/apache/ozone/pull/1954


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] rakeshadr commented on a change in pull request #1954: HDDS-4683. [FSO]ListKeys: do lookup in dir and file tables

Posted by GitBox <gi...@apache.org>.
rakeshadr commented on a change in pull request #1954:
URL: https://github.com/apache/ozone/pull/1954#discussion_r582577681



##########
File path: hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/OzoneBucket.java
##########
@@ -828,10 +846,285 @@ public OzoneKey next() {
      * @param prevKey
      * @return {@code List<OzoneKey>}
      */
-    private List<OzoneKey> getNextListOfKeys(String prevKey) throws
+    List<OzoneKey> getNextListOfKeys(String prevKey) throws
         IOException {
       return proxy.listKeys(volumeName, name, keyPrefix, prevKey,
           listCacheSize);
     }
   }
+
+
+  /**
+   * An Iterator to iterate over {@link OzoneKey} list.
+   *
+   *                  buck-1
+   *                    |
+   *                    a
+   *                    |
+   *      -----------------------------------
+   *     |           |                       |
+   *     b1          b2                      b3
+   *   -----       --------               ----------
+   *   |    |      |    |   |             |    |     |
+   *  c1   c2     d1   d2  d3             e1   e2   e3
+   *                   |                  |
+   *               --------               |
+   *              |        |              |
+   *           d21.txt   d22.txt        e11.txt
+   *
+   * Say, keyPrefix="a" and prevKey="", then will do Depth-First-Traversal and
+   * visit node to getChildren in below fashion:-
+   * 1. getChildren("a/")  2. getChildren("a/b1")  3. getChildren("a/b1/c1")
+   * 4. getChildren("a/b1/c2")  5. getChildren("a/b2/d1")
+   * 6. getChildren("a/b2/d2")  7. getChildren("a/b2/d3")
+   * 8. getChildren("a/b3/e1")  9. getChildren("a/b3/e2")
+   * 10. getChildren("a/b3/e3")
+   *
+   * Note: Does not guarantee to return the list of keys in a sorted order.
+   */
+  private class KeyIteratorV1 extends KeyIterator{
+
+    private Stack<String> stack;
+    private List<OzoneKey> pendingItemsToBeBatched;
+    private boolean addedKeyPrefix;
+
+    /**
+     * Creates an Iterator to iterate over all keys after prevKey in the bucket.
+     * If prevKey is null it iterates from the first key in the bucket.
+     * The returned keys match key prefix.
+     *
+     * @param keyPrefix
+     * @param prevKey
+     */
+    KeyIteratorV1(String keyPrefix, String prevKey) throws IOException {
+      super(keyPrefix, prevKey);
+      addedKeyPrefix = true;
+    }
+
+    @Override
+    List<OzoneKey> getNextListOfKeys(String prevKey) throws IOException {
+      if (stack == null) {
+        stack = new Stack();
+        pendingItemsToBeBatched = new ArrayList<>();
+      }
+
+      // normalize paths
+      if (!addedKeyPrefix) {
+        prevKey = OmUtils.normalizeKey(prevKey, true);
+        String keyPrefixName = "";
+        if (StringUtils.isNotBlank(getKeyPrefix())) {
+          keyPrefixName = OmUtils.normalizeKey(getKeyPrefix(), true);
+        }
+        setKeyprefix(keyPrefixName);
+      }
+
+      // Get immediate children
+      List<OzoneKey> keysResultList = new ArrayList<>();
+      getChildrenKeys(getKeyPrefix(), prevKey, keysResultList);
+
+      // TODO: Back and Forth seek all the files & dirs, starting from
+      //  startKey till keyPrefix.
+
+      return keysResultList;
+    }
+
+    /**
+     * List children under the given keyPrefix and startKey path. It does
+     * recursive #listStatus calls to list all the sub-keys resultList.
+     *
+     *                  buck-1
+     *                    |
+     *                    a
+     *                    |
+     *      -----------------------------------
+     *     |           |                       |
+     *     b1          b2                      b3
+     *   -----       --------               ----------
+     *   |    |      |    |   |             |    |     |
+     *  c1   c2     d1   d2  d3             e1   e2   e3
+     *                   |                  |
+     *               --------               |
+     *              |        |              |
+     *           d21.txt   d22.txt        e11.txt
+     *
+     * Say, KeyPrefix = "a" and startKey = null;
+     *
+     * Iteration-1) RPC call proxy#listStatus("a").
+     *              Add b3, b2 and b1 to stack.
+     * Iteration-2) pop b1 and do RPC call proxy#listStatus("b1")
+     *              Add c2, c1 to stack.
+     * Iteration-3) pop c1 and do RPC call proxy#listStatus("c1"). Empty list.
+     * Iteration-4) pop c2 and do RPC call proxy#listStatus("c2"). Empty list.
+     * Iteration-5) pop b2 and do RPC call proxy#listStatus("b2")
+     *              Add d3, d2 and d1 to stack.
+     *              ..........
+     *              ..........
+     * Iteration-n) pop e3 and do RPC call proxy#listStatus("e2")
+     *              Add d3, d2 and d1 to stack.
+     *
+     * @param keyPrefix
+     * @param startKey
+     * @param keysResultList
+     * @return true represents it reached limit batch size, false otherwise.
+     * @throws IOException
+     */
+    private boolean getChildrenKeys(String keyPrefix, String startKey,
+        List<OzoneKey> keysResultList) throws IOException {
+
+      // listStatus API expects a not null 'startKey' value
+      startKey = startKey == null ? "" : startKey;
+
+      // 1. Add pending items to the user key resultList
+      if (addAllPendingItemsToResultList(keysResultList)) {
+        // reached limit batch size.
+        return true;
+      }
+
+      // 2. Get immediate children of keyPrefix, starting with startKey
+      List<OzoneFileStatus> statuses = proxy.listStatus(volumeName, name,
+              keyPrefix, false, startKey, listCacheSize);
+
+      // 3. Special case: ListKey expects keyPrefix element should present in
+      // the resultList, only if startKey is blank. If startKey is not blank
+      // then resultList shouldn't contain the startKey element.
+      // Since proxy#listStatus API won't return keyPrefix element in the
+      // resultList. So, this is to add user given keyPrefix to the return list.
+      addKeyPrefixInfoToResultList(keyPrefix, startKey, keysResultList);

Review comment:
       #listKeys API expects the user given `keyPrefix` element should be there in the `returnKeysList`. Since client is calling #listStatus API, it won't include the `keyPrefix` element in returnKeysList. Please refer [KeyManagerImpl code](https://github.com/apache/ozone/blob/HDDS-2939/hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/KeyManagerImpl.java#L2229) where OM is skipping the keyPrefix element. So, I have made an explicit call to getFileStatus("keyPrefix") and add to the `returnKeysList`.
   
   Please refer [existing test case for the existence of keyPrefix](https://github.com/apache/ozone/blob/master/hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/fs/ozone/TestOzoneFSWithObjectStoreCreate.java#L376) in the listKeys return values.

##########
File path: hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/ozone/om/TestObjectStoreV1.java
##########
@@ -275,6 +285,246 @@ public void testLookupKey() throws Exception {
             dirPathC.getObjectID(), true);
   }
 
+  /**
+   * Verify listKeys at different levels.
+   *
+   *                  buck-1
+   *                    |
+   *                    a
+   *                    |
+   *      -----------------------------------
+   *     |              |                       |
+   *     b1             b2                      b3
+   *    -----           --------               ----------
+   *   |      |        |    |   |             |    |     |
+   *  c1     c2        d1   d2  d3             e1   e2   e3
+   *  |      |         |    |   |              |    |    |
+   * c1.tx  c2.tx   d11.tx  | d31.tx           |    |    e31.tx
+   *                      --------             |   e21.tx
+   *                     |        |            |
+   *                   d21.tx   d22.tx        e11.tx
+   *
+   * Above is the FS tree structure.
+   */
+  @Test
+  public void testListKeysAtDifferentLevels() throws Exception {
+    OzoneClient client = cluster.getClient();
+
+    ObjectStore objectStore = client.getObjectStore();
+    OzoneVolume ozoneVolume = objectStore.getVolume(volumeName);
+    Assert.assertTrue(ozoneVolume.getName().equals(volumeName));
+    OzoneBucket ozoneBucket = ozoneVolume.getBucket(bucketName);
+    Assert.assertTrue(ozoneBucket.getName().equals(bucketName));
+
+    String keyc1 = "/a/b1/c1/c1.tx";
+    String keyc2 = "/a/b1/c2/c2.tx";
+
+    String keyd13 = "/a/b2/d1/d11.tx";
+    String keyd21 = "/a/b2/d2/d21.tx";
+    String keyd22 = "/a/b2/d2/d22.tx";
+    String keyd31 = "/a/b2/d3/d31.tx";
+
+    String keye11 = "/a/b3/e1/e11.tx";
+    String keye21 = "/a/b3/e2/e21.tx";
+    String keye31 = "/a/b3/e3/e31.tx";
+
+    LinkedList<String> keys = new LinkedList<>();
+    keys.add(keyc1);
+    keys.add(keyc2);
+
+    keys.add(keyd13);
+    keys.add(keyd21);
+    keys.add(keyd22);
+    keys.add(keyd31);
+
+    keys.add(keye11);
+    keys.add(keye21);
+    keys.add(keye31);
+
+    int length = 10;
+    byte[] input = new byte[length];
+    Arrays.fill(input, (byte)96);
+
+    createKeys(ozoneBucket, keys);
+
+    // Root level listing keys
+    Iterator<? extends OzoneKey> ozoneKeyIterator =
+        ozoneBucket.listKeys("/a", null);
+
+    LinkedList<String> expectedKeys = new LinkedList<>();
+    expectedKeys.add("a/");
+    expectedKeys.add("a/b1/");
+    expectedKeys.add("a/b2/");
+    expectedKeys.add("a/b3/");
+    expectedKeys.add("a/b1/c1/");
+    expectedKeys.add("a/b1/c2/");
+    expectedKeys.add("a/b1/c1/c1.tx");
+    expectedKeys.add("a/b1/c2/c2.tx");
+    expectedKeys.add("a/b2/d1/");
+    expectedKeys.add("a/b2/d2/");
+    expectedKeys.add("a/b2/d3/");
+    expectedKeys.add("a/b2/d1/d11.tx");
+    expectedKeys.add("a/b2/d2/d21.tx");
+    expectedKeys.add("a/b2/d2/d22.tx");
+    expectedKeys.add("a/b2/d3/d31.tx");
+    expectedKeys.add("a/b3/e1/");
+    expectedKeys.add("a/b3/e2/");
+    expectedKeys.add("a/b3/e3/");
+    expectedKeys.add("a/b3/e1/e11.tx");
+    expectedKeys.add("a/b3/e2/e21.tx");
+    expectedKeys.add("a/b3/e3/e31.tx");
+    checkKeyList(ozoneKeyIterator, expectedKeys);
+
+    // Intermediate level keyPrefix - 2nd level
+    ozoneKeyIterator =
+        ozoneBucket.listKeys("a///b2///", null);
+    expectedKeys = new LinkedList<>();
+    expectedKeys.add("a/b2/");
+    expectedKeys.add("a/b2/d1/");
+    expectedKeys.add("a/b2/d2/");
+    expectedKeys.add("a/b2/d3/");
+    expectedKeys.add("a/b2/d1/d11.tx");
+    expectedKeys.add("a/b2/d2/d21.tx");
+    expectedKeys.add("a/b2/d2/d22.tx");
+    expectedKeys.add("a/b2/d3/d31.tx");
+    checkKeyList(ozoneKeyIterator, expectedKeys);
+
+    // Intermediate level keyPrefix - 3rd level
+    ozoneKeyIterator =
+        ozoneBucket.listKeys("a/b2/d1", null);
+    expectedKeys = new LinkedList<>();
+    expectedKeys.add("a/b2/d1/");
+    expectedKeys.add("a/b2/d1/d11.tx");
+    checkKeyList(ozoneKeyIterator, expectedKeys);
+
+    // Boundary of a level
+    ozoneKeyIterator =
+        ozoneBucket.listKeys("a/b2/d2", "a/b2/d2/d21.tx");
+    expectedKeys = new LinkedList<>();
+    expectedKeys.add("a/b2/d2/d22.tx");
+    checkKeyList(ozoneKeyIterator, expectedKeys);
+
+    // Boundary case - last node in the depth-first-traversal
+    ozoneKeyIterator =
+        ozoneBucket.listKeys("a/b3/e3", "a/b3/e3/e31.tx");
+    expectedKeys = new LinkedList<>();
+    checkKeyList(ozoneKeyIterator, expectedKeys);

Review comment:
       Sure




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] linyiqun commented on a change in pull request #1954: HDDS-4683. [FSO]ListKeys: do lookup in dir and file tables

Posted by GitBox <gi...@apache.org>.
linyiqun commented on a change in pull request #1954:
URL: https://github.com/apache/ozone/pull/1954#discussion_r582877455



##########
File path: hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/ozone/om/TestObjectStoreV1.java
##########
@@ -275,6 +285,246 @@ public void testLookupKey() throws Exception {
             dirPathC.getObjectID(), true);
   }
 
+  /**
+   * Verify listKeys at different levels.
+   *
+   *                  buck-1
+   *                    |
+   *                    a
+   *                    |
+   *      -----------------------------------
+   *     |              |                       |
+   *     b1             b2                      b3
+   *    -----           --------               ----------
+   *   |      |        |    |   |             |    |     |
+   *  c1     c2        d1   d2  d3             e1   e2   e3
+   *  |      |         |    |   |              |    |    |
+   * c1.tx  c2.tx   d11.tx  | d31.tx           |    |    e31.tx
+   *                      --------             |   e21.tx
+   *                     |        |            |
+   *                   d21.tx   d22.tx        e11.tx
+   *
+   * Above is the FS tree structure.
+   */
+  @Test
+  public void testListKeysAtDifferentLevels() throws Exception {
+    OzoneClient client = cluster.getClient();
+
+    ObjectStore objectStore = client.getObjectStore();
+    OzoneVolume ozoneVolume = objectStore.getVolume(volumeName);
+    Assert.assertTrue(ozoneVolume.getName().equals(volumeName));
+    OzoneBucket ozoneBucket = ozoneVolume.getBucket(bucketName);
+    Assert.assertTrue(ozoneBucket.getName().equals(bucketName));
+
+    String keyc1 = "/a/b1/c1/c1.tx";
+    String keyc2 = "/a/b1/c2/c2.tx";
+
+    String keyd13 = "/a/b2/d1/d11.tx";
+    String keyd21 = "/a/b2/d2/d21.tx";
+    String keyd22 = "/a/b2/d2/d22.tx";
+    String keyd31 = "/a/b2/d3/d31.tx";
+
+    String keye11 = "/a/b3/e1/e11.tx";
+    String keye21 = "/a/b3/e2/e21.tx";
+    String keye31 = "/a/b3/e3/e31.tx";
+
+    LinkedList<String> keys = new LinkedList<>();
+    keys.add(keyc1);
+    keys.add(keyc2);
+
+    keys.add(keyd13);
+    keys.add(keyd21);
+    keys.add(keyd22);
+    keys.add(keyd31);
+
+    keys.add(keye11);
+    keys.add(keye21);
+    keys.add(keye31);
+
+    int length = 10;
+    byte[] input = new byte[length];
+    Arrays.fill(input, (byte)96);
+
+    createKeys(ozoneBucket, keys);
+
+    // Root level listing keys
+    Iterator<? extends OzoneKey> ozoneKeyIterator =
+        ozoneBucket.listKeys("/a", null);
+
+    LinkedList<String> expectedKeys = new LinkedList<>();
+    expectedKeys.add("a/");
+    expectedKeys.add("a/b1/");
+    expectedKeys.add("a/b2/");
+    expectedKeys.add("a/b3/");
+    expectedKeys.add("a/b1/c1/");
+    expectedKeys.add("a/b1/c2/");
+    expectedKeys.add("a/b1/c1/c1.tx");
+    expectedKeys.add("a/b1/c2/c2.tx");
+    expectedKeys.add("a/b2/d1/");
+    expectedKeys.add("a/b2/d2/");
+    expectedKeys.add("a/b2/d3/");
+    expectedKeys.add("a/b2/d1/d11.tx");
+    expectedKeys.add("a/b2/d2/d21.tx");
+    expectedKeys.add("a/b2/d2/d22.tx");
+    expectedKeys.add("a/b2/d3/d31.tx");
+    expectedKeys.add("a/b3/e1/");
+    expectedKeys.add("a/b3/e2/");
+    expectedKeys.add("a/b3/e3/");
+    expectedKeys.add("a/b3/e1/e11.tx");
+    expectedKeys.add("a/b3/e2/e21.tx");
+    expectedKeys.add("a/b3/e3/e31.tx");
+    checkKeyList(ozoneKeyIterator, expectedKeys);
+
+    // Intermediate level keyPrefix - 2nd level
+    ozoneKeyIterator =
+        ozoneBucket.listKeys("a///b2///", null);
+    expectedKeys = new LinkedList<>();
+    expectedKeys.add("a/b2/");
+    expectedKeys.add("a/b2/d1/");
+    expectedKeys.add("a/b2/d2/");
+    expectedKeys.add("a/b2/d3/");
+    expectedKeys.add("a/b2/d1/d11.tx");
+    expectedKeys.add("a/b2/d2/d21.tx");
+    expectedKeys.add("a/b2/d2/d22.tx");
+    expectedKeys.add("a/b2/d3/d31.tx");
+    checkKeyList(ozoneKeyIterator, expectedKeys);
+
+    // Intermediate level keyPrefix - 3rd level
+    ozoneKeyIterator =
+        ozoneBucket.listKeys("a/b2/d1", null);
+    expectedKeys = new LinkedList<>();
+    expectedKeys.add("a/b2/d1/");
+    expectedKeys.add("a/b2/d1/d11.tx");
+    checkKeyList(ozoneKeyIterator, expectedKeys);
+
+    // Boundary of a level
+    ozoneKeyIterator =
+        ozoneBucket.listKeys("a/b2/d2", "a/b2/d2/d21.tx");
+    expectedKeys = new LinkedList<>();
+    expectedKeys.add("a/b2/d2/d22.tx");
+    checkKeyList(ozoneKeyIterator, expectedKeys);
+
+    // Boundary case - last node in the depth-first-traversal
+    ozoneKeyIterator =
+        ozoneBucket.listKeys("a/b3/e3", "a/b3/e3/e31.tx");
+    expectedKeys = new LinkedList<>();
+    checkKeyList(ozoneKeyIterator, expectedKeys);

Review comment:
       Makes sense to me.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] linyiqun commented on a change in pull request #1954: HDDS-4683. [FSO]ListKeys: do lookup in dir and file tables

Posted by GitBox <gi...@apache.org>.
linyiqun commented on a change in pull request #1954:
URL: https://github.com/apache/ozone/pull/1954#discussion_r582877837



##########
File path: hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/OzoneBucket.java
##########
@@ -828,10 +846,285 @@ public OzoneKey next() {
      * @param prevKey
      * @return {@code List<OzoneKey>}
      */
-    private List<OzoneKey> getNextListOfKeys(String prevKey) throws
+    List<OzoneKey> getNextListOfKeys(String prevKey) throws
         IOException {
       return proxy.listKeys(volumeName, name, keyPrefix, prevKey,
           listCacheSize);
     }
   }
+
+
+  /**
+   * An Iterator to iterate over {@link OzoneKey} list.
+   *
+   *                  buck-1
+   *                    |
+   *                    a
+   *                    |
+   *      -----------------------------------
+   *     |           |                       |
+   *     b1          b2                      b3
+   *   -----       --------               ----------
+   *   |    |      |    |   |             |    |     |
+   *  c1   c2     d1   d2  d3             e1   e2   e3
+   *                   |                  |
+   *               --------               |
+   *              |        |              |
+   *           d21.txt   d22.txt        e11.txt
+   *
+   * Say, keyPrefix="a" and prevKey="", then will do Depth-First-Traversal and
+   * visit node to getChildren in below fashion:-
+   * 1. getChildren("a/")  2. getChildren("a/b1")  3. getChildren("a/b1/c1")
+   * 4. getChildren("a/b1/c2")  5. getChildren("a/b2/d1")
+   * 6. getChildren("a/b2/d2")  7. getChildren("a/b2/d3")
+   * 8. getChildren("a/b3/e1")  9. getChildren("a/b3/e2")
+   * 10. getChildren("a/b3/e3")
+   *
+   * Note: Does not guarantee to return the list of keys in a sorted order.
+   */
+  private class KeyIteratorV1 extends KeyIterator{
+
+    private Stack<String> stack;
+    private List<OzoneKey> pendingItemsToBeBatched;
+    private boolean addedKeyPrefix;
+
+    /**
+     * Creates an Iterator to iterate over all keys after prevKey in the bucket.
+     * If prevKey is null it iterates from the first key in the bucket.
+     * The returned keys match key prefix.
+     *
+     * @param keyPrefix
+     * @param prevKey
+     */
+    KeyIteratorV1(String keyPrefix, String prevKey) throws IOException {
+      super(keyPrefix, prevKey);
+      addedKeyPrefix = true;
+    }
+
+    @Override
+    List<OzoneKey> getNextListOfKeys(String prevKey) throws IOException {
+      if (stack == null) {
+        stack = new Stack();
+        pendingItemsToBeBatched = new ArrayList<>();
+      }
+
+      // normalize paths
+      if (!addedKeyPrefix) {
+        prevKey = OmUtils.normalizeKey(prevKey, true);
+        String keyPrefixName = "";
+        if (StringUtils.isNotBlank(getKeyPrefix())) {
+          keyPrefixName = OmUtils.normalizeKey(getKeyPrefix(), true);
+        }
+        setKeyprefix(keyPrefixName);
+      }
+
+      // Get immediate children
+      List<OzoneKey> keysResultList = new ArrayList<>();
+      getChildrenKeys(getKeyPrefix(), prevKey, keysResultList);
+
+      // TODO: Back and Forth seek all the files & dirs, starting from
+      //  startKey till keyPrefix.
+
+      return keysResultList;
+    }
+
+    /**
+     * List children under the given keyPrefix and startKey path. It does
+     * recursive #listStatus calls to list all the sub-keys resultList.
+     *
+     *                  buck-1
+     *                    |
+     *                    a
+     *                    |
+     *      -----------------------------------
+     *     |           |                       |
+     *     b1          b2                      b3
+     *   -----       --------               ----------
+     *   |    |      |    |   |             |    |     |
+     *  c1   c2     d1   d2  d3             e1   e2   e3
+     *                   |                  |
+     *               --------               |
+     *              |        |              |
+     *           d21.txt   d22.txt        e11.txt
+     *
+     * Say, KeyPrefix = "a" and startKey = null;
+     *
+     * Iteration-1) RPC call proxy#listStatus("a").
+     *              Add b3, b2 and b1 to stack.
+     * Iteration-2) pop b1 and do RPC call proxy#listStatus("b1")
+     *              Add c2, c1 to stack.
+     * Iteration-3) pop c1 and do RPC call proxy#listStatus("c1"). Empty list.
+     * Iteration-4) pop c2 and do RPC call proxy#listStatus("c2"). Empty list.
+     * Iteration-5) pop b2 and do RPC call proxy#listStatus("b2")
+     *              Add d3, d2 and d1 to stack.
+     *              ..........
+     *              ..........
+     * Iteration-n) pop e3 and do RPC call proxy#listStatus("e2")
+     *              Add d3, d2 and d1 to stack.
+     *
+     * @param keyPrefix
+     * @param startKey
+     * @param keysResultList
+     * @return true represents it reached limit batch size, false otherwise.
+     * @throws IOException
+     */
+    private boolean getChildrenKeys(String keyPrefix, String startKey,
+        List<OzoneKey> keysResultList) throws IOException {
+
+      // listStatus API expects a not null 'startKey' value
+      startKey = startKey == null ? "" : startKey;
+
+      // 1. Add pending items to the user key resultList
+      if (addAllPendingItemsToResultList(keysResultList)) {
+        // reached limit batch size.
+        return true;
+      }
+
+      // 2. Get immediate children of keyPrefix, starting with startKey
+      List<OzoneFileStatus> statuses = proxy.listStatus(volumeName, name,
+              keyPrefix, false, startKey, listCacheSize);
+
+      // 3. Special case: ListKey expects keyPrefix element should present in
+      // the resultList, only if startKey is blank. If startKey is not blank
+      // then resultList shouldn't contain the startKey element.
+      // Since proxy#listStatus API won't return keyPrefix element in the
+      // resultList. So, this is to add user given keyPrefix to the return list.
+      addKeyPrefixInfoToResultList(keyPrefix, startKey, keysResultList);

Review comment:
       Get it, thanks @rakeshadr .




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] linyiqun commented on a change in pull request #1954: HDDS-4683. [FSO]ListKeys: do lookup in dir and file tables

Posted by GitBox <gi...@apache.org>.
linyiqun commented on a change in pull request #1954:
URL: https://github.com/apache/ozone/pull/1954#discussion_r582880826



##########
File path: hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/ozone/om/TestObjectStoreV1.java
##########
@@ -275,6 +285,259 @@ public void testLookupKey() throws Exception {
             dirPathC.getObjectID(), true);
   }
 
+  /**
+   * Verify listKeys at different levels.
+   *
+   *                  buck-1
+   *                    |
+   *                    a
+   *                    |
+   *      -----------------------------------
+   *     |              |                       |
+   *     b1             b2                      b3
+   *    -----           --------               ----------
+   *   |      |        |    |   |             |    |     |
+   *  c1     c2        d1   d2  d3             e1   e2   e3
+   *  |      |         |    |   |              |    |    |
+   * c1.tx  c2.tx   d11.tx  | d31.tx           |    |    e31.tx
+   *                      --------             |   e21.tx
+   *                     |        |            |
+   *                   d21.tx   d22.tx        e11.tx
+   *
+   * Above is the FS tree structure.
+   */
+  @Test
+  public void testListKeysAtDifferentLevels() throws Exception {
+    OzoneClient client = cluster.getClient();
+
+    ObjectStore objectStore = client.getObjectStore();
+    OzoneVolume ozoneVolume = objectStore.getVolume(volumeName);
+    Assert.assertTrue(ozoneVolume.getName().equals(volumeName));
+    OzoneBucket ozoneBucket = ozoneVolume.getBucket(bucketName);
+    Assert.assertTrue(ozoneBucket.getName().equals(bucketName));
+
+    String keyc1 = "/a/b1/c1/c1.tx";
+    String keyc2 = "/a/b1/c2/c2.tx";
+
+    String keyd13 = "/a/b2/d1/d11.tx";
+    String keyd21 = "/a/b2/d2/d21.tx";
+    String keyd22 = "/a/b2/d2/d22.tx";
+    String keyd31 = "/a/b2/d3/d31.tx";
+
+    String keye11 = "/a/b3/e1/e11.tx";
+    String keye21 = "/a/b3/e2/e21.tx";
+    String keye31 = "/a/b3/e3/e31.tx";
+
+    LinkedList<String> keys = new LinkedList<>();
+    keys.add(keyc1);
+    keys.add(keyc2);
+
+    keys.add(keyd13);
+    keys.add(keyd21);
+    keys.add(keyd22);
+    keys.add(keyd31);
+
+    keys.add(keye11);
+    keys.add(keye21);
+    keys.add(keye31);
+
+    int length = 10;
+    byte[] input = new byte[length];
+    Arrays.fill(input, (byte)96);
+
+    createKeys(ozoneBucket, keys);
+
+    // Root level listing keys
+    Iterator<? extends OzoneKey> ozoneKeyIterator =
+        ozoneBucket.listKeys(null, null);
+    verifyFullTreeStructure(ozoneKeyIterator);
+
+    ozoneKeyIterator =
+        ozoneBucket.listKeys(null, null);
+    verifyFullTreeStructure(ozoneKeyIterator);

Review comment:
       Two duplicated lines.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org