You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ozone.apache.org by GitBox <gi...@apache.org> on 2020/10/28 23:50:18 UTC

[GitHub] [ozone] bharatviswa504 commented on a change in pull request #1503: HDDS-4332: ListFileStatus - do lookup in directory and file tables

bharatviswa504 commented on a change in pull request #1503:
URL: https://github.com/apache/ozone/pull/1503#discussion_r513799166



##########
File path: hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/KeyManagerImpl.java
##########
@@ -2205,6 +2276,318 @@ private void listStatusFindKeyInTableCache(
     return fileStatusList;
   }
 
+  public List<OzoneFileStatus> listStatusV1(OmKeyArgs args, boolean recursive,
+      String startKey, long numEntries, String clientAddress)
+          throws IOException {
+    Preconditions.checkNotNull(args, "Key args can not be null");
+
+    // unsorted OMKeyInfo list contains combine results from TableCache and DB.
+    List<OzoneFileStatus> fileStatusFinalList = new ArrayList<>();
+    LinkedHashSet<OzoneFileStatus> fileStatusList = new LinkedHashSet<>();
+    if (numEntries <= 0) {
+      return fileStatusFinalList;
+    }
+
+    String volumeName = args.getVolumeName();
+    String bucketName = args.getBucketName();
+    String keyName = args.getKeyName();
+    String seekFileInDB;
+    String seekDirInDB;
+    long prefixKeyInDB;
+    String prefixPath = keyName;
+
+    int countEntries = 0;
+
+    metadataManager.getLock().acquireReadLock(BUCKET_LOCK, volumeName,
+            bucketName);
+    try {
+      if (Strings.isNullOrEmpty(startKey)) {
+        OzoneFileStatus fileStatus = getFileStatus(args, clientAddress);
+        if (fileStatus.isFile()) {
+          return Collections.singletonList(fileStatus);
+        }
+
+        // Not required to search in DeletedTable because all the deleted
+        // keys will be marked directly in dirTable or in keyTable by
+        // breaking the pointer to its sub-dirs. So, there is no issue of
+        // inconsistency.
+
+        /*
+         * keyName is a directory.
+         * Say, "/a" is the dir name and its objectID is 1024, then seek
+         * will be doing with "1024/" to get all immediate descendants.
+         */
+        if (fileStatus.getKeyInfo() != null) {
+          prefixKeyInDB = fileStatus.getKeyInfo().getObjectID();
+        } else {
+          // list root directory.
+          String bucketKey = metadataManager.getBucketKey(volumeName,
+                  bucketName);
+          OmBucketInfo omBucketInfo =
+                  metadataManager.getBucketTable().get(bucketKey);
+          prefixKeyInDB = omBucketInfo.getObjectID();
+        }
+        seekFileInDB = metadataManager.getOzonePathKey(prefixKeyInDB, "");
+        seekDirInDB = metadataManager.getOzonePathKey(prefixKeyInDB, "");
+
+        // TODO: recursive flag=true will be handled in HDDS-4360 jira.
+        // Order of seek -> (1)Seek dirs in dirTable (2)Seek files in fileTable
+        // 1. Seek the given key in key table.
+        countEntries = getFilesFromDirectory(fileStatusList, seekFileInDB,
+                prefixPath, prefixKeyInDB, startKey, countEntries, numEntries);
+        // 2. Seek the given key in dir table.
+        getDirectories(recursive, startKey, numEntries, fileStatusList,
+                volumeName, bucketName, seekDirInDB, prefixKeyInDB,
+                prefixPath, countEntries);
+      } else {
+        /*
+         * startKey will be used in iterator seek and sets the beginning point
+         * for key traversal.
+         *
+         * key name will be used as parentID where the user has requested to
+         * list the keys from.
+         *
+         * When recursive flag=false, parentID won't change between two pages.
+         * For example: OM has a namespace like,
+         *    /a/1...1M files and /a/b/1...1M files.
+         *    /a/1...1M directories and /a/b/1...1M directories.
+         * Listing "/a", will always have the parentID as "a" irrespective of
+         * the startKey value.
+         */
+        // TODO: recursive flag=true will be handled in HDDS-4360 jira.
+        OzoneFileStatus fileStatusInfo = getOzoneFileStatusV1(volumeName,
+                bucketName, startKey, false, null, true);
+
+        if (fileStatusInfo != null) {
+          prefixKeyInDB = fileStatusInfo.getKeyInfo().getParentObjectID();
+          if(fileStatusInfo.isDirectory()){
+            seekDirInDB = metadataManager.getOzonePathKey(prefixKeyInDB,
+                    fileStatusInfo.getKeyInfo().getFileName());
+
+            // Order of seek -> (1) Seek dirs in dirTable. In OM, always the
+            // order of search is, first seek into fileTable and then dirTable.
+            // So, its not required to search again into the fileTable.
+
+            // Seek the given key in dirTable.
+            getDirectories(recursive, startKey, numEntries,
+                    fileStatusList, volumeName, bucketName, seekDirInDB,
+                    prefixKeyInDB, prefixPath, countEntries);
+
+          } else {
+            seekFileInDB = metadataManager.getOzonePathKey(prefixKeyInDB,
+                    fileStatusInfo.getKeyInfo().getFileName());
+            // begins from the first sub-dir under the parent dir
+            seekDirInDB = metadataManager.getOzonePathKey(prefixKeyInDB, "");
+
+            // Seek the given key in key table.
+            countEntries = getFilesFromDirectory(fileStatusList, seekFileInDB,
+                    prefixPath, prefixKeyInDB, startKey, countEntries,
+                    numEntries);
+            // Seek the given key in dir table.
+            getDirectories(recursive, startKey, numEntries, fileStatusList,
+                    volumeName, bucketName, seekDirInDB,
+                    prefixKeyInDB, prefixPath, countEntries);
+          }
+        } else {
+          // No key exists for the given startKey.
+          // TODO: HDDS-4364: startKey can be a non-existed key
+          return Collections.emptyList();
+        }
+      }
+    } finally {
+      metadataManager.getLock().releaseReadLock(BUCKET_LOCK, volumeName,
+              bucketName);
+    }
+
+    // Convert results in cacheKeyMap to List
+    for (OzoneFileStatus fileStatus : fileStatusList) {
+      // refreshPipeline flag check has been removed as part of
+      // https://issues.apache.org/jira/browse/HDDS-3658.
+      // Please refer this jira for more details.
+      refreshPipeline(fileStatus.getKeyInfo());
+
+      // No need to check if a key is deleted or not here, this is handled
+      // when adding entries to cacheKeyMap from DB.
+      if (args.getSortDatanodes()) {
+        sortDatanodeInPipeline(fileStatus.getKeyInfo(), clientAddress);
+      }
+    }
+    fileStatusFinalList.addAll(fileStatusList);
+    return fileStatusFinalList;
+  }
+
+  @SuppressWarnings("parameternumber")
+  protected int getDirectories(boolean recursive, String startKey,
+      long numEntries, Set<OzoneFileStatus> fileStatusList,
+      String volumeName, String bucketName, String seekDirInDB,
+      long prefixKeyInDB, String prefixPath,
+      int countEntries) throws IOException {
+
+    Table dirTable = metadataManager.getDirectoryTable();
+    listStatusFindDirsInTableCache(fileStatusList, dirTable, prefixKeyInDB,

Review comment:
       We need to consider count from this during total count when iterating from cache.
   
   We are using that countEntries but listStatusFindDirsInTableCache is not returning that

##########
File path: hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/KeyManagerImpl.java
##########
@@ -2205,6 +2276,318 @@ private void listStatusFindKeyInTableCache(
     return fileStatusList;
   }
 
+  public List<OzoneFileStatus> listStatusV1(OmKeyArgs args, boolean recursive,
+      String startKey, long numEntries, String clientAddress)
+          throws IOException {
+    Preconditions.checkNotNull(args, "Key args can not be null");
+
+    // unsorted OMKeyInfo list contains combine results from TableCache and DB.
+    List<OzoneFileStatus> fileStatusFinalList = new ArrayList<>();
+    LinkedHashSet<OzoneFileStatus> fileStatusList = new LinkedHashSet<>();
+    if (numEntries <= 0) {
+      return fileStatusFinalList;
+    }
+
+    String volumeName = args.getVolumeName();
+    String bucketName = args.getBucketName();
+    String keyName = args.getKeyName();
+    String seekFileInDB;
+    String seekDirInDB;
+    long prefixKeyInDB;
+    String prefixPath = keyName;
+
+    int countEntries = 0;
+
+    metadataManager.getLock().acquireReadLock(BUCKET_LOCK, volumeName,
+            bucketName);
+    try {
+      if (Strings.isNullOrEmpty(startKey)) {
+        OzoneFileStatus fileStatus = getFileStatus(args, clientAddress);
+        if (fileStatus.isFile()) {
+          return Collections.singletonList(fileStatus);
+        }
+
+        // Not required to search in DeletedTable because all the deleted
+        // keys will be marked directly in dirTable or in keyTable by
+        // breaking the pointer to its sub-dirs. So, there is no issue of
+        // inconsistency.
+
+        /*
+         * keyName is a directory.
+         * Say, "/a" is the dir name and its objectID is 1024, then seek
+         * will be doing with "1024/" to get all immediate descendants.
+         */
+        if (fileStatus.getKeyInfo() != null) {
+          prefixKeyInDB = fileStatus.getKeyInfo().getObjectID();
+        } else {
+          // list root directory.
+          String bucketKey = metadataManager.getBucketKey(volumeName,
+                  bucketName);
+          OmBucketInfo omBucketInfo =
+                  metadataManager.getBucketTable().get(bucketKey);
+          prefixKeyInDB = omBucketInfo.getObjectID();
+        }
+        seekFileInDB = metadataManager.getOzonePathKey(prefixKeyInDB, "");
+        seekDirInDB = metadataManager.getOzonePathKey(prefixKeyInDB, "");
+
+        // TODO: recursive flag=true will be handled in HDDS-4360 jira.
+        // Order of seek -> (1)Seek dirs in dirTable (2)Seek files in fileTable
+        // 1. Seek the given key in key table.
+        countEntries = getFilesFromDirectory(fileStatusList, seekFileInDB,
+                prefixPath, prefixKeyInDB, startKey, countEntries, numEntries);
+        // 2. Seek the given key in dir table.
+        getDirectories(recursive, startKey, numEntries, fileStatusList,
+                volumeName, bucketName, seekDirInDB, prefixKeyInDB,
+                prefixPath, countEntries);
+      } else {
+        /*
+         * startKey will be used in iterator seek and sets the beginning point
+         * for key traversal.
+         *
+         * key name will be used as parentID where the user has requested to
+         * list the keys from.
+         *
+         * When recursive flag=false, parentID won't change between two pages.
+         * For example: OM has a namespace like,
+         *    /a/1...1M files and /a/b/1...1M files.
+         *    /a/1...1M directories and /a/b/1...1M directories.
+         * Listing "/a", will always have the parentID as "a" irrespective of
+         * the startKey value.
+         */
+        // TODO: recursive flag=true will be handled in HDDS-4360 jira.
+        OzoneFileStatus fileStatusInfo = getOzoneFileStatusV1(volumeName,
+                bucketName, startKey, false, null, true);
+
+        if (fileStatusInfo != null) {
+          prefixKeyInDB = fileStatusInfo.getKeyInfo().getParentObjectID();
+          if(fileStatusInfo.isDirectory()){
+            seekDirInDB = metadataManager.getOzonePathKey(prefixKeyInDB,
+                    fileStatusInfo.getKeyInfo().getFileName());
+
+            // Order of seek -> (1) Seek dirs in dirTable. In OM, always the
+            // order of search is, first seek into fileTable and then dirTable.
+            // So, its not required to search again into the fileTable.
+
+            // Seek the given key in dirTable.
+            getDirectories(recursive, startKey, numEntries,
+                    fileStatusList, volumeName, bucketName, seekDirInDB,
+                    prefixKeyInDB, prefixPath, countEntries);
+
+          } else {
+            seekFileInDB = metadataManager.getOzonePathKey(prefixKeyInDB,
+                    fileStatusInfo.getKeyInfo().getFileName());
+            // begins from the first sub-dir under the parent dir
+            seekDirInDB = metadataManager.getOzonePathKey(prefixKeyInDB, "");
+
+            // Seek the given key in key table.
+            countEntries = getFilesFromDirectory(fileStatusList, seekFileInDB,
+                    prefixPath, prefixKeyInDB, startKey, countEntries,
+                    numEntries);
+            // Seek the given key in dir table.
+            getDirectories(recursive, startKey, numEntries, fileStatusList,

Review comment:
       Minor NIT: Can we maintain similar order for parameters for getFilesFromDirectories and getDirectories it will be easy to follow while reading code, as it has many parameters.

##########
File path: hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/request/file/OMFileRequest.java
##########
@@ -583,4 +588,112 @@ public static OmKeyInfo getOmKeyInfoFromFileTable(boolean openFileTable,
     return dbOmKeyInfo;
   }
 
+  /**
+   * Gets OmKeyInfo if exists for the given key name in the DB.
+   *
+   * @param omMetadataMgr metadata manager
+   * @param volumeName    volume name
+   * @param bucketName    bucket name
+   * @param keyName       key name
+   * @param scmBlockSize  scm block size
+   * @return OzoneFileStatus
+   * @throws IOException DB failure
+   */
+  @Nullable
+  public static OzoneFileStatus getOMKeyInfoIfExists(

Review comment:
       Question: This should be done under bucket lock right?
   

##########
File path: hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/KeyManagerImpl.java
##########
@@ -2205,6 +2276,318 @@ private void listStatusFindKeyInTableCache(
     return fileStatusList;
   }
 
+  public List<OzoneFileStatus> listStatusV1(OmKeyArgs args, boolean recursive,
+      String startKey, long numEntries, String clientAddress)
+          throws IOException {
+    Preconditions.checkNotNull(args, "Key args can not be null");
+
+    // unsorted OMKeyInfo list contains combine results from TableCache and DB.
+    List<OzoneFileStatus> fileStatusFinalList = new ArrayList<>();
+    LinkedHashSet<OzoneFileStatus> fileStatusList = new LinkedHashSet<>();
+    if (numEntries <= 0) {
+      return fileStatusFinalList;
+    }
+
+    String volumeName = args.getVolumeName();
+    String bucketName = args.getBucketName();
+    String keyName = args.getKeyName();
+    String seekFileInDB;
+    String seekDirInDB;
+    long prefixKeyInDB;
+    String prefixPath = keyName;
+
+    int countEntries = 0;
+
+    metadataManager.getLock().acquireReadLock(BUCKET_LOCK, volumeName,
+            bucketName);
+    try {
+      if (Strings.isNullOrEmpty(startKey)) {
+        OzoneFileStatus fileStatus = getFileStatus(args, clientAddress);
+        if (fileStatus.isFile()) {
+          return Collections.singletonList(fileStatus);
+        }
+
+        // Not required to search in DeletedTable because all the deleted
+        // keys will be marked directly in dirTable or in keyTable by
+        // breaking the pointer to its sub-dirs. So, there is no issue of
+        // inconsistency.
+
+        /*
+         * keyName is a directory.
+         * Say, "/a" is the dir name and its objectID is 1024, then seek
+         * will be doing with "1024/" to get all immediate descendants.
+         */
+        if (fileStatus.getKeyInfo() != null) {
+          prefixKeyInDB = fileStatus.getKeyInfo().getObjectID();
+        } else {
+          // list root directory.
+          String bucketKey = metadataManager.getBucketKey(volumeName,
+                  bucketName);
+          OmBucketInfo omBucketInfo =
+                  metadataManager.getBucketTable().get(bucketKey);
+          prefixKeyInDB = omBucketInfo.getObjectID();
+        }
+        seekFileInDB = metadataManager.getOzonePathKey(prefixKeyInDB, "");
+        seekDirInDB = metadataManager.getOzonePathKey(prefixKeyInDB, "");
+
+        // TODO: recursive flag=true will be handled in HDDS-4360 jira.
+        // Order of seek -> (1)Seek dirs in dirTable (2)Seek files in fileTable
+        // 1. Seek the given key in key table.
+        countEntries = getFilesFromDirectory(fileStatusList, seekFileInDB,
+                prefixPath, prefixKeyInDB, startKey, countEntries, numEntries);
+        // 2. Seek the given key in dir table.
+        getDirectories(recursive, startKey, numEntries, fileStatusList,
+                volumeName, bucketName, seekDirInDB, prefixKeyInDB,
+                prefixPath, countEntries);
+      } else {
+        /*
+         * startKey will be used in iterator seek and sets the beginning point
+         * for key traversal.
+         *
+         * key name will be used as parentID where the user has requested to
+         * list the keys from.
+         *
+         * When recursive flag=false, parentID won't change between two pages.
+         * For example: OM has a namespace like,
+         *    /a/1...1M files and /a/b/1...1M files.
+         *    /a/1...1M directories and /a/b/1...1M directories.
+         * Listing "/a", will always have the parentID as "a" irrespective of
+         * the startKey value.
+         */
+        // TODO: recursive flag=true will be handled in HDDS-4360 jira.
+        OzoneFileStatus fileStatusInfo = getOzoneFileStatusV1(volumeName,
+                bucketName, startKey, false, null, true);
+
+        if (fileStatusInfo != null) {
+          prefixKeyInDB = fileStatusInfo.getKeyInfo().getParentObjectID();
+          if(fileStatusInfo.isDirectory()){
+            seekDirInDB = metadataManager.getOzonePathKey(prefixKeyInDB,
+                    fileStatusInfo.getKeyInfo().getFileName());
+
+            // Order of seek -> (1) Seek dirs in dirTable. In OM, always the
+            // order of search is, first seek into fileTable and then dirTable.
+            // So, its not required to search again into the fileTable.
+
+            // Seek the given key in dirTable.
+            getDirectories(recursive, startKey, numEntries,
+                    fileStatusList, volumeName, bucketName, seekDirInDB,
+                    prefixKeyInDB, prefixPath, countEntries);
+
+          } else {
+            seekFileInDB = metadataManager.getOzonePathKey(prefixKeyInDB,
+                    fileStatusInfo.getKeyInfo().getFileName());
+            // begins from the first sub-dir under the parent dir
+            seekDirInDB = metadataManager.getOzonePathKey(prefixKeyInDB, "");
+
+            // Seek the given key in key table.
+            countEntries = getFilesFromDirectory(fileStatusList, seekFileInDB,
+                    prefixPath, prefixKeyInDB, startKey, countEntries,
+                    numEntries);
+            // Seek the given key in dir table.
+            getDirectories(recursive, startKey, numEntries, fileStatusList,
+                    volumeName, bucketName, seekDirInDB,
+                    prefixKeyInDB, prefixPath, countEntries);
+          }
+        } else {
+          // No key exists for the given startKey.
+          // TODO: HDDS-4364: startKey can be a non-existed key
+          return Collections.emptyList();
+        }
+      }
+    } finally {
+      metadataManager.getLock().releaseReadLock(BUCKET_LOCK, volumeName,
+              bucketName);
+    }
+
+    // Convert results in cacheKeyMap to List
+    for (OzoneFileStatus fileStatus : fileStatusList) {
+      // refreshPipeline flag check has been removed as part of
+      // https://issues.apache.org/jira/browse/HDDS-3658.
+      // Please refer this jira for more details.
+      refreshPipeline(fileStatus.getKeyInfo());
+
+      // No need to check if a key is deleted or not here, this is handled
+      // when adding entries to cacheKeyMap from DB.
+      if (args.getSortDatanodes()) {
+        sortDatanodeInPipeline(fileStatus.getKeyInfo(), clientAddress);
+      }
+    }
+    fileStatusFinalList.addAll(fileStatusList);
+    return fileStatusFinalList;
+  }
+
+  @SuppressWarnings("parameternumber")
+  protected int getDirectories(boolean recursive, String startKey,
+      long numEntries, Set<OzoneFileStatus> fileStatusList,
+      String volumeName, String bucketName, String seekDirInDB,
+      long prefixKeyInDB, String prefixPath,
+      int countEntries) throws IOException {
+
+    Table dirTable = metadataManager.getDirectoryTable();
+    listStatusFindDirsInTableCache(fileStatusList, dirTable, prefixKeyInDB,
+            seekDirInDB, prefixPath, startKey, volumeName, bucketName,
+            countEntries, numEntries);
+    TableIterator<String, ? extends Table.KeyValue<String, OmDirectoryInfo>>
+            iterator = dirTable.iterator();
+
+    iterator.seek(seekDirInDB);
+
+    while (iterator.hasNext() && numEntries - countEntries > 0) {
+      OmDirectoryInfo dirInfo = iterator.value().getValue();
+      if (!isImmediateChild(dirInfo.getParentObjectID(), prefixKeyInDB)) {
+        break;
+      }
+
+      // TODO: recursive list will be handled in HDDS-4360 jira.
+      if (!recursive) {
+        String dirName = OMFileRequest.getAbsolutePath(prefixPath,
+                dirInfo.getName());
+        OmKeyInfo omKeyInfo = OMFileRequest.getOmKeyInfo(volumeName,
+                bucketName, dirInfo, dirName);
+        fileStatusList.add(new OzoneFileStatus(omKeyInfo, scmBlockSize,
+                true));
+        countEntries++;
+      }
+      // move to next entry in the DirTable
+      iterator.next();
+    }
+
+    return countEntries;
+  }
+
+  private int getFilesFromDirectory(Set<OzoneFileStatus> fileStatusList,
+      String seekKeyInDB, String prefixKeyPath, long prefixKeyInDB,
+      String startKey, int countEntries, long numEntries) throws IOException {
+
+    Table<String, OmKeyInfo> keyTable = metadataManager.getKeyTable();
+    listStatusFindFilesInTableCache(fileStatusList, keyTable, prefixKeyInDB,

Review comment:
       We need to consider count from this during total count when iterating from cache.
   
   We are using that countEntries but listStatusFindFilesInTableCache is not returning that.
   

##########
File path: hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/KeyManagerImpl.java
##########
@@ -2205,6 +2276,318 @@ private void listStatusFindKeyInTableCache(
     return fileStatusList;
   }
 
+  public List<OzoneFileStatus> listStatusV1(OmKeyArgs args, boolean recursive,
+      String startKey, long numEntries, String clientAddress)
+          throws IOException {
+    Preconditions.checkNotNull(args, "Key args can not be null");
+
+    // unsorted OMKeyInfo list contains combine results from TableCache and DB.
+    List<OzoneFileStatus> fileStatusFinalList = new ArrayList<>();
+    LinkedHashSet<OzoneFileStatus> fileStatusList = new LinkedHashSet<>();
+    if (numEntries <= 0) {
+      return fileStatusFinalList;
+    }
+
+    String volumeName = args.getVolumeName();
+    String bucketName = args.getBucketName();
+    String keyName = args.getKeyName();
+    String seekFileInDB;
+    String seekDirInDB;
+    long prefixKeyInDB;
+    String prefixPath = keyName;
+
+    int countEntries = 0;
+
+    metadataManager.getLock().acquireReadLock(BUCKET_LOCK, volumeName,
+            bucketName);
+    try {
+      if (Strings.isNullOrEmpty(startKey)) {
+        OzoneFileStatus fileStatus = getFileStatus(args, clientAddress);
+        if (fileStatus.isFile()) {
+          return Collections.singletonList(fileStatus);
+        }
+
+        // Not required to search in DeletedTable because all the deleted
+        // keys will be marked directly in dirTable or in keyTable by
+        // breaking the pointer to its sub-dirs. So, there is no issue of
+        // inconsistency.
+
+        /*
+         * keyName is a directory.
+         * Say, "/a" is the dir name and its objectID is 1024, then seek
+         * will be doing with "1024/" to get all immediate descendants.
+         */
+        if (fileStatus.getKeyInfo() != null) {
+          prefixKeyInDB = fileStatus.getKeyInfo().getObjectID();
+        } else {
+          // list root directory.
+          String bucketKey = metadataManager.getBucketKey(volumeName,
+                  bucketName);
+          OmBucketInfo omBucketInfo =
+                  metadataManager.getBucketTable().get(bucketKey);
+          prefixKeyInDB = omBucketInfo.getObjectID();
+        }
+        seekFileInDB = metadataManager.getOzonePathKey(prefixKeyInDB, "");
+        seekDirInDB = metadataManager.getOzonePathKey(prefixKeyInDB, "");
+
+        // TODO: recursive flag=true will be handled in HDDS-4360 jira.
+        // Order of seek -> (1)Seek dirs in dirTable (2)Seek files in fileTable
+        // 1. Seek the given key in key table.
+        countEntries = getFilesFromDirectory(fileStatusList, seekFileInDB,
+                prefixPath, prefixKeyInDB, startKey, countEntries, numEntries);
+        // 2. Seek the given key in dir table.
+        getDirectories(recursive, startKey, numEntries, fileStatusList,
+                volumeName, bucketName, seekDirInDB, prefixKeyInDB,
+                prefixPath, countEntries);
+      } else {
+        /*
+         * startKey will be used in iterator seek and sets the beginning point
+         * for key traversal.
+         *
+         * key name will be used as parentID where the user has requested to
+         * list the keys from.
+         *
+         * When recursive flag=false, parentID won't change between two pages.
+         * For example: OM has a namespace like,
+         *    /a/1...1M files and /a/b/1...1M files.
+         *    /a/1...1M directories and /a/b/1...1M directories.
+         * Listing "/a", will always have the parentID as "a" irrespective of
+         * the startKey value.
+         */
+        // TODO: recursive flag=true will be handled in HDDS-4360 jira.
+        OzoneFileStatus fileStatusInfo = getOzoneFileStatusV1(volumeName,
+                bucketName, startKey, false, null, true);
+
+        if (fileStatusInfo != null) {
+          prefixKeyInDB = fileStatusInfo.getKeyInfo().getParentObjectID();
+          if(fileStatusInfo.isDirectory()){
+            seekDirInDB = metadataManager.getOzonePathKey(prefixKeyInDB,
+                    fileStatusInfo.getKeyInfo().getFileName());
+
+            // Order of seek -> (1) Seek dirs in dirTable. In OM, always the
+            // order of search is, first seek into fileTable and then dirTable.
+            // So, its not required to search again into the fileTable.
+
+            // Seek the given key in dirTable.
+            getDirectories(recursive, startKey, numEntries,
+                    fileStatusList, volumeName, bucketName, seekDirInDB,
+                    prefixKeyInDB, prefixPath, countEntries);
+
+          } else {
+            seekFileInDB = metadataManager.getOzonePathKey(prefixKeyInDB,
+                    fileStatusInfo.getKeyInfo().getFileName());
+            // begins from the first sub-dir under the parent dir
+            seekDirInDB = metadataManager.getOzonePathKey(prefixKeyInDB, "");
+
+            // Seek the given key in key table.
+            countEntries = getFilesFromDirectory(fileStatusList, seekFileInDB,
+                    prefixPath, prefixKeyInDB, startKey, countEntries,
+                    numEntries);
+            // Seek the given key in dir table.
+            getDirectories(recursive, startKey, numEntries, fileStatusList,
+                    volumeName, bucketName, seekDirInDB,
+                    prefixKeyInDB, prefixPath, countEntries);
+          }
+        } else {
+          // No key exists for the given startKey.
+          // TODO: HDDS-4364: startKey can be a non-existed key
+          return Collections.emptyList();
+        }
+      }
+    } finally {
+      metadataManager.getLock().releaseReadLock(BUCKET_LOCK, volumeName,
+              bucketName);
+    }
+
+    // Convert results in cacheKeyMap to List
+    for (OzoneFileStatus fileStatus : fileStatusList) {
+      // refreshPipeline flag check has been removed as part of
+      // https://issues.apache.org/jira/browse/HDDS-3658.
+      // Please refer this jira for more details.
+      refreshPipeline(fileStatus.getKeyInfo());
+
+      // No need to check if a key is deleted or not here, this is handled
+      // when adding entries to cacheKeyMap from DB.
+      if (args.getSortDatanodes()) {
+        sortDatanodeInPipeline(fileStatus.getKeyInfo(), clientAddress);
+      }
+    }
+    fileStatusFinalList.addAll(fileStatusList);

Review comment:
       When startKey is passed, we seek to that and to construct keyName we used keyArgs keyName as keyName and construct the paths.
   
   But let's say
   
   KeyArgs keyName /a/b
   startkey is /f/d
   
   Then we have used parentObjectID and seek and then return all keys in directory /f, but we construct keyNames with prefix as "/a/b".
   
   Do you think we need to have a check when startKey is passed is the parentObjectID is same when recursive is false. To avoid this kind of corner case scenarios?
   

##########
File path: hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/KeyManagerImpl.java
##########
@@ -2205,6 +2276,318 @@ private void listStatusFindKeyInTableCache(
     return fileStatusList;
   }
 
+  public List<OzoneFileStatus> listStatusV1(OmKeyArgs args, boolean recursive,
+      String startKey, long numEntries, String clientAddress)
+          throws IOException {
+    Preconditions.checkNotNull(args, "Key args can not be null");
+
+    // unsorted OMKeyInfo list contains combine results from TableCache and DB.
+    List<OzoneFileStatus> fileStatusFinalList = new ArrayList<>();
+    LinkedHashSet<OzoneFileStatus> fileStatusList = new LinkedHashSet<>();
+    if (numEntries <= 0) {
+      return fileStatusFinalList;
+    }
+
+    String volumeName = args.getVolumeName();
+    String bucketName = args.getBucketName();
+    String keyName = args.getKeyName();
+    String seekFileInDB;
+    String seekDirInDB;
+    long prefixKeyInDB;
+    String prefixPath = keyName;
+
+    int countEntries = 0;
+
+    metadataManager.getLock().acquireReadLock(BUCKET_LOCK, volumeName,
+            bucketName);
+    try {
+      if (Strings.isNullOrEmpty(startKey)) {
+        OzoneFileStatus fileStatus = getFileStatus(args, clientAddress);
+        if (fileStatus.isFile()) {
+          return Collections.singletonList(fileStatus);
+        }
+
+        // Not required to search in DeletedTable because all the deleted
+        // keys will be marked directly in dirTable or in keyTable by
+        // breaking the pointer to its sub-dirs. So, there is no issue of
+        // inconsistency.
+
+        /*
+         * keyName is a directory.
+         * Say, "/a" is the dir name and its objectID is 1024, then seek
+         * will be doing with "1024/" to get all immediate descendants.
+         */
+        if (fileStatus.getKeyInfo() != null) {
+          prefixKeyInDB = fileStatus.getKeyInfo().getObjectID();
+        } else {
+          // list root directory.
+          String bucketKey = metadataManager.getBucketKey(volumeName,
+                  bucketName);
+          OmBucketInfo omBucketInfo =
+                  metadataManager.getBucketTable().get(bucketKey);
+          prefixKeyInDB = omBucketInfo.getObjectID();
+        }
+        seekFileInDB = metadataManager.getOzonePathKey(prefixKeyInDB, "");
+        seekDirInDB = metadataManager.getOzonePathKey(prefixKeyInDB, "");
+
+        // TODO: recursive flag=true will be handled in HDDS-4360 jira.
+        // Order of seek -> (1)Seek dirs in dirTable (2)Seek files in fileTable
+        // 1. Seek the given key in key table.
+        countEntries = getFilesFromDirectory(fileStatusList, seekFileInDB,
+                prefixPath, prefixKeyInDB, startKey, countEntries, numEntries);
+        // 2. Seek the given key in dir table.
+        getDirectories(recursive, startKey, numEntries, fileStatusList,
+                volumeName, bucketName, seekDirInDB, prefixKeyInDB,
+                prefixPath, countEntries);
+      } else {
+        /*
+         * startKey will be used in iterator seek and sets the beginning point
+         * for key traversal.
+         *
+         * key name will be used as parentID where the user has requested to
+         * list the keys from.
+         *
+         * When recursive flag=false, parentID won't change between two pages.
+         * For example: OM has a namespace like,
+         *    /a/1...1M files and /a/b/1...1M files.
+         *    /a/1...1M directories and /a/b/1...1M directories.
+         * Listing "/a", will always have the parentID as "a" irrespective of
+         * the startKey value.
+         */
+        // TODO: recursive flag=true will be handled in HDDS-4360 jira.
+        OzoneFileStatus fileStatusInfo = getOzoneFileStatusV1(volumeName,
+                bucketName, startKey, false, null, true);
+
+        if (fileStatusInfo != null) {
+          prefixKeyInDB = fileStatusInfo.getKeyInfo().getParentObjectID();
+          if(fileStatusInfo.isDirectory()){
+            seekDirInDB = metadataManager.getOzonePathKey(prefixKeyInDB,
+                    fileStatusInfo.getKeyInfo().getFileName());
+
+            // Order of seek -> (1) Seek dirs in dirTable. In OM, always the
+            // order of search is, first seek into fileTable and then dirTable.
+            // So, its not required to search again into the fileTable.
+
+            // Seek the given key in dirTable.
+            getDirectories(recursive, startKey, numEntries,
+                    fileStatusList, volumeName, bucketName, seekDirInDB,
+                    prefixKeyInDB, prefixPath, countEntries);
+
+          } else {
+            seekFileInDB = metadataManager.getOzonePathKey(prefixKeyInDB,
+                    fileStatusInfo.getKeyInfo().getFileName());
+            // begins from the first sub-dir under the parent dir
+            seekDirInDB = metadataManager.getOzonePathKey(prefixKeyInDB, "");
+
+            // Seek the given key in key table.
+            countEntries = getFilesFromDirectory(fileStatusList, seekFileInDB,
+                    prefixPath, prefixKeyInDB, startKey, countEntries,
+                    numEntries);
+            // Seek the given key in dir table.
+            getDirectories(recursive, startKey, numEntries, fileStatusList,
+                    volumeName, bucketName, seekDirInDB,
+                    prefixKeyInDB, prefixPath, countEntries);
+          }
+        } else {
+          // No key exists for the given startKey.
+          // TODO: HDDS-4364: startKey can be a non-existed key
+          return Collections.emptyList();
+        }
+      }
+    } finally {
+      metadataManager.getLock().releaseReadLock(BUCKET_LOCK, volumeName,
+              bucketName);
+    }
+
+    // Convert results in cacheKeyMap to List
+    for (OzoneFileStatus fileStatus : fileStatusList) {
+      // refreshPipeline flag check has been removed as part of
+      // https://issues.apache.org/jira/browse/HDDS-3658.
+      // Please refer this jira for more details.
+      refreshPipeline(fileStatus.getKeyInfo());
+
+      // No need to check if a key is deleted or not here, this is handled
+      // when adding entries to cacheKeyMap from DB.
+      if (args.getSortDatanodes()) {
+        sortDatanodeInPipeline(fileStatus.getKeyInfo(), clientAddress);
+      }
+    }
+    fileStatusFinalList.addAll(fileStatusList);
+    return fileStatusFinalList;
+  }
+
+  @SuppressWarnings("parameternumber")
+  protected int getDirectories(boolean recursive, String startKey,
+      long numEntries, Set<OzoneFileStatus> fileStatusList,
+      String volumeName, String bucketName, String seekDirInDB,
+      long prefixKeyInDB, String prefixPath,
+      int countEntries) throws IOException {
+
+    Table dirTable = metadataManager.getDirectoryTable();
+    listStatusFindDirsInTableCache(fileStatusList, dirTable, prefixKeyInDB,
+            seekDirInDB, prefixPath, startKey, volumeName, bucketName,
+            countEntries, numEntries);
+    TableIterator<String, ? extends Table.KeyValue<String, OmDirectoryInfo>>
+            iterator = dirTable.iterator();
+
+    iterator.seek(seekDirInDB);
+
+    while (iterator.hasNext() && numEntries - countEntries > 0) {
+      OmDirectoryInfo dirInfo = iterator.value().getValue();
+      if (!isImmediateChild(dirInfo.getParentObjectID(), prefixKeyInDB)) {
+        break;
+      }
+
+      // TODO: recursive list will be handled in HDDS-4360 jira.
+      if (!recursive) {
+        String dirName = OMFileRequest.getAbsolutePath(prefixPath,
+                dirInfo.getName());
+        OmKeyInfo omKeyInfo = OMFileRequest.getOmKeyInfo(volumeName,
+                bucketName, dirInfo, dirName);
+        fileStatusList.add(new OzoneFileStatus(omKeyInfo, scmBlockSize,
+                true));
+        countEntries++;
+      }
+      // move to next entry in the DirTable
+      iterator.next();
+    }
+
+    return countEntries;
+  }
+
+  private int getFilesFromDirectory(Set<OzoneFileStatus> fileStatusList,
+      String seekKeyInDB, String prefixKeyPath, long prefixKeyInDB,
+      String startKey, int countEntries, long numEntries) throws IOException {
+
+    Table<String, OmKeyInfo> keyTable = metadataManager.getKeyTable();
+    listStatusFindFilesInTableCache(fileStatusList, keyTable, prefixKeyInDB,
+            seekKeyInDB, prefixKeyPath, startKey, countEntries, numEntries);
+    TableIterator<String, ? extends Table.KeyValue<String, OmKeyInfo>>
+            iterator = keyTable.iterator();
+    iterator.seek(seekKeyInDB);
+    while (iterator.hasNext() && numEntries - countEntries > 0) {
+      OmKeyInfo keyInfo = iterator.value().getValue();
+
+      if (!isImmediateChild(keyInfo.getParentObjectID(), prefixKeyInDB)) {
+        break;
+      }
+
+      keyInfo.setFileName(keyInfo.getKeyName());
+      String fullKeyPath = OMFileRequest.getAbsolutePath(prefixKeyPath,
+              keyInfo.getKeyName());
+      keyInfo.setKeyName(fullKeyPath);
+      fileStatusList.add(new OzoneFileStatus(keyInfo, scmBlockSize, false));
+      countEntries++;
+      iterator.next(); // move to next entry in the table
+    }
+    return countEntries;
+  }
+
+  private boolean isImmediateChild(long parentId, long ancestorId) {
+    return parentId == ancestorId;
+  }
+
+  /**
+   * Helper function for listStatus to find key in FileTableCache.
+   */
+  @SuppressWarnings("parameternumber")
+  private void listStatusFindFilesInTableCache(
+          Set<OzoneFileStatus> fileStatusList, Table<String,
+          OmKeyInfo> keyTable, long prefixKeyInDB, String seekKeyInDB,
+          String prefixKeyPath, String startKey, int countEntries,
+          long numEntries) {
+
+    Iterator<Map.Entry<CacheKey<String>, CacheValue<OmKeyInfo>>>
+            cacheIter = keyTable.cacheIterator();
+    // seekKeyInDB will have two type of values.
+    // 1. "1024/"   -> startKey is null or empty
+    // 2. "1024/b"  -> startKey exists
+    // TODO: recursive list will be handled in HDDS-4360 jira.
+    while (cacheIter.hasNext() && numEntries - countEntries > 0) {
+      Map.Entry<CacheKey<String>, CacheValue<OmKeyInfo>> entry =
+              cacheIter.next();
+      String cacheKey = entry.getKey().getCacheKey();
+      OmKeyInfo cacheOmKeyInfo = entry.getValue().getCacheValue();
+      if(cacheOmKeyInfo == null){
+        continue;
+      }
+
+      cacheOmKeyInfo.setFileName(cacheOmKeyInfo.getKeyName());
+      String fullKeyPath = OMFileRequest.getAbsolutePath(prefixKeyPath,
+              cacheOmKeyInfo.getKeyName());
+      cacheOmKeyInfo.setKeyName(fullKeyPath);
+
+      if (StringUtils.isBlank(startKey)) {
+        // cacheOmKeyInfo is null if an entry is deleted in cache
+        if (cacheKey.startsWith(seekKeyInDB) &&
+                cacheKey.compareTo(seekKeyInDB) >= 0) {

Review comment:
       Question: Do we need this second check when startKey is blank?

##########
File path: hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/KeyManagerImpl.java
##########
@@ -2205,6 +2276,318 @@ private void listStatusFindKeyInTableCache(
     return fileStatusList;
   }
 
+  public List<OzoneFileStatus> listStatusV1(OmKeyArgs args, boolean recursive,
+      String startKey, long numEntries, String clientAddress)
+          throws IOException {
+    Preconditions.checkNotNull(args, "Key args can not be null");
+
+    // unsorted OMKeyInfo list contains combine results from TableCache and DB.
+    List<OzoneFileStatus> fileStatusFinalList = new ArrayList<>();
+    LinkedHashSet<OzoneFileStatus> fileStatusList = new LinkedHashSet<>();
+    if (numEntries <= 0) {
+      return fileStatusFinalList;
+    }
+
+    String volumeName = args.getVolumeName();
+    String bucketName = args.getBucketName();
+    String keyName = args.getKeyName();
+    String seekFileInDB;
+    String seekDirInDB;
+    long prefixKeyInDB;
+    String prefixPath = keyName;
+
+    int countEntries = 0;
+
+    metadataManager.getLock().acquireReadLock(BUCKET_LOCK, volumeName,
+            bucketName);
+    try {
+      if (Strings.isNullOrEmpty(startKey)) {
+        OzoneFileStatus fileStatus = getFileStatus(args, clientAddress);
+        if (fileStatus.isFile()) {
+          return Collections.singletonList(fileStatus);
+        }
+
+        // Not required to search in DeletedTable because all the deleted
+        // keys will be marked directly in dirTable or in keyTable by
+        // breaking the pointer to its sub-dirs. So, there is no issue of
+        // inconsistency.
+
+        /*
+         * keyName is a directory.
+         * Say, "/a" is the dir name and its objectID is 1024, then seek
+         * will be doing with "1024/" to get all immediate descendants.
+         */
+        if (fileStatus.getKeyInfo() != null) {
+          prefixKeyInDB = fileStatus.getKeyInfo().getObjectID();
+        } else {
+          // list root directory.
+          String bucketKey = metadataManager.getBucketKey(volumeName,
+                  bucketName);
+          OmBucketInfo omBucketInfo =
+                  metadataManager.getBucketTable().get(bucketKey);
+          prefixKeyInDB = omBucketInfo.getObjectID();
+        }
+        seekFileInDB = metadataManager.getOzonePathKey(prefixKeyInDB, "");
+        seekDirInDB = metadataManager.getOzonePathKey(prefixKeyInDB, "");
+
+        // TODO: recursive flag=true will be handled in HDDS-4360 jira.
+        // Order of seek -> (1)Seek dirs in dirTable (2)Seek files in fileTable
+        // 1. Seek the given key in key table.
+        countEntries = getFilesFromDirectory(fileStatusList, seekFileInDB,
+                prefixPath, prefixKeyInDB, startKey, countEntries, numEntries);
+        // 2. Seek the given key in dir table.
+        getDirectories(recursive, startKey, numEntries, fileStatusList,
+                volumeName, bucketName, seekDirInDB, prefixKeyInDB,
+                prefixPath, countEntries);
+      } else {
+        /*
+         * startKey will be used in iterator seek and sets the beginning point
+         * for key traversal.
+         *
+         * key name will be used as parentID where the user has requested to
+         * list the keys from.
+         *
+         * When recursive flag=false, parentID won't change between two pages.
+         * For example: OM has a namespace like,
+         *    /a/1...1M files and /a/b/1...1M files.
+         *    /a/1...1M directories and /a/b/1...1M directories.
+         * Listing "/a", will always have the parentID as "a" irrespective of
+         * the startKey value.
+         */
+        // TODO: recursive flag=true will be handled in HDDS-4360 jira.
+        OzoneFileStatus fileStatusInfo = getOzoneFileStatusV1(volumeName,
+                bucketName, startKey, false, null, true);
+
+        if (fileStatusInfo != null) {
+          prefixKeyInDB = fileStatusInfo.getKeyInfo().getParentObjectID();
+          if(fileStatusInfo.isDirectory()){
+            seekDirInDB = metadataManager.getOzonePathKey(prefixKeyInDB,
+                    fileStatusInfo.getKeyInfo().getFileName());
+
+            // Order of seek -> (1) Seek dirs in dirTable. In OM, always the
+            // order of search is, first seek into fileTable and then dirTable.
+            // So, its not required to search again into the fileTable.
+
+            // Seek the given key in dirTable.
+            getDirectories(recursive, startKey, numEntries,
+                    fileStatusList, volumeName, bucketName, seekDirInDB,
+                    prefixKeyInDB, prefixPath, countEntries);
+
+          } else {
+            seekFileInDB = metadataManager.getOzonePathKey(prefixKeyInDB,
+                    fileStatusInfo.getKeyInfo().getFileName());
+            // begins from the first sub-dir under the parent dir
+            seekDirInDB = metadataManager.getOzonePathKey(prefixKeyInDB, "");
+
+            // Seek the given key in key table.
+            countEntries = getFilesFromDirectory(fileStatusList, seekFileInDB,
+                    prefixPath, prefixKeyInDB, startKey, countEntries,
+                    numEntries);
+            // Seek the given key in dir table.
+            getDirectories(recursive, startKey, numEntries, fileStatusList,
+                    volumeName, bucketName, seekDirInDB,
+                    prefixKeyInDB, prefixPath, countEntries);
+          }
+        } else {
+          // No key exists for the given startKey.
+          // TODO: HDDS-4364: startKey can be a non-existed key
+          return Collections.emptyList();
+        }
+      }
+    } finally {
+      metadataManager.getLock().releaseReadLock(BUCKET_LOCK, volumeName,
+              bucketName);
+    }
+
+    // Convert results in cacheKeyMap to List
+    for (OzoneFileStatus fileStatus : fileStatusList) {
+      // refreshPipeline flag check has been removed as part of
+      // https://issues.apache.org/jira/browse/HDDS-3658.
+      // Please refer this jira for more details.
+      refreshPipeline(fileStatus.getKeyInfo());

Review comment:
       Can we avoid the call to refreshPipeline for directories and also for sortDatanodeInPipeline.
   
   ```
   if (fileStatus.isFile()) {
   refreshPipeline(fileStatus.getKeyInfo());
   if (args.getSortDatanodes()) {
           sortDatanodeInPipeline(fileStatus.getKeyInfo(), clientAddress);
         }
       }
   }
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: ozone-issues-help@hadoop.apache.org