You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ozone.apache.org by GitBox <gi...@apache.org> on 2021/08/03 18:15:14 UTC

[GitHub] [ozone] yuangu002 opened a new pull request #2489: Recon namespace summary 'du' information should return replicated size of a key

yuangu002 opened a new pull request #2489:
URL: https://github.com/apache/ozone/pull/2489


   ## What changes were proposed in this pull request?
   
   In addition to actual key size, Recon 'du' should also return the replicated size of a key.
   For example, for a 3 way replicated key of size = 1MB, Recon 'du' should return '1MB' as the size of the key, and '3MB' as the replicated size if the key is fully replicated.
   
   [Refactoring]
   1. DU response now shows a "structure"
   2. DU key response shouldn't have subpaths
   
   ## What is the link to the Apache JIRA
   
   https://issues.apache.org/jira/browse/HDDS-5449
   
   ## How was this patch tested?
   
   Since this is an improvement/refactoring feature on top of HDDS-5378, no additional unit test is implemented. I tested remotely on the cluster endpoint with the following configs:
   ![Screen Shot 2021-08-03 at 1 46 36 PM](https://user-images.githubusercontent.com/53324985/128062467-ff763bc6-07a0-4a5a-9eab-857b8643523a.png)
   All keys are three-way replicated except `key3`
   
   [Root DU]
   http://localhost:9888/api/v1/namespace/du?path=/&replica=true
   ![root du -rep](https://user-images.githubusercontent.com/53324985/128065104-1e4fdbdd-9b84-4c75-9b14-cf70fb7ea5c7.png)
   
   [Volume DU]
   http://localhost:9888/api/v1/namespace/du?path=/vol&replica=true
   ![vol du -rep](https://user-images.githubusercontent.com/53324985/128065167-758ffd63-e89f-4120-a03b-3f7acea11715.png)
   
   [Bucket DU] `&files=true` enabled
   http://localhost:9888/api/v1/namespace/du?path=/vol/bucket&files=true&replica=true
   ![bucket du -rep](https://user-images.githubusercontent.com/53324985/128065213-9049b519-27a6-443c-b387-450b9b9f2993.png)
   
   [Directory DU] `&files=true` enabled
   http://localhost:9888/api/v1/namespace/du?path=/vol/bucket/dir1&files=true&replica=true
   ![dir1 du -rep](https://user-images.githubusercontent.com/53324985/128065242-e6b9efbd-532d-42e7-b2ac-9d913b8e8cff.png)
   (Key3 is one-way replicated.)
   
   [Nested Directory DU]
   http://localhost:9888/api/v1/namespace/du?path=/vol/bucket/dir1/dir2&files=true&replica=true
   ![dir2 du -rep](https://user-images.githubusercontent.com/53324985/128065272-15b0af28-b79e-4f98-9666-f4c22ec9bf53.png)
   
   [Key DU]
   http://localhost:9888/api/v1/namespace/du?path=/vol/bucket/dir1/dir2/dir3/key6&files=true&replica=true
   ![key du -rep](https://user-images.githubusercontent.com/53324985/128065331-8f0f3023-f50a-47a7-84f1-65774524708d.png)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] smengcl commented on a change in pull request #2489: HDDS-5449 Recon namespace summary 'du' information should return replicated size of a key

Posted by GitBox <gi...@apache.org>.
smengcl commented on a change in pull request #2489:
URL: https://github.com/apache/ozone/pull/2489#discussion_r685515347



##########
File path: hadoop-ozone/recon/src/main/java/org/apache/hadoop/ozone/recon/api/NSSummaryEndpoint.java
##########
@@ -991,7 +947,7 @@ private long getKeySizeWithReplication(OmKeyInfo keyInfo) {
         long blockSize = location.getLength() * replicationFactor;
         du += blockSize;
       } catch (ContainerNotFoundException cnfe) {
-        LOG.warn("Cannot find container");
+        LOG.debug("Cannot find container {}", block.getContainerID(), cnfe);

Review comment:
       If this is a **rare** exception that should not have happened in most cases, we should leave this at `warn`.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] smengcl commented on a change in pull request #2489: HDDS-5449 Recon namespace summary 'du' information should return replicated size of a key

Posted by GitBox <gi...@apache.org>.
smengcl commented on a change in pull request #2489:
URL: https://github.com/apache/ozone/pull/2489#discussion_r685515347



##########
File path: hadoop-ozone/recon/src/main/java/org/apache/hadoop/ozone/recon/api/NSSummaryEndpoint.java
##########
@@ -991,7 +947,7 @@ private long getKeySizeWithReplication(OmKeyInfo keyInfo) {
         long blockSize = location.getLength() * replicationFactor;
         du += blockSize;
       } catch (ContainerNotFoundException cnfe) {
-        LOG.warn("Cannot find container");
+        LOG.debug("Cannot find container {}", block.getContainerID(), cnfe);

Review comment:
       If this is a rare exception, we should leave this at `warn`.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] yuangu002 commented on pull request #2489: HDDS-5449 Recon namespace summary 'du' information should return replicated size of a key

Posted by GitBox <gi...@apache.org>.
yuangu002 commented on pull request #2489:
URL: https://github.com/apache/ozone/pull/2489#issuecomment-894593288


   [Fix]
   1. Addressed comments.
   2. Optimized listKeys by seek() and DFS. No full iteration on FileTable except root level
   3. Added a unit test for the case that a key spanning across multiple blocks and some of containers are under/over-replicated
   
   [Future works]
   1. https://github.com/yuangu002/ozone/blob/HDDS-5449/hadoop-ozone/recon/src/main/java/org/apache/hadoop/ozone/recon/api/NSSummaryEndpoint.java#L228
   2. https://github.com/yuangu002/ozone/blob/HDDS-5449/hadoop-ozone/recon/src/main/java/org/apache/hadoop/ozone/recon/api/NSSummaryEndpoint.java#L907
   3. Change `numOfFiles` in `NSSummary` from `int` to `long`


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] smengcl commented on a change in pull request #2489: HDDS-5449 Recon namespace summary 'du' information should return replicated size of a key

Posted by GitBox <gi...@apache.org>.
smengcl commented on a change in pull request #2489:
URL: https://github.com/apache/ozone/pull/2489#discussion_r683759950



##########
File path: hadoop-ozone/recon/src/main/java/org/apache/hadoop/ozone/recon/api/NSSummaryEndpoint.java
##########
@@ -267,26 +368,69 @@ public Response getDiskUsage(@QueryParam("path") String path)
         // reformat the response
         diskUsage.setSubpath(subpath);
         long dataSize = getTotalSize(subdirObjectId);
+        dirDataSize += dataSize;
+
+        if (withReplica) {
+          long subdirDU = 0;
+          List<OmKeyInfo> keys = listKeysUnderDirectory(subdirObjectId);
+          for (OmKeyInfo keyInfo: keys) {
+            subdirDU += getKeySizeWithReplication(keyInfo);
+          }
+          diskUsage.setSizeWithReplica(subdirDU);
+          dirDataSizeWithReplica += subdirDU;
+        }
+
         diskUsage.setSize(dataSize);
         subdirDUData.add(diskUsage);
       }
+
+      // handle direct keys under directory
+      if (listFile || withReplica) {
+        List<OmKeyInfo> directKeys = listDirectKeys(dirObjectId);

Review comment:
       Same here. We can calculate the replica size directly without gathering all `OmKeyInfo`s




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] yuangu002 edited a comment on pull request #2489: Recon namespace summary 'du' information should return replicated size of a key

Posted by GitBox <gi...@apache.org>.
yuangu002 edited a comment on pull request #2489:
URL: https://github.com/apache/ozone/pull/2489#issuecomment-892075318


   cc @smengcl @avijayanhwx. Please review.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] smengcl commented on a change in pull request #2489: HDDS-5449 Recon namespace summary 'du' information should return replicated size of a key

Posted by GitBox <gi...@apache.org>.
smengcl commented on a change in pull request #2489:
URL: https://github.com/apache/ozone/pull/2489#discussion_r685517294



##########
File path: hadoop-ozone/recon/src/test/java/org/apache/hadoop/ozone/recon/api/TestNSSummaryEndpoint.java
##########
@@ -51,16 +57,13 @@
 
 import java.io.File;
 import java.io.IOException;
-import java.util.Collections;
-import java.util.Comparator;
-import java.util.List;
+import java.util.*;
 
+import static org.apache.hadoop.hdds.protocol.MockDatanodeDetails.randomDatanodeDetails;
 import static org.apache.hadoop.ozone.om.OMConfigKeys.OZONE_OM_DB_DIRS;
-import static org.apache.hadoop.ozone.recon.OMMetadataManagerTestUtils.getMockOzoneManagerServiceProviderWithFSO;
-import static org.apache.hadoop.ozone.recon.OMMetadataManagerTestUtils.getTestReconOmMetadataManager;
-import static org.apache.hadoop.ozone.recon.OMMetadataManagerTestUtils.writeDirToOm;
-import static org.apache.hadoop.ozone.recon.OMMetadataManagerTestUtils.writeKeyToOm;
+import static org.apache.hadoop.ozone.recon.OMMetadataManagerTestUtils.*;

Review comment:
       And this.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] avijayanhwx commented on a change in pull request #2489: HDDS-5449 Recon namespace summary 'du' information should return replicated size of a key

Posted by GitBox <gi...@apache.org>.
avijayanhwx commented on a change in pull request #2489:
URL: https://github.com/apache/ozone/pull/2489#discussion_r682832116



##########
File path: hadoop-ozone/recon/src/main/java/org/apache/hadoop/ozone/recon/api/NSSummaryEndpoint.java
##########
@@ -267,26 +368,69 @@ public Response getDiskUsage(@QueryParam("path") String path)
         // reformat the response
         diskUsage.setSubpath(subpath);
         long dataSize = getTotalSize(subdirObjectId);
+        dirDataSize += dataSize;
+
+        if (withReplica) {
+          long subdirDU = 0;
+          List<OmKeyInfo> keys = listKeysUnderDirectory(subdirObjectId);
+          for (OmKeyInfo keyInfo: keys) {
+            subdirDU += getKeySizeWithReplication(keyInfo);
+          }
+          diskUsage.setSizeWithReplica(subdirDU);
+          dirDataSizeWithReplica += subdirDU;
+        }
+
         diskUsage.setSize(dataSize);
         subdirDUData.add(diskUsage);
       }
+
+      // handle direct keys under directory
+      if (listFile || withReplica) {
+        List<OmKeyInfo> directKeys = listDirectKeys(dirObjectId);
+
+        for (OmKeyInfo directKey: directKeys) {
+          DUResponse.DiskUsage diskUsage = new DUResponse.DiskUsage();
+          String subpath = buildSubpath(normalizedPath,
+                  directKey.getFileName());
+          diskUsage.setSubpath(subpath);
+          diskUsage.setSize(directKey.getDataSize());
+
+          if (withReplica) {
+            long keyDU = getKeySizeWithReplication(directKey);
+            dirDataSizeWithReplica += keyDU;
+            diskUsage.setSizeWithReplica(keyDU);
+          }
+          // list the key as a subpath
+          if (listFile) {
+            subdirDUData.add(diskUsage);
+          }
+        }
+      }
+
+      if (withReplica) {
+        duResponse.setSizeWithReplica(dirDataSizeWithReplica);
+      }
+      duResponse.setCount(subdirDUData.size());
+      duResponse.setSize(dirDataSize);
       duResponse.setDuData(subdirDUData);
       break;
     case KEY:
-      // DU for key is the data size
-      duResponse.setCount(1);
-      DUResponse.DiskUsage keyDU = new DUResponse.DiskUsage();
+      // DU for key doesn't have subpaths
+      duResponse.setCount(0);
       // The object ID for the directory that the key is directly in
       long parentObjectId = getDirObjectId(names, names.length - 1);
       String fileName = names[names.length - 1];
       String ozoneKey =
               omMetadataManager.getOzonePathKey(parentObjectId, fileName);
       OmKeyInfo keyInfo =
               omMetadataManager.getFileTable().getSkipCache(ozoneKey);
-      String subpath = buildSubpath(normalizedPath, null);
-      keyDU.setSubpath(subpath);
-      keyDU.setSize(keyInfo.getDataSize());
-      duResponse.setDuData(Collections.singletonList(keyDU));
+      duResponse.setSize(keyInfo.getDataSize());
+      if (withReplica) {
+        long keySizeWithReplica = getKeySizeWithReplication(keyInfo);
+        duResponse.setSizeWithReplica(keySizeWithReplica);
+      }
+      // give an empty list to avoid NPE
+      duResponse.setDuData(new ArrayList<>());

Review comment:
       We could init the duData list to empty list in the DuResponse constructor.

##########
File path: hadoop-ozone/recon/src/main/java/org/apache/hadoop/ozone/recon/api/NSSummaryEndpoint.java
##########
@@ -724,6 +868,135 @@ private long getTotalSize(long objectId) throws IOException {
     return result;
   }
 
+  private List<OmKeyInfo> listKeysUnderVolume(String volumeName)
+          throws IOException {
+    List<OmKeyInfo> result = new ArrayList<>();
+    if (Strings.isBlank(volumeName)) {
+      return result;
+    }
+
+    Table keyTable = omMetadataManager.getFileTable();
+
+    TableIterator<String, ? extends Table.KeyValue<String, OmKeyInfo>>
+            iterator = keyTable.iterator();
+
+    while (iterator.hasNext()) {
+      Table.KeyValue<String, OmKeyInfo> kv = iterator.next();
+      OmKeyInfo keyInfo = kv.getValue();
+
+      if (keyInfo != null) {
+        if (volumeName.equals(keyInfo.getVolumeName())) {
+          result.add(keyInfo);
+        }
+      }
+    }
+    return result;
+  }
+
+  private List<OmKeyInfo> listKeysUnderBucket(String bucketName)
+          throws IOException {
+    List<OmKeyInfo> result = new ArrayList<>();
+    if (Strings.isBlank(bucketName)) {
+      return result;
+    }
+
+    Table keyTable = omMetadataManager.getFileTable();
+
+    TableIterator<String, ? extends Table.KeyValue<String, OmKeyInfo>>
+            iterator = keyTable.iterator();
+
+    while (iterator.hasNext()) {
+      Table.KeyValue<String, OmKeyInfo> kv = iterator.next();
+      OmKeyInfo keyInfo = kv.getValue();
+
+      if (keyInfo != null) {
+        if (bucketName.equals(keyInfo.getBucketName())) {
+          result.add(keyInfo);
+        }
+      }
+    }
+    return result;
+  }
+
+  private List<OmKeyInfo> listKeysUnderDirectory(long dirObjectId)
+          throws IOException {
+    List<OmKeyInfo> result = new ArrayList<>();
+
+    Table keyTable = omMetadataManager.getFileTable();
+
+    TableIterator<String, ? extends Table.KeyValue<String, OmKeyInfo>>
+            iterator = keyTable.iterator();
+
+    while (iterator.hasNext()) {
+      Table.KeyValue<String, OmKeyInfo> kv = iterator.next();
+      OmKeyInfo keyInfo = kv.getValue();
+
+      if (keyInfo != null
+              && isFileUnderDir(dirObjectId, keyInfo.getParentObjectID())) {
+        result.add(keyInfo);
+      }
+    }
+    return result;
+  }
+
+  private boolean isFileUnderDir(long dirObjectId,
+                                 long keyParentId) throws IOException {
+    if (dirObjectId == keyParentId) {
+      return true;
+    } else {
+      NSSummary nsSummary =
+              reconNamespaceSummaryManager.getNSSummary(dirObjectId);
+      for (long subdirId: nsSummary.getChildDir()) {
+        if (isFileUnderDir(subdirId, keyParentId)) {
+          return true;
+        }
+      }
+      return false;
+    }
+  }
+
+  private List<OmKeyInfo> listDirectKeys(long parentId) throws IOException {
+    List<OmKeyInfo> result = new ArrayList<>();
+
+    Table keyTable = omMetadataManager.getFileTable();
+
+    TableIterator<String, ? extends Table.KeyValue<String, OmKeyInfo>>
+            iterator = keyTable.iterator();
+
+    while (iterator.hasNext()) {
+      Table.KeyValue<String, OmKeyInfo> kv = iterator.next();
+      OmKeyInfo keyInfo = kv.getValue();
+
+      if (keyInfo != null) {
+        if (parentId == keyInfo.getParentObjectID()) {
+          result.add(keyInfo);
+        }
+      }
+    }
+    return result;
+  }
+
+  private long getKeySizeWithReplication(OmKeyInfo keyInfo) {
+    OmKeyLocationInfoGroup locationGroup = keyInfo.getLatestVersionLocations();
+    List<OmKeyLocationInfo> keyLocations =
+            locationGroup.getBlocksLatestVersionOnly();
+    long du = keyInfo.getDataSize();
+    // a key could be too large to fit in one single container
+    for (OmKeyLocationInfo location: keyLocations) {
+      BlockID block = location.getBlockID();
+      ContainerID containerId = new ContainerID(block.getContainerID());
+      try {
+        int replicationFactor =
+                containerManager.getContainerReplicas(containerId).size();
+        du *= replicationFactor;
+        break;
+      } catch (ContainerNotFoundException cnfe) {
+        LOG.warn("Cannot find container");

Review comment:
       Can be debug, along with the ContainerID being printed out.

##########
File path: hadoop-ozone/recon/src/main/java/org/apache/hadoop/ozone/recon/api/NSSummaryEndpoint.java
##########
@@ -724,6 +868,135 @@ private long getTotalSize(long objectId) throws IOException {
     return result;
   }
 
+  private List<OmKeyInfo> listKeysUnderVolume(String volumeName)
+          throws IOException {
+    List<OmKeyInfo> result = new ArrayList<>();
+    if (Strings.isBlank(volumeName)) {
+      return result;
+    }
+
+    Table keyTable = omMetadataManager.getFileTable();
+
+    TableIterator<String, ? extends Table.KeyValue<String, OmKeyInfo>>
+            iterator = keyTable.iterator();
+
+    while (iterator.hasNext()) {
+      Table.KeyValue<String, OmKeyInfo> kv = iterator.next();

Review comment:
       Can we use the iterator.seek() API that let's us jump to the first key with that the volume name in prefix here?

##########
File path: hadoop-ozone/recon/src/main/java/org/apache/hadoop/ozone/recon/api/NSSummaryEndpoint.java
##########
@@ -724,6 +868,135 @@ private long getTotalSize(long objectId) throws IOException {
     return result;
   }
 
+  private List<OmKeyInfo> listKeysUnderVolume(String volumeName)
+          throws IOException {
+    List<OmKeyInfo> result = new ArrayList<>();
+    if (Strings.isBlank(volumeName)) {
+      return result;
+    }
+
+    Table keyTable = omMetadataManager.getFileTable();
+
+    TableIterator<String, ? extends Table.KeyValue<String, OmKeyInfo>>
+            iterator = keyTable.iterator();
+
+    while (iterator.hasNext()) {
+      Table.KeyValue<String, OmKeyInfo> kv = iterator.next();
+      OmKeyInfo keyInfo = kv.getValue();
+
+      if (keyInfo != null) {
+        if (volumeName.equals(keyInfo.getVolumeName())) {
+          result.add(keyInfo);
+        }
+      }
+    }
+    return result;
+  }
+
+  private List<OmKeyInfo> listKeysUnderBucket(String bucketName)
+          throws IOException {
+    List<OmKeyInfo> result = new ArrayList<>();
+    if (Strings.isBlank(bucketName)) {
+      return result;
+    }
+
+    Table keyTable = omMetadataManager.getFileTable();
+
+    TableIterator<String, ? extends Table.KeyValue<String, OmKeyInfo>>
+            iterator = keyTable.iterator();
+
+    while (iterator.hasNext()) {
+      Table.KeyValue<String, OmKeyInfo> kv = iterator.next();

Review comment:
       Same as above.

##########
File path: hadoop-ozone/recon/src/main/java/org/apache/hadoop/ozone/recon/api/NSSummaryEndpoint.java
##########
@@ -724,6 +868,135 @@ private long getTotalSize(long objectId) throws IOException {
     return result;
   }
 
+  private List<OmKeyInfo> listKeysUnderVolume(String volumeName)
+          throws IOException {
+    List<OmKeyInfo> result = new ArrayList<>();
+    if (Strings.isBlank(volumeName)) {
+      return result;
+    }
+
+    Table keyTable = omMetadataManager.getFileTable();
+
+    TableIterator<String, ? extends Table.KeyValue<String, OmKeyInfo>>
+            iterator = keyTable.iterator();
+
+    while (iterator.hasNext()) {
+      Table.KeyValue<String, OmKeyInfo> kv = iterator.next();
+      OmKeyInfo keyInfo = kv.getValue();
+
+      if (keyInfo != null) {
+        if (volumeName.equals(keyInfo.getVolumeName())) {
+          result.add(keyInfo);
+        }
+      }
+    }
+    return result;
+  }
+
+  private List<OmKeyInfo> listKeysUnderBucket(String bucketName)
+          throws IOException {
+    List<OmKeyInfo> result = new ArrayList<>();
+    if (Strings.isBlank(bucketName)) {
+      return result;
+    }
+
+    Table keyTable = omMetadataManager.getFileTable();
+
+    TableIterator<String, ? extends Table.KeyValue<String, OmKeyInfo>>
+            iterator = keyTable.iterator();
+
+    while (iterator.hasNext()) {
+      Table.KeyValue<String, OmKeyInfo> kv = iterator.next();
+      OmKeyInfo keyInfo = kv.getValue();
+
+      if (keyInfo != null) {
+        if (bucketName.equals(keyInfo.getBucketName())) {
+          result.add(keyInfo);
+        }
+      }
+    }
+    return result;
+  }
+
+  private List<OmKeyInfo> listKeysUnderDirectory(long dirObjectId)
+          throws IOException {
+    List<OmKeyInfo> result = new ArrayList<>();
+
+    Table keyTable = omMetadataManager.getFileTable();
+
+    TableIterator<String, ? extends Table.KeyValue<String, OmKeyInfo>>
+            iterator = keyTable.iterator();
+
+    while (iterator.hasNext()) {

Review comment:
       Same as above.

##########
File path: hadoop-ozone/recon/src/test/java/org/apache/hadoop/ozone/recon/api/TestNSSummaryEndpoint.java
##########
@@ -299,13 +311,15 @@ public void testDiskUsage() throws Exception {
     Assert.assertEquals(KEY_SIX_SIZE, duDir4.getSize());
 
     // key level DU
-    Response keyResponse = nsSummaryEndpoint.getDiskUsage(KEY_PATH);
+    Response keyResponse = nsSummaryEndpoint.getDiskUsage(KEY_PATH,
+            false, false);

Review comment:
       Can we add tests for replica size? There are a few cases that need to be verified (key spanning into multiple blocks, container replica count < desired replica count etc) 

##########
File path: hadoop-ozone/recon/src/main/java/org/apache/hadoop/ozone/recon/api/NSSummaryEndpoint.java
##########
@@ -62,17 +72,25 @@
 @Path("/namespace")
 @Produces(MediaType.APPLICATION_JSON)
 public class NSSummaryEndpoint {
+
+  private static final Logger LOG = LoggerFactory.getLogger(
+          NSSummaryEndpoint.class);
   @Inject
   private ReconNamespaceSummaryManager reconNamespaceSummaryManager;
 
   @Inject
   private ReconOMMetadataManager omMetadataManager;
 
+  private ReconContainerManager containerManager;
+
   @Inject
   public NSSummaryEndpoint(ReconNamespaceSummaryManager namespaceSummaryManager,
-                           ReconOMMetadataManager omMetadataManager) {
+                           ReconOMMetadataManager omMetadataManager,
+                           OzoneStorageContainerManager reconSCM) {
     this.reconNamespaceSummaryManager = namespaceSummaryManager;
     this.omMetadataManager = omMetadataManager;
+    this.containerManager =
+            (ReconContainerManager) reconSCM.getContainerManager();

Review comment:
       We don't need the cast to RCM here. We should be OK with the return type being 'ContainerManagerV2' since we are using only interface method.

##########
File path: hadoop-ozone/recon/src/main/java/org/apache/hadoop/ozone/recon/api/NSSummaryEndpoint.java
##########
@@ -163,12 +181,19 @@ public Response getBasicInfo(
   /**
    * DU endpoint to return datasize for subdirectory (bucket for volume).
    * @param path request path
+   * @param listFile show subpath/disk usage for each key
+   * @param withReplica count actual DU with replication
    * @return DU response
    * @throws IOException
    */
   @GET
   @Path("/du")
-  public Response getDiskUsage(@QueryParam("path") String path)
+  @SuppressWarnings("methodlength")
+  public Response getDiskUsage(@QueryParam("path") String path,
+                               @DefaultValue("false")
+                               @QueryParam("files") boolean listFile,
+                               @DefaultValue("false")
+                               @QueryParam("replica") boolean withReplica)

Review comment:
       Can we set Default value for replica?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] smengcl commented on pull request #2489: HDDS-5449 Recon namespace summary 'du' information should return replicated size of a key

Posted by GitBox <gi...@apache.org>.
smengcl commented on pull request #2489:
URL: https://github.com/apache/ozone/pull/2489#issuecomment-896570550


   Merged. Thanks @yuangu002 for the patch. Thanks @avijayanhwx for the review.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] smengcl commented on a change in pull request #2489: HDDS-5449 Recon namespace summary 'du' information should return replicated size of a key

Posted by GitBox <gi...@apache.org>.
smengcl commented on a change in pull request #2489:
URL: https://github.com/apache/ozone/pull/2489#discussion_r683759950



##########
File path: hadoop-ozone/recon/src/main/java/org/apache/hadoop/ozone/recon/api/NSSummaryEndpoint.java
##########
@@ -267,26 +368,69 @@ public Response getDiskUsage(@QueryParam("path") String path)
         // reformat the response
         diskUsage.setSubpath(subpath);
         long dataSize = getTotalSize(subdirObjectId);
+        dirDataSize += dataSize;
+
+        if (withReplica) {
+          long subdirDU = 0;
+          List<OmKeyInfo> keys = listKeysUnderDirectory(subdirObjectId);
+          for (OmKeyInfo keyInfo: keys) {
+            subdirDU += getKeySizeWithReplication(keyInfo);
+          }
+          diskUsage.setSizeWithReplica(subdirDU);
+          dirDataSizeWithReplica += subdirDU;
+        }
+
         diskUsage.setSize(dataSize);
         subdirDUData.add(diskUsage);
       }
+
+      // handle direct keys under directory
+      if (listFile || withReplica) {
+        List<OmKeyInfo> directKeys = listDirectKeys(dirObjectId);

Review comment:
       Same here. We can calculate the replica directly here without gathering all `OmKeyInfo`s

##########
File path: hadoop-ozone/recon/src/main/java/org/apache/hadoop/ozone/recon/api/NSSummaryEndpoint.java
##########
@@ -267,26 +368,69 @@ public Response getDiskUsage(@QueryParam("path") String path)
         // reformat the response
         diskUsage.setSubpath(subpath);
         long dataSize = getTotalSize(subdirObjectId);
+        dirDataSize += dataSize;
+
+        if (withReplica) {
+          long subdirDU = 0;
+          List<OmKeyInfo> keys = listKeysUnderDirectory(subdirObjectId);
+          for (OmKeyInfo keyInfo: keys) {
+            subdirDU += getKeySizeWithReplication(keyInfo);
+          }
+          diskUsage.setSizeWithReplica(subdirDU);
+          dirDataSizeWithReplica += subdirDU;
+        }
+
         diskUsage.setSize(dataSize);
         subdirDUData.add(diskUsage);
       }
+
+      // handle direct keys under directory
+      if (listFile || withReplica) {
+        List<OmKeyInfo> directKeys = listDirectKeys(dirObjectId);
+
+        for (OmKeyInfo directKey: directKeys) {

Review comment:
       ```suggestion
           for (OmKeyInfo directKey : directKeys) {
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] yuangu002 commented on a change in pull request #2489: HDDS-5449 Recon namespace summary 'du' information should return replicated size of a key

Posted by GitBox <gi...@apache.org>.
yuangu002 commented on a change in pull request #2489:
URL: https://github.com/apache/ozone/pull/2489#discussion_r682857427



##########
File path: hadoop-ozone/recon/src/main/java/org/apache/hadoop/ozone/recon/api/NSSummaryEndpoint.java
##########
@@ -163,12 +181,19 @@ public Response getBasicInfo(
   /**
    * DU endpoint to return datasize for subdirectory (bucket for volume).
    * @param path request path
+   * @param listFile show subpath/disk usage for each key
+   * @param withReplica count actual DU with replication
    * @return DU response
    * @throws IOException
    */
   @GET
   @Path("/du")
-  public Response getDiskUsage(@QueryParam("path") String path)
+  @SuppressWarnings("methodlength")
+  public Response getDiskUsage(@QueryParam("path") String path,
+                               @DefaultValue("false")
+                               @QueryParam("files") boolean listFile,
+                               @DefaultValue("false")
+                               @QueryParam("replica") boolean withReplica)

Review comment:
       shouldn't `@DefaultValue("false")` suffice?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] yuangu002 edited a comment on pull request #2489: HDDS-5449 Recon namespace summary 'du' information should return replicated size of a key

Posted by GitBox <gi...@apache.org>.
yuangu002 edited a comment on pull request #2489:
URL: https://github.com/apache/ozone/pull/2489#issuecomment-894593288


   [Fix]
   1. Addressed comments.
   2. Optimized listKeys by `seek()` and DFS. No full iteration on `FileTable` except root level
   3. Added a unit test for the case that a key spanning across multiple blocks and some of containers are under/over-replicated
   
   [Future works]
   1. https://github.com/yuangu002/ozone/blob/HDDS-5449/hadoop-ozone/recon/src/main/java/org/apache/hadoop/ozone/recon/api/NSSummaryEndpoint.java#L228
   2. https://github.com/yuangu002/ozone/blob/HDDS-5449/hadoop-ozone/recon/src/main/java/org/apache/hadoop/ozone/recon/api/NSSummaryEndpoint.java#L907
   3. Change `numOfFiles` in `NSSummary` from `int` to `long`


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] smengcl merged pull request #2489: HDDS-5449 Recon namespace summary 'du' information should return replicated size of a key

Posted by GitBox <gi...@apache.org>.
smengcl merged pull request #2489:
URL: https://github.com/apache/ozone/pull/2489


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] smengcl commented on a change in pull request #2489: HDDS-5449 Recon namespace summary 'du' information should return replicated size of a key

Posted by GitBox <gi...@apache.org>.
smengcl commented on a change in pull request #2489:
URL: https://github.com/apache/ozone/pull/2489#discussion_r683675489



##########
File path: hadoop-ozone/recon/src/main/java/org/apache/hadoop/ozone/recon/api/NSSummaryEndpoint.java
##########
@@ -179,12 +204,15 @@ public Response getDiskUsage(@QueryParam("path") String path)
     EntityType type = getEntityType(normalizedPath, names);
 
     DUResponse duResponse = new DUResponse();
+    duResponse.setPath(normalizedPath);
     switch (type) {
     case ROOT:
       List<OmVolumeArgs> volumes = listVolumes();
       duResponse.setCount(volumes.size());
 
       List<DUResponse.DiskUsage> volumeDuData = new ArrayList<>();
+      long totalDataSize = 0;
+      long totalDataSizeWithReplica = 0;

Review comment:
       nit
   ```suggestion
         long totalDataSize = 0L;
         long totalDataSizeWithReplica = 0L;
   ```

##########
File path: hadoop-ozone/recon/src/main/java/org/apache/hadoop/ozone/recon/api/NSSummaryEndpoint.java
##########
@@ -208,16 +252,32 @@ public Response getDiskUsage(@QueryParam("path") String path)
 
       // List of DiskUsage data for all buckets
       List<DUResponse.DiskUsage> bucketDuData = new ArrayList<>();
-      for (OmBucketInfo bucket : buckets) {
+      long volDataSize = 0;
+      long volDataSizeWithReplica = 0;
+      for (OmBucketInfo bucket: buckets) {
         String bucketName = bucket.getBucketName();
         long bucketObjectID = bucket.getObjectID();
         String subpath = omMetadataManager.getBucketKey(volName, bucketName);
         DUResponse.DiskUsage diskUsage = new DUResponse.DiskUsage();
         diskUsage.setSubpath(subpath);
         long dataSize = getTotalSize(bucketObjectID);
+        volDataSize += dataSize;
+        if (withReplica) {
+          long bucketDU = 0;

Review comment:
       nit
   ```suggestion
             long bucketDU = 0L;
   ```

##########
File path: hadoop-ozone/recon/src/main/java/org/apache/hadoop/ozone/recon/api/NSSummaryEndpoint.java
##########
@@ -242,9 +303,48 @@ public Response getDiskUsage(@QueryParam("path") String path)
         DUResponse.DiskUsage diskUsage = new DUResponse.DiskUsage();
         diskUsage.setSubpath(subpath);
         long dataSize = getTotalSize(subdirObjectId);
+        bucketDataSize += dataSize;
+
+        if (withReplica) {
+          long dirDU = 0;

Review comment:
       nit
   ```suggestion
             long dirDU = 0L;
   ```

##########
File path: hadoop-ozone/recon/src/main/java/org/apache/hadoop/ozone/recon/api/NSSummaryEndpoint.java
##########
@@ -253,8 +353,9 @@ public Response getDiskUsage(@QueryParam("path") String path)
               reconNamespaceSummaryManager.getNSSummary(dirObjectId);
       Set<Long> subdirs = dirNSSummary.getChildDir();
 
-      duResponse.setCount(subdirs.size());
       duResponse.setKeySize(dirNSSummary.getSizeOfFiles());
+      long dirDataSize = duResponse.getKeySize();
+      long dirDataSizeWithReplica = 0;

Review comment:
       nit
   ```suggestion
         long dirDataSizeWithReplica = 0L;
   ```

##########
File path: hadoop-ozone/recon/src/main/java/org/apache/hadoop/ozone/recon/api/NSSummaryEndpoint.java
##########
@@ -267,26 +368,69 @@ public Response getDiskUsage(@QueryParam("path") String path)
         // reformat the response
         diskUsage.setSubpath(subpath);
         long dataSize = getTotalSize(subdirObjectId);
+        dirDataSize += dataSize;
+
+        if (withReplica) {
+          long subdirDU = 0;

Review comment:
       nit
   ```suggestion
             long subdirDU = 0L;
   ```

##########
File path: hadoop-ozone/recon/src/main/java/org/apache/hadoop/ozone/recon/api/NSSummaryEndpoint.java
##########
@@ -196,9 +224,25 @@ public Response getDiskUsage(@QueryParam("path") String path)
           long bucketObjectID = bucket.getObjectID();
           dataSize += getTotalSize(bucketObjectID);
         }
+        totalDataSize += dataSize;
+
+        // count replicas
+        if (withReplica) {
+          long volumeDU = 0;
+          List<OmKeyInfo> keys = listKeysUnderVolume(volumeName);

Review comment:
       Caveat: if the volume has billions of keys, the jvm heap can be exhausted and Recon can crash due to OOM.
   
   Is it possible to count replica size without getting all keys under one batch? this might require pagination.
   
   Or even better: can we count replica size without even retaining the batches of OmKeyInfo in a variable at all? This can avoid pagination, simplifying the logic.

##########
File path: hadoop-ozone/recon/src/main/java/org/apache/hadoop/ozone/recon/api/NSSummaryEndpoint.java
##########
@@ -242,9 +303,48 @@ public Response getDiskUsage(@QueryParam("path") String path)
         DUResponse.DiskUsage diskUsage = new DUResponse.DiskUsage();
         diskUsage.setSubpath(subpath);
         long dataSize = getTotalSize(subdirObjectId);
+        bucketDataSize += dataSize;
+
+        if (withReplica) {
+          long dirDU = 0;
+
+          List<OmKeyInfo> keys = listKeysUnderDirectory(subdirObjectId);
+          for (OmKeyInfo keyInfo: keys) {
+            dirDU += getKeySizeWithReplication(keyInfo);
+          }
+          diskUsage.setSizeWithReplica(dirDU);
+          bucketDataSizeWithReplica += dirDU;
+        }
         diskUsage.setSize(dataSize);
         dirDUData.add(diskUsage);
       }
+      // Either listFile or withReplica is enabled, we need the directKeys info
+      if (listFile || withReplica) {
+        List<OmKeyInfo> directKeys = listDirectKeys(bucketObjectId);
+
+        for (OmKeyInfo directKey: directKeys) {

Review comment:
       nit
   ```suggestion
           for (OmKeyInfo directKey : directKeys) {
   ```

##########
File path: hadoop-ozone/recon/src/main/java/org/apache/hadoop/ozone/recon/api/NSSummaryEndpoint.java
##########
@@ -208,16 +252,32 @@ public Response getDiskUsage(@QueryParam("path") String path)
 
       // List of DiskUsage data for all buckets
       List<DUResponse.DiskUsage> bucketDuData = new ArrayList<>();
-      for (OmBucketInfo bucket : buckets) {
+      long volDataSize = 0;
+      long volDataSizeWithReplica = 0;

Review comment:
       nits
   ```suggestion
         long volDataSize = 0L;
         long volDataSizeWithReplica = 0L;
   ```

##########
File path: hadoop-ozone/recon/src/main/java/org/apache/hadoop/ozone/recon/api/NSSummaryEndpoint.java
##########
@@ -227,9 +287,10 @@ public Response getDiskUsage(@QueryParam("path") String path)
 
       // get object IDs for all its subdirectories
       Set<Long> bucketSubdirs = bucketNSSummary.getChildDir();
-      duResponse.setCount(bucketSubdirs.size());
       duResponse.setKeySize(bucketNSSummary.getSizeOfFiles());
       List<DUResponse.DiskUsage> dirDUData = new ArrayList<>();
+      long bucketDataSize = duResponse.getKeySize();
+      long bucketDataSizeWithReplica = 0;

Review comment:
       nit
   ```suggestion
         long bucketDataSizeWithReplica = 0L;
   ```

##########
File path: hadoop-ozone/recon/src/main/java/org/apache/hadoop/ozone/recon/api/NSSummaryEndpoint.java
##########
@@ -196,9 +224,25 @@ public Response getDiskUsage(@QueryParam("path") String path)
           long bucketObjectID = bucket.getObjectID();
           dataSize += getTotalSize(bucketObjectID);
         }
+        totalDataSize += dataSize;
+
+        // count replicas
+        if (withReplica) {
+          long volumeDU = 0;

Review comment:
       nit
   ```suggestion
             long volumeDU = 0L;
   ```

##########
File path: hadoop-ozone/recon/src/main/java/org/apache/hadoop/ozone/recon/api/NSSummaryEndpoint.java
##########
@@ -242,9 +303,48 @@ public Response getDiskUsage(@QueryParam("path") String path)
         DUResponse.DiskUsage diskUsage = new DUResponse.DiskUsage();
         diskUsage.setSubpath(subpath);
         long dataSize = getTotalSize(subdirObjectId);
+        bucketDataSize += dataSize;
+
+        if (withReplica) {
+          long dirDU = 0;
+
+          List<OmKeyInfo> keys = listKeysUnderDirectory(subdirObjectId);
+          for (OmKeyInfo keyInfo: keys) {

Review comment:
       nit
   ```suggestion
             for (OmKeyInfo keyInfo : keys) {
   ```

##########
File path: hadoop-ozone/recon/src/main/java/org/apache/hadoop/ozone/recon/api/NSSummaryEndpoint.java
##########
@@ -267,26 +368,69 @@ public Response getDiskUsage(@QueryParam("path") String path)
         // reformat the response
         diskUsage.setSubpath(subpath);
         long dataSize = getTotalSize(subdirObjectId);
+        dirDataSize += dataSize;
+
+        if (withReplica) {
+          long subdirDU = 0;
+          List<OmKeyInfo> keys = listKeysUnderDirectory(subdirObjectId);
+          for (OmKeyInfo keyInfo: keys) {

Review comment:
       ```suggestion
             for (OmKeyInfo keyInfo : keys) {
   ```

##########
File path: hadoop-ozone/recon/src/main/java/org/apache/hadoop/ozone/recon/api/NSSummaryEndpoint.java
##########
@@ -242,9 +303,48 @@ public Response getDiskUsage(@QueryParam("path") String path)
         DUResponse.DiskUsage diskUsage = new DUResponse.DiskUsage();
         diskUsage.setSubpath(subpath);
         long dataSize = getTotalSize(subdirObjectId);
+        bucketDataSize += dataSize;
+
+        if (withReplica) {
+          long dirDU = 0;
+
+          List<OmKeyInfo> keys = listKeysUnderDirectory(subdirObjectId);

Review comment:
       Same question here.

##########
File path: hadoop-ozone/recon/src/main/java/org/apache/hadoop/ozone/recon/api/NSSummaryEndpoint.java
##########
@@ -208,16 +252,32 @@ public Response getDiskUsage(@QueryParam("path") String path)
 
       // List of DiskUsage data for all buckets
       List<DUResponse.DiskUsage> bucketDuData = new ArrayList<>();
-      for (OmBucketInfo bucket : buckets) {
+      long volDataSize = 0;
+      long volDataSizeWithReplica = 0;
+      for (OmBucketInfo bucket: buckets) {
         String bucketName = bucket.getBucketName();
         long bucketObjectID = bucket.getObjectID();
         String subpath = omMetadataManager.getBucketKey(volName, bucketName);
         DUResponse.DiskUsage diskUsage = new DUResponse.DiskUsage();
         diskUsage.setSubpath(subpath);
         long dataSize = getTotalSize(bucketObjectID);
+        volDataSize += dataSize;
+        if (withReplica) {
+          long bucketDU = 0;
+          List<OmKeyInfo> keys = listKeysUnderBucket(bucketName);
+          for (OmKeyInfo keyInfo: keys) {

Review comment:
       nit
   ```suggestion
             for (OmKeyInfo keyInfo : keys) {
   ```

##########
File path: hadoop-ozone/recon/src/main/java/org/apache/hadoop/ozone/recon/api/NSSummaryEndpoint.java
##########
@@ -242,9 +303,48 @@ public Response getDiskUsage(@QueryParam("path") String path)
         DUResponse.DiskUsage diskUsage = new DUResponse.DiskUsage();
         diskUsage.setSubpath(subpath);
         long dataSize = getTotalSize(subdirObjectId);
+        bucketDataSize += dataSize;
+
+        if (withReplica) {
+          long dirDU = 0;
+
+          List<OmKeyInfo> keys = listKeysUnderDirectory(subdirObjectId);
+          for (OmKeyInfo keyInfo: keys) {
+            dirDU += getKeySizeWithReplication(keyInfo);
+          }
+          diskUsage.setSizeWithReplica(dirDU);
+          bucketDataSizeWithReplica += dirDU;
+        }
         diskUsage.setSize(dataSize);
         dirDUData.add(diskUsage);
       }
+      // Either listFile or withReplica is enabled, we need the directKeys info
+      if (listFile || withReplica) {
+        List<OmKeyInfo> directKeys = listDirectKeys(bucketObjectId);

Review comment:
       And here.

##########
File path: hadoop-ozone/recon/src/main/java/org/apache/hadoop/ozone/recon/api/NSSummaryEndpoint.java
##########
@@ -62,17 +72,25 @@
 @Path("/namespace")
 @Produces(MediaType.APPLICATION_JSON)
 public class NSSummaryEndpoint {
+
+  private static final Logger LOG = LoggerFactory.getLogger(
+          NSSummaryEndpoint.class);
   @Inject
   private ReconNamespaceSummaryManager reconNamespaceSummaryManager;
 
   @Inject
   private ReconOMMetadataManager omMetadataManager;
 
+  private ReconContainerManager containerManager;
+
   @Inject
   public NSSummaryEndpoint(ReconNamespaceSummaryManager namespaceSummaryManager,
-                           ReconOMMetadataManager omMetadataManager) {
+                           ReconOMMetadataManager omMetadataManager,
+                           OzoneStorageContainerManager reconSCM) {
     this.reconNamespaceSummaryManager = namespaceSummaryManager;
     this.omMetadataManager = omMetadataManager;
+    this.containerManager =
+            (ReconContainerManager) reconSCM.getContainerManager();

Review comment:
       ```suggestion
               reconSCM.getContainerManager();
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] yuangu002 commented on pull request #2489: Recon namespace summary 'du' information should return replicated size of a key

Posted by GitBox <gi...@apache.org>.
yuangu002 commented on pull request #2489:
URL: https://github.com/apache/ozone/pull/2489#issuecomment-892075318


   cc @smengcl @avijayanhwx. Please review


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] yuangu002 commented on a change in pull request #2489: HDDS-5449 Recon namespace summary 'du' information should return replicated size of a key

Posted by GitBox <gi...@apache.org>.
yuangu002 commented on a change in pull request #2489:
URL: https://github.com/apache/ozone/pull/2489#discussion_r682857427



##########
File path: hadoop-ozone/recon/src/main/java/org/apache/hadoop/ozone/recon/api/NSSummaryEndpoint.java
##########
@@ -163,12 +181,19 @@ public Response getBasicInfo(
   /**
    * DU endpoint to return datasize for subdirectory (bucket for volume).
    * @param path request path
+   * @param listFile show subpath/disk usage for each key
+   * @param withReplica count actual DU with replication
    * @return DU response
    * @throws IOException
    */
   @GET
   @Path("/du")
-  public Response getDiskUsage(@QueryParam("path") String path)
+  @SuppressWarnings("methodlength")
+  public Response getDiskUsage(@QueryParam("path") String path,
+                               @DefaultValue("false")
+                               @QueryParam("files") boolean listFile,
+                               @DefaultValue("false")
+                               @QueryParam("replica") boolean withReplica)

Review comment:
       shouldn't `@DefaultValue("false")` suffice?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] avijayanhwx commented on a change in pull request #2489: HDDS-5449 Recon namespace summary 'du' information should return replicated size of a key

Posted by GitBox <gi...@apache.org>.
avijayanhwx commented on a change in pull request #2489:
URL: https://github.com/apache/ozone/pull/2489#discussion_r682832116



##########
File path: hadoop-ozone/recon/src/main/java/org/apache/hadoop/ozone/recon/api/NSSummaryEndpoint.java
##########
@@ -267,26 +368,69 @@ public Response getDiskUsage(@QueryParam("path") String path)
         // reformat the response
         diskUsage.setSubpath(subpath);
         long dataSize = getTotalSize(subdirObjectId);
+        dirDataSize += dataSize;
+
+        if (withReplica) {
+          long subdirDU = 0;
+          List<OmKeyInfo> keys = listKeysUnderDirectory(subdirObjectId);
+          for (OmKeyInfo keyInfo: keys) {
+            subdirDU += getKeySizeWithReplication(keyInfo);
+          }
+          diskUsage.setSizeWithReplica(subdirDU);
+          dirDataSizeWithReplica += subdirDU;
+        }
+
         diskUsage.setSize(dataSize);
         subdirDUData.add(diskUsage);
       }
+
+      // handle direct keys under directory
+      if (listFile || withReplica) {
+        List<OmKeyInfo> directKeys = listDirectKeys(dirObjectId);
+
+        for (OmKeyInfo directKey: directKeys) {
+          DUResponse.DiskUsage diskUsage = new DUResponse.DiskUsage();
+          String subpath = buildSubpath(normalizedPath,
+                  directKey.getFileName());
+          diskUsage.setSubpath(subpath);
+          diskUsage.setSize(directKey.getDataSize());
+
+          if (withReplica) {
+            long keyDU = getKeySizeWithReplication(directKey);
+            dirDataSizeWithReplica += keyDU;
+            diskUsage.setSizeWithReplica(keyDU);
+          }
+          // list the key as a subpath
+          if (listFile) {
+            subdirDUData.add(diskUsage);
+          }
+        }
+      }
+
+      if (withReplica) {
+        duResponse.setSizeWithReplica(dirDataSizeWithReplica);
+      }
+      duResponse.setCount(subdirDUData.size());
+      duResponse.setSize(dirDataSize);
       duResponse.setDuData(subdirDUData);
       break;
     case KEY:
-      // DU for key is the data size
-      duResponse.setCount(1);
-      DUResponse.DiskUsage keyDU = new DUResponse.DiskUsage();
+      // DU for key doesn't have subpaths
+      duResponse.setCount(0);
       // The object ID for the directory that the key is directly in
       long parentObjectId = getDirObjectId(names, names.length - 1);
       String fileName = names[names.length - 1];
       String ozoneKey =
               omMetadataManager.getOzonePathKey(parentObjectId, fileName);
       OmKeyInfo keyInfo =
               omMetadataManager.getFileTable().getSkipCache(ozoneKey);
-      String subpath = buildSubpath(normalizedPath, null);
-      keyDU.setSubpath(subpath);
-      keyDU.setSize(keyInfo.getDataSize());
-      duResponse.setDuData(Collections.singletonList(keyDU));
+      duResponse.setSize(keyInfo.getDataSize());
+      if (withReplica) {
+        long keySizeWithReplica = getKeySizeWithReplication(keyInfo);
+        duResponse.setSizeWithReplica(keySizeWithReplica);
+      }
+      // give an empty list to avoid NPE
+      duResponse.setDuData(new ArrayList<>());

Review comment:
       We could init the duData list to empty list in the DuResponse constructor.

##########
File path: hadoop-ozone/recon/src/main/java/org/apache/hadoop/ozone/recon/api/NSSummaryEndpoint.java
##########
@@ -724,6 +868,135 @@ private long getTotalSize(long objectId) throws IOException {
     return result;
   }
 
+  private List<OmKeyInfo> listKeysUnderVolume(String volumeName)
+          throws IOException {
+    List<OmKeyInfo> result = new ArrayList<>();
+    if (Strings.isBlank(volumeName)) {
+      return result;
+    }
+
+    Table keyTable = omMetadataManager.getFileTable();
+
+    TableIterator<String, ? extends Table.KeyValue<String, OmKeyInfo>>
+            iterator = keyTable.iterator();
+
+    while (iterator.hasNext()) {
+      Table.KeyValue<String, OmKeyInfo> kv = iterator.next();
+      OmKeyInfo keyInfo = kv.getValue();
+
+      if (keyInfo != null) {
+        if (volumeName.equals(keyInfo.getVolumeName())) {
+          result.add(keyInfo);
+        }
+      }
+    }
+    return result;
+  }
+
+  private List<OmKeyInfo> listKeysUnderBucket(String bucketName)
+          throws IOException {
+    List<OmKeyInfo> result = new ArrayList<>();
+    if (Strings.isBlank(bucketName)) {
+      return result;
+    }
+
+    Table keyTable = omMetadataManager.getFileTable();
+
+    TableIterator<String, ? extends Table.KeyValue<String, OmKeyInfo>>
+            iterator = keyTable.iterator();
+
+    while (iterator.hasNext()) {
+      Table.KeyValue<String, OmKeyInfo> kv = iterator.next();
+      OmKeyInfo keyInfo = kv.getValue();
+
+      if (keyInfo != null) {
+        if (bucketName.equals(keyInfo.getBucketName())) {
+          result.add(keyInfo);
+        }
+      }
+    }
+    return result;
+  }
+
+  private List<OmKeyInfo> listKeysUnderDirectory(long dirObjectId)
+          throws IOException {
+    List<OmKeyInfo> result = new ArrayList<>();
+
+    Table keyTable = omMetadataManager.getFileTable();
+
+    TableIterator<String, ? extends Table.KeyValue<String, OmKeyInfo>>
+            iterator = keyTable.iterator();
+
+    while (iterator.hasNext()) {
+      Table.KeyValue<String, OmKeyInfo> kv = iterator.next();
+      OmKeyInfo keyInfo = kv.getValue();
+
+      if (keyInfo != null
+              && isFileUnderDir(dirObjectId, keyInfo.getParentObjectID())) {
+        result.add(keyInfo);
+      }
+    }
+    return result;
+  }
+
+  private boolean isFileUnderDir(long dirObjectId,
+                                 long keyParentId) throws IOException {
+    if (dirObjectId == keyParentId) {
+      return true;
+    } else {
+      NSSummary nsSummary =
+              reconNamespaceSummaryManager.getNSSummary(dirObjectId);
+      for (long subdirId: nsSummary.getChildDir()) {
+        if (isFileUnderDir(subdirId, keyParentId)) {
+          return true;
+        }
+      }
+      return false;
+    }
+  }
+
+  private List<OmKeyInfo> listDirectKeys(long parentId) throws IOException {
+    List<OmKeyInfo> result = new ArrayList<>();
+
+    Table keyTable = omMetadataManager.getFileTable();
+
+    TableIterator<String, ? extends Table.KeyValue<String, OmKeyInfo>>
+            iterator = keyTable.iterator();
+
+    while (iterator.hasNext()) {
+      Table.KeyValue<String, OmKeyInfo> kv = iterator.next();
+      OmKeyInfo keyInfo = kv.getValue();
+
+      if (keyInfo != null) {
+        if (parentId == keyInfo.getParentObjectID()) {
+          result.add(keyInfo);
+        }
+      }
+    }
+    return result;
+  }
+
+  private long getKeySizeWithReplication(OmKeyInfo keyInfo) {
+    OmKeyLocationInfoGroup locationGroup = keyInfo.getLatestVersionLocations();
+    List<OmKeyLocationInfo> keyLocations =
+            locationGroup.getBlocksLatestVersionOnly();
+    long du = keyInfo.getDataSize();
+    // a key could be too large to fit in one single container
+    for (OmKeyLocationInfo location: keyLocations) {
+      BlockID block = location.getBlockID();
+      ContainerID containerId = new ContainerID(block.getContainerID());
+      try {
+        int replicationFactor =
+                containerManager.getContainerReplicas(containerId).size();
+        du *= replicationFactor;
+        break;
+      } catch (ContainerNotFoundException cnfe) {
+        LOG.warn("Cannot find container");

Review comment:
       Can be debug, along with the ContainerID being printed out.

##########
File path: hadoop-ozone/recon/src/main/java/org/apache/hadoop/ozone/recon/api/NSSummaryEndpoint.java
##########
@@ -724,6 +868,135 @@ private long getTotalSize(long objectId) throws IOException {
     return result;
   }
 
+  private List<OmKeyInfo> listKeysUnderVolume(String volumeName)
+          throws IOException {
+    List<OmKeyInfo> result = new ArrayList<>();
+    if (Strings.isBlank(volumeName)) {
+      return result;
+    }
+
+    Table keyTable = omMetadataManager.getFileTable();
+
+    TableIterator<String, ? extends Table.KeyValue<String, OmKeyInfo>>
+            iterator = keyTable.iterator();
+
+    while (iterator.hasNext()) {
+      Table.KeyValue<String, OmKeyInfo> kv = iterator.next();

Review comment:
       Can we use the iterator.seek() API that let's us jump to the first key with that the volume name in prefix here?

##########
File path: hadoop-ozone/recon/src/main/java/org/apache/hadoop/ozone/recon/api/NSSummaryEndpoint.java
##########
@@ -724,6 +868,135 @@ private long getTotalSize(long objectId) throws IOException {
     return result;
   }
 
+  private List<OmKeyInfo> listKeysUnderVolume(String volumeName)
+          throws IOException {
+    List<OmKeyInfo> result = new ArrayList<>();
+    if (Strings.isBlank(volumeName)) {
+      return result;
+    }
+
+    Table keyTable = omMetadataManager.getFileTable();
+
+    TableIterator<String, ? extends Table.KeyValue<String, OmKeyInfo>>
+            iterator = keyTable.iterator();
+
+    while (iterator.hasNext()) {
+      Table.KeyValue<String, OmKeyInfo> kv = iterator.next();
+      OmKeyInfo keyInfo = kv.getValue();
+
+      if (keyInfo != null) {
+        if (volumeName.equals(keyInfo.getVolumeName())) {
+          result.add(keyInfo);
+        }
+      }
+    }
+    return result;
+  }
+
+  private List<OmKeyInfo> listKeysUnderBucket(String bucketName)
+          throws IOException {
+    List<OmKeyInfo> result = new ArrayList<>();
+    if (Strings.isBlank(bucketName)) {
+      return result;
+    }
+
+    Table keyTable = omMetadataManager.getFileTable();
+
+    TableIterator<String, ? extends Table.KeyValue<String, OmKeyInfo>>
+            iterator = keyTable.iterator();
+
+    while (iterator.hasNext()) {
+      Table.KeyValue<String, OmKeyInfo> kv = iterator.next();

Review comment:
       Same as above.

##########
File path: hadoop-ozone/recon/src/main/java/org/apache/hadoop/ozone/recon/api/NSSummaryEndpoint.java
##########
@@ -724,6 +868,135 @@ private long getTotalSize(long objectId) throws IOException {
     return result;
   }
 
+  private List<OmKeyInfo> listKeysUnderVolume(String volumeName)
+          throws IOException {
+    List<OmKeyInfo> result = new ArrayList<>();
+    if (Strings.isBlank(volumeName)) {
+      return result;
+    }
+
+    Table keyTable = omMetadataManager.getFileTable();
+
+    TableIterator<String, ? extends Table.KeyValue<String, OmKeyInfo>>
+            iterator = keyTable.iterator();
+
+    while (iterator.hasNext()) {
+      Table.KeyValue<String, OmKeyInfo> kv = iterator.next();
+      OmKeyInfo keyInfo = kv.getValue();
+
+      if (keyInfo != null) {
+        if (volumeName.equals(keyInfo.getVolumeName())) {
+          result.add(keyInfo);
+        }
+      }
+    }
+    return result;
+  }
+
+  private List<OmKeyInfo> listKeysUnderBucket(String bucketName)
+          throws IOException {
+    List<OmKeyInfo> result = new ArrayList<>();
+    if (Strings.isBlank(bucketName)) {
+      return result;
+    }
+
+    Table keyTable = omMetadataManager.getFileTable();
+
+    TableIterator<String, ? extends Table.KeyValue<String, OmKeyInfo>>
+            iterator = keyTable.iterator();
+
+    while (iterator.hasNext()) {
+      Table.KeyValue<String, OmKeyInfo> kv = iterator.next();
+      OmKeyInfo keyInfo = kv.getValue();
+
+      if (keyInfo != null) {
+        if (bucketName.equals(keyInfo.getBucketName())) {
+          result.add(keyInfo);
+        }
+      }
+    }
+    return result;
+  }
+
+  private List<OmKeyInfo> listKeysUnderDirectory(long dirObjectId)
+          throws IOException {
+    List<OmKeyInfo> result = new ArrayList<>();
+
+    Table keyTable = omMetadataManager.getFileTable();
+
+    TableIterator<String, ? extends Table.KeyValue<String, OmKeyInfo>>
+            iterator = keyTable.iterator();
+
+    while (iterator.hasNext()) {

Review comment:
       Same as above.

##########
File path: hadoop-ozone/recon/src/test/java/org/apache/hadoop/ozone/recon/api/TestNSSummaryEndpoint.java
##########
@@ -299,13 +311,15 @@ public void testDiskUsage() throws Exception {
     Assert.assertEquals(KEY_SIX_SIZE, duDir4.getSize());
 
     // key level DU
-    Response keyResponse = nsSummaryEndpoint.getDiskUsage(KEY_PATH);
+    Response keyResponse = nsSummaryEndpoint.getDiskUsage(KEY_PATH,
+            false, false);

Review comment:
       Can we add tests for replica size? There are a few cases that need to be verified (key spanning into multiple blocks, container replica count < desired replica count etc) 

##########
File path: hadoop-ozone/recon/src/main/java/org/apache/hadoop/ozone/recon/api/NSSummaryEndpoint.java
##########
@@ -62,17 +72,25 @@
 @Path("/namespace")
 @Produces(MediaType.APPLICATION_JSON)
 public class NSSummaryEndpoint {
+
+  private static final Logger LOG = LoggerFactory.getLogger(
+          NSSummaryEndpoint.class);
   @Inject
   private ReconNamespaceSummaryManager reconNamespaceSummaryManager;
 
   @Inject
   private ReconOMMetadataManager omMetadataManager;
 
+  private ReconContainerManager containerManager;
+
   @Inject
   public NSSummaryEndpoint(ReconNamespaceSummaryManager namespaceSummaryManager,
-                           ReconOMMetadataManager omMetadataManager) {
+                           ReconOMMetadataManager omMetadataManager,
+                           OzoneStorageContainerManager reconSCM) {
     this.reconNamespaceSummaryManager = namespaceSummaryManager;
     this.omMetadataManager = omMetadataManager;
+    this.containerManager =
+            (ReconContainerManager) reconSCM.getContainerManager();

Review comment:
       We don't need the cast to RCM here. We should be OK with the return type being 'ContainerManagerV2' since we are using only interface method.

##########
File path: hadoop-ozone/recon/src/main/java/org/apache/hadoop/ozone/recon/api/NSSummaryEndpoint.java
##########
@@ -163,12 +181,19 @@ public Response getBasicInfo(
   /**
    * DU endpoint to return datasize for subdirectory (bucket for volume).
    * @param path request path
+   * @param listFile show subpath/disk usage for each key
+   * @param withReplica count actual DU with replication
    * @return DU response
    * @throws IOException
    */
   @GET
   @Path("/du")
-  public Response getDiskUsage(@QueryParam("path") String path)
+  @SuppressWarnings("methodlength")
+  public Response getDiskUsage(@QueryParam("path") String path,
+                               @DefaultValue("false")
+                               @QueryParam("files") boolean listFile,
+                               @DefaultValue("false")
+                               @QueryParam("replica") boolean withReplica)

Review comment:
       Can we set Default value for replica?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] smengcl commented on a change in pull request #2489: HDDS-5449 Recon namespace summary 'du' information should return replicated size of a key

Posted by GitBox <gi...@apache.org>.
smengcl commented on a change in pull request #2489:
URL: https://github.com/apache/ozone/pull/2489#discussion_r685516940



##########
File path: hadoop-ozone/recon/src/test/java/org/apache/hadoop/ozone/recon/api/TestNSSummaryEndpoint.java
##########
@@ -51,16 +57,13 @@
 
 import java.io.File;
 import java.io.IOException;
-import java.util.Collections;
-import java.util.Comparator;
-import java.util.List;
+import java.util.*;

Review comment:
       Let's change this back.
   
   You can disable intellij's wildcard imports feature: https://www.jetbrains.com/help/idea/creating-and-optimizing-imports.html#disable-wildcards-for-class

##########
File path: hadoop-ozone/recon/src/test/java/org/apache/hadoop/ozone/recon/api/TestNSSummaryEndpoint.java
##########
@@ -18,13 +18,19 @@
 
 package org.apache.hadoop.ozone.recon.api;
 
+import org.apache.hadoop.hdds.client.BlockID;
 import org.apache.hadoop.hdds.conf.OzoneConfiguration;
+import org.apache.hadoop.hdds.protocol.DatanodeDetails;
+import org.apache.hadoop.hdds.protocol.proto.StorageContainerDatanodeProtocolProtos.ContainerReplicaProto.State;
+import org.apache.hadoop.hdds.scm.container.ContainerID;
+import org.apache.hadoop.hdds.scm.container.ContainerManagerV2;
+import org.apache.hadoop.hdds.scm.container.ContainerNotFoundException;
+import org.apache.hadoop.hdds.scm.container.ContainerReplica;
 import org.apache.hadoop.hdds.scm.server.OzoneStorageContainerManager;
 import org.apache.hadoop.ozone.OzoneConsts;
 import org.apache.hadoop.ozone.om.OMMetadataManager;
 import org.apache.hadoop.ozone.om.OmMetadataManagerImpl;
-import org.apache.hadoop.ozone.om.helpers.OmBucketInfo;
-import org.apache.hadoop.ozone.om.helpers.OmVolumeArgs;
+import org.apache.hadoop.ozone.om.helpers.*;

Review comment:
       This as well.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org