You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ozone.apache.org by GitBox <gi...@apache.org> on 2022/06/27 07:56:58 UTC

[GitHub] [ozone] tanvipenumudy opened a new pull request, #3554: HDDS-6951. Replace bucket.listKeys() with bucket.listStatus() in OmBucketReadWriteKeyOps

tanvipenumudy opened a new pull request, #3554:
URL: https://github.com/apache/ozone/pull/3554

   ## What changes were proposed in this pull request?
   
   Replace `bucket.listKeys()` with the `bucket.listStatus()` method for simulating synthetic read operation workload in `OmBucketReadWriteKeyOps` Freon class since `bucket.listStatus()` uses the `OzoneManagerLock.acquireReadLock()` method which would hence serve more useful for the sake of benchmarking the lock performance.
   
   ## What is the link to the Apache JIRA
   
   https://issues.apache.org/jira/browse/HDDS-6951
   
   ## How was this patch tested?
   
   Existing integration tests.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] ayushtkn commented on a diff in pull request #3554: HDDS-6951. Replace bucket.listKeys() with bucket.listStatus() in OmBucketReadWriteKeyOps

Posted by GitBox <gi...@apache.org>.
ayushtkn commented on code in PR #3554:
URL: https://github.com/apache/ozone/pull/3554#discussion_r935027796


##########
hadoop-ozone/tools/src/main/java/org/apache/hadoop/ozone/freon/OmBucketReadWriteKeyOps.java:
##########
@@ -128,12 +128,10 @@ protected String createPath(String path) {
   @Override
   protected int getReadCount(int readCount, String readPath)
       throws IOException {
-    Iterator<? extends OzoneKey> ozoneKeyIterator = bucket.listKeys(
-        OzoneConsts.OM_KEY_PREFIX + readPath + OzoneConsts.OM_KEY_PREFIX);
-    while (ozoneKeyIterator.hasNext()) {
-      ozoneKeyIterator.next();
-      ++readCount;
-    }
+    List<OzoneFileStatus> ozoneFileStatusList = bucket.listStatus(
+        OzoneConsts.OM_KEY_PREFIX + readPath + OzoneConsts.OM_KEY_PREFIX, true,
+        "/", keyCountForRead);

Review Comment:
   On a quick look I think here client is controlling the number of entries via ``keyCountForRead``.
   
   In the FileSystem code as well, it seems to be controlled at the client side:
   https://github.com/apache/ozone/blob/a32549cbc1227291d6ebd972ce32db043a5c51d0/hadoop-ozone/ozonefs-common/src/main/java/org/apache/hadoop/fs/ozone/BasicOzoneFileSystem.java#L608
   
   In HDFS, AFAIK the Namenode does impose such limits and the client can't explicitly demand the number of entries:
   https://github.com/apache/hadoop/blob/123d1aa8846a2099b04fd8e8ed7f2bd6db4f36b5/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirectory.java#L364-L366
   
   And the use case here seems to be just getting the ``readCount``, which actually doesn't require ``OzoneFileStatus``, it can easily be looped using the ``LISTING_PAGE_SIZE`` or similar numbers, If you go at a scale of around a ~1 Million entries or beyond you would be utilising a lot of memory storing the ``FileStatus`` which you actually don't need, If I remember correct it started taking around ~1.5 Gigs/million, when I tried similar stuff with HDFSFileStatus in DistCp which sadly builds up the FileStatus tree in Breadth First Traversal approach. It would be less for Ozone for sure as compared to HDFS....
   
   I couldn't spare time to dig in much, Please feel free to proceed if you are convinced.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] ayushtkn commented on a diff in pull request #3554: HDDS-6951. Replace bucket.listKeys() with bucket.listStatus() in OmBucketReadWriteKeyOps

Posted by GitBox <gi...@apache.org>.
ayushtkn commented on code in PR #3554:
URL: https://github.com/apache/ozone/pull/3554#discussion_r934002979


##########
hadoop-ozone/tools/src/main/java/org/apache/hadoop/ozone/freon/OmBucketReadWriteKeyOps.java:
##########
@@ -128,12 +128,10 @@ protected String createPath(String path) {
   @Override
   protected int getReadCount(int readCount, String readPath)
       throws IOException {
-    Iterator<? extends OzoneKey> ozoneKeyIterator = bucket.listKeys(
-        OzoneConsts.OM_KEY_PREFIX + readPath + OzoneConsts.OM_KEY_PREFIX);
-    while (ozoneKeyIterator.hasNext()) {
-      ozoneKeyIterator.next();
-      ++readCount;
-    }
+    List<OzoneFileStatus> ozoneFileStatusList = bucket.listStatus(
+        OzoneConsts.OM_KEY_PREFIX + readPath + OzoneConsts.OM_KEY_PREFIX, true,
+        "/", keyCountForRead);

Review Comment:
   If ``keyCountForRead`` is huge, in that case does it go and try to fetch all entries in one go and store it in ``List<OzoneFileStatus> ozoneFileStatusList``, in that case for larger values of ``keyCountForRead``, this might start giving OOM.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] kerneltime commented on a diff in pull request #3554: HDDS-6951. Replace bucket.listKeys() with bucket.listStatus() in OmBucketReadWriteKeyOps

Posted by GitBox <gi...@apache.org>.
kerneltime commented on code in PR #3554:
URL: https://github.com/apache/ozone/pull/3554#discussion_r935102116


##########
hadoop-ozone/tools/src/main/java/org/apache/hadoop/ozone/freon/OmBucketReadWriteKeyOps.java:
##########
@@ -128,12 +128,10 @@ protected String createPath(String path) {
   @Override
   protected int getReadCount(int readCount, String readPath)
       throws IOException {
-    Iterator<? extends OzoneKey> ozoneKeyIterator = bucket.listKeys(
-        OzoneConsts.OM_KEY_PREFIX + readPath + OzoneConsts.OM_KEY_PREFIX);
-    while (ozoneKeyIterator.hasNext()) {
-      ozoneKeyIterator.next();
-      ++readCount;
-    }
+    List<OzoneFileStatus> ozoneFileStatusList = bucket.listStatus(
+        OzoneConsts.OM_KEY_PREFIX + readPath + OzoneConsts.OM_KEY_PREFIX, true,
+        "/", keyCountForRead);

Review Comment:
   @ayushtkn what you point out makes sense. The reason it is not relevant here is that freon is a CLI extension used to generate load for testing. The user might not want to have a limit as they might be profiling memory or CPU. The CLI invocation has a [default value that can be overridden ](https://github.com/apache/ozone/pull/3554/files#diff-1fa33049586e9221ab06c5c11fba5033a8ca68a28b8fe5ccefe4b676ef4f9c0fR62-R65)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] rakeshadr commented on a diff in pull request #3554: HDDS-6951. Replace bucket.listKeys() with bucket.listStatus() in OmBucketReadWriteKeyOps

Posted by GitBox <gi...@apache.org>.
rakeshadr commented on code in PR #3554:
URL: https://github.com/apache/ozone/pull/3554#discussion_r936385625


##########
hadoop-ozone/tools/src/main/java/org/apache/hadoop/ozone/freon/OmBucketReadWriteKeyOps.java:
##########
@@ -128,12 +128,10 @@ protected String createPath(String path) {
   @Override
   protected int getReadCount(int readCount, String readPath)
       throws IOException {
-    Iterator<? extends OzoneKey> ozoneKeyIterator = bucket.listKeys(
-        OzoneConsts.OM_KEY_PREFIX + readPath + OzoneConsts.OM_KEY_PREFIX);
-    while (ozoneKeyIterator.hasNext()) {
-      ozoneKeyIterator.next();
-      ++readCount;
-    }
+    List<OzoneFileStatus> ozoneFileStatusList = bucket.listStatus(
+        OzoneConsts.OM_KEY_PREFIX + readPath + OzoneConsts.OM_KEY_PREFIX, true,
+        "/", keyCountForRead);

Review Comment:
   > @ayushtkn what you point out makes sense. The reason it is not relevant here is that freon is a CLI extension used to generate load for testing. The user might not want to have a limit as they might be profiling memory or CPU. The CLI invocation has a [default value that can be overridden ](https://github.com/apache/ozone/pull/3554/files#diff-1fa33049586e9221ab06c5c11fba5033a8ca68a28b8fe5ccefe4b676ef4f9c0fR62-R65)
   
   Thanks a lot @ayushtkn for the reviews. 
   
   Agreed with @kerneltime. Since its a stress/benchmarking tool, I am more inclined to keep this value open to the users and they can tune this based on their cluster resource capabilities.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] kerneltime commented on a diff in pull request #3554: HDDS-6951. Replace bucket.listKeys() with bucket.listStatus() in OmBucketReadWriteKeyOps

Posted by GitBox <gi...@apache.org>.
kerneltime commented on code in PR #3554:
URL: https://github.com/apache/ozone/pull/3554#discussion_r934194822


##########
hadoop-ozone/tools/src/main/java/org/apache/hadoop/ozone/freon/OmBucketReadWriteKeyOps.java:
##########
@@ -128,12 +128,10 @@ protected String createPath(String path) {
   @Override
   protected int getReadCount(int readCount, String readPath)
       throws IOException {
-    Iterator<? extends OzoneKey> ozoneKeyIterator = bucket.listKeys(
-        OzoneConsts.OM_KEY_PREFIX + readPath + OzoneConsts.OM_KEY_PREFIX);
-    while (ozoneKeyIterator.hasNext()) {
-      ozoneKeyIterator.next();
-      ++readCount;
-    }
+    List<OzoneFileStatus> ozoneFileStatusList = bucket.listStatus(
+        OzoneConsts.OM_KEY_PREFIX + readPath + OzoneConsts.OM_KEY_PREFIX, true,
+        "/", keyCountForRead);

Review Comment:
   Ideally the server should limit the max number of entries returned. This is a CLI side change that should not be implementing such caps.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] rakeshadr commented on pull request #3554: HDDS-6951. Replace bucket.listKeys() with bucket.listStatus() in OmBucketReadWriteKeyOps

Posted by GitBox <gi...@apache.org>.
rakeshadr commented on PR #3554:
URL: https://github.com/apache/ozone/pull/3554#issuecomment-1203652641

   Thanks @tanvipenumudy for the contribution!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] rakeshadr merged pull request #3554: HDDS-6951. Replace bucket.listKeys() with bucket.listStatus() in OmBucketReadWriteKeyOps

Posted by GitBox <gi...@apache.org>.
rakeshadr merged PR #3554:
URL: https://github.com/apache/ozone/pull/3554


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org