You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ozone.apache.org by "xichen01 (via GitHub)" <gi...@apache.org> on 2023/12/14 07:39:09 UTC

[PR] HDDS-9913. EC client Reduces duplicate loading of configurations [ozone]

xichen01 opened a new pull request, #5789:
URL: https://github.com/apache/ozone/pull/5789

   ## What changes were proposed in this pull request?
   Improving EC perf by eliminating redundant config reads.
   
   ### The Root cause
   The `conf.getObject(OzoneClientConfig.class)` is an expensive operation, because the for in `injectConfigurationToObject` may need to be looped many times and there will be a lot of configuration parsing operations.
   https://github.com/apache/ozone/blob/8b25c554cbb7a74120f6cf6390ea127bb23f64e2/hadoop-hdds/config/src/main/java/org/apache/hadoop/hdds/conf/ConfigurationReflectionUtil.java#L89-L99
   
   - There may be other places in Ozone where there are similar performance issues, because `getObject` is used heavily.
   - Maybe this can be solved by caching the computed configurations
   
   ## What is the link to the Apache JIRA
   https://issues.apache.org/jira/browse/HDDS-9913
   
   ## How was this patch tested?
   existing test
   Performance testing result
   https://issues.apache.org/jira/browse/HDDS-9911
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


Re: [PR] HDDS-9913. EC client Reduces duplicate loading of configurations [ozone]

Posted by "adoroszlai (via GitHub)" <gi...@apache.org>.
adoroszlai commented on code in PR #5789:
URL: https://github.com/apache/ozone/pull/5789#discussion_r1426354012


##########
hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/rpc/RpcClient.java:
##########
@@ -1854,7 +1855,7 @@ public OzoneDataStreamOutput createMultipartStreamKey(
             .setMultipartUploadID(uploadID)
             .setIsMultipartKey(true)
             .enableUnsafeByteBufferConversion(unsafeByteBufferConversion)
-            .setConfig(conf.getObject(OzoneClientConfig.class))
+            .setConfig(clientConfig)

Review Comment:
   This one, and the other two similar items are creating new configs intentionally.  Please see HDDS-8216.
   
   To reduce config loading, but still keep config loading to minimum, we could cache and reuse `OzoneClientConfig` instance per EC chunk size.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


Re: [PR] HDDS-9913. EC client Reduces duplicate loading of configurations [ozone]

Posted by "xichen01 (via GitHub)" <gi...@apache.org>.
xichen01 commented on code in PR #5789:
URL: https://github.com/apache/ozone/pull/5789#discussion_r1428812568


##########
hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/rpc/RpcClient.java:
##########
@@ -1854,7 +1855,7 @@ public OzoneDataStreamOutput createMultipartStreamKey(
             .setMultipartUploadID(uploadID)
             .setIsMultipartKey(true)
             .enableUnsafeByteBufferConversion(unsafeByteBufferConversion)
-            .setConfig(conf.getObject(OzoneClientConfig.class))
+            .setConfig(clientConfig)

Review Comment:
   I understand, but here we are using `Configuration` to pass parameters, which shouldn't be the role of `Configuration`. If a parameter needs to be computed at runtime, it should not be directly obtained from Configuration. Therefore, I added a `StreamBufferArgs` to modify and pass the StreamBuffe related parameters . 
   
   IMO, Configuration should ideally not provide a direct `set` method. Modifications to Configuration should only occur in "test code" and during "reconfiguration". Otherwise, Configuration should be read-only.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


Re: [PR] HDDS-9913. Reduce number of times configuration is loaded in Ozone client [ozone]

Posted by "adoroszlai (via GitHub)" <gi...@apache.org>.
adoroszlai merged PR #5789:
URL: https://github.com/apache/ozone/pull/5789


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org