You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ozone.apache.org by "xichen01 (via GitHub)" <gi...@apache.org> on 2023/12/14 09:11:01 UTC

[PR] HDDS-9912. EC client Reusing thread resources [ozone]

xichen01 opened a new pull request, #5791:
URL: https://github.com/apache/ozone/pull/5791

   ## What changes were proposed in this pull request?
   When writing an EC key, multiple temporary threads are created, creating and destroying these threads will cause performance degradation, this PR improves the performance of writing an EC key by reusing thread resources.
   
   ### The Root cause
   For a 3 + 2 EC key 1 `flushExecutor` thread and 5 `responseExecutors` are created.
   
   For a key in replica mode, only one `responseExecutor` is created, because replica mode only needs to write once and the rest of the replicas are synchronized by Ratis, whereas EC mode requires the Client to write all the strips to DNs.
   
   So temporary threads have less impact on performance in replica mode buckets, and more impact in EC buckets.
   
   https://github.com/apache/ozone/blob/8b25c554cbb7a74120f6cf6390ea127bb23f64e2/hadoop-hdds/client/src/main/java/org/apache/hadoop/hdds/scm/storage/BlockDataStreamOutput.java#L183
   https://github.com/apache/ozone/blob/8b25c554cbb7a74120f6cf6390ea127bb23f64e2/hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/io/ECKeyOutputStream.java#L159
   
   
   
   ## What is the link to the Apache JIRA
   https://issues.apache.org/jira/browse/HDDS-9912
   
   
   ## How was this patch tested?
   Existing test
   Performance testing result
   https://issues.apache.org/jira/browse/HDDS-9911
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


Re: [PR] HDDS-9912. EC client Reusing thread resources [ozone]

Posted by "xichen01 (via GitHub)" <gi...@apache.org>.
xichen01 commented on code in PR #5791:
URL: https://github.com/apache/ozone/pull/5791#discussion_r1471521535


##########
hadoop-hdds/client/src/main/java/org/apache/hadoop/hdds/scm/OzoneClientConfig.java:
##########
@@ -201,6 +201,21 @@ public enum ChecksumCombineMode {
   // 3 concurrent stripe read should be enough.
   private int ecReconstructStripeReadPoolLimit = 10 * 3;
 
+  @Config(key = "ec.reconstruct.stripe.write.pool.limit",
+      defaultValue = "30",
+      description = "Thread pool max size for parallelly write" +
+          " available ec chunks to reconstruct the whole stripe.",
+      tags = ConfigTag.CLIENT)
+  private int ecReconstructStripeWritePoolLimit = 10 * 3;
+
+  @Config(key = "ec.client.write.pool.limit",
+      defaultValue = "200",
+      description = "Maximum number of threads in the pool for handling client-side" +
+          " write operations in erasure coding. This setting controls the concurrency " +
+          "level for writing data blocks, ensuring efficient data processing and" +
+          " throughput while managing resource utilization.",
+      tags = ConfigTag.CLIENT)
+  private int ecClientWritePoolLimit = 200;

Review Comment:
   Done.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


Re: [PR] HDDS-9912. EC client Reusing thread resources [ozone]

Posted by "xichen01 (via GitHub)" <gi...@apache.org>.
xichen01 commented on PR #5791:
URL: https://github.com/apache/ozone/pull/5791#issuecomment-1929290462

   > @xichen01 , thanks a lot for working on this. The idea of reusing threads is great! The change here is quite big. How about separating this into at least two JIRAs?
   > 
   > 1. changing both the common code
   > 2. changing the ec code.
   
   I understand, but although the code seems to have changed a lot, most of the changes were actually made to pass parameters, so it may not be very easy to split, but if you think it's necessary, I'll give it a try. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


Re: [PR] HDDS-9912. EC client Reusing thread resources [ozone]

Posted by "xichen01 (via GitHub)" <gi...@apache.org>.
xichen01 commented on code in PR #5791:
URL: https://github.com/apache/ozone/pull/5791#discussion_r1471498647


##########
hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/rpc/RpcClient.java:
##########
@@ -2480,26 +2488,45 @@ public void setTimes(OzoneObj obj, String keyName, long mtime, long atime)
     ozoneManagerClient.setTimes(builder.build(), mtime, atime);
   }
 
+  private ExecutorService createThreadPoolExecutor(
+      int corePoolSize, int maximumPoolSize, String threadNameFormat) {
+    return new ThreadPoolExecutor(corePoolSize, maximumPoolSize,
+        60, TimeUnit.SECONDS, new SynchronousQueue<>(),
+        new ThreadFactoryBuilder()
+            .setNameFormat(threadNameFormat)
+            .build(),
+        new ThreadPoolExecutor.CallerRunsPolicy());
+  }
+
   public ExecutorService getECReconstructExecutor() {
-    // local ref to a volatile to ensure access
-    // to a completed initialized object
-    ExecutorService executor = ecReconstructExecutor;
-    if (executor == null) {
+    ExecutorService localRef = ecReconstructExecutor;
+    if (localRef == null) {
       synchronized (this) {
-        executor = ecReconstructExecutor;
-        if (executor == null) {
-          ecReconstructExecutor = new ThreadPoolExecutor(
-              EC_RECONSTRUCT_STRIPE_READ_POOL_MIN_SIZE,
+        localRef = ecReconstructExecutor;
+        if (localRef == null) {
+          localRef = createThreadPoolExecutor(EC_RECONSTRUCT_STRIPE_READ_POOL_MIN_SIZE,
               clientConfig.getEcReconstructStripeReadPoolLimit(),
-              60, TimeUnit.SECONDS, new SynchronousQueue<>(),
-              new ThreadFactoryBuilder()
-                  .setNameFormat("ec-reconstruct-reader-TID-%d")
-                  .build(),
-              new ThreadPoolExecutor.CallerRunsPolicy());
-          executor = ecReconstructExecutor;
+              "ec-reconstruct-reader-TID-%d");
+          ecReconstructExecutor = localRef;
+        }
+      }
+    }
+    return localRef;
+  }
+
+  public ExecutorService getECWriteThreadPool() {
+    ExecutorService localRef = ecWriteExecutor;
+    if (localRef == null) {
+      synchronized (this) {
+        localRef = ecWriteExecutor;
+        if (localRef == null) {
+          localRef = createThreadPoolExecutor(EC_WRITE_POOL_MIN_SIZE,
+              clientConfig.getEcClientWritePoolLimit(),

Review Comment:
   Do you means `Executors.newCachedThreadPool()`? But an unlimited threads likely to consume too much client-side performance; after all, it is possible that a user's client is not necessarily configured to be very strong



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


Re: [PR] HDDS-9912. EC client Reusing thread resources [ozone]

Posted by "szetszwo (via GitHub)" <gi...@apache.org>.
szetszwo commented on PR #5791:
URL: https://github.com/apache/ozone/pull/5791#issuecomment-1930493579

   > I understand, but although the code seems to have changed a lot, most of the changes were actually made to pass parameters, so it may not be very easy to split, but if you think it's necessary, I'll give it a try.
   
   I tried to review this but not able to reason about the changes.  In particular, I like to understand why it needs the new class `BlockOutputStreamResourceProvider` and also the synchronization change.
   
   If the changes are just some simply, mechanical changes, it is not hard to review even the change size is large.  However, it is hard to review when some complicated changes (e.g. synchronization) is mixed in a lot of simple changes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


Re: [PR] HDDS-9912. EC client Reusing thread resources [ozone]

Posted by "adoroszlai (via GitHub)" <gi...@apache.org>.
adoroszlai commented on code in PR #5791:
URL: https://github.com/apache/ozone/pull/5791#discussion_r1477057036


##########
hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/rpc/RpcClient.java:
##########
@@ -2358,7 +2366,7 @@ private KeyOutputStream.Builder createKeyOutputStream(
         .enableUnsafeByteBufferConversion(unsafeByteBufferConversion)
         .setConfig(clientConfig)
         .setAtomicKeyCreation(isS3GRequest.get())
-        .setClientMetrics(clientMetrics)
+        .setBlockOutPutStreamResourceProvider(blockOutPutStreamResourceProvider)

Review Comment:
   Thanks a lot @xichen01 for updating the patch.  Most of the updates look good, but I think this part needs to improved.  I think it would be better to replace the `LinkedList` in `AbstractCommitWatcher` with a `CopyOnWriteArrayList`, rather than synchronizing `thenApplyAsync` completely.
   
   CC @szetszwo



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


Re: [PR] HDDS-9912. EC client Reusing thread resources [ozone]

Posted by "adoroszlai (via GitHub)" <gi...@apache.org>.
adoroszlai commented on code in PR #5791:
URL: https://github.com/apache/ozone/pull/5791#discussion_r1475533377


##########
hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/rpc/RpcClient.java:
##########
@@ -2480,26 +2488,45 @@ public void setTimes(OzoneObj obj, String keyName, long mtime, long atime)
     ozoneManagerClient.setTimes(builder.build(), mtime, atime);
   }
 
+  private ExecutorService createThreadPoolExecutor(
+      int corePoolSize, int maximumPoolSize, String threadNameFormat) {
+    return new ThreadPoolExecutor(corePoolSize, maximumPoolSize,
+        60, TimeUnit.SECONDS, new SynchronousQueue<>(),
+        new ThreadFactoryBuilder()
+            .setNameFormat(threadNameFormat)
+            .build(),
+        new ThreadPoolExecutor.CallerRunsPolicy());
+  }
+
   public ExecutorService getECReconstructExecutor() {
-    // local ref to a volatile to ensure access
-    // to a completed initialized object
-    ExecutorService executor = ecReconstructExecutor;
-    if (executor == null) {
+    ExecutorService localRef = ecReconstructExecutor;
+    if (localRef == null) {
       synchronized (this) {
-        executor = ecReconstructExecutor;
-        if (executor == null) {
-          ecReconstructExecutor = new ThreadPoolExecutor(
-              EC_RECONSTRUCT_STRIPE_READ_POOL_MIN_SIZE,
+        localRef = ecReconstructExecutor;
+        if (localRef == null) {
+          localRef = createThreadPoolExecutor(EC_RECONSTRUCT_STRIPE_READ_POOL_MIN_SIZE,
               clientConfig.getEcReconstructStripeReadPoolLimit(),
-              60, TimeUnit.SECONDS, new SynchronousQueue<>(),
-              new ThreadFactoryBuilder()
-                  .setNameFormat("ec-reconstruct-reader-TID-%d")
-                  .build(),
-              new ThreadPoolExecutor.CallerRunsPolicy());
-          executor = ecReconstructExecutor;
+              "ec-reconstruct-reader-TID-%d");
+          ecReconstructExecutor = localRef;
+        }
+      }
+    }
+    return localRef;
+  }
+
+  public ExecutorService getECWriteThreadPool() {
+    ExecutorService localRef = ecWriteExecutor;
+    if (localRef == null) {
+      synchronized (this) {
+        localRef = ecWriteExecutor;
+        if (localRef == null) {
+          localRef = createThreadPoolExecutor(EC_WRITE_POOL_MIN_SIZE,
+              clientConfig.getEcClientWritePoolLimit(),

Review Comment:
   I was thinking it's the client's decision to start few or many parallel writes.  With the new config, the responsibility is still at the client-side, but now it needs to be configured explicitly.
   
   I don't know if the config helps in practice or not.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


Re: [PR] HDDS-9912. EC client Reusing thread resources [ozone]

Posted by "xichen01 (via GitHub)" <gi...@apache.org>.
xichen01 commented on PR #5791:
URL: https://github.com/apache/ozone/pull/5791#issuecomment-1935735705

   > > I understand, but although the code seems to have changed a lot, most of the changes were actually made to pass parameters, so it may not be very easy to split, but if you think it's necessary, I'll give it a try.
   > 
   > I tried to review this but not able to reason about the changes. In particular, I like to understand why it needs the new class `BlockOutputStreamResourceProvider` and also the synchronization change.
   > 
   > If the changes are just some simply, mechanical changes, it is not hard to review even the change size is large. However, it is hard to review when some complicated changes (e.g. synchronization) is mixed in a lot of simple changes.
   
   OK, thank for you explain, I will try to split this into two JIRAs.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


Re: [PR] HDDS-9912. EC client Reusing thread resources [ozone]

Posted by "adoroszlai (via GitHub)" <gi...@apache.org>.
adoroszlai commented on code in PR #5791:
URL: https://github.com/apache/ozone/pull/5791#discussion_r1451735218


##########
hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/rpc/RpcClient.java:
##########
@@ -2358,7 +2366,7 @@ private KeyOutputStream.Builder createKeyOutputStream(
         .enableUnsafeByteBufferConversion(unsafeByteBufferConversion)
         .setConfig(clientConfig)
         .setAtomicKeyCreation(isS3GRequest.get())
-        .setClientMetrics(clientMetrics)
+        .setBlockOutPutStreamResourceProvider(blockOutPutStreamResourceProvider)

Review Comment:
   Maybe I'm missing something, but it seems the same `ecWriteExecutor` is used for both EC and Ratis keys.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


Re: [PR] HDDS-9912. EC client Reusing thread resources [ozone]

Posted by "adoroszlai (via GitHub)" <gi...@apache.org>.
adoroszlai commented on code in PR #5791:
URL: https://github.com/apache/ozone/pull/5791#discussion_r1451735218


##########
hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/rpc/RpcClient.java:
##########
@@ -2358,7 +2366,7 @@ private KeyOutputStream.Builder createKeyOutputStream(
         .enableUnsafeByteBufferConversion(unsafeByteBufferConversion)
         .setConfig(clientConfig)
         .setAtomicKeyCreation(isS3GRequest.get())
-        .setClientMetrics(clientMetrics)
+        .setBlockOutPutStreamResourceProvider(blockOutPutStreamResourceProvider)

Review Comment:
   Maybe I'm missing something, but it seems the same `ecWriteExecutor` is used for both EC and Ratis keys:
   
   https://github.com/apache/ozone/blob/f695a374420a49728e2be352534cbdbe48ece0d9/hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/rpc/RpcClient.java#L319-L320
   
   @xichen01 Shouldn't we use a different resource provider for Ratis keys?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


Re: [PR] HDDS-9912. EC client Reusing thread resources [ozone]

Posted by "guohao-rosicky (via GitHub)" <gi...@apache.org>.
guohao-rosicky commented on code in PR #5791:
URL: https://github.com/apache/ozone/pull/5791#discussion_r1469494041


##########
hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/rpc/RpcClient.java:
##########
@@ -2358,7 +2366,7 @@ private KeyOutputStream.Builder createKeyOutputStream(
         .enableUnsafeByteBufferConversion(unsafeByteBufferConversion)
         .setConfig(clientConfig)
         .setAtomicKeyCreation(isS3GRequest.get())
-        .setClientMetrics(clientMetrics)
+        .setBlockOutPutStreamResourceProvider(blockOutPutStreamResourceProvider)

Review Comment:
   Hi @adoroszlai @xichen01 , Thanks work on this, I think the reason for the current SingleThreadExecutor is because it uses asynchrony in an OutputStream and needs to guarantee the order of execution. If that's the case, is switching to a multi-threaded Executor still a guarantee?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


Re: [PR] HDDS-9912. EC client Reusing thread resources [ozone]

Posted by "xichen01 (via GitHub)" <gi...@apache.org>.
xichen01 commented on code in PR #5791:
URL: https://github.com/apache/ozone/pull/5791#discussion_r1470783056


##########
hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/rpc/RpcClient.java:
##########
@@ -2358,7 +2366,7 @@ private KeyOutputStream.Builder createKeyOutputStream(
         .enableUnsafeByteBufferConversion(unsafeByteBufferConversion)
         .setConfig(clientConfig)
         .setAtomicKeyCreation(isS3GRequest.get())
-        .setClientMetrics(clientMetrics)
+        .setBlockOutPutStreamResourceProvider(blockOutPutStreamResourceProvider)

Review Comment:
   > Hi @adoroszlai @xichen01 , Thanks work on this, I think the reason for the current SingleThreadExecutor is because it uses asynchrony in an OutputStream and needs to guarantee the order of execution. If that's the case, is switching to a multi-threaded Executor still a guarantee?
   
   @guohao-rosicky. I think it's okay here, because `ecWriteExecutor` is just used to wait for an RPC response, not to send a request. the thread that sends request was always a fixed single thread.
   @adoroszlai how do you think?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


Re: [PR] HDDS-9912. EC client Reusing thread resources [ozone]

Posted by "adoroszlai (via GitHub)" <gi...@apache.org>.
adoroszlai commented on code in PR #5791:
URL: https://github.com/apache/ozone/pull/5791#discussion_r1469927805


##########
hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/rpc/RpcClient.java:
##########
@@ -2358,7 +2366,7 @@ private KeyOutputStream.Builder createKeyOutputStream(
         .enableUnsafeByteBufferConversion(unsafeByteBufferConversion)
         .setConfig(clientConfig)
         .setAtomicKeyCreation(isS3GRequest.get())
-        .setClientMetrics(clientMetrics)
+        .setBlockOutPutStreamResourceProvider(blockOutPutStreamResourceProvider)

Review Comment:
   Thanks for the clarification.  I agree, we should rename it if it's intended for both.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


Re: [PR] HDDS-9912. EC client Reusing thread resources [ozone]

Posted by "adoroszlai (via GitHub)" <gi...@apache.org>.
adoroszlai commented on code in PR #5791:
URL: https://github.com/apache/ozone/pull/5791#discussion_r1471098799


##########
hadoop-hdds/client/src/main/java/org/apache/hadoop/hdds/scm/OzoneClientConfig.java:
##########
@@ -201,6 +201,21 @@ public enum ChecksumCombineMode {
   // 3 concurrent stripe read should be enough.
   private int ecReconstructStripeReadPoolLimit = 10 * 3;
 
+  @Config(key = "ec.reconstruct.stripe.write.pool.limit",
+      defaultValue = "30",
+      description = "Thread pool max size for parallelly write" +
+          " available ec chunks to reconstruct the whole stripe.",
+      tags = ConfigTag.CLIENT)
+  private int ecReconstructStripeWritePoolLimit = 10 * 3;
+
+  @Config(key = "ec.client.write.pool.limit",
+      defaultValue = "200",
+      description = "Maximum number of threads in the pool for handling client-side" +
+          " write operations in erasure coding. This setting controls the concurrency " +
+          "level for writing data blocks, ensuring efficient data processing and" +
+          " throughput while managing resource utilization.",
+      tags = ConfigTag.CLIENT)
+  private int ecClientWritePoolLimit = 200;

Review Comment:
   Since this applies to all writes, not just EC, let's rename it.



##########
hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/rpc/RpcClient.java:
##########
@@ -2480,26 +2488,45 @@ public void setTimes(OzoneObj obj, String keyName, long mtime, long atime)
     ozoneManagerClient.setTimes(builder.build(), mtime, atime);
   }
 
+  private ExecutorService createThreadPoolExecutor(
+      int corePoolSize, int maximumPoolSize, String threadNameFormat) {
+    return new ThreadPoolExecutor(corePoolSize, maximumPoolSize,
+        60, TimeUnit.SECONDS, new SynchronousQueue<>(),
+        new ThreadFactoryBuilder()
+            .setNameFormat(threadNameFormat)
+            .build(),
+        new ThreadPoolExecutor.CallerRunsPolicy());
+  }
+
   public ExecutorService getECReconstructExecutor() {
-    // local ref to a volatile to ensure access
-    // to a completed initialized object
-    ExecutorService executor = ecReconstructExecutor;
-    if (executor == null) {
+    ExecutorService localRef = ecReconstructExecutor;
+    if (localRef == null) {
       synchronized (this) {
-        executor = ecReconstructExecutor;
-        if (executor == null) {
-          ecReconstructExecutor = new ThreadPoolExecutor(
-              EC_RECONSTRUCT_STRIPE_READ_POOL_MIN_SIZE,
+        localRef = ecReconstructExecutor;
+        if (localRef == null) {
+          localRef = createThreadPoolExecutor(EC_RECONSTRUCT_STRIPE_READ_POOL_MIN_SIZE,
               clientConfig.getEcReconstructStripeReadPoolLimit(),
-              60, TimeUnit.SECONDS, new SynchronousQueue<>(),
-              new ThreadFactoryBuilder()
-                  .setNameFormat("ec-reconstruct-reader-TID-%d")
-                  .build(),
-              new ThreadPoolExecutor.CallerRunsPolicy());
-          executor = ecReconstructExecutor;
+              "ec-reconstruct-reader-TID-%d");
+          ecReconstructExecutor = localRef;
+        }
+      }
+    }
+    return localRef;
+  }
+
+  public ExecutorService getECWriteThreadPool() {
+    ExecutorService localRef = ecWriteExecutor;
+    if (localRef == null) {
+      synchronized (this) {
+        localRef = ecWriteExecutor;
+        if (localRef == null) {
+          localRef = createThreadPoolExecutor(EC_WRITE_POOL_MIN_SIZE,
+              clientConfig.getEcClientWritePoolLimit(),

Review Comment:
   One question: should we make this an unbounded cached thread pool?  This would help resource re-use (as opposed to current single thread per output stream), but also allow scaling further without config change.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


Re: [PR] HDDS-9912. EC client Reusing thread resources [ozone]

Posted by "adoroszlai (via GitHub)" <gi...@apache.org>.
adoroszlai commented on code in PR #5791:
URL: https://github.com/apache/ozone/pull/5791#discussion_r1475552873


##########
hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/rpc/RpcClient.java:
##########
@@ -2358,7 +2366,7 @@ private KeyOutputStream.Builder createKeyOutputStream(
         .enableUnsafeByteBufferConversion(unsafeByteBufferConversion)
         .setConfig(clientConfig)
         .setAtomicKeyCreation(isS3GRequest.get())
-        .setClientMetrics(clientMetrics)
+        .setBlockOutPutStreamResourceProvider(blockOutPutStreamResourceProvider)

Review Comment:
   @xichen01 This part of response processing may not be fully safe for multi-thread use:
   
   https://github.com/apache/ozone/blob/6f507c9bb1a66e667006f8609a3e900950fedd20/hadoop-hdds/client/src/main/java/org/apache/hadoop/hdds/scm/storage/BlockOutputStream.java#L490-L493
   
   https://github.com/apache/ozone/blob/6f507c9bb1a66e667006f8609a3e900950fedd20/hadoop-hdds/client/src/main/java/org/apache/hadoop/hdds/scm/storage/RatisBlockOutputStream.java#L108-L109
   
   https://github.com/apache/ozone/blob/6f507c9bb1a66e667006f8609a3e900950fedd20/hadoop-hdds/client/src/main/java/org/apache/hadoop/hdds/scm/storage/AbstractCommitWatcher.java#L76-L79
   
   due to the `LinkedList`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


Re: [PR] HDDS-9912. EC client Reusing thread resources [ozone]

Posted by "szetszwo (via GitHub)" <gi...@apache.org>.
szetszwo commented on code in PR #5791:
URL: https://github.com/apache/ozone/pull/5791#discussion_r1478700539


##########
hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/rpc/RpcClient.java:
##########
@@ -2358,7 +2366,7 @@ private KeyOutputStream.Builder createKeyOutputStream(
         .enableUnsafeByteBufferConversion(unsafeByteBufferConversion)
         .setConfig(clientConfig)
         .setAtomicKeyCreation(isS3GRequest.get())
-        .setClientMetrics(clientMetrics)
+        .setBlockOutPutStreamResourceProvider(blockOutPutStreamResourceProvider)

Review Comment:
   Could the `synchronized` be moved to `AbstractCommitWatcher`?  It is a much smaller class.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


Re: [PR] HDDS-9912. EC client Reusing thread resources [ozone]

Posted by "adoroszlai (via GitHub)" <gi...@apache.org>.
adoroszlai commented on PR #5791:
URL: https://github.com/apache/ozone/pull/5791#issuecomment-1885767762

   @duongkame @jojochuang @kerneltime @umamaheswararao please review


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


Re: [PR] HDDS-9912. EC client Reusing thread resources [ozone]

Posted by "xichen01 (via GitHub)" <gi...@apache.org>.
xichen01 commented on code in PR #5791:
URL: https://github.com/apache/ozone/pull/5791#discussion_r1469914548


##########
hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/rpc/RpcClient.java:
##########
@@ -2358,7 +2366,7 @@ private KeyOutputStream.Builder createKeyOutputStream(
         .enableUnsafeByteBufferConversion(unsafeByteBufferConversion)
         .setConfig(clientConfig)
         .setAtomicKeyCreation(isS3GRequest.get())
-        .setClientMetrics(clientMetrics)
+        .setBlockOutPutStreamResourceProvider(blockOutPutStreamResourceProvider)

Review Comment:
   > Maybe I'm missing something, but it seems the same `ecWriteExecutor` is used for both EC and Ratis keys:
   > 
   > https://github.com/apache/ozone/blob/f695a374420a49728e2be352534cbdbe48ece0d9/hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/rpc/RpcClient.java#L319-L320
   > 
   > @xichen01 Shouldn't we use a different resource provider for Ratis keys?
   
   @adoroszlai Yes, since Ratis Key and EC Key reuse class `BlockOutputStream`, the EC Key and Ratis Key both use this `ecWriteExecutor`. I think that's acceptable, since `ecWriteExecutor` just provides thread resources, not logic. And the Ratis key also benefits from not having to create a new thread for each Block. But mybe we need rename the `ecWriteExecutor` to a more normal name.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


Re: [PR] HDDS-9912. EC client Reusing thread resources [ozone]

Posted by "xichen01 (via GitHub)" <gi...@apache.org>.
xichen01 commented on code in PR #5791:
URL: https://github.com/apache/ozone/pull/5791#discussion_r1484182126


##########
hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/rpc/RpcClient.java:
##########
@@ -2358,7 +2366,7 @@ private KeyOutputStream.Builder createKeyOutputStream(
         .enableUnsafeByteBufferConversion(unsafeByteBufferConversion)
         .setConfig(clientConfig)
         .setAtomicKeyCreation(isS3GRequest.get())
-        .setClientMetrics(clientMetrics)
+        .setBlockOutPutStreamResourceProvider(blockOutPutStreamResourceProvider)

Review Comment:
   `synchronized` has been moved to `AbstractCommitWatcher`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


Re: [PR] HDDS-9912. EC client Reusing thread resources [ozone]

Posted by "xichen01 (via GitHub)" <gi...@apache.org>.
xichen01 closed pull request #5791: HDDS-9912. EC client Reusing thread resources
URL: https://github.com/apache/ozone/pull/5791


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


Re: [PR] HDDS-9912. EC client Reusing thread resources [ozone]

Posted by "xichen01 (via GitHub)" <gi...@apache.org>.
xichen01 commented on code in PR #5791:
URL: https://github.com/apache/ozone/pull/5791#discussion_r1477006838


##########
hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/rpc/RpcClient.java:
##########
@@ -2480,26 +2488,45 @@ public void setTimes(OzoneObj obj, String keyName, long mtime, long atime)
     ozoneManagerClient.setTimes(builder.build(), mtime, atime);
   }
 
+  private ExecutorService createThreadPoolExecutor(
+      int corePoolSize, int maximumPoolSize, String threadNameFormat) {
+    return new ThreadPoolExecutor(corePoolSize, maximumPoolSize,
+        60, TimeUnit.SECONDS, new SynchronousQueue<>(),
+        new ThreadFactoryBuilder()
+            .setNameFormat(threadNameFormat)
+            .build(),
+        new ThreadPoolExecutor.CallerRunsPolicy());
+  }
+
   public ExecutorService getECReconstructExecutor() {
-    // local ref to a volatile to ensure access
-    // to a completed initialized object
-    ExecutorService executor = ecReconstructExecutor;
-    if (executor == null) {
+    ExecutorService localRef = ecReconstructExecutor;
+    if (localRef == null) {
       synchronized (this) {
-        executor = ecReconstructExecutor;
-        if (executor == null) {
-          ecReconstructExecutor = new ThreadPoolExecutor(
-              EC_RECONSTRUCT_STRIPE_READ_POOL_MIN_SIZE,
+        localRef = ecReconstructExecutor;
+        if (localRef == null) {
+          localRef = createThreadPoolExecutor(EC_RECONSTRUCT_STRIPE_READ_POOL_MIN_SIZE,
               clientConfig.getEcReconstructStripeReadPoolLimit(),
-              60, TimeUnit.SECONDS, new SynchronousQueue<>(),
-              new ThreadFactoryBuilder()
-                  .setNameFormat("ec-reconstruct-reader-TID-%d")
-                  .build(),
-              new ThreadPoolExecutor.CallerRunsPolicy());
-          executor = ecReconstructExecutor;
+              "ec-reconstruct-reader-TID-%d");
+          ecReconstructExecutor = localRef;
+        }
+      }
+    }
+    return localRef;
+  }
+
+  public ExecutorService getECWriteThreadPool() {
+    ExecutorService localRef = ecWriteExecutor;
+    if (localRef == null) {
+      synchronized (this) {
+        localRef = ecWriteExecutor;
+        if (localRef == null) {
+          localRef = createThreadPoolExecutor(EC_WRITE_POOL_MIN_SIZE,
+              clientConfig.getEcClientWritePoolLimit(),

Review Comment:
   Understand, in fact, the original `BlockOutputStream` is equivalent to an unbounded thread because a new thread is created for each `BlockOutputStream` instance.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


Re: [PR] HDDS-9912. EC client Reusing thread resources [ozone]

Posted by "adoroszlai (via GitHub)" <gi...@apache.org>.
adoroszlai commented on code in PR #5791:
URL: https://github.com/apache/ozone/pull/5791#discussion_r1469927805


##########
hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/rpc/RpcClient.java:
##########
@@ -2358,7 +2366,7 @@ private KeyOutputStream.Builder createKeyOutputStream(
         .enableUnsafeByteBufferConversion(unsafeByteBufferConversion)
         .setConfig(clientConfig)
         .setAtomicKeyCreation(isS3GRequest.get())
-        .setClientMetrics(clientMetrics)
+        .setBlockOutPutStreamResourceProvider(blockOutPutStreamResourceProvider)

Review Comment:
   Thanks for the clarification.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


Re: [PR] HDDS-9912. EC client Reusing thread resources [ozone]

Posted by "adoroszlai (via GitHub)" <gi...@apache.org>.
adoroszlai commented on code in PR #5791:
URL: https://github.com/apache/ozone/pull/5791#discussion_r1480162258


##########
hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/rpc/RpcClient.java:
##########
@@ -2358,7 +2366,7 @@ private KeyOutputStream.Builder createKeyOutputStream(
         .enableUnsafeByteBufferConversion(unsafeByteBufferConversion)
         .setConfig(clientConfig)
         .setAtomicKeyCreation(isS3GRequest.get())
-        .setClientMetrics(clientMetrics)
+        .setBlockOutPutStreamResourceProvider(blockOutPutStreamResourceProvider)

Review Comment:
   > closure accesses a lot of external variables, which could be a potential thread-safety issue
   
   I have the following concerns:
   
   1. Are the same external variables protected by the same object elsewhere?  If not, there is no point in making this synchronized.
   2. `validateResponse` only accesses an atomic reference and the response object itself.  Neither of those need to be synced.
   3. Protobuf conversion (`BlockID.getFromProtobuf`) should not be performed while holding the lock.
   4. `(this.)blockID` is also atomic, does not need the lock.
   5. logging should not be performed while holding the lock.
   
   So in the end I think only `updateCommitInfo` needs to be protected.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


Re: [PR] HDDS-9912. EC client Reusing thread resources [ozone]

Posted by "xichen01 (via GitHub)" <gi...@apache.org>.
xichen01 commented on code in PR #5791:
URL: https://github.com/apache/ozone/pull/5791#discussion_r1477006859


##########
hadoop-hdds/client/src/main/java/org/apache/hadoop/ozone/client/io/BlockOutPutStreamResourceProvider.java:
##########
@@ -0,0 +1,60 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.ozone.client.io;
+
+import java.util.concurrent.ExecutorService;
+import java.util.function.Supplier;
+import org.apache.hadoop.hdds.scm.ContainerClientMetrics;
+
+/**
+ * Provides resources for BlockOutputStream, including byte buffer pool,
+ * executor service, and client metrics.
+ */
+public final class BlockOutPutStreamResourceProvider {

Review Comment:
   Done.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


Re: [PR] HDDS-9912. EC client Reusing thread resources [ozone]

Posted by "adoroszlai (via GitHub)" <gi...@apache.org>.
adoroszlai commented on code in PR #5791:
URL: https://github.com/apache/ozone/pull/5791#discussion_r1475542127


##########
hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/io/ECKeyOutputStream.java:
##########
@@ -149,17 +149,20 @@ private ECKeyOutputStream(Builder builder) {
             builder.isMultipartKey(),
             info, builder.isUnsafeByteBufferConversionEnabled(),
             builder.getXceiverManager(), builder.getOpenHandler().getId(),
-            builder.getClientMetrics(), builder.getStreamBufferArgs());
+            builder.getStreamBufferArgs(),
+            builder.getBlockOutPutStreamResourceProvider());
 
     this.writeOffset = 0;
     this.encoder = CodecUtil.createRawEncoderWithFallback(
         builder.getReplicationConfig());
-    this.flushExecutor = Executors.newSingleThreadExecutor();
     S3Auth s3Auth = builder.getS3CredentialsProvider().get();
     ThreadLocal<S3Auth> s3CredentialsProvider =
         builder.getS3CredentialsProvider();
-    flushExecutor.submit(() -> s3CredentialsProvider.set(s3Auth));
-    this.flushFuture = this.flushExecutor.submit(this::flushStripeFromQueue);
+    this.flushExecutor = builder.getBlockOutPutStreamResourceProvider().getThreadFactory();

Review Comment:
   `flushExecutor` member can be removed



##########
hadoop-hdds/client/src/main/java/org/apache/hadoop/ozone/client/io/BlockOutPutStreamResourceProvider.java:
##########
@@ -0,0 +1,60 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.ozone.client.io;
+
+import java.util.concurrent.ExecutorService;
+import java.util.function.Supplier;
+import org.apache.hadoop.hdds.scm.ContainerClientMetrics;
+
+/**
+ * Provides resources for BlockOutputStream, including byte buffer pool,
+ * executor service, and client metrics.
+ */
+public final class BlockOutPutStreamResourceProvider {

Review Comment:
   nit: please rename to `BlockOutputStreamResourceProvider` for consistency



##########
hadoop-hdds/client/src/main/java/org/apache/hadoop/ozone/client/io/BlockOutPutStreamResourceProvider.java:
##########
@@ -0,0 +1,60 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.ozone.client.io;
+
+import java.util.concurrent.ExecutorService;
+import java.util.function.Supplier;
+import org.apache.hadoop.hdds.scm.ContainerClientMetrics;
+
+/**
+ * Provides resources for BlockOutputStream, including byte buffer pool,
+ * executor service, and client metrics.
+ */
+public final class BlockOutPutStreamResourceProvider {
+  private final Supplier<ExecutorService> threadFactorySupplier;
+  private final ContainerClientMetrics clientMetrics;
+
+  /**
+   * Creates an instance of BlockOutPutStreamResourceProvider.
+   */
+  public static BlockOutPutStreamResourceProvider create(
+      Supplier<ExecutorService> threadFactorySupplier, ContainerClientMetrics clientMetrics) {
+    return new BlockOutPutStreamResourceProvider(threadFactorySupplier, clientMetrics);
+  }
+
+  private BlockOutPutStreamResourceProvider(Supplier<ExecutorService> threadFactorySupplier,
+      ContainerClientMetrics clientMetrics) {
+    this.threadFactorySupplier = threadFactorySupplier;
+    this.clientMetrics = clientMetrics;
+  }
+
+  /**
+   * Provides an ExecutorService, lazily initialized upon first request.
+   */
+  public ExecutorService getThreadFactory() {

Review Comment:
   nit:
   
   Please rename to `getExecutorService`.  `ThreadFactory` and `ExecutorService` are different classes.  (Also rename in factory/constructor parameters and member variable.)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


Re: [PR] HDDS-9912. EC client Reusing thread resources [ozone]

Posted by "xichen01 (via GitHub)" <gi...@apache.org>.
xichen01 commented on code in PR #5791:
URL: https://github.com/apache/ozone/pull/5791#discussion_r1479595309


##########
hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/rpc/RpcClient.java:
##########
@@ -2358,7 +2366,7 @@ private KeyOutputStream.Builder createKeyOutputStream(
         .enableUnsafeByteBufferConversion(unsafeByteBufferConversion)
         .setConfig(clientConfig)
         .setAtomicKeyCreation(isS3GRequest.get())
-        .setClientMetrics(clientMetrics)
+        .setBlockOutPutStreamResourceProvider(blockOutPutStreamResourceProvider)

Review Comment:
   I noticed that the closure accesses a lot of external variables, which could be a potential thread-safety issue, so I just added a "big" `synchronized` here, which ensures that even if someone changes the code later on, there's probably no thread-safety issue, and the `thenApplyAsync` just sets/creates the variable, which is supposed to be lightweight.
   So, I think the current approach is acceptable, what do you think?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


Re: [PR] HDDS-9912. EC client Reusing thread resources [ozone]

Posted by "szetszwo (via GitHub)" <gi...@apache.org>.
szetszwo commented on PR #5791:
URL: https://github.com/apache/ozone/pull/5791#issuecomment-1927736283

   @xichen01 , thanks a lot for working on this.  The idea of reusing threads is great!  The change here is quite big.  How about separating this into at least two JIRAs?
   1. changing both the common code
   2. changing the ec code.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


Re: [PR] HDDS-9912. EC client Reusing thread resources [ozone]

Posted by "xichen01 (via GitHub)" <gi...@apache.org>.
xichen01 commented on PR #5791:
URL: https://github.com/apache/ozone/pull/5791#issuecomment-1945768879

   Split this PR to:
   https://issues.apache.org/jira/browse/HDDS-10383
   https://issues.apache.org/jira/browse/HDDS-10384
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


Re: [PR] HDDS-9912. EC client Reusing thread resources [ozone]

Posted by "szetszwo (via GitHub)" <gi...@apache.org>.
szetszwo commented on PR #5791:
URL: https://github.com/apache/ozone/pull/5791#issuecomment-1946500429

   @xichen01 , thank a lot!  Will review both JIRAs.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


Re: [PR] HDDS-9912. EC client Reusing thread resources [ozone]

Posted by "kerneltime (via GitHub)" <gi...@apache.org>.
kerneltime commented on PR #5791:
URL: https://github.com/apache/ozone/pull/5791#issuecomment-1861104884

   cc @tanvipenumudy 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


Re: [PR] HDDS-9912. EC client Reusing thread resources [ozone]

Posted by "adoroszlai (via GitHub)" <gi...@apache.org>.
adoroszlai commented on code in PR #5791:
URL: https://github.com/apache/ozone/pull/5791#discussion_r1471098079


##########
hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/rpc/RpcClient.java:
##########
@@ -2358,7 +2366,7 @@ private KeyOutputStream.Builder createKeyOutputStream(
         .enableUnsafeByteBufferConversion(unsafeByteBufferConversion)
         .setConfig(clientConfig)
         .setAtomicKeyCreation(isS3GRequest.get())
-        .setClientMetrics(clientMetrics)
+        .setBlockOutPutStreamResourceProvider(blockOutPutStreamResourceProvider)

Review Comment:
   I'll take another look.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


Re: [PR] HDDS-9912. EC client Reusing thread resources [ozone]

Posted by "adoroszlai (via GitHub)" <gi...@apache.org>.
adoroszlai commented on code in PR #5791:
URL: https://github.com/apache/ozone/pull/5791#discussion_r1469927805


##########
hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/rpc/RpcClient.java:
##########
@@ -2358,7 +2366,7 @@ private KeyOutputStream.Builder createKeyOutputStream(
         .enableUnsafeByteBufferConversion(unsafeByteBufferConversion)
         .setConfig(clientConfig)
         .setAtomicKeyCreation(isS3GRequest.get())
-        .setClientMetrics(clientMetrics)
+        .setBlockOutPutStreamResourceProvider(blockOutPutStreamResourceProvider)

Review Comment:
   Thanks for the clarification.  I agree, we should rename it if it intended for both.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


Re: [PR] HDDS-9912. EC client Reusing thread resources [ozone]

Posted by "xichen01 (via GitHub)" <gi...@apache.org>.
xichen01 commented on code in PR #5791:
URL: https://github.com/apache/ozone/pull/5791#discussion_r1477004996


##########
hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/rpc/RpcClient.java:
##########
@@ -2358,7 +2366,7 @@ private KeyOutputStream.Builder createKeyOutputStream(
         .enableUnsafeByteBufferConversion(unsafeByteBufferConversion)
         .setConfig(clientConfig)
         .setAtomicKeyCreation(isS3GRequest.get())
-        .setClientMetrics(clientMetrics)
+        .setBlockOutPutStreamResourceProvider(blockOutPutStreamResourceProvider)

Review Comment:
   Add a `synchronized` for `thenApplyAsync`. 



##########
hadoop-hdds/client/src/main/java/org/apache/hadoop/ozone/client/io/BlockOutPutStreamResourceProvider.java:
##########
@@ -0,0 +1,60 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.ozone.client.io;
+
+import java.util.concurrent.ExecutorService;
+import java.util.function.Supplier;
+import org.apache.hadoop.hdds.scm.ContainerClientMetrics;
+
+/**
+ * Provides resources for BlockOutputStream, including byte buffer pool,
+ * executor service, and client metrics.
+ */
+public final class BlockOutPutStreamResourceProvider {
+  private final Supplier<ExecutorService> threadFactorySupplier;
+  private final ContainerClientMetrics clientMetrics;
+
+  /**
+   * Creates an instance of BlockOutPutStreamResourceProvider.
+   */
+  public static BlockOutPutStreamResourceProvider create(
+      Supplier<ExecutorService> threadFactorySupplier, ContainerClientMetrics clientMetrics) {
+    return new BlockOutPutStreamResourceProvider(threadFactorySupplier, clientMetrics);
+  }
+
+  private BlockOutPutStreamResourceProvider(Supplier<ExecutorService> threadFactorySupplier,
+      ContainerClientMetrics clientMetrics) {
+    this.threadFactorySupplier = threadFactorySupplier;
+    this.clientMetrics = clientMetrics;
+  }
+
+  /**
+   * Provides an ExecutorService, lazily initialized upon first request.
+   */
+  public ExecutorService getThreadFactory() {

Review Comment:
   Done.



##########
hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/io/ECKeyOutputStream.java:
##########
@@ -149,17 +149,20 @@ private ECKeyOutputStream(Builder builder) {
             builder.isMultipartKey(),
             info, builder.isUnsafeByteBufferConversionEnabled(),
             builder.getXceiverManager(), builder.getOpenHandler().getId(),
-            builder.getClientMetrics(), builder.getStreamBufferArgs());
+            builder.getStreamBufferArgs(),
+            builder.getBlockOutPutStreamResourceProvider());
 
     this.writeOffset = 0;
     this.encoder = CodecUtil.createRawEncoderWithFallback(
         builder.getReplicationConfig());
-    this.flushExecutor = Executors.newSingleThreadExecutor();
     S3Auth s3Auth = builder.getS3CredentialsProvider().get();
     ThreadLocal<S3Auth> s3CredentialsProvider =
         builder.getS3CredentialsProvider();
-    flushExecutor.submit(() -> s3CredentialsProvider.set(s3Auth));
-    this.flushFuture = this.flushExecutor.submit(this::flushStripeFromQueue);
+    this.flushExecutor = builder.getBlockOutPutStreamResourceProvider().getThreadFactory();

Review Comment:
   Done.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org