You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ozone.apache.org by "adoroszlai (via GitHub)" <gi...@apache.org> on 2023/03/08 15:20:26 UTC

[GitHub] [ozone] adoroszlai opened a new pull request, #4363: HDDS-8108. gRPC channel created for replication client not shutdown properly

adoroszlai opened a new pull request, #4363:
URL: https://github.com/apache/ozone/pull/4363

   ## What changes were proposed in this pull request?
   
   Ensure `GrpcReplicationClient` is closed to fix leak.
   
   https://issues.apache.org/jira/browse/HDDS-8108
   
   ## How was this patch tested?
   
   Added unit tests.
   
   https://github.com/adoroszlai/hadoop-ozone/actions/runs/4364237406


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] kerneltime merged pull request #4363: HDDS-8108. gRPC channel created for replication client not shutdown properly

Posted by "kerneltime (via GitHub)" <gi...@apache.org>.
kerneltime merged PR #4363:
URL: https://github.com/apache/ozone/pull/4363


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] adoroszlai commented on pull request #4363: HDDS-8108. gRPC channel created for replication client not shutdown properly

Posted by "adoroszlai (via GitHub)" <gi...@apache.org>.
adoroszlai commented on PR #4363:
URL: https://github.com/apache/ozone/pull/4363#issuecomment-1460876737

   Thanks @duongkame, @kerneltime for the review.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] duongkame commented on a diff in pull request #4363: HDDS-8108. gRPC channel created for replication client not shutdown properly

Posted by "duongkame (via GitHub)" <gi...@apache.org>.
duongkame commented on code in PR #4363:
URL: https://github.com/apache/ozone/pull/4363#discussion_r1129886385


##########
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/replication/GrpcContainerUploader.java:
##########
@@ -53,18 +55,32 @@ public GrpcContainerUploader(
   public OutputStream startUpload(long containerId, DatanodeDetails target,
       CompletableFuture<Void> callback, CopyContainerCompression compression)
       throws IOException {
-    GrpcReplicationClient client =
-        new GrpcReplicationClient(target.getIpAddress(),
-            target.getPort(Port.Name.REPLICATION).getValue(),
-            securityConfig, certClient, compression);
-    StreamObserver<SendContainerRequest> requestStream = client.upload(
-        new SendContainerResponseStreamObserver(containerId, target, callback));
-    return new SendContainerOutputStream(requestStream, containerId,
-        GrpcReplicationService.BUFFER_SIZE, compression);
+    GrpcReplicationClient client = null;

Review Comment:
   Should the code around creating `client` and uploading be refactored to be done in one place to ease resource management, e.g. in `PushReplicator.replicate`.
   
   ```
   try (GrpcReplicationClient client  = grpcContainerUploader.createClient()) {
     try (OutputStream output : client.upload(....)) {
        source.copyData(containerID, output, compression);
        ...
     }
   }
   ```
   That way we wouldn't need to care about passing `client` around and closing it in multiple places.
   
   That's just a suggestion to simplify the model. Looks like in this PR you already take good care of closing resources. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] kerneltime commented on a diff in pull request #4363: HDDS-8108. gRPC channel created for replication client not shutdown properly

Posted by "kerneltime (via GitHub)" <gi...@apache.org>.
kerneltime commented on code in PR #4363:
URL: https://github.com/apache/ozone/pull/4363#discussion_r1129963806


##########
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/replication/SimpleContainerDownloader.java:
##########
@@ -72,28 +72,36 @@ public Path getContainerDataFromReplicas(
         shuffleDatanodes(sourceDatanodes);
 
     for (DatanodeDetails datanode : shuffledDatanodes) {
+      GrpcReplicationClient client = null;
       try {
+        client = createReplicationClient(datanode, compression);
         CompletableFuture<Path> result =
-            downloadContainer(containerId, datanode, downloadDir, compression);
+            downloadContainer(client, containerId, downloadDir);
         return result.get();
-      } catch (ExecutionException | IOException e) {
-        LOG.error("Error on replicating container: {} from {}/{}", containerId,
-            datanode.getHostName(), datanode.getIpAddress(), e);
       } catch (InterruptedException e) {
+        logError(e, containerId, datanode);
         Thread.currentThread().interrupt();
-      } catch (Exception ex) {
-        LOG.error("Container {} download from datanode {} was unsuccessful. "
-                + "Trying the next datanode", containerId, datanode, ex);
+      } catch (Exception e) {
+        logError(e, containerId, datanode);
+      } finally {
+        IOUtils.close(LOG, client);
       }
     }
     LOG.error("Container {} could not be downloaded from any datanode",
         containerId);
     return null;
   }
 
+  private static void logError(Exception e,
+      long containerId, DatanodeDetails datanode) {
+    LOG.error("Error on replicating container: {} from {}", containerId,

Review Comment:
   Nit: Functions that log error should be inlined even if duplicated.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] adoroszlai commented on a diff in pull request #4363: HDDS-8108. gRPC channel created for replication client not shutdown properly

Posted by "adoroszlai (via GitHub)" <gi...@apache.org>.
adoroszlai commented on code in PR #4363:
URL: https://github.com/apache/ozone/pull/4363#discussion_r1130027139


##########
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/replication/GrpcContainerUploader.java:
##########
@@ -53,18 +55,32 @@ public GrpcContainerUploader(
   public OutputStream startUpload(long containerId, DatanodeDetails target,
       CompletableFuture<Void> callback, CopyContainerCompression compression)
       throws IOException {
-    GrpcReplicationClient client =
-        new GrpcReplicationClient(target.getIpAddress(),
-            target.getPort(Port.Name.REPLICATION).getValue(),
-            securityConfig, certClient, compression);
-    StreamObserver<SendContainerRequest> requestStream = client.upload(
-        new SendContainerResponseStreamObserver(containerId, target, callback));
-    return new SendContainerOutputStream(requestStream, containerId,
-        GrpcReplicationService.BUFFER_SIZE, compression);
+    GrpcReplicationClient client = null;

Review Comment:
   Thanks for the idea.  I prefer to keep `PushReplicator` independent of the gRPC implementation, similar to `DownloadAndImportReplicator` (pull replication).  But I agree that passing around `client` just to close it is a hack, created HDDS-8119 to get rid of it.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] duongkame commented on a diff in pull request #4363: HDDS-8108. gRPC channel created for replication client not shutdown properly

Posted by "duongkame (via GitHub)" <gi...@apache.org>.
duongkame commented on code in PR #4363:
URL: https://github.com/apache/ozone/pull/4363#discussion_r1129886385


##########
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/replication/GrpcContainerUploader.java:
##########
@@ -53,18 +55,32 @@ public GrpcContainerUploader(
   public OutputStream startUpload(long containerId, DatanodeDetails target,
       CompletableFuture<Void> callback, CopyContainerCompression compression)
       throws IOException {
-    GrpcReplicationClient client =
-        new GrpcReplicationClient(target.getIpAddress(),
-            target.getPort(Port.Name.REPLICATION).getValue(),
-            securityConfig, certClient, compression);
-    StreamObserver<SendContainerRequest> requestStream = client.upload(
-        new SendContainerResponseStreamObserver(containerId, target, callback));
-    return new SendContainerOutputStream(requestStream, containerId,
-        GrpcReplicationService.BUFFER_SIZE, compression);
+    GrpcReplicationClient client = null;

Review Comment:
   Should the code around creating `client` and uploading be refactored to be done in one place, e.g. in `PushReplicator.replicate`.
   
   ```
   try (GrpcReplicationClient client  = grpcContainerUploader.createClient()) {
     try (OutputStream output : client.upload(....)) {
        source.copyData(containerID, output, compression);
        ...
     }
   }
   ```
   That way we wouldn't need to care about passing `client` around and closing it in multiple places.
   
   That's just a suggestion to simplify the model. Looks like in this PR you already take good care of closing resources. 



##########
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/replication/GrpcContainerUploader.java:
##########
@@ -53,18 +55,32 @@ public GrpcContainerUploader(
   public OutputStream startUpload(long containerId, DatanodeDetails target,
       CompletableFuture<Void> callback, CopyContainerCompression compression)
       throws IOException {
-    GrpcReplicationClient client =
-        new GrpcReplicationClient(target.getIpAddress(),
-            target.getPort(Port.Name.REPLICATION).getValue(),
-            securityConfig, certClient, compression);
-    StreamObserver<SendContainerRequest> requestStream = client.upload(
-        new SendContainerResponseStreamObserver(containerId, target, callback));
-    return new SendContainerOutputStream(requestStream, containerId,
-        GrpcReplicationService.BUFFER_SIZE, compression);
+    GrpcReplicationClient client = null;
+    try {
+      client = createReplicationClient(target, compression);
+      StreamObserver<SendContainerRequest> requestStream = client.upload(
+          new SendContainerResponseStreamObserver(containerId, target,
+              callback));
+      return new SendContainerOutputStream(client, requestStream, containerId,

Review Comment:
   nit: should `client.upload` return `SendContainerOutputStream` directly? Looks like there's no need to expose the details of StreamObserver to outside.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org