You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ozone.apache.org by GitBox <gi...@apache.org> on 2022/04/06 07:16:14 UTC

[GitHub] [ozone] siddhantsangwan commented on a diff in pull request #3272: HDDS-6553. Incorrect timeout when checking balancer iteration result.

siddhantsangwan commented on code in PR #3272:
URL: https://github.com/apache/ozone/pull/3272#discussion_r843507455


##########
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/balancer/ContainerBalancer.java:
##########
@@ -462,49 +461,72 @@ private void checkIterationResults(boolean isMoveGenerated,
   private void checkIterationMoveResults(Set<DatanodeDetails> selectedTargets) {
     this.countDatanodesInvolvedPerIteration = 0;
     this.sizeMovedPerIteration = 0;
-    for (Map.Entry<ContainerMoveSelection,
-            CompletableFuture<ReplicationManager.MoveResult>>
-        futureEntry : moveSelectionToFutureMap.entrySet()) {
-      ContainerMoveSelection moveSelection = futureEntry.getKey();
-      CompletableFuture<ReplicationManager.MoveResult> future =
-          futureEntry.getValue();
+
+    CompletableFuture<Void> allFuturesResult = CompletableFuture.allOf(
+        moveSelectionToFutureMap.values()
+            .toArray(new CompletableFuture[moveSelectionToFutureMap.size()]));
+    try {
+      allFuturesResult.get(config.getMoveTimeout().toMillis(),
+          TimeUnit.MILLISECONDS);
+    } catch (Exception e) {
+      // Ignore the exception here.

Review Comment:
   If an `InterruptedException` is caught here, we will need to to set the interrupted status again since it would have been cleared.



##########
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/balancer/ContainerBalancer.java:
##########
@@ -462,49 +461,72 @@ private void checkIterationResults(boolean isMoveGenerated,
   private void checkIterationMoveResults(Set<DatanodeDetails> selectedTargets) {
     this.countDatanodesInvolvedPerIteration = 0;
     this.sizeMovedPerIteration = 0;
-    for (Map.Entry<ContainerMoveSelection,
-            CompletableFuture<ReplicationManager.MoveResult>>
-        futureEntry : moveSelectionToFutureMap.entrySet()) {
-      ContainerMoveSelection moveSelection = futureEntry.getKey();
-      CompletableFuture<ReplicationManager.MoveResult> future =
-          futureEntry.getValue();
+
+    CompletableFuture<Void> allFuturesResult = CompletableFuture.allOf(
+        moveSelectionToFutureMap.values()
+            .toArray(new CompletableFuture[moveSelectionToFutureMap.size()]));
+    try {
+      allFuturesResult.get(config.getMoveTimeout().toMillis(),
+          TimeUnit.MILLISECONDS);
+    } catch (Exception e) {
+      // Ignore the exception here.
+    }
+
+    List<ContainerMoveSelection> completeKeyList =
+        moveSelectionToFutureMap.keySet().stream()
+        .filter(key -> moveSelectionToFutureMap.get(key).isDone())
+        .collect(Collectors.toList());
+
+    // Handle completed futures
+    for (ContainerMoveSelection moveSelection : completeKeyList) {
       try {
-        ReplicationManager.MoveResult result = future.get(
-            config.getMoveTimeout().toMillis(), TimeUnit.MILLISECONDS);
-        if (result == ReplicationManager.MoveResult.COMPLETED) {
-          try {
-            ContainerInfo container =
-                containerManager.getContainer(moveSelection.getContainerID());
-            this.sizeMovedPerIteration += container.getUsedBytes();
-            metrics.incrementNumContainerMovesInLatestIteration(1);
-            LOG.info("Container move completed for container {} to target {}",
-                container.containerID(),
-                moveSelection.getTargetNode().getUuidString());
-          } catch (ContainerNotFoundException e) {
-            LOG.warn("Could not find Container {} while " +
-                    "checking move results in ContainerBalancer",
-                moveSelection.getContainerID(), e);
-          }
+        CompletableFuture<ReplicationManager.MoveResult> future =
+            moveSelectionToFutureMap.get(moveSelection);
+        if (future.isCompletedExceptionally()) {
+          LOG.info("Container move for container {} to target {} " +

Review Comment:
   Can we also specify what the exception was here?



##########
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/balancer/ContainerBalancer.java:
##########
@@ -462,49 +461,72 @@ private void checkIterationResults(boolean isMoveGenerated,
   private void checkIterationMoveResults(Set<DatanodeDetails> selectedTargets) {
     this.countDatanodesInvolvedPerIteration = 0;
     this.sizeMovedPerIteration = 0;
-    for (Map.Entry<ContainerMoveSelection,
-            CompletableFuture<ReplicationManager.MoveResult>>
-        futureEntry : moveSelectionToFutureMap.entrySet()) {
-      ContainerMoveSelection moveSelection = futureEntry.getKey();
-      CompletableFuture<ReplicationManager.MoveResult> future =
-          futureEntry.getValue();
+
+    CompletableFuture<Void> allFuturesResult = CompletableFuture.allOf(
+        moveSelectionToFutureMap.values()
+            .toArray(new CompletableFuture[moveSelectionToFutureMap.size()]));
+    try {
+      allFuturesResult.get(config.getMoveTimeout().toMillis(),
+          TimeUnit.MILLISECONDS);
+    } catch (Exception e) {
+      // Ignore the exception here.
+    }
+
+    List<ContainerMoveSelection> completeKeyList =
+        moveSelectionToFutureMap.keySet().stream()
+        .filter(key -> moveSelectionToFutureMap.get(key).isDone())
+        .collect(Collectors.toList());
+
+    // Handle completed futures
+    for (ContainerMoveSelection moveSelection : completeKeyList) {
       try {
-        ReplicationManager.MoveResult result = future.get(
-            config.getMoveTimeout().toMillis(), TimeUnit.MILLISECONDS);
-        if (result == ReplicationManager.MoveResult.COMPLETED) {
-          try {
-            ContainerInfo container =
-                containerManager.getContainer(moveSelection.getContainerID());
-            this.sizeMovedPerIteration += container.getUsedBytes();
-            metrics.incrementNumContainerMovesInLatestIteration(1);
-            LOG.info("Container move completed for container {} to target {}",
-                container.containerID(),
-                moveSelection.getTargetNode().getUuidString());
-          } catch (ContainerNotFoundException e) {
-            LOG.warn("Could not find Container {} while " +
-                    "checking move results in ContainerBalancer",
-                moveSelection.getContainerID(), e);
-          }
+        CompletableFuture<ReplicationManager.MoveResult> future =
+            moveSelectionToFutureMap.get(moveSelection);
+        if (future.isCompletedExceptionally()) {
+          LOG.info("Container move for container {} to target {} " +

Review Comment:
   Can we also specify what the exception was here?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org