You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ozone.apache.org by GitBox <gi...@apache.org> on 2022/02/25 10:24:27 UTC

[GitHub] [ozone] sodonnel commented on a change in pull request #3137: HDDS-6373. EC: Exclude pipeline upon container close instead of exclude DNs.

sodonnel commented on a change in pull request #3137:
URL: https://github.com/apache/ozone/pull/3137#discussion_r814650748



##########
File path: hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/io/ECKeyOutputStream.java
##########
@@ -300,36 +305,51 @@ private StripeWriteStatus handleParityWrites(int parityCellSize,
   }
 
   private boolean hasWriteFailure() {
-    List<ECBlockOutputStream> failedStreams =
-        blockOutputStreamEntryPool.getCurrentStreamEntry()
-            .streamsWithWriteFailure();
-    // Since writes are async, let's check the failures once.
-    if (failedStreams.size() > 0) {
-      addToExcludeNodesList(failedStreams);
-      return true;
-    }
-    return false;
+    return !blockOutputStreamEntryPool.getCurrentStreamEntry()
+        .streamsWithWriteFailure().isEmpty();
   }
 
   private boolean hasPutBlockFailure() {
-    List<ECBlockOutputStream> failedStreams =
-        blockOutputStreamEntryPool.getCurrentStreamEntry()
-            .streamsWithPutBlockFailure();
-    // Since writes are async, let's check the failures once.
-    if (failedStreams.size() > 0) {
-      addToExcludeNodesList(failedStreams);
-      return true;
-    }
-    return false;
+    return !blockOutputStreamEntryPool.getCurrentStreamEntry()
+        .streamsWithPutBlockFailure().isEmpty();
   }
 
-  private void addToExcludeNodesList(List<ECBlockOutputStream> failedStreams) {
+  private void handleFailedStreams(boolean forPutBlock) {
+    ECBlockOutputStreamEntry currentStreamEntry =
+        blockOutputStreamEntryPool.getCurrentStreamEntry();
+    List<ECBlockOutputStream> failedStreams = forPutBlock
+        ? currentStreamEntry.streamsWithPutBlockFailure()
+        : currentStreamEntry.streamsWithWriteFailure();
+
+    // Since writes are async, let's check the failures once.
+    boolean containerToExcludeAll = true;
     for (ECBlockOutputStream failedStream : failedStreams) {
+      Throwable cause = HddsClientUtils.checkForException(
+          failedStream.getIoException());
+      Preconditions.checkNotNull(cause);
+      if (!checkIfContainerToExclude(cause)) {
+        blockOutputStreamEntryPool.getExcludeList()
+            .addDatanode(failedStream.getDatanodeDetails());
+        containerToExcludeAll = false;
+      }
+    }
+
+    // NOTE: For now, this is mainly for ContainerNotOpenException
+    // due to container full, but may also for those cases that
+    // a DN do respond but with one with certain failures.
+    // In such cases we don't treat the replied DNs as failed.
+    if (containerToExcludeAll) {
       blockOutputStreamEntryPool.getExcludeList()
-          .addDatanode(failedStream.getDatanodeDetails());
+          .addPipeline(currentStreamEntry.getPipeline().getId());
     }
   }
 
+  @Override
+  protected boolean checkIfContainerToExclude(Throwable t) {
+    return super.checkIfContainerToExclude(t)
+        && t instanceof ContainerNotOpenException;

Review comment:
       I don't know what the best approach is:
   
   It seems that in the existing Ratis code, there is already this checkIfContainerToExclude method, and it excludes the container rather than the DN if there is any exception returned from the DN. If the DN is able to return an exception, then it is still up. However I guess it could somehow be in a state where everything on it is failing (I don't know how that may happen).
   
   The existing code makes me think that we many not need the ContainerNotOpenException check here, and just use the existing Super method to handle things in the same way as for Ratis writes? That way  it is consistent and we will handle other potential errors from the DN too.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org