You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@ozone.apache.org by GitBox <gi...@apache.org> on 2022/06/30 18:07:05 UTC

[GitHub] [ozone] adoroszlai opened a new pull request, #3573: HDDS-6970. EC: Ensure DatanodeAdminMonitor can handle EC containers during decommission

adoroszlai opened a new pull request, #3573:
URL: https://github.com/apache/ozone/pull/3573

   ## What changes were proposed in this pull request?
   
   Extract a common interface for `ContainerReplicaCount` and `ECContainerReplicaCount` to allow `DatanodeAdminMonitorImpl` to handle both EC and non-EC containers.
   
   https://issues.apache.org/jira/browse/HDDS-6970
   
   ## How was this patch tested?
   
   Full CI:
   https://github.com/adoroszlai/hadoop-ozone/actions/runs/2590660510


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org

[GitHub] [ozone] sodonnel commented on a diff in pull request #3573: HDDS-6970. EC: Ensure DatanodeAdminMonitor can handle EC containers during decommission

Posted by GitBox <gi...@apache.org>.

sodonnel commented on code in PR #3573:
URL: https://github.com/apache/ozone/pull/3573#discussion_r911768159


##########
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/ContainerReplicaCount.java:
##########
@@ -262,22 +60,24 @@ public boolean isOverReplicated() {
    *
    * @return true if the container is healthy, false otherwise
    */
-  public boolean isHealthy() {
-    return (container.getState() == HddsProtos.LifeCycleState.CLOSED
-        || container.getState() == HddsProtos.LifeCycleState.QUASI_CLOSED)
-        && replica.stream()
+  default boolean isHealthy() {
+    HddsProtos.LifeCycleState containerState = getContainer().getState();
+    return (containerState == HddsProtos.LifeCycleState.CLOSED
+        || containerState == HddsProtos.LifeCycleState.QUASI_CLOSED)
+        && getReplicas().stream()
         .filter(r -> r.getDatanodeDetails().getPersistedOpState() == IN_SERVICE)
         .allMatch(r -> LegacyReplicationManager.compareState(
-            container.getState(), r.getState()));
+            containerState, r.getState()));
+
   }
 
   /**
-   * Returns true is there are no replicas of a container available, ie the
-   * set of container replica passed in the constructor has zero entries.
+   * Returns true is there are no replicas of the container available, ie the
+   * set of container replicas has zero entries.
    *
    * @return true if there are no replicas, false otherwise.
    */
-  public boolean isMissing() {
-    return replica.size() == 0;
+  default boolean isMissing() {

Review Comment:
   For EC, we have a method `public boolean unRecoverable()`. The definition of missing for EC is a bit strange. For Ratis it is very clear - there are no replicas available at all. For EC, a container is effectively missing if there are not dataNum containers available. We probably need to override this in the EC class and had it return the the result of unRecoverable.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org

[GitHub] [ozone] sodonnel commented on a diff in pull request #3573: HDDS-6970. EC: Ensure DatanodeAdminMonitor can handle EC containers during decommission

Posted by GitBox <gi...@apache.org>.

sodonnel commented on code in PR #3573:
URL: https://github.com/apache/ozone/pull/3573#discussion_r911791756


##########
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/replication/ReplicationManager.java:
##########
@@ -493,6 +498,15 @@ public boolean isContainerReplicatingOrDeleting(ContainerID containerID) {
         .isContainerReplicatingOrDeleting(containerID);
   }
 
+  private ECContainerReplicaCount getECContainerReplicaCount(
+      ContainerInfo containerInfo) throws ContainerNotFoundException {
+    Set<ContainerReplica> replicas = containerManager.getContainerReplicas(
+        containerInfo.containerID());
+    List<ContainerReplicaOp> pendingOps =
+        containerReplicaPendingOps.getPendingOps(containerInfo.containerID());
+    return new ECContainerReplicaCount(containerInfo, replicas, pendingOps, 0);

Review Comment:
   The zero at the end of the parameters is definitely not correct, but I don't know yet what is the correct value to put here.
   
   Could you add a TODO - define maintenance redundancy for EC (HDDS-6975) here?
   
   We will need to fix this in a couple of places I think.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org

[GitHub] [ozone] sodonnel commented on a diff in pull request #3573: HDDS-6970. EC: Ensure DatanodeAdminMonitor can handle EC containers during decommission

Posted by GitBox <gi...@apache.org>.

sodonnel commented on code in PR #3573:
URL: https://github.com/apache/ozone/pull/3573#discussion_r911815621


##########
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/ContainerReplicaCount.java:
##########
@@ -262,22 +60,24 @@ public boolean isOverReplicated() {
    *
    * @return true if the container is healthy, false otherwise
    */
-  public boolean isHealthy() {
-    return (container.getState() == HddsProtos.LifeCycleState.CLOSED
-        || container.getState() == HddsProtos.LifeCycleState.QUASI_CLOSED)
-        && replica.stream()
+  default boolean isHealthy() {
+    HddsProtos.LifeCycleState containerState = getContainer().getState();
+    return (containerState == HddsProtos.LifeCycleState.CLOSED
+        || containerState == HddsProtos.LifeCycleState.QUASI_CLOSED)
+        && getReplicas().stream()
         .filter(r -> r.getDatanodeDetails().getPersistedOpState() == IN_SERVICE)
         .allMatch(r -> LegacyReplicationManager.compareState(
-            container.getState(), r.getState()));
+            containerState, r.getState()));
+
   }
 
   /**
-   * Returns true is there are no replicas of a container available, ie the
-   * set of container replica passed in the constructor has zero entries.
+   * Returns true is there are no replicas of the container available, ie the
+   * set of container replicas has zero entries.
    *
    * @return true if there are no replicas, false otherwise.
    */
-  public boolean isMissing() {
-    return replica.size() == 0;
+  default boolean isMissing() {

Review Comment:
   Yea we could rename isMissing to unRecoverable. I think I originally has isMissing in the EC class, but Uma asked me to change it to unRecoverable. Would make sense to standardize on unRecoverable both classes I think.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org

[GitHub] [ozone] adoroszlai merged pull request #3573: HDDS-6970. EC: Ensure DatanodeAdminMonitor can handle EC containers during decommission

Posted by GitBox <gi...@apache.org>.

adoroszlai merged PR #3573:
URL: https://github.com/apache/ozone/pull/3573


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org

[GitHub] [ozone] sodonnel commented on a diff in pull request #3573: HDDS-6970. EC: Ensure DatanodeAdminMonitor can handle EC containers during decommission

Posted by GitBox <gi...@apache.org>.

sodonnel commented on code in PR #3573:
URL: https://github.com/apache/ozone/pull/3573#discussion_r911762253


##########
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/ContainerIdenticalReplicaCount.java:
##########
@@ -0,0 +1,262 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.hdds.scm.container;
+
+import org.apache.hadoop.hdds.protocol.proto.HddsProtos;
+
+import java.util.Set;
+
+import static org.apache.hadoop.hdds.protocol.proto.HddsProtos.NodeOperationalState.DECOMMISSIONED;
+import static org.apache.hadoop.hdds.protocol.proto.HddsProtos.NodeOperationalState.DECOMMISSIONING;
+import static org.apache.hadoop.hdds.protocol.proto.HddsProtos.NodeOperationalState.ENTERING_MAINTENANCE;
+import static org.apache.hadoop.hdds.protocol.proto.HddsProtos.NodeOperationalState.IN_MAINTENANCE;
+
+/**
+ * Immutable object that is created with a set of ContainerReplica objects and
+ * the number of in flight replica add and deletes, the container replication
+ * factor and the min count which must be available for maintenance. This
+ * information can be used to determine if the container is over or under
+ * replicated and also how many additional replicas need created or removed.
+ */
+public class ContainerIdenticalReplicaCount implements ContainerReplicaCount {
+
+  private int healthyCount;
+  private int decommissionCount;
+  private int maintenanceCount;
+  private final int inFlightAdd;
+  private final int inFlightDel;
+  private final int repFactor;
+  private final int minHealthyForMaintenance;
+  private final ContainerInfo container;
+  private final Set<ContainerReplica> replica;
+
+  public ContainerIdenticalReplicaCount(ContainerInfo container,

Review Comment:
   I'm not sure about the name of the class. As the other one is called ECContainerReplicaCount, would this be better as RatisContainerReplicaCount, or ReplicatedContainerReplicaCount maybe?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org

[GitHub] [ozone] adoroszlai commented on a diff in pull request #3573: HDDS-6970. EC: Ensure DatanodeAdminMonitor can handle EC containers during decommission

Posted by GitBox <gi...@apache.org>.

adoroszlai commented on code in PR #3573:
URL: https://github.com/apache/ozone/pull/3573#discussion_r911807095


##########
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/ContainerReplicaCount.java:
##########
@@ -22,238 +22,36 @@
 
 import java.util.Set;
 
-import static org.apache.hadoop.hdds.protocol.proto.HddsProtos.NodeOperationalState.DECOMMISSIONED;
-import static org.apache.hadoop.hdds.protocol.proto.HddsProtos.NodeOperationalState.DECOMMISSIONING;
-import static org.apache.hadoop.hdds.protocol.proto.HddsProtos.NodeOperationalState.ENTERING_MAINTENANCE;
-import static org.apache.hadoop.hdds.protocol.proto.HddsProtos.NodeOperationalState.IN_MAINTENANCE;
 import static org.apache.hadoop.hdds.protocol.proto.HddsProtos.NodeOperationalState.IN_SERVICE;
 
 /**
- * Immutable object that is created with a set of ContainerReplica objects and
- * the number of in flight replica add and deletes, the container replication
- * factor and the min count which must be available for maintenance. This
- * information can be used to determine if the container is over or under
- * replicated and also how many additional replicas need created or removed.
+ * Common interface for EC and non-EC container replica counts.
+ * TODO pull up more methods if needed
  */
-public class ContainerReplicaCount {
+public interface ContainerReplicaCount {
+  ContainerInfo getContainer();
 
-  private int healthyCount = 0;
-  private int decommissionCount = 0;
-  private int maintenanceCount = 0;
-  private int inFlightAdd = 0;
-  private int inFlightDel = 0;
-  private int repFactor;
-  private int minHealthyForMaintenance;
-  private ContainerInfo container;
-  private Set<ContainerReplica> replica;
+  Set<ContainerReplica> getReplicas();
 
-  public ContainerReplicaCount(ContainerInfo container,
-                               Set<ContainerReplica> replica, int inFlightAdd,
-                               int inFlightDelete, int replicationFactor,
-                               int minHealthyForMaintenance) {
-    this.healthyCount = 0;
-    this.decommissionCount = 0;
-    this.maintenanceCount = 0;
-    this.inFlightAdd = inFlightAdd;
-    this.inFlightDel = inFlightDelete;
-    this.repFactor = replicationFactor;
-    this.replica = replica;
-    this.minHealthyForMaintenance
-        = Math.min(this.repFactor, minHealthyForMaintenance);
-    this.container = container;
+  boolean isSufficientlyReplicated();
 
-    for (ContainerReplica cr : this.replica) {
-      HddsProtos.NodeOperationalState state =
-          cr.getDatanodeDetails().getPersistedOpState();
-      if (state == DECOMMISSIONED || state == DECOMMISSIONING) {
-        decommissionCount++;
-      } else if (state == IN_MAINTENANCE || state == ENTERING_MAINTENANCE) {
-        maintenanceCount++;
-      } else {
-        healthyCount++;
-      }
-    }
-  }
+  boolean isOverReplicated();
 
-  public int getHealthyCount() {
-    return healthyCount;
-  }
+  int getDecommissionCount();
 
-  public int getDecommissionCount() {
-    return decommissionCount;
-  }
-
-  public int getMaintenanceCount() {
-    return maintenanceCount;
-  }
-
-  public int getReplicationFactor() {
-    return repFactor;
-  }
-
-  public ContainerInfo getContainer() {
-    return container;
-  }
-
-  public Set<ContainerReplica> getReplica() {
-    return replica;
-  }
-
-  @Override
-  public String toString() {
-    return "Container State: " + container.getState() +
-        " Replica Count: " + replica.size() +
-        " Healthy Count: " + healthyCount +
-        " Decommission Count: " + decommissionCount +
-        " Maintenance Count: " + maintenanceCount +
-        " inFlightAdd Count: " + inFlightAdd +
-        " inFightDel Count: " + inFlightDel +
-        " ReplicationFactor: " + repFactor +
-        " minMaintenance Count: " + minHealthyForMaintenance;
-  }
+  int getMaintenanceCount();
 
   /**
-   * Calculates the the delta of replicas which need to be created or removed
+   * Calculates the delta of replicas which need to be created or removed
    * to ensure the container is correctly replicated when considered inflight
    * adds and deletes.
    *
-   * When considering inflight operations, it is assumed any operation will
-   * fail. However, to consider the worst case and avoid data loss, we always
-   * assume a delete will succeed and and add will fail. In this way, we will
-   * avoid scheduling too many deletes which could result in dataloss.
-   *
-   * Decisions around over-replication are made only on healthy replicas,
-   * ignoring any in maintenance and also any inflight adds. InFlight adds are
-   * ignored, as they may not complete, so if we have:
-   *
-   *     H, H, H, IN_FLIGHT_ADD
-   *
-   * And then schedule a delete, we could end up under-replicated (add fails,
-   * delete completes). It is better to let the inflight operations complete
-   * and then deal with any further over or under replication.
-   *
-   * For maintenance replicas, assuming replication factor 3, and minHealthy
-   * 2, it is possible for all 3 hosts to be put into maintenance, leaving the
-   * following (H = healthy, M = maintenance):
-   *
-   *     H, H, M, M, M
-   *
-   * Even though we are tracking 5 replicas, this is not over replicated as we
-   * ignore the maintenance copies. Later, the replicas could look like:
-   *
-   *     H, H, H, H, M
-   *
-   * At this stage, the container is over replicated by 1, so one replica can be
-   * removed.
-   *
-   * For containers which have replication factor healthy replica, we ignore any
-   * inflight add or deletes, as they may fail. Instead, wait for them to
-   * complete and then deal with any excess or deficit.
-   *
-   * For under replicated containers we do consider inflight add and delete to
-   * avoid scheduling more adds than needed. There is additional logic around
-   * containers with maintenance replica to ensure minHealthyForMaintenance
-   * replia are maintained.
-   *
    * @return Delta of replicas needed. Negative indicates over replication and
    *         containers should be removed. Positive indicates over replication
    *         and zero indicates the containers has replicationFactor healthy
    *         replica
    */
-  public int additionalReplicaNeeded() {
-    int delta = missingReplicas();
-
-    if (delta < 0) {
-      // Over replicated, so may need to remove a container. Do not consider
-      // inFlightAdds, as they may fail, but do consider inFlightDel which
-      // will reduce the over-replication if it completes.
-      // Note this could make the delta positive if there are too many in flight
-      // deletes, which will result in an additional being scheduled.
-      return delta + inFlightDel;
-    } else {
-      // May be under or perfectly replicated.
-      // We must consider in flight add and delete when calculating the new
-      // containers needed, but we bound the lower limit at zero to allow
-      // inflight operations to complete before handling any potential over
-      // replication
-      return Math.max(0, delta - inFlightAdd + inFlightDel);
-    }
-  }
-
-  /**
-   * Returns the count of replicas which need to be created or removed to
-   * ensure the container is perfectly replicate. Inflight operations are not
-   * considered here, but the logic to determine the missing or excess counts
-   * for maintenance is present.
-   *
-   * Decisions around over-replication are made only on healthy replicas,
-   * ignoring any in maintenance. For example, if we have:
-   *
-   *     H, H, H, M, M
-   *
-   * This will not be consider over replicated until one of the Maintenance
-   * replicas moves to Healthy.
-   *
-   * If the container is perfectly replicated, zero will be return.
-   *
-   * If it is under replicated a positive value will be returned, indicating
-   * how many replicas must be added.
-   *
-   * If it is over replicated a negative value will be returned, indicating now
-   * many replicas to remove.
-   *
-   * @return Zero if the container is perfectly replicated, a positive value
-   *         for under replicated and a negative value for over replicated.
-   */
-  private int missingReplicas() {
-    int delta = repFactor - healthyCount;
-
-    if (delta < 0) {
-      // Over replicated, so may need to remove a container.
-      return delta;
-    } else if (delta > 0) {
-      // May be under-replicated, depending on maintenance.
-      delta = Math.max(0, delta - maintenanceCount);
-      int neededHealthy =
-          Math.max(0, minHealthyForMaintenance - healthyCount);
-      delta = Math.max(neededHealthy, delta);
-      return delta;
-    } else { // delta == 0
-      // We have exactly the number of healthy replicas needed.
-      return delta;
-    }
-  }
-
-  /**
-   * Return true if the container is sufficiently replicated. Decommissioning
-   * and Decommissioned containers are ignored in this check, assuming they will
-   * eventually be removed from the cluster.
-   * This check ignores inflight additions, as those replicas have not yet been
-   * created and the create could fail for some reason.
-   * The check does consider inflight deletes as there may be 3 healthy replicas
-   * now, but once the delete completes it will reduce to 2.
-   * We also assume a replica in Maintenance state cannot be removed, so the
-   * pending delete would affect only the healthy replica count.
-   *
-   * @return True if the container is sufficiently replicated and False
-   *         otherwise.
-   */
-  public boolean isSufficientlyReplicated() {
-    return missingReplicas() + inFlightDel <= 0;
-  }
-
-  /**
-   * Return true is the container is over replicated. Decommission and
-   * maintenance containers are ignored for this check.
-   * The check ignores inflight additions, as they may fail, but it does
-   * consider inflight deletes, as they would reduce the over replication when
-   * they complete.
-   *
-   * @return True if the container is over replicated, false otherwise.
-   */
-  public boolean isOverReplicated() {
-    return missingReplicas() + inFlightDel < 0;
-  }
+  int additionalReplicaNeeded();

Review Comment:
   Yep, turns out it's not needed.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org

[GitHub] [ozone] sodonnel commented on a diff in pull request #3573: HDDS-6970. EC: Ensure DatanodeAdminMonitor can handle EC containers during decommission

Posted by GitBox <gi...@apache.org>.

sodonnel commented on code in PR #3573:
URL: https://github.com/apache/ozone/pull/3573#discussion_r911768159


##########
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/ContainerReplicaCount.java:
##########
@@ -262,22 +60,24 @@ public boolean isOverReplicated() {
    *
    * @return true if the container is healthy, false otherwise
    */
-  public boolean isHealthy() {
-    return (container.getState() == HddsProtos.LifeCycleState.CLOSED
-        || container.getState() == HddsProtos.LifeCycleState.QUASI_CLOSED)
-        && replica.stream()
+  default boolean isHealthy() {
+    HddsProtos.LifeCycleState containerState = getContainer().getState();
+    return (containerState == HddsProtos.LifeCycleState.CLOSED
+        || containerState == HddsProtos.LifeCycleState.QUASI_CLOSED)
+        && getReplicas().stream()
         .filter(r -> r.getDatanodeDetails().getPersistedOpState() == IN_SERVICE)
         .allMatch(r -> LegacyReplicationManager.compareState(
-            container.getState(), r.getState()));
+            containerState, r.getState()));
+
   }
 
   /**
-   * Returns true is there are no replicas of a container available, ie the
-   * set of container replica passed in the constructor has zero entries.
+   * Returns true is there are no replicas of the container available, ie the
+   * set of container replicas has zero entries.
    *
    * @return true if there are no replicas, false otherwise.
    */
-  public boolean isMissing() {
-    return replica.size() == 0;
+  default boolean isMissing() {

Review Comment:
   For EC, we have a method `public boolean unRecoverable()`. The definition of missing for EC is a bit strange. For Ratis it is very clear - there are no replicas available at all. For EC, a container is effectively missing if there are no dataNum containers available. We probably need to override this in the EC class and have it return the the result of unRecoverable.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org

[GitHub] [ozone] adoroszlai commented on a diff in pull request #3573: HDDS-6970. EC: Ensure DatanodeAdminMonitor can handle EC containers during decommission

Posted by GitBox <gi...@apache.org>.

adoroszlai commented on code in PR #3573:
URL: https://github.com/apache/ozone/pull/3573#discussion_r911810452


##########
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/ContainerReplicaCount.java:
##########
@@ -262,22 +60,24 @@ public boolean isOverReplicated() {
    *
    * @return true if the container is healthy, false otherwise
    */
-  public boolean isHealthy() {
-    return (container.getState() == HddsProtos.LifeCycleState.CLOSED
-        || container.getState() == HddsProtos.LifeCycleState.QUASI_CLOSED)
-        && replica.stream()
+  default boolean isHealthy() {
+    HddsProtos.LifeCycleState containerState = getContainer().getState();
+    return (containerState == HddsProtos.LifeCycleState.CLOSED
+        || containerState == HddsProtos.LifeCycleState.QUASI_CLOSED)
+        && getReplicas().stream()
         .filter(r -> r.getDatanodeDetails().getPersistedOpState() == IN_SERVICE)
         .allMatch(r -> LegacyReplicationManager.compareState(
-            container.getState(), r.getState()));
+            containerState, r.getState()));
+
   }
 
   /**
-   * Returns true is there are no replicas of a container available, ie the
-   * set of container replica passed in the constructor has zero entries.
+   * Returns true is there are no replicas of the container available, ie the
+   * set of container replicas has zero entries.
    *
    * @return true if there are no replicas, false otherwise.
    */
-  public boolean isMissing() {
-    return replica.size() == 0;
+  default boolean isMissing() {

Review Comment:
   Done, thanks.  What do you think about merging these two methods, e.g. renaming `isMissing` to `unRecoverable` (which seems more general) or vice versa?
   
   Note: I think `isUnrecoverable` would be a more standard name for `unRecoverable`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org

[GitHub] [ozone] sodonnel commented on a diff in pull request #3573: HDDS-6970. EC: Ensure DatanodeAdminMonitor can handle EC containers during decommission

Posted by GitBox <gi...@apache.org>.

sodonnel commented on code in PR #3573:
URL: https://github.com/apache/ozone/pull/3573#discussion_r911773646


##########
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/ContainerReplicaCount.java:
##########
@@ -22,238 +22,36 @@
 
 import java.util.Set;
 
-import static org.apache.hadoop.hdds.protocol.proto.HddsProtos.NodeOperationalState.DECOMMISSIONED;
-import static org.apache.hadoop.hdds.protocol.proto.HddsProtos.NodeOperationalState.DECOMMISSIONING;
-import static org.apache.hadoop.hdds.protocol.proto.HddsProtos.NodeOperationalState.ENTERING_MAINTENANCE;
-import static org.apache.hadoop.hdds.protocol.proto.HddsProtos.NodeOperationalState.IN_MAINTENANCE;
 import static org.apache.hadoop.hdds.protocol.proto.HddsProtos.NodeOperationalState.IN_SERVICE;
 
 /**
- * Immutable object that is created with a set of ContainerReplica objects and
- * the number of in flight replica add and deletes, the container replication
- * factor and the min count which must be available for maintenance. This
- * information can be used to determine if the container is over or under
- * replicated and also how many additional replicas need created or removed.
+ * Common interface for EC and non-EC container replica counts.
+ * TODO pull up more methods if needed
  */
-public class ContainerReplicaCount {
+public interface ContainerReplicaCount {
+  ContainerInfo getContainer();
 
-  private int healthyCount = 0;
-  private int decommissionCount = 0;
-  private int maintenanceCount = 0;
-  private int inFlightAdd = 0;
-  private int inFlightDel = 0;
-  private int repFactor;
-  private int minHealthyForMaintenance;
-  private ContainerInfo container;
-  private Set<ContainerReplica> replica;
+  Set<ContainerReplica> getReplicas();
 
-  public ContainerReplicaCount(ContainerInfo container,
-                               Set<ContainerReplica> replica, int inFlightAdd,
-                               int inFlightDelete, int replicationFactor,
-                               int minHealthyForMaintenance) {
-    this.healthyCount = 0;
-    this.decommissionCount = 0;
-    this.maintenanceCount = 0;
-    this.inFlightAdd = inFlightAdd;
-    this.inFlightDel = inFlightDelete;
-    this.repFactor = replicationFactor;
-    this.replica = replica;
-    this.minHealthyForMaintenance
-        = Math.min(this.repFactor, minHealthyForMaintenance);
-    this.container = container;
+  boolean isSufficientlyReplicated();
 
-    for (ContainerReplica cr : this.replica) {
-      HddsProtos.NodeOperationalState state =
-          cr.getDatanodeDetails().getPersistedOpState();
-      if (state == DECOMMISSIONED || state == DECOMMISSIONING) {
-        decommissionCount++;
-      } else if (state == IN_MAINTENANCE || state == ENTERING_MAINTENANCE) {
-        maintenanceCount++;
-      } else {
-        healthyCount++;
-      }
-    }
-  }
+  boolean isOverReplicated();
 
-  public int getHealthyCount() {
-    return healthyCount;
-  }
+  int getDecommissionCount();
 
-  public int getDecommissionCount() {
-    return decommissionCount;
-  }
-
-  public int getMaintenanceCount() {
-    return maintenanceCount;
-  }
-
-  public int getReplicationFactor() {
-    return repFactor;
-  }
-
-  public ContainerInfo getContainer() {
-    return container;
-  }
-
-  public Set<ContainerReplica> getReplica() {
-    return replica;
-  }
-
-  @Override
-  public String toString() {
-    return "Container State: " + container.getState() +
-        " Replica Count: " + replica.size() +
-        " Healthy Count: " + healthyCount +
-        " Decommission Count: " + decommissionCount +
-        " Maintenance Count: " + maintenanceCount +
-        " inFlightAdd Count: " + inFlightAdd +
-        " inFightDel Count: " + inFlightDel +
-        " ReplicationFactor: " + repFactor +
-        " minMaintenance Count: " + minHealthyForMaintenance;
-  }
+  int getMaintenanceCount();
 
   /**
-   * Calculates the the delta of replicas which need to be created or removed
+   * Calculates the delta of replicas which need to be created or removed
    * to ensure the container is correctly replicated when considered inflight
    * adds and deletes.
    *
-   * When considering inflight operations, it is assumed any operation will
-   * fail. However, to consider the worst case and avoid data loss, we always
-   * assume a delete will succeed and and add will fail. In this way, we will
-   * avoid scheduling too many deletes which could result in dataloss.
-   *
-   * Decisions around over-replication are made only on healthy replicas,
-   * ignoring any in maintenance and also any inflight adds. InFlight adds are
-   * ignored, as they may not complete, so if we have:
-   *
-   *     H, H, H, IN_FLIGHT_ADD
-   *
-   * And then schedule a delete, we could end up under-replicated (add fails,
-   * delete completes). It is better to let the inflight operations complete
-   * and then deal with any further over or under replication.
-   *
-   * For maintenance replicas, assuming replication factor 3, and minHealthy
-   * 2, it is possible for all 3 hosts to be put into maintenance, leaving the
-   * following (H = healthy, M = maintenance):
-   *
-   *     H, H, M, M, M
-   *
-   * Even though we are tracking 5 replicas, this is not over replicated as we
-   * ignore the maintenance copies. Later, the replicas could look like:
-   *
-   *     H, H, H, H, M
-   *
-   * At this stage, the container is over replicated by 1, so one replica can be
-   * removed.
-   *
-   * For containers which have replication factor healthy replica, we ignore any
-   * inflight add or deletes, as they may fail. Instead, wait for them to
-   * complete and then deal with any excess or deficit.
-   *
-   * For under replicated containers we do consider inflight add and delete to
-   * avoid scheduling more adds than needed. There is additional logic around
-   * containers with maintenance replica to ensure minHealthyForMaintenance
-   * replia are maintained.
-   *
    * @return Delta of replicas needed. Negative indicates over replication and
    *         containers should be removed. Positive indicates over replication
    *         and zero indicates the containers has replicationFactor healthy
    *         replica
    */
-  public int additionalReplicaNeeded() {
-    int delta = missingReplicas();
-
-    if (delta < 0) {
-      // Over replicated, so may need to remove a container. Do not consider
-      // inFlightAdds, as they may fail, but do consider inFlightDel which
-      // will reduce the over-replication if it completes.
-      // Note this could make the delta positive if there are too many in flight
-      // deletes, which will result in an additional being scheduled.
-      return delta + inFlightDel;
-    } else {
-      // May be under or perfectly replicated.
-      // We must consider in flight add and delete when calculating the new
-      // containers needed, but we bound the lower limit at zero to allow
-      // inflight operations to complete before handling any potential over
-      // replication
-      return Math.max(0, delta - inFlightAdd + inFlightDel);
-    }
-  }
-
-  /**
-   * Returns the count of replicas which need to be created or removed to
-   * ensure the container is perfectly replicate. Inflight operations are not
-   * considered here, but the logic to determine the missing or excess counts
-   * for maintenance is present.
-   *
-   * Decisions around over-replication are made only on healthy replicas,
-   * ignoring any in maintenance. For example, if we have:
-   *
-   *     H, H, H, M, M
-   *
-   * This will not be consider over replicated until one of the Maintenance
-   * replicas moves to Healthy.
-   *
-   * If the container is perfectly replicated, zero will be return.
-   *
-   * If it is under replicated a positive value will be returned, indicating
-   * how many replicas must be added.
-   *
-   * If it is over replicated a negative value will be returned, indicating now
-   * many replicas to remove.
-   *
-   * @return Zero if the container is perfectly replicated, a positive value
-   *         for under replicated and a negative value for over replicated.
-   */
-  private int missingReplicas() {
-    int delta = repFactor - healthyCount;
-
-    if (delta < 0) {
-      // Over replicated, so may need to remove a container.
-      return delta;
-    } else if (delta > 0) {
-      // May be under-replicated, depending on maintenance.
-      delta = Math.max(0, delta - maintenanceCount);
-      int neededHealthy =
-          Math.max(0, minHealthyForMaintenance - healthyCount);
-      delta = Math.max(neededHealthy, delta);
-      return delta;
-    } else { // delta == 0
-      // We have exactly the number of healthy replicas needed.
-      return delta;
-    }
-  }
-
-  /**
-   * Return true if the container is sufficiently replicated. Decommissioning
-   * and Decommissioned containers are ignored in this check, assuming they will
-   * eventually be removed from the cluster.
-   * This check ignores inflight additions, as those replicas have not yet been
-   * created and the create could fail for some reason.
-   * The check does consider inflight deletes as there may be 3 healthy replicas
-   * now, but once the delete completes it will reduce to 2.
-   * We also assume a replica in Maintenance state cannot be removed, so the
-   * pending delete would affect only the healthy replica count.
-   *
-   * @return True if the container is sufficiently replicated and False
-   *         otherwise.
-   */
-  public boolean isSufficientlyReplicated() {
-    return missingReplicas() + inFlightDel <= 0;
-  }
-
-  /**
-   * Return true is the container is over replicated. Decommission and
-   * maintenance containers are ignored for this check.
-   * The check ignores inflight additions, as they may fail, but it does
-   * consider inflight deletes, as they would reduce the over replication when
-   * they complete.
-   *
-   * @return True if the container is over replicated, false otherwise.
-   */
-  public boolean isOverReplicated() {
-    return missingReplicas() + inFlightDel < 0;
-  }
+  int additionalReplicaNeeded();

Review Comment:
   I wonder if we should omit this from the interface. For EC, its a tricky calculation and probably not that useful. It could say 2 additional replicas needed, but its not too helpful, as we don't know what indexes, or if its a reconstruction or an easy copy from a decommissioning node. It feels like this doesn't apply well to the common methods, and I don't think its used inside the decommission code.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org

[GitHub] [ozone] sodonnel commented on a diff in pull request #3573: HDDS-6970. EC: Ensure DatanodeAdminMonitor can handle EC containers during decommission

Posted by GitBox <gi...@apache.org>.

sodonnel commented on code in PR #3573:
URL: https://github.com/apache/ozone/pull/3573#discussion_r911856998


##########
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/ContainerReplicaCount.java:
##########
@@ -262,22 +60,24 @@ public boolean isOverReplicated() {
    *
    * @return true if the container is healthy, false otherwise
    */
-  public boolean isHealthy() {
-    return (container.getState() == HddsProtos.LifeCycleState.CLOSED
-        || container.getState() == HddsProtos.LifeCycleState.QUASI_CLOSED)
-        && replica.stream()
+  default boolean isHealthy() {
+    HddsProtos.LifeCycleState containerState = getContainer().getState();
+    return (containerState == HddsProtos.LifeCycleState.CLOSED
+        || containerState == HddsProtos.LifeCycleState.QUASI_CLOSED)
+        && getReplicas().stream()
         .filter(r -> r.getDatanodeDetails().getPersistedOpState() == IN_SERVICE)
         .allMatch(r -> LegacyReplicationManager.compareState(
-            container.getState(), r.getState()));
+            containerState, r.getState()));
+
   }
 
   /**
-   * Returns true is there are no replicas of a container available, ie the
-   * set of container replica passed in the constructor has zero entries.
+   * Returns true is there are no replicas of the container available, ie the
+   * set of container replicas has zero entries.
    *
    * @return true if there are no replicas, false otherwise.
    */
-  public boolean isMissing() {
-    return replica.size() == 0;
+  default boolean isMissing() {

Review Comment:
   Go ahead with isUnrecoverable - I think it is best.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org

[GitHub] [ozone] sodonnel commented on a diff in pull request #3573: HDDS-6970. EC: Ensure DatanodeAdminMonitor can handle EC containers during decommission

Posted by GitBox <gi...@apache.org>.

sodonnel commented on code in PR #3573:
URL: https://github.com/apache/ozone/pull/3573#discussion_r911791756


##########
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/replication/ReplicationManager.java:
##########
@@ -493,6 +498,15 @@ public boolean isContainerReplicatingOrDeleting(ContainerID containerID) {
         .isContainerReplicatingOrDeleting(containerID);
   }
 
+  private ECContainerReplicaCount getECContainerReplicaCount(
+      ContainerInfo containerInfo) throws ContainerNotFoundException {
+    Set<ContainerReplica> replicas = containerManager.getContainerReplicas(
+        containerInfo.containerID());
+    List<ContainerReplicaOp> pendingOps =
+        containerReplicaPendingOps.getPendingOps(containerInfo.containerID());
+    return new ECContainerReplicaCount(containerInfo, replicas, pendingOps, 0);

Review Comment:
   The zero at the end of the parameters is definitely not correct, but I don't know yet what is the correct value to put here.
   
   Could you add a TODO - define maintenance redundancy for EC (HDDS-6975) here?
   
   We will need to fix this in a couple of places I think (not related to this PR).



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org

[GitHub] [ozone] sodonnel commented on a diff in pull request #3573: HDDS-6970. EC: Ensure DatanodeAdminMonitor can handle EC containers during decommission

Posted by GitBox <gi...@apache.org>.

sodonnel commented on code in PR #3573:
URL: https://github.com/apache/ozone/pull/3573#discussion_r911856263


##########
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/replication/ReplicationManager.java:
##########
@@ -493,6 +498,16 @@ public boolean isContainerReplicatingOrDeleting(ContainerID containerID) {
         .isContainerReplicatingOrDeleting(containerID);
   }
 
+  private ECContainerReplicaCount getECContainerReplicaCount(
+      ContainerInfo containerInfo) throws ContainerNotFoundException {
+    Set<ContainerReplica> replicas = containerManager.getContainerReplicas(
+        containerInfo.containerID());
+    List<ContainerReplicaOp> pendingOps =
+        containerReplicaPendingOps.getPendingOps(containerInfo.containerID());
+    // TODO: define maintenance redundancy for EC

Review Comment:
   Is it worth adding the Jira number here too - ([HDDS-6975](https://issues.apache.org/jira/browse/HDDS-6975) ?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org

[GitHub] [ozone] adoroszlai commented on pull request #3573: HDDS-6970. EC: Ensure DatanodeAdminMonitor can handle EC containers during decommission

Posted by GitBox <gi...@apache.org>.

adoroszlai commented on PR #3573:
URL: https://github.com/apache/ozone/pull/3573#issuecomment-1172573193

   Thanks @sodonnel for the review.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org