You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ozone.apache.org by GitBox <gi...@apache.org> on 2022/09/27 12:54:25 UTC

[GitHub] [ozone] siddhantsangwan opened a new pull request, #3785: HDDS-7263. Add a handler for Quasi Closed containers to RM

siddhantsangwan opened a new pull request, #3785:
URL: https://github.com/apache/ozone/pull/3785

   ## What changes were proposed in this pull request?
   
   Added a handler for Quasi Closed containers to Replication Manager. This handler takes care of only Ratis containers. Most of the core logic has been picked up from the legacy RM.
   
   ## What is the link to the Apache JIRA
   
   https://issues.apache.org/jira/browse/HDDS-7263
   
   ## How was this patch tested?
   
   Added unit tests.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] sodonnel commented on a diff in pull request #3785: HDDS-7263. Add a handler for Quasi Closed containers to RM

Posted by GitBox <gi...@apache.org>.
sodonnel commented on code in PR #3785:
URL: https://github.com/apache/ozone/pull/3785#discussion_r981360925


##########
hadoop-hdds/server-scm/src/test/java/org/apache/hadoop/hdds/scm/container/replication/health/TestQuasiClosedContainerHandler.java:
##########
@@ -0,0 +1,180 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ *  with the License.  You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ *  Unless required by applicable law or agreed to in writing, software
+ *  distributed under the License is distributed on an "AS IS" BASIS,
+ *  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ *  See the License for the specific language governing permissions and
+ *  limitations under the License.
+ */
+
+package org.apache.hadoop.hdds.scm.container.replication.health;
+
+import org.apache.hadoop.hdds.client.ECReplicationConfig;
+import org.apache.hadoop.hdds.client.RatisReplicationConfig;
+import org.apache.hadoop.hdds.protocol.proto.HddsProtos;
+import org.apache.hadoop.hdds.protocol.proto.StorageContainerDatanodeProtocolProtos.ContainerReplicaProto.State;
+import org.apache.hadoop.hdds.scm.container.ContainerInfo;
+import org.apache.hadoop.hdds.scm.container.ContainerReplica;
+import org.apache.hadoop.hdds.scm.container.ReplicationManagerReport;
+import org.apache.hadoop.hdds.scm.container.replication.ContainerCheckRequest;
+import org.apache.hadoop.hdds.scm.container.replication.ReplicationManager;
+import org.apache.hadoop.hdds.scm.container.replication.ReplicationTestUtil;
+import org.junit.jupiter.api.Assertions;
+import org.junit.jupiter.api.BeforeEach;
+import org.junit.jupiter.api.Test;
+import org.mockito.Mockito;
+
+import java.util.Collections;
+import java.util.Set;
+
+import static org.apache.hadoop.hdds.protocol.proto.HddsProtos.LifeCycleState.OPEN;
+import static org.apache.hadoop.hdds.protocol.proto.HddsProtos.LifeCycleState.QUASI_CLOSED;
+import static org.mockito.ArgumentMatchers.any;
+import static org.mockito.ArgumentMatchers.anyBoolean;
+import static org.mockito.Mockito.times;
+
+/**
+ * Tests for {@link QuasiClosedContainerHandler}. This handler is only meant
+ * to handle Ratis containers.
+ */
+public class TestQuasiClosedContainerHandler {
+  private ReplicationManager replicationManager;
+  private QuasiClosedContainerHandler quasiClosedContainerHandler;
+  private RatisReplicationConfig ratisReplicationConfig;
+
+  @BeforeEach
+  public void setup() {
+    ratisReplicationConfig = RatisReplicationConfig.getInstance(
+        HddsProtos.ReplicationFactor.THREE);
+    replicationManager = Mockito.mock(ReplicationManager.class);
+    quasiClosedContainerHandler =
+        new QuasiClosedContainerHandler(replicationManager);
+  }
+
+  @Test
+  public void testECContainerReturnsFalse() {
+    ContainerInfo containerInfo = ReplicationTestUtil.createContainerInfo(
+        new ECReplicationConfig(3, 2), 1, QUASI_CLOSED);
+    Set<ContainerReplica> containerReplicas = ReplicationTestUtil
+        .createReplicas(containerInfo.containerID(),
+            State.QUASI_CLOSED, 1, 2, 3);
+    ContainerCheckRequest request = new ContainerCheckRequest.Builder()
+        .setPendingOps(Collections.EMPTY_LIST)
+        .setReport(new ReplicationManagerReport())
+        .setContainerInfo(containerInfo)
+        .setContainerReplicas(containerReplicas)
+        .build();
+
+    Assertions.assertFalse(quasiClosedContainerHandler.handle(request));
+    Mockito.verify(replicationManager, times(0))
+        .sendCloseContainerReplicaCommand(any(), any(), anyBoolean());
+  }
+
+  @Test
+  public void testOpenContainerReturnsFalse() {
+    ContainerInfo containerInfo = ReplicationTestUtil.createContainerInfo(
+        ratisReplicationConfig, 1, OPEN);
+    Set<ContainerReplica> containerReplicas = ReplicationTestUtil
+        .createReplicas(containerInfo.containerID(),
+            State.OPEN, 1, 2, 3);
+    ContainerCheckRequest request = new ContainerCheckRequest.Builder()
+        .setPendingOps(Collections.EMPTY_LIST)
+        .setReport(new ReplicationManagerReport())
+        .setContainerInfo(containerInfo)
+        .setContainerReplicas(containerReplicas)
+        .build();
+
+    Assertions.assertFalse(quasiClosedContainerHandler.handle(request));
+    Mockito.verify(replicationManager, times(0))
+        .sendCloseContainerReplicaCommand(any(), any(), anyBoolean());
+  }
+
+  /**
+   * When a container is QUASI_CLOSED, and it has greater than 50% of its
+   * replicas in QUASI_CLOSED state with unique origin node id,
+   * the handler should send force close commands to the replica(s) with
+   * highest BCSID.
+   */
+  @Test
+  public void testQuasiClosedWithQuorumReturnsTrue() {
+    ContainerInfo containerInfo = ReplicationTestUtil.createContainerInfo(
+        ratisReplicationConfig, 1, QUASI_CLOSED);
+    Set<ContainerReplica> containerReplicas = ReplicationTestUtil
+        .createReplicas(containerInfo.containerID(),
+            State.QUASI_CLOSED, 1, 2);
+    ContainerReplica openReplica = ReplicationTestUtil.createContainerReplica(
+        containerInfo.containerID(), 3,
+        HddsProtos.NodeOperationalState.IN_SERVICE, State.OPEN);
+    containerReplicas.add(openReplica);
+    ContainerCheckRequest request = new ContainerCheckRequest.Builder()
+        .setPendingOps(Collections.EMPTY_LIST)
+        .setReport(new ReplicationManagerReport())
+        .setContainerInfo(containerInfo)
+        .setContainerReplicas(containerReplicas)
+        .build();
+
+    Assertions.assertTrue(quasiClosedContainerHandler.handle(request));
+    Mockito.verify(replicationManager, times(2))
+        .sendCloseContainerReplicaCommand(any(), any(), anyBoolean());
+  }
+
+  /**
+   * The replicas are QUASI_CLOSED, but all of them have the same origin node
+   * id. Since a quorum with unique origin node ids (greater than 50% of
+   * replicas) is not formed, the handler should return false.
+   */
+  @Test
+  public void testHealthyQuasiClosedContainerReturnsFalse() {
+    ContainerInfo containerInfo = ReplicationTestUtil.createContainerInfo(
+        ratisReplicationConfig, 1, QUASI_CLOSED);
+    Set<ContainerReplica> containerReplicas = ReplicationTestUtil
+        .createReplicasWithSameOrigin(containerInfo.containerID(),
+            State.QUASI_CLOSED, 1, 2, 3);

Review Comment:
   The Replica Index should always be zero for Ratis containers. Only for EC do we have the indexes 1 to N, indicating the position the container takes in the container group.
   
   It probably doesn't cause any problems with these tests, but we should probably fix it, and just pass "0, 0, 0" rather than "1, 2, 3" for any Ratis containers as they may be other places in the code where unexpected things happen if someone reused this code later.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] sodonnel commented on pull request #3785: HDDS-7263. Add a handler for Quasi Closed containers to RM

Posted by GitBox <gi...@apache.org>.
sodonnel commented on PR #3785:
URL: https://github.com/apache/ozone/pull/3785#issuecomment-1261016829

   Thanks for the updated PR. LGTM, pending the green CI run. I will commit after it goes green.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] sodonnel merged pull request #3785: HDDS-7263. Add a handler for Quasi Closed containers to RM

Posted by GitBox <gi...@apache.org>.
sodonnel merged PR #3785:
URL: https://github.com/apache/ozone/pull/3785


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] sodonnel commented on a diff in pull request #3785: HDDS-7263. Add a handler for Quasi Closed containers to RM

Posted by GitBox <gi...@apache.org>.
sodonnel commented on code in PR #3785:
URL: https://github.com/apache/ozone/pull/3785#discussion_r981363546


##########
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/replication/health/QuasiClosedContainerHandler.java:
##########
@@ -0,0 +1,130 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ *  with the License.  You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ *  Unless required by applicable law or agreed to in writing, software
+ *  distributed under the License is distributed on an "AS IS" BASIS,
+ *  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ *  See the License for the specific language governing permissions and
+ *  limitations under the License.
+ */
+
+package org.apache.hadoop.hdds.scm.container.replication.health;
+
+import org.apache.hadoop.hdds.protocol.proto.HddsProtos;
+import org.apache.hadoop.hdds.protocol.proto.StorageContainerDatanodeProtocolProtos.ContainerReplicaProto.State;
+import org.apache.hadoop.hdds.scm.container.ContainerInfo;
+import org.apache.hadoop.hdds.scm.container.ContainerReplica;
+import org.apache.hadoop.hdds.scm.container.ReplicationManagerReport;
+import org.apache.hadoop.hdds.scm.container.replication.ContainerCheckRequest;
+import org.apache.hadoop.hdds.scm.container.replication.ReplicationManager;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.util.List;
+import java.util.Set;
+import java.util.stream.Collectors;
+
+/**
+ * Class for handling containers that are in QUASI_CLOSED state. This will
+ * send commands to Datanodes to force close these containers if they satisfy
+ * the requirements to be force closed.
+ */
+public class QuasiClosedContainerHandler extends AbstractCheck {
+  public static final Logger LOG =
+      LoggerFactory.getLogger(QuasiClosedContainerHandler.class);
+
+  private final ReplicationManager replicationManager;
+
+  public QuasiClosedContainerHandler(ReplicationManager replicationManager) {
+    this.replicationManager = replicationManager;
+  }
+
+  /**
+   * If possible, force closes the Ratis container in QUASI_CLOSED state.
+   * Replicas with the highest Sequence ID are selected to be closed.
+   * @param request ContainerCheckRequest object representing the container
+   * @return true if close commands were sent, otherwise false
+   */
+  @Override
+  public boolean handle(ContainerCheckRequest request) {
+    ContainerInfo containerInfo = request.getContainerInfo();
+    if (containerInfo.getReplicationType() == HddsProtos.ReplicationType.EC) {
+      return false;
+    }
+
+    if (containerInfo.getState() != HddsProtos.LifeCycleState.QUASI_CLOSED) {
+      return false;
+    }
+
+    Set<ContainerReplica> replicas = request.getContainerReplicas();
+    if (canForceCloseContainer(containerInfo, replicas)) {
+      forceCloseContainer(containerInfo, replicas);
+      return true;
+    } else {
+      request.getReport().incrementAndSample(
+          ReplicationManagerReport.HealthState.QUASI_CLOSED_STUCK,
+          containerInfo.containerID());
+    }
+    return false;
+  }
+
+  /**
+   * Returns true if more than 50% of the container replicas with unique
+   * originNodeId are in QUASI_CLOSED state.
+   *
+   * @param container Container to check
+   * @param replicas Set of ContainerReplicas
+   * @return true if we can force close the container, false otherwise
+   */
+  private boolean canForceCloseContainer(final ContainerInfo container,
+      final Set<ContainerReplica> replicas) {
+    final int replicationFactor =
+        container.getReplicationConfig().getRequiredNodes();
+    final long uniqueQuasiClosedReplicaCount = replicas.stream()
+        .filter(r -> r.getState() == State.QUASI_CLOSED)
+        .map(ContainerReplica::getOriginDatanodeId)
+        .distinct()
+        .count();
+    return uniqueQuasiClosedReplicaCount > (replicationFactor / 2);
+  }
+
+  /**
+   * Force close the container replica(s) with the highest Sequence ID.
+   *
+   * <p>
+   *   Note: We should force close the container only if >50% (quorum)
+   *   of replicas with unique originNodeId are in QUASI_CLOSED state.
+   * </p>
+   *
+   * @param container ContainerInfo
+   * @param replicas Set of ContainerReplicas
+   */
+  private void forceCloseContainer(final ContainerInfo container,
+      final Set<ContainerReplica> replicas) {
+    final List<ContainerReplica> quasiClosedReplicas = replicas.stream()
+        .filter(r -> r.getState() == State.QUASI_CLOSED)
+        .collect(Collectors.toList());
+
+    final Long sequenceId = quasiClosedReplicas.stream()
+        .map(ContainerReplica::getSequenceId)
+        .max(Long::compare)
+        .orElse(-1L);
+
+    LOG.info("Force closing container {} with BCSID {}, which is in " +
+            "QUASI_CLOSED state.", container.containerID(), sequenceId);
+
+    quasiClosedReplicas.stream()
+        .filter(r -> sequenceId != -1L)
+        .filter(replica -> replica.getSequenceId().equals(sequenceId))

Review Comment:
   I don't think we have a unit test to validate we only send close commands for Replicas matching the max sequence number. It would be good to add one, as this is an important piece of logic I think.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] siddhantsangwan commented on a diff in pull request #3785: HDDS-7263. Add a handler for Quasi Closed containers to RM

Posted by GitBox <gi...@apache.org>.
siddhantsangwan commented on code in PR #3785:
URL: https://github.com/apache/ozone/pull/3785#discussion_r982375267


##########
hadoop-hdds/server-scm/src/test/java/org/apache/hadoop/hdds/scm/container/replication/health/TestQuasiClosedContainerHandler.java:
##########
@@ -0,0 +1,180 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ *  with the License.  You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ *  Unless required by applicable law or agreed to in writing, software
+ *  distributed under the License is distributed on an "AS IS" BASIS,
+ *  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ *  See the License for the specific language governing permissions and
+ *  limitations under the License.
+ */
+
+package org.apache.hadoop.hdds.scm.container.replication.health;
+
+import org.apache.hadoop.hdds.client.ECReplicationConfig;
+import org.apache.hadoop.hdds.client.RatisReplicationConfig;
+import org.apache.hadoop.hdds.protocol.proto.HddsProtos;
+import org.apache.hadoop.hdds.protocol.proto.StorageContainerDatanodeProtocolProtos.ContainerReplicaProto.State;
+import org.apache.hadoop.hdds.scm.container.ContainerInfo;
+import org.apache.hadoop.hdds.scm.container.ContainerReplica;
+import org.apache.hadoop.hdds.scm.container.ReplicationManagerReport;
+import org.apache.hadoop.hdds.scm.container.replication.ContainerCheckRequest;
+import org.apache.hadoop.hdds.scm.container.replication.ReplicationManager;
+import org.apache.hadoop.hdds.scm.container.replication.ReplicationTestUtil;
+import org.junit.jupiter.api.Assertions;
+import org.junit.jupiter.api.BeforeEach;
+import org.junit.jupiter.api.Test;
+import org.mockito.Mockito;
+
+import java.util.Collections;
+import java.util.Set;
+
+import static org.apache.hadoop.hdds.protocol.proto.HddsProtos.LifeCycleState.OPEN;
+import static org.apache.hadoop.hdds.protocol.proto.HddsProtos.LifeCycleState.QUASI_CLOSED;
+import static org.mockito.ArgumentMatchers.any;
+import static org.mockito.ArgumentMatchers.anyBoolean;
+import static org.mockito.Mockito.times;
+
+/**
+ * Tests for {@link QuasiClosedContainerHandler}. This handler is only meant
+ * to handle Ratis containers.
+ */
+public class TestQuasiClosedContainerHandler {
+  private ReplicationManager replicationManager;
+  private QuasiClosedContainerHandler quasiClosedContainerHandler;
+  private RatisReplicationConfig ratisReplicationConfig;
+
+  @BeforeEach
+  public void setup() {
+    ratisReplicationConfig = RatisReplicationConfig.getInstance(
+        HddsProtos.ReplicationFactor.THREE);
+    replicationManager = Mockito.mock(ReplicationManager.class);
+    quasiClosedContainerHandler =
+        new QuasiClosedContainerHandler(replicationManager);
+  }
+
+  @Test
+  public void testECContainerReturnsFalse() {
+    ContainerInfo containerInfo = ReplicationTestUtil.createContainerInfo(
+        new ECReplicationConfig(3, 2), 1, QUASI_CLOSED);
+    Set<ContainerReplica> containerReplicas = ReplicationTestUtil
+        .createReplicas(containerInfo.containerID(),
+            State.QUASI_CLOSED, 1, 2, 3);
+    ContainerCheckRequest request = new ContainerCheckRequest.Builder()
+        .setPendingOps(Collections.EMPTY_LIST)
+        .setReport(new ReplicationManagerReport())
+        .setContainerInfo(containerInfo)
+        .setContainerReplicas(containerReplicas)
+        .build();
+
+    Assertions.assertFalse(quasiClosedContainerHandler.handle(request));
+    Mockito.verify(replicationManager, times(0))
+        .sendCloseContainerReplicaCommand(any(), any(), anyBoolean());
+  }
+
+  @Test
+  public void testOpenContainerReturnsFalse() {
+    ContainerInfo containerInfo = ReplicationTestUtil.createContainerInfo(
+        ratisReplicationConfig, 1, OPEN);
+    Set<ContainerReplica> containerReplicas = ReplicationTestUtil
+        .createReplicas(containerInfo.containerID(),
+            State.OPEN, 1, 2, 3);
+    ContainerCheckRequest request = new ContainerCheckRequest.Builder()
+        .setPendingOps(Collections.EMPTY_LIST)
+        .setReport(new ReplicationManagerReport())
+        .setContainerInfo(containerInfo)
+        .setContainerReplicas(containerReplicas)
+        .build();
+
+    Assertions.assertFalse(quasiClosedContainerHandler.handle(request));
+    Mockito.verify(replicationManager, times(0))
+        .sendCloseContainerReplicaCommand(any(), any(), anyBoolean());
+  }
+
+  /**
+   * When a container is QUASI_CLOSED, and it has greater than 50% of its
+   * replicas in QUASI_CLOSED state with unique origin node id,
+   * the handler should send force close commands to the replica(s) with
+   * highest BCSID.
+   */
+  @Test
+  public void testQuasiClosedWithQuorumReturnsTrue() {
+    ContainerInfo containerInfo = ReplicationTestUtil.createContainerInfo(
+        ratisReplicationConfig, 1, QUASI_CLOSED);
+    Set<ContainerReplica> containerReplicas = ReplicationTestUtil
+        .createReplicas(containerInfo.containerID(),
+            State.QUASI_CLOSED, 1, 2);
+    ContainerReplica openReplica = ReplicationTestUtil.createContainerReplica(
+        containerInfo.containerID(), 3,
+        HddsProtos.NodeOperationalState.IN_SERVICE, State.OPEN);
+    containerReplicas.add(openReplica);
+    ContainerCheckRequest request = new ContainerCheckRequest.Builder()
+        .setPendingOps(Collections.EMPTY_LIST)
+        .setReport(new ReplicationManagerReport())
+        .setContainerInfo(containerInfo)
+        .setContainerReplicas(containerReplicas)
+        .build();
+
+    Assertions.assertTrue(quasiClosedContainerHandler.handle(request));
+    Mockito.verify(replicationManager, times(2))
+        .sendCloseContainerReplicaCommand(any(), any(), anyBoolean());
+  }
+
+  /**
+   * The replicas are QUASI_CLOSED, but all of them have the same origin node
+   * id. Since a quorum with unique origin node ids (greater than 50% of
+   * replicas) is not formed, the handler should return false.
+   */
+  @Test
+  public void testHealthyQuasiClosedContainerReturnsFalse() {
+    ContainerInfo containerInfo = ReplicationTestUtil.createContainerInfo(
+        ratisReplicationConfig, 1, QUASI_CLOSED);
+    Set<ContainerReplica> containerReplicas = ReplicationTestUtil
+        .createReplicasWithSameOrigin(containerInfo.containerID(),
+            State.QUASI_CLOSED, 1, 2, 3);

Review Comment:
   Created a jira for those other tests https://issues.apache.org/jira/browse/HDDS-7268



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] siddhantsangwan commented on a diff in pull request #3785: HDDS-7263. Add a handler for Quasi Closed containers to RM

Posted by GitBox <gi...@apache.org>.
siddhantsangwan commented on code in PR #3785:
URL: https://github.com/apache/ozone/pull/3785#discussion_r982164031


##########
hadoop-hdds/server-scm/src/test/java/org/apache/hadoop/hdds/scm/container/replication/health/TestQuasiClosedContainerHandler.java:
##########
@@ -0,0 +1,180 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ *  with the License.  You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ *  Unless required by applicable law or agreed to in writing, software
+ *  distributed under the License is distributed on an "AS IS" BASIS,
+ *  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ *  See the License for the specific language governing permissions and
+ *  limitations under the License.
+ */
+
+package org.apache.hadoop.hdds.scm.container.replication.health;
+
+import org.apache.hadoop.hdds.client.ECReplicationConfig;
+import org.apache.hadoop.hdds.client.RatisReplicationConfig;
+import org.apache.hadoop.hdds.protocol.proto.HddsProtos;
+import org.apache.hadoop.hdds.protocol.proto.StorageContainerDatanodeProtocolProtos.ContainerReplicaProto.State;
+import org.apache.hadoop.hdds.scm.container.ContainerInfo;
+import org.apache.hadoop.hdds.scm.container.ContainerReplica;
+import org.apache.hadoop.hdds.scm.container.ReplicationManagerReport;
+import org.apache.hadoop.hdds.scm.container.replication.ContainerCheckRequest;
+import org.apache.hadoop.hdds.scm.container.replication.ReplicationManager;
+import org.apache.hadoop.hdds.scm.container.replication.ReplicationTestUtil;
+import org.junit.jupiter.api.Assertions;
+import org.junit.jupiter.api.BeforeEach;
+import org.junit.jupiter.api.Test;
+import org.mockito.Mockito;
+
+import java.util.Collections;
+import java.util.Set;
+
+import static org.apache.hadoop.hdds.protocol.proto.HddsProtos.LifeCycleState.OPEN;
+import static org.apache.hadoop.hdds.protocol.proto.HddsProtos.LifeCycleState.QUASI_CLOSED;
+import static org.mockito.ArgumentMatchers.any;
+import static org.mockito.ArgumentMatchers.anyBoolean;
+import static org.mockito.Mockito.times;
+
+/**
+ * Tests for {@link QuasiClosedContainerHandler}. This handler is only meant
+ * to handle Ratis containers.
+ */
+public class TestQuasiClosedContainerHandler {
+  private ReplicationManager replicationManager;
+  private QuasiClosedContainerHandler quasiClosedContainerHandler;
+  private RatisReplicationConfig ratisReplicationConfig;
+
+  @BeforeEach
+  public void setup() {
+    ratisReplicationConfig = RatisReplicationConfig.getInstance(
+        HddsProtos.ReplicationFactor.THREE);
+    replicationManager = Mockito.mock(ReplicationManager.class);
+    quasiClosedContainerHandler =
+        new QuasiClosedContainerHandler(replicationManager);
+  }
+
+  @Test
+  public void testECContainerReturnsFalse() {
+    ContainerInfo containerInfo = ReplicationTestUtil.createContainerInfo(
+        new ECReplicationConfig(3, 2), 1, QUASI_CLOSED);
+    Set<ContainerReplica> containerReplicas = ReplicationTestUtil
+        .createReplicas(containerInfo.containerID(),
+            State.QUASI_CLOSED, 1, 2, 3);
+    ContainerCheckRequest request = new ContainerCheckRequest.Builder()
+        .setPendingOps(Collections.EMPTY_LIST)
+        .setReport(new ReplicationManagerReport())
+        .setContainerInfo(containerInfo)
+        .setContainerReplicas(containerReplicas)
+        .build();
+
+    Assertions.assertFalse(quasiClosedContainerHandler.handle(request));
+    Mockito.verify(replicationManager, times(0))
+        .sendCloseContainerReplicaCommand(any(), any(), anyBoolean());
+  }
+
+  @Test
+  public void testOpenContainerReturnsFalse() {
+    ContainerInfo containerInfo = ReplicationTestUtil.createContainerInfo(
+        ratisReplicationConfig, 1, OPEN);
+    Set<ContainerReplica> containerReplicas = ReplicationTestUtil
+        .createReplicas(containerInfo.containerID(),
+            State.OPEN, 1, 2, 3);
+    ContainerCheckRequest request = new ContainerCheckRequest.Builder()
+        .setPendingOps(Collections.EMPTY_LIST)
+        .setReport(new ReplicationManagerReport())
+        .setContainerInfo(containerInfo)
+        .setContainerReplicas(containerReplicas)
+        .build();
+
+    Assertions.assertFalse(quasiClosedContainerHandler.handle(request));
+    Mockito.verify(replicationManager, times(0))
+        .sendCloseContainerReplicaCommand(any(), any(), anyBoolean());
+  }
+
+  /**
+   * When a container is QUASI_CLOSED, and it has greater than 50% of its
+   * replicas in QUASI_CLOSED state with unique origin node id,
+   * the handler should send force close commands to the replica(s) with
+   * highest BCSID.
+   */
+  @Test
+  public void testQuasiClosedWithQuorumReturnsTrue() {
+    ContainerInfo containerInfo = ReplicationTestUtil.createContainerInfo(
+        ratisReplicationConfig, 1, QUASI_CLOSED);
+    Set<ContainerReplica> containerReplicas = ReplicationTestUtil
+        .createReplicas(containerInfo.containerID(),
+            State.QUASI_CLOSED, 1, 2);
+    ContainerReplica openReplica = ReplicationTestUtil.createContainerReplica(
+        containerInfo.containerID(), 3,
+        HddsProtos.NodeOperationalState.IN_SERVICE, State.OPEN);
+    containerReplicas.add(openReplica);
+    ContainerCheckRequest request = new ContainerCheckRequest.Builder()
+        .setPendingOps(Collections.EMPTY_LIST)
+        .setReport(new ReplicationManagerReport())
+        .setContainerInfo(containerInfo)
+        .setContainerReplicas(containerReplicas)
+        .build();
+
+    Assertions.assertTrue(quasiClosedContainerHandler.handle(request));
+    Mockito.verify(replicationManager, times(2))
+        .sendCloseContainerReplicaCommand(any(), any(), anyBoolean());
+  }
+
+  /**
+   * The replicas are QUASI_CLOSED, but all of them have the same origin node
+   * id. Since a quorum with unique origin node ids (greater than 50% of
+   * replicas) is not formed, the handler should return false.
+   */
+  @Test
+  public void testHealthyQuasiClosedContainerReturnsFalse() {
+    ContainerInfo containerInfo = ReplicationTestUtil.createContainerInfo(
+        ratisReplicationConfig, 1, QUASI_CLOSED);
+    Set<ContainerReplica> containerReplicas = ReplicationTestUtil
+        .createReplicasWithSameOrigin(containerInfo.containerID(),
+            State.QUASI_CLOSED, 1, 2, 3);

Review Comment:
   Got it. Will also need to change this in the Ratis related tests for other handlers. I'll open another PR for that.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] sodonnel commented on a diff in pull request #3785: HDDS-7263. Add a handler for Quasi Closed containers to RM

Posted by GitBox <gi...@apache.org>.
sodonnel commented on code in PR #3785:
URL: https://github.com/apache/ozone/pull/3785#discussion_r982223512


##########
hadoop-hdds/server-scm/src/test/java/org/apache/hadoop/hdds/scm/container/replication/health/TestQuasiClosedContainerHandler.java:
##########
@@ -0,0 +1,180 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ *  with the License.  You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ *  Unless required by applicable law or agreed to in writing, software
+ *  distributed under the License is distributed on an "AS IS" BASIS,
+ *  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ *  See the License for the specific language governing permissions and
+ *  limitations under the License.
+ */
+
+package org.apache.hadoop.hdds.scm.container.replication.health;
+
+import org.apache.hadoop.hdds.client.ECReplicationConfig;
+import org.apache.hadoop.hdds.client.RatisReplicationConfig;
+import org.apache.hadoop.hdds.protocol.proto.HddsProtos;
+import org.apache.hadoop.hdds.protocol.proto.StorageContainerDatanodeProtocolProtos.ContainerReplicaProto.State;
+import org.apache.hadoop.hdds.scm.container.ContainerInfo;
+import org.apache.hadoop.hdds.scm.container.ContainerReplica;
+import org.apache.hadoop.hdds.scm.container.ReplicationManagerReport;
+import org.apache.hadoop.hdds.scm.container.replication.ContainerCheckRequest;
+import org.apache.hadoop.hdds.scm.container.replication.ReplicationManager;
+import org.apache.hadoop.hdds.scm.container.replication.ReplicationTestUtil;
+import org.junit.jupiter.api.Assertions;
+import org.junit.jupiter.api.BeforeEach;
+import org.junit.jupiter.api.Test;
+import org.mockito.Mockito;
+
+import java.util.Collections;
+import java.util.Set;
+
+import static org.apache.hadoop.hdds.protocol.proto.HddsProtos.LifeCycleState.OPEN;
+import static org.apache.hadoop.hdds.protocol.proto.HddsProtos.LifeCycleState.QUASI_CLOSED;
+import static org.mockito.ArgumentMatchers.any;
+import static org.mockito.ArgumentMatchers.anyBoolean;
+import static org.mockito.Mockito.times;
+
+/**
+ * Tests for {@link QuasiClosedContainerHandler}. This handler is only meant
+ * to handle Ratis containers.
+ */
+public class TestQuasiClosedContainerHandler {
+  private ReplicationManager replicationManager;
+  private QuasiClosedContainerHandler quasiClosedContainerHandler;
+  private RatisReplicationConfig ratisReplicationConfig;
+
+  @BeforeEach
+  public void setup() {
+    ratisReplicationConfig = RatisReplicationConfig.getInstance(
+        HddsProtos.ReplicationFactor.THREE);
+    replicationManager = Mockito.mock(ReplicationManager.class);
+    quasiClosedContainerHandler =
+        new QuasiClosedContainerHandler(replicationManager);
+  }
+
+  @Test
+  public void testECContainerReturnsFalse() {
+    ContainerInfo containerInfo = ReplicationTestUtil.createContainerInfo(
+        new ECReplicationConfig(3, 2), 1, QUASI_CLOSED);
+    Set<ContainerReplica> containerReplicas = ReplicationTestUtil
+        .createReplicas(containerInfo.containerID(),
+            State.QUASI_CLOSED, 1, 2, 3);
+    ContainerCheckRequest request = new ContainerCheckRequest.Builder()
+        .setPendingOps(Collections.EMPTY_LIST)
+        .setReport(new ReplicationManagerReport())
+        .setContainerInfo(containerInfo)
+        .setContainerReplicas(containerReplicas)
+        .build();
+
+    Assertions.assertFalse(quasiClosedContainerHandler.handle(request));
+    Mockito.verify(replicationManager, times(0))
+        .sendCloseContainerReplicaCommand(any(), any(), anyBoolean());
+  }
+
+  @Test
+  public void testOpenContainerReturnsFalse() {
+    ContainerInfo containerInfo = ReplicationTestUtil.createContainerInfo(
+        ratisReplicationConfig, 1, OPEN);
+    Set<ContainerReplica> containerReplicas = ReplicationTestUtil
+        .createReplicas(containerInfo.containerID(),
+            State.OPEN, 1, 2, 3);
+    ContainerCheckRequest request = new ContainerCheckRequest.Builder()
+        .setPendingOps(Collections.EMPTY_LIST)
+        .setReport(new ReplicationManagerReport())
+        .setContainerInfo(containerInfo)
+        .setContainerReplicas(containerReplicas)
+        .build();
+
+    Assertions.assertFalse(quasiClosedContainerHandler.handle(request));
+    Mockito.verify(replicationManager, times(0))
+        .sendCloseContainerReplicaCommand(any(), any(), anyBoolean());
+  }
+
+  /**
+   * When a container is QUASI_CLOSED, and it has greater than 50% of its
+   * replicas in QUASI_CLOSED state with unique origin node id,
+   * the handler should send force close commands to the replica(s) with
+   * highest BCSID.
+   */
+  @Test
+  public void testQuasiClosedWithQuorumReturnsTrue() {
+    ContainerInfo containerInfo = ReplicationTestUtil.createContainerInfo(
+        ratisReplicationConfig, 1, QUASI_CLOSED);
+    Set<ContainerReplica> containerReplicas = ReplicationTestUtil
+        .createReplicas(containerInfo.containerID(),
+            State.QUASI_CLOSED, 1, 2);
+    ContainerReplica openReplica = ReplicationTestUtil.createContainerReplica(
+        containerInfo.containerID(), 3,
+        HddsProtos.NodeOperationalState.IN_SERVICE, State.OPEN);
+    containerReplicas.add(openReplica);
+    ContainerCheckRequest request = new ContainerCheckRequest.Builder()
+        .setPendingOps(Collections.EMPTY_LIST)
+        .setReport(new ReplicationManagerReport())
+        .setContainerInfo(containerInfo)
+        .setContainerReplicas(containerReplicas)
+        .build();
+
+    Assertions.assertTrue(quasiClosedContainerHandler.handle(request));
+    Mockito.verify(replicationManager, times(2))
+        .sendCloseContainerReplicaCommand(any(), any(), anyBoolean());
+  }
+
+  /**
+   * The replicas are QUASI_CLOSED, but all of them have the same origin node
+   * id. Since a quorum with unique origin node ids (greater than 50% of
+   * replicas) is not formed, the handler should return false.
+   */
+  @Test
+  public void testHealthyQuasiClosedContainerReturnsFalse() {
+    ContainerInfo containerInfo = ReplicationTestUtil.createContainerInfo(
+        ratisReplicationConfig, 1, QUASI_CLOSED);
+    Set<ContainerReplica> containerReplicas = ReplicationTestUtil
+        .createReplicasWithSameOrigin(containerInfo.containerID(),
+            State.QUASI_CLOSED, 1, 2, 3);

Review Comment:
   Ah, I did not spot it then. But yea, it would be good to tidy it up there too.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] sodonnel commented on a diff in pull request #3785: HDDS-7263. Add a handler for Quasi Closed containers to RM

Posted by GitBox <gi...@apache.org>.
sodonnel commented on code in PR #3785:
URL: https://github.com/apache/ozone/pull/3785#discussion_r981334425


##########
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/replication/health/QuasiClosedContainerHandler.java:
##########
@@ -0,0 +1,130 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ *  with the License.  You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ *  Unless required by applicable law or agreed to in writing, software
+ *  distributed under the License is distributed on an "AS IS" BASIS,
+ *  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ *  See the License for the specific language governing permissions and
+ *  limitations under the License.
+ */
+
+package org.apache.hadoop.hdds.scm.container.replication.health;
+
+import org.apache.hadoop.hdds.protocol.proto.HddsProtos;
+import org.apache.hadoop.hdds.protocol.proto.StorageContainerDatanodeProtocolProtos.ContainerReplicaProto.State;
+import org.apache.hadoop.hdds.scm.container.ContainerInfo;
+import org.apache.hadoop.hdds.scm.container.ContainerReplica;
+import org.apache.hadoop.hdds.scm.container.ReplicationManagerReport;
+import org.apache.hadoop.hdds.scm.container.replication.ContainerCheckRequest;
+import org.apache.hadoop.hdds.scm.container.replication.ReplicationManager;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.util.List;
+import java.util.Set;
+import java.util.stream.Collectors;
+
+/**
+ * Class for handling containers that are in QUASI_CLOSED state. This will
+ * send commands to Datanodes to force close these containers if they satisfy
+ * the requirements to be force closed.
+ */
+public class QuasiClosedContainerHandler extends AbstractCheck {
+  public static final Logger LOG =
+      LoggerFactory.getLogger(QuasiClosedContainerHandler.class);
+
+  private final ReplicationManager replicationManager;
+
+  public QuasiClosedContainerHandler(ReplicationManager replicationManager) {
+    this.replicationManager = replicationManager;
+  }
+
+  /**
+   * If possible, force closes the Ratis container in QUASI_CLOSED state.
+   * Replicas with the highest Sequence ID are selected to be closed.
+   * @param request ContainerCheckRequest object representing the container
+   * @return true if close commands were sent, otherwise false
+   */
+  @Override
+  public boolean handle(ContainerCheckRequest request) {
+    ContainerInfo containerInfo = request.getContainerInfo();
+    if (containerInfo.getReplicationType() == HddsProtos.ReplicationType.EC) {

Review Comment:
   I think it should be `!= RATIS` - quasi closed is really only a Ratis thing as far as I know.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] siddhantsangwan commented on a diff in pull request #3785: HDDS-7263. Add a handler for Quasi Closed containers to RM

Posted by GitBox <gi...@apache.org>.
siddhantsangwan commented on code in PR #3785:
URL: https://github.com/apache/ozone/pull/3785#discussion_r982365844


##########
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/replication/health/QuasiClosedContainerHandler.java:
##########
@@ -0,0 +1,130 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ *  with the License.  You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ *  Unless required by applicable law or agreed to in writing, software
+ *  distributed under the License is distributed on an "AS IS" BASIS,
+ *  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ *  See the License for the specific language governing permissions and
+ *  limitations under the License.
+ */
+
+package org.apache.hadoop.hdds.scm.container.replication.health;
+
+import org.apache.hadoop.hdds.protocol.proto.HddsProtos;
+import org.apache.hadoop.hdds.protocol.proto.StorageContainerDatanodeProtocolProtos.ContainerReplicaProto.State;
+import org.apache.hadoop.hdds.scm.container.ContainerInfo;
+import org.apache.hadoop.hdds.scm.container.ContainerReplica;
+import org.apache.hadoop.hdds.scm.container.ReplicationManagerReport;
+import org.apache.hadoop.hdds.scm.container.replication.ContainerCheckRequest;
+import org.apache.hadoop.hdds.scm.container.replication.ReplicationManager;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.util.List;
+import java.util.Set;
+import java.util.stream.Collectors;
+
+/**
+ * Class for handling containers that are in QUASI_CLOSED state. This will
+ * send commands to Datanodes to force close these containers if they satisfy
+ * the requirements to be force closed.
+ */
+public class QuasiClosedContainerHandler extends AbstractCheck {
+  public static final Logger LOG =
+      LoggerFactory.getLogger(QuasiClosedContainerHandler.class);
+
+  private final ReplicationManager replicationManager;
+
+  public QuasiClosedContainerHandler(ReplicationManager replicationManager) {
+    this.replicationManager = replicationManager;
+  }
+
+  /**
+   * If possible, force closes the Ratis container in QUASI_CLOSED state.
+   * Replicas with the highest Sequence ID are selected to be closed.
+   * @param request ContainerCheckRequest object representing the container
+   * @return true if close commands were sent, otherwise false
+   */
+  @Override
+  public boolean handle(ContainerCheckRequest request) {
+    ContainerInfo containerInfo = request.getContainerInfo();
+    if (containerInfo.getReplicationType() == HddsProtos.ReplicationType.EC) {
+      return false;
+    }
+
+    if (containerInfo.getState() != HddsProtos.LifeCycleState.QUASI_CLOSED) {
+      return false;
+    }
+
+    Set<ContainerReplica> replicas = request.getContainerReplicas();
+    if (canForceCloseContainer(containerInfo, replicas)) {
+      forceCloseContainer(containerInfo, replicas);
+      return true;
+    } else {
+      request.getReport().incrementAndSample(
+          ReplicationManagerReport.HealthState.QUASI_CLOSED_STUCK,
+          containerInfo.containerID());
+    }
+    return false;
+  }
+
+  /**
+   * Returns true if more than 50% of the container replicas with unique
+   * originNodeId are in QUASI_CLOSED state.
+   *
+   * @param container Container to check
+   * @param replicas Set of ContainerReplicas
+   * @return true if we can force close the container, false otherwise
+   */
+  private boolean canForceCloseContainer(final ContainerInfo container,
+      final Set<ContainerReplica> replicas) {
+    final int replicationFactor =
+        container.getReplicationConfig().getRequiredNodes();
+    final long uniqueQuasiClosedReplicaCount = replicas.stream()
+        .filter(r -> r.getState() == State.QUASI_CLOSED)
+        .map(ContainerReplica::getOriginDatanodeId)
+        .distinct()
+        .count();
+    return uniqueQuasiClosedReplicaCount > (replicationFactor / 2);
+  }
+
+  /**
+   * Force close the container replica(s) with the highest Sequence ID.
+   *
+   * <p>
+   *   Note: We should force close the container only if >50% (quorum)
+   *   of replicas with unique originNodeId are in QUASI_CLOSED state.
+   * </p>
+   *
+   * @param container ContainerInfo
+   * @param replicas Set of ContainerReplicas
+   */
+  private void forceCloseContainer(final ContainerInfo container,
+      final Set<ContainerReplica> replicas) {
+    final List<ContainerReplica> quasiClosedReplicas = replicas.stream()
+        .filter(r -> r.getState() == State.QUASI_CLOSED)
+        .collect(Collectors.toList());
+
+    final Long sequenceId = quasiClosedReplicas.stream()
+        .map(ContainerReplica::getSequenceId)
+        .max(Long::compare)
+        .orElse(-1L);
+
+    LOG.info("Force closing container {} with BCSID {}, which is in " +
+            "QUASI_CLOSED state.", container.containerID(), sequenceId);
+
+    quasiClosedReplicas.stream()
+        .filter(r -> sequenceId != -1L)
+        .filter(replica -> replica.getSequenceId().equals(sequenceId))

Review Comment:
   Added `testReplicasWithHighestBCSIDAreClosed` in the latest commit



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] siddhantsangwan commented on a diff in pull request #3785: HDDS-7263. Add a handler for Quasi Closed containers to RM

Posted by GitBox <gi...@apache.org>.
siddhantsangwan commented on code in PR #3785:
URL: https://github.com/apache/ozone/pull/3785#discussion_r982375267


##########
hadoop-hdds/server-scm/src/test/java/org/apache/hadoop/hdds/scm/container/replication/health/TestQuasiClosedContainerHandler.java:
##########
@@ -0,0 +1,180 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ *  with the License.  You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ *  Unless required by applicable law or agreed to in writing, software
+ *  distributed under the License is distributed on an "AS IS" BASIS,
+ *  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ *  See the License for the specific language governing permissions and
+ *  limitations under the License.
+ */
+
+package org.apache.hadoop.hdds.scm.container.replication.health;
+
+import org.apache.hadoop.hdds.client.ECReplicationConfig;
+import org.apache.hadoop.hdds.client.RatisReplicationConfig;
+import org.apache.hadoop.hdds.protocol.proto.HddsProtos;
+import org.apache.hadoop.hdds.protocol.proto.StorageContainerDatanodeProtocolProtos.ContainerReplicaProto.State;
+import org.apache.hadoop.hdds.scm.container.ContainerInfo;
+import org.apache.hadoop.hdds.scm.container.ContainerReplica;
+import org.apache.hadoop.hdds.scm.container.ReplicationManagerReport;
+import org.apache.hadoop.hdds.scm.container.replication.ContainerCheckRequest;
+import org.apache.hadoop.hdds.scm.container.replication.ReplicationManager;
+import org.apache.hadoop.hdds.scm.container.replication.ReplicationTestUtil;
+import org.junit.jupiter.api.Assertions;
+import org.junit.jupiter.api.BeforeEach;
+import org.junit.jupiter.api.Test;
+import org.mockito.Mockito;
+
+import java.util.Collections;
+import java.util.Set;
+
+import static org.apache.hadoop.hdds.protocol.proto.HddsProtos.LifeCycleState.OPEN;
+import static org.apache.hadoop.hdds.protocol.proto.HddsProtos.LifeCycleState.QUASI_CLOSED;
+import static org.mockito.ArgumentMatchers.any;
+import static org.mockito.ArgumentMatchers.anyBoolean;
+import static org.mockito.Mockito.times;
+
+/**
+ * Tests for {@link QuasiClosedContainerHandler}. This handler is only meant
+ * to handle Ratis containers.
+ */
+public class TestQuasiClosedContainerHandler {
+  private ReplicationManager replicationManager;
+  private QuasiClosedContainerHandler quasiClosedContainerHandler;
+  private RatisReplicationConfig ratisReplicationConfig;
+
+  @BeforeEach
+  public void setup() {
+    ratisReplicationConfig = RatisReplicationConfig.getInstance(
+        HddsProtos.ReplicationFactor.THREE);
+    replicationManager = Mockito.mock(ReplicationManager.class);
+    quasiClosedContainerHandler =
+        new QuasiClosedContainerHandler(replicationManager);
+  }
+
+  @Test
+  public void testECContainerReturnsFalse() {
+    ContainerInfo containerInfo = ReplicationTestUtil.createContainerInfo(
+        new ECReplicationConfig(3, 2), 1, QUASI_CLOSED);
+    Set<ContainerReplica> containerReplicas = ReplicationTestUtil
+        .createReplicas(containerInfo.containerID(),
+            State.QUASI_CLOSED, 1, 2, 3);
+    ContainerCheckRequest request = new ContainerCheckRequest.Builder()
+        .setPendingOps(Collections.EMPTY_LIST)
+        .setReport(new ReplicationManagerReport())
+        .setContainerInfo(containerInfo)
+        .setContainerReplicas(containerReplicas)
+        .build();
+
+    Assertions.assertFalse(quasiClosedContainerHandler.handle(request));
+    Mockito.verify(replicationManager, times(0))
+        .sendCloseContainerReplicaCommand(any(), any(), anyBoolean());
+  }
+
+  @Test
+  public void testOpenContainerReturnsFalse() {
+    ContainerInfo containerInfo = ReplicationTestUtil.createContainerInfo(
+        ratisReplicationConfig, 1, OPEN);
+    Set<ContainerReplica> containerReplicas = ReplicationTestUtil
+        .createReplicas(containerInfo.containerID(),
+            State.OPEN, 1, 2, 3);
+    ContainerCheckRequest request = new ContainerCheckRequest.Builder()
+        .setPendingOps(Collections.EMPTY_LIST)
+        .setReport(new ReplicationManagerReport())
+        .setContainerInfo(containerInfo)
+        .setContainerReplicas(containerReplicas)
+        .build();
+
+    Assertions.assertFalse(quasiClosedContainerHandler.handle(request));
+    Mockito.verify(replicationManager, times(0))
+        .sendCloseContainerReplicaCommand(any(), any(), anyBoolean());
+  }
+
+  /**
+   * When a container is QUASI_CLOSED, and it has greater than 50% of its
+   * replicas in QUASI_CLOSED state with unique origin node id,
+   * the handler should send force close commands to the replica(s) with
+   * highest BCSID.
+   */
+  @Test
+  public void testQuasiClosedWithQuorumReturnsTrue() {
+    ContainerInfo containerInfo = ReplicationTestUtil.createContainerInfo(
+        ratisReplicationConfig, 1, QUASI_CLOSED);
+    Set<ContainerReplica> containerReplicas = ReplicationTestUtil
+        .createReplicas(containerInfo.containerID(),
+            State.QUASI_CLOSED, 1, 2);
+    ContainerReplica openReplica = ReplicationTestUtil.createContainerReplica(
+        containerInfo.containerID(), 3,
+        HddsProtos.NodeOperationalState.IN_SERVICE, State.OPEN);
+    containerReplicas.add(openReplica);
+    ContainerCheckRequest request = new ContainerCheckRequest.Builder()
+        .setPendingOps(Collections.EMPTY_LIST)
+        .setReport(new ReplicationManagerReport())
+        .setContainerInfo(containerInfo)
+        .setContainerReplicas(containerReplicas)
+        .build();
+
+    Assertions.assertTrue(quasiClosedContainerHandler.handle(request));
+    Mockito.verify(replicationManager, times(2))
+        .sendCloseContainerReplicaCommand(any(), any(), anyBoolean());
+  }
+
+  /**
+   * The replicas are QUASI_CLOSED, but all of them have the same origin node
+   * id. Since a quorum with unique origin node ids (greater than 50% of
+   * replicas) is not formed, the handler should return false.
+   */
+  @Test
+  public void testHealthyQuasiClosedContainerReturnsFalse() {
+    ContainerInfo containerInfo = ReplicationTestUtil.createContainerInfo(
+        ratisReplicationConfig, 1, QUASI_CLOSED);
+    Set<ContainerReplica> containerReplicas = ReplicationTestUtil
+        .createReplicasWithSameOrigin(containerInfo.containerID(),
+            State.QUASI_CLOSED, 1, 2, 3);

Review Comment:
   Created a jira for this https://issues.apache.org/jira/browse/HDDS-7268



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] siddhantsangwan commented on a diff in pull request #3785: HDDS-7263. Add a handler for Quasi Closed containers to RM

Posted by GitBox <gi...@apache.org>.
siddhantsangwan commented on code in PR #3785:
URL: https://github.com/apache/ozone/pull/3785#discussion_r981209273


##########
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/replication/health/QuasiClosedContainerHandler.java:
##########
@@ -0,0 +1,130 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ *  with the License.  You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ *  Unless required by applicable law or agreed to in writing, software
+ *  distributed under the License is distributed on an "AS IS" BASIS,
+ *  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ *  See the License for the specific language governing permissions and
+ *  limitations under the License.
+ */
+
+package org.apache.hadoop.hdds.scm.container.replication.health;
+
+import org.apache.hadoop.hdds.protocol.proto.HddsProtos;
+import org.apache.hadoop.hdds.protocol.proto.StorageContainerDatanodeProtocolProtos.ContainerReplicaProto.State;
+import org.apache.hadoop.hdds.scm.container.ContainerInfo;
+import org.apache.hadoop.hdds.scm.container.ContainerReplica;
+import org.apache.hadoop.hdds.scm.container.ReplicationManagerReport;
+import org.apache.hadoop.hdds.scm.container.replication.ContainerCheckRequest;
+import org.apache.hadoop.hdds.scm.container.replication.ReplicationManager;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.util.List;
+import java.util.Set;
+import java.util.stream.Collectors;
+
+/**
+ * Class for handling containers that are in QUASI_CLOSED state. This will
+ * send commands to Datanodes to force close these containers if they satisfy
+ * the requirements to be force closed.
+ */
+public class QuasiClosedContainerHandler extends AbstractCheck {
+  public static final Logger LOG =
+      LoggerFactory.getLogger(QuasiClosedContainerHandler.class);
+
+  private final ReplicationManager replicationManager;
+
+  public QuasiClosedContainerHandler(ReplicationManager replicationManager) {
+    this.replicationManager = replicationManager;
+  }
+
+  /**
+   * If possible, force closes the Ratis container in QUASI_CLOSED state.
+   * Replicas with the highest Sequence ID are selected to be closed.
+   * @param request ContainerCheckRequest object representing the container
+   * @return true if close commands were sent, otherwise false
+   */
+  @Override
+  public boolean handle(ContainerCheckRequest request) {
+    ContainerInfo containerInfo = request.getContainerInfo();
+    if (containerInfo.getReplicationType() == HddsProtos.ReplicationType.EC) {

Review Comment:
   Should this check be 
   ```
   if (containerInfo.getReplicationType() != HddsProtos.ReplicationType.RATIS) {
         return false;
   }
   ```
   I'm not sure about `HddsProtos.ReplicationType.STAND_ALONE`. Are we still using this replication type?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] sodonnel commented on pull request #3785: HDDS-7263. Add a handler for Quasi Closed containers to RM

Posted by GitBox <gi...@apache.org>.
sodonnel commented on PR #3785:
URL: https://github.com/apache/ozone/pull/3785#issuecomment-1259646409

   This PR looks mostly good. I think we just need to fix the replica indexes in the tests and also consider adding one more unit test.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org