You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ozone.apache.org by GitBox <gi...@apache.org> on 2020/07/29 07:42:33 UTC

[GitHub] [hadoop-ozone] prashantpogde commented on a change in pull request #1257: HDDS-3970. Enabling TestStorageContainerManager with all failures add…

prashantpogde commented on a change in pull request #1257:
URL: https://github.com/apache/hadoop-ozone/pull/1257#discussion_r461326748



##########
File path: hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/ozone/TestStorageContainerManager.java
##########
@@ -525,7 +523,7 @@ public void testScmInfo() throws Exception {
   /**
    * Test datanode heartbeat well processed with a 4-layer network topology.
    */
-  @Test(timeout = 60000)
+  @Test(timeout = 180000)
   public void testScmProcessDatanodeHeartbeat() throws Exception {

Review comment:
       I tested it multiple times on my laptop and tests use to timeout occasionally depending on other applications on my laptop. I kept it long enough that there should be no flakiness in the tests.

##########
File path: hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/ozone/TestStorageContainerManager.java
##########
@@ -604,8 +602,13 @@ public void testCloseContainerCommandOnRestart() throws Exception {
       // Stop processing HB
       scm.getDatanodeProtocolServer().stop();
 
-      scm.getContainerManager().updateContainerState(selectedContainer
-          .containerID(), HddsProtos.LifeCycleEvent.FINALIZE);
+      LoggerFactory.getLogger(TestStorageContainerManager.class).info(

Review comment:
       done

##########
File path: hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/ozone/TestStorageContainerManager.java
##########
@@ -604,8 +602,13 @@ public void testCloseContainerCommandOnRestart() throws Exception {
       // Stop processing HB
       scm.getDatanodeProtocolServer().stop();
 
-      scm.getContainerManager().updateContainerState(selectedContainer
-          .containerID(), HddsProtos.LifeCycleEvent.FINALIZE);
+      LoggerFactory.getLogger(TestStorageContainerManager.class).info(
+          "Current Container State is" + selectedContainer.getState());
+      if (selectedContainer.getState() == HddsProtos.LifeCycleState.OPEN) {
+        scm.getContainerManager().updateContainerState(selectedContainer
+            .containerID(), HddsProtos.LifeCycleEvent.FINALIZE);

Review comment:
       no, I couldn't find the other thread closing the container. one theory could be, We stopped processing the heartbeat from the datanode and if it assumes data node is dead, it could close the container. 
   Yes there is a race condition still unless it could be done atomically. I changed it to try/catch to avoid any race condition.

##########
File path: hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/ozone/TestStorageContainerManager.java
##########
@@ -593,7 +591,7 @@ public void testCloseContainerCommandOnRestart() throws Exception {
           new TestStorageContainerManagerHelper(cluster, conf);
 
       helper.createKeys(10, 4096);
-      Thread.sleep(5000);
+      Thread.sleep(10000);

Review comment:
       done

##########
File path: hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/ozone/TestStorageContainerManager.java
##########
@@ -604,8 +602,13 @@ public void testCloseContainerCommandOnRestart() throws Exception {
       // Stop processing HB
       scm.getDatanodeProtocolServer().stop();
 
-      scm.getContainerManager().updateContainerState(selectedContainer
-          .containerID(), HddsProtos.LifeCycleEvent.FINALIZE);
+      LoggerFactory.getLogger(TestStorageContainerManager.class).info(
+          "Current Container State is" + selectedContainer.getState());

Review comment:
       done

##########
File path: hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/ozone/TestStorageContainerManager.java
##########
@@ -593,7 +591,7 @@ public void testCloseContainerCommandOnRestart() throws Exception {
           new TestStorageContainerManagerHelper(cluster, conf);
 
       helper.createKeys(10, 4096);
-      Thread.sleep(5000);
+      Thread.sleep(10000);

Review comment:
       I spent some time looking around in the code. Not sure how we can do this cleanly for the other sleep.

##########
File path: hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/ozone/TestStorageContainerManager.java
##########
@@ -593,7 +591,7 @@ public void testCloseContainerCommandOnRestart() throws Exception {
           new TestStorageContainerManagerHelper(cluster, conf);
 
       helper.createKeys(10, 4096);
-      Thread.sleep(5000);
+      Thread.sleep(10000);

Review comment:
       Changed this to ->
    - waiting till the replication manager comes up using GenericTestUtils.waitFor and then
    - wait some more to give it enough time to process containers




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: ozone-issues-help@hadoop.apache.org