You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ozone.apache.org by GitBox <gi...@apache.org> on 2021/04/06 11:20:59 UTC

[GitHub] [ozone] bshashikant opened a new pull request #2114: HDDS-5062. Add a config to bypass clusterId validation for bootstrapping SCM.

bshashikant opened a new pull request #2114:
URL: https://github.com/apache/ozone/pull/2114


   
   ## What changes were proposed in this pull request?
   IN SCM HA, the primary node starts up the ratis server while other bootstrapping nodes will get added to the ratis group. Now, if all the bootstrapping SCM's get stopped, the primary node will now step down from leadership as it will loose majority. If the bootstrapping nodes are now bootstrapped again,  the bootsrapping node will try to first validate the cluster id from the leader SCM with the persisted cluster id , but as there is no leader existing, bootstrapping wil keep on failing and retrying until it shuts down. 
   
   The issue can be very easily simulated in kubernetes deployments, where bootstrap and init cmds are run repeatedly on every restart.
   
   The Jira aims to bypass the cluster id validation if a bootstrapping node already has a cluster id.
   ## What is the link to the Apache JIRA
   https://issues.apache.org/jira/browse/HDDS-5062
   
   ## How was this patch tested?
   Added unit test
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] bshashikant commented on pull request #2114: HDDS-5062. Add a config to bypass clusterId validation for bootstrapping SCM.

Posted by GitBox <gi...@apache.org>.
bshashikant commented on pull request #2114:
URL: https://github.com/apache/ozone/pull/2114#issuecomment-814736652


   The findbug issue reported will be fixed by #2120 . Merging it.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] bharatviswa504 commented on a change in pull request #2114: HDDS-5062. Add a config to bypass clusterId validation for bootstrapping SCM.

Posted by GitBox <gi...@apache.org>.
bharatviswa504 commented on a change in pull request #2114:
URL: https://github.com/apache/ozone/pull/2114#discussion_r607770831



##########
File path: hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/ozone/scm/TestStorageContainerManagerHA.java
##########
@@ -228,4 +230,25 @@ public void testPrimordialSCM() throws Exception {
     Assert.assertTrue(
         StorageContainerManager.scmInit(conf2, scm2.getClusterId()));
   }
+
+  @Test
+  public void testBootStrapSCM() throws Exception {
+    StorageContainerManager scm1 = cluster.getStorageContainerManagers().get(0);
+    StorageContainerManager scm2 = cluster.getStorageContainerManagers().get(1);
+    OzoneConfiguration conf2 = scm2.getConfiguration();
+    conf2.set(ScmConfigKeys.OZONE_SCM_PRIMORDIAL_NODE_ID_KEY,

Review comment:
       Same line duplicated

##########
File path: hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/ozone/scm/TestStorageContainerManagerHA.java
##########
@@ -228,4 +230,25 @@ public void testPrimordialSCM() throws Exception {
     Assert.assertTrue(
         StorageContainerManager.scmInit(conf2, scm2.getClusterId()));
   }
+
+  @Test
+  public void testBootStrapSCM() throws Exception {
+    StorageContainerManager scm1 = cluster.getStorageContainerManagers().get(0);
+    StorageContainerManager scm2 = cluster.getStorageContainerManagers().get(1);
+    OzoneConfiguration conf2 = scm2.getConfiguration();
+    conf2.set(ScmConfigKeys.OZONE_SCM_PRIMORDIAL_NODE_ID_KEY,

Review comment:
       And also do we need this setting here for this test case?

##########
File path: hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/ozone/scm/TestStorageContainerManagerHA.java
##########
@@ -228,4 +230,25 @@ public void testPrimordialSCM() throws Exception {
     Assert.assertTrue(
         StorageContainerManager.scmInit(conf2, scm2.getClusterId()));
   }
+
+  @Test
+  public void testBootStrapSCM() throws Exception {
+    StorageContainerManager scm1 = cluster.getStorageContainerManagers().get(0);
+    StorageContainerManager scm2 = cluster.getStorageContainerManagers().get(1);
+    OzoneConfiguration conf2 = scm2.getConfiguration();
+    conf2.set(ScmConfigKeys.OZONE_SCM_PRIMORDIAL_NODE_ID_KEY,
+        scm1.getSCMNodeId());
+    conf2.set(ScmConfigKeys.OZONE_SCM_PRIMORDIAL_NODE_ID_KEY,
+        scm1.getSCMNodeId());
+    boolean isDeleted = scm2.getScmStorageConfig().getVersionFile().delete();
+    Assert.assertTrue(isDeleted);
+    final SCMStorageConfig scmStorageConfig = new SCMStorageConfig(conf2);
+    scmStorageConfig.setClusterId(UUID.randomUUID().toString());
+    scmStorageConfig.getCurrentDir().delete();
+    scmStorageConfig.initialize();

Review comment:
       So test here is delete scm2 version file.
   L246-249 created a new version file for SCM2 in same location with new clusterID.
   Now we are check with/without OZONE_SCM_SKIP_BOOTSTRAP_VALIDATION_KEY?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] bshashikant merged pull request #2114: HDDS-5062. Add a config to bypass clusterId validation for bootstrapping SCM.

Posted by GitBox <gi...@apache.org>.
bshashikant merged pull request #2114:
URL: https://github.com/apache/ozone/pull/2114


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] bshashikant commented on a change in pull request #2114: HDDS-5062. Add a config to bypass clusterId validation for bootstrapping SCM.

Posted by GitBox <gi...@apache.org>.
bshashikant commented on a change in pull request #2114:
URL: https://github.com/apache/ozone/pull/2114#discussion_r607806524



##########
File path: hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/ozone/scm/TestStorageContainerManagerHA.java
##########
@@ -228,4 +230,25 @@ public void testPrimordialSCM() throws Exception {
     Assert.assertTrue(
         StorageContainerManager.scmInit(conf2, scm2.getClusterId()));
   }
+
+  @Test
+  public void testBootStrapSCM() throws Exception {
+    StorageContainerManager scm1 = cluster.getStorageContainerManagers().get(0);
+    StorageContainerManager scm2 = cluster.getStorageContainerManagers().get(1);
+    OzoneConfiguration conf2 = scm2.getConfiguration();
+    conf2.set(ScmConfigKeys.OZONE_SCM_PRIMORDIAL_NODE_ID_KEY,

Review comment:
       Addressed in latest patch.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] GlenGeng commented on pull request #2114: HDDS-5062. Add a config to bypass clusterId validation for bootstrapping SCM.

Posted by GitBox <gi...@apache.org>.
GlenGeng commented on pull request #2114:
URL: https://github.com/apache/ozone/pull/2114#issuecomment-814576832


   LGTM. Wait for CI pass.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] bshashikant commented on a change in pull request #2114: HDDS-5062. Add a config to bypass clusterId validation for bootstrapping SCM.

Posted by GitBox <gi...@apache.org>.
bshashikant commented on a change in pull request #2114:
URL: https://github.com/apache/ozone/pull/2114#discussion_r607806340



##########
File path: hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/ozone/scm/TestStorageContainerManagerHA.java
##########
@@ -228,4 +230,25 @@ public void testPrimordialSCM() throws Exception {
     Assert.assertTrue(
         StorageContainerManager.scmInit(conf2, scm2.getClusterId()));
   }
+
+  @Test
+  public void testBootStrapSCM() throws Exception {
+    StorageContainerManager scm1 = cluster.getStorageContainerManagers().get(0);
+    StorageContainerManager scm2 = cluster.getStorageContainerManagers().get(1);
+    OzoneConfiguration conf2 = scm2.getConfiguration();
+    conf2.set(ScmConfigKeys.OZONE_SCM_PRIMORDIAL_NODE_ID_KEY,
+        scm1.getSCMNodeId());
+    conf2.set(ScmConfigKeys.OZONE_SCM_PRIMORDIAL_NODE_ID_KEY,
+        scm1.getSCMNodeId());
+    boolean isDeleted = scm2.getScmStorageConfig().getVersionFile().delete();
+    Assert.assertTrue(isDeleted);
+    final SCMStorageConfig scmStorageConfig = new SCMStorageConfig(conf2);
+    scmStorageConfig.setClusterId(UUID.randomUUID().toString());
+    scmStorageConfig.getCurrentDir().delete();
+    scmStorageConfig.initialize();

Review comment:
       Yes




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org