You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ozone.apache.org by GitBox <gi...@apache.org> on 2021/09/30 05:34:06 UTC

[GitHub] [ozone] hanishakoneru commented on a change in pull request #2491: HDDS-5534. Verify config is updated on all OMs before proceeding with Bootstrap

hanishakoneru commented on a change in pull request #2491:
URL: https://github.com/apache/ozone/pull/2491#discussion_r719071042



##########
File path: hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/OzoneManager.java
##########
@@ -1373,6 +1411,64 @@ public void restart() throws IOException {
     omState = State.RUNNING;
   }
 
+  private void checkConfigBeforeBootstrap() throws IOException {
+    List<OMNodeDetails> omsWihtoutNewConfig = new ArrayList<>();
+    for (Map.Entry<String, OMNodeDetails> entry : peerNodesMap.entrySet()) {
+      String remoteNodeId = entry.getKey();
+      OMNodeDetails remoteNodeDetails = entry.getValue();
+      try (OMMetadataProtocolClientSideImpl omMetadataProtocolClient =
+               new OMMetadataProtocolClientSideImpl(configuration,
+                   getRemoteUser(), remoteNodeId,
+                   remoteNodeDetails.getRpcAddress())) {
+
+        OMMetadata remoteOMMetadata = omMetadataProtocolClient.getOMMetadata();
+        boolean exists = checkOMexistsInRemoteOMConfig(remoteOMMetadata);
+        if (!exists) {
+          LOG.error("Remote OM " + remoteNodeId + ":" +
+              remoteNodeDetails.getHostAddress() + " does not have the " +
+              "bootstrapping OM(" + getOMNodeId() + ") information on " +
+              "reloading configs.");
+          omsWihtoutNewConfig.add(remoteNodeDetails);
+        }
+      } catch (IOException ioe) {
+        LOG.error("Remote OM config check before bootstrap failed on OM {}",
+            remoteNodeId, ioe);
+        omsWihtoutNewConfig.add(remoteNodeDetails);
+      }
+    }
+    if (!omsWihtoutNewConfig.isEmpty()) {
+      StringBuilder errorMsgBuilder = new StringBuilder();
+      errorMsgBuilder.append("OM(s) [")
+          .append(omsWihtoutNewConfig.get(0).getOMPrintInfo());
+      for (int i = 1; i < omsWihtoutNewConfig.size(); i++) {
+        errorMsgBuilder.append(",")
+            .append(omsWihtoutNewConfig.get(i).getOMPrintInfo());
+      }
+      errorMsgBuilder.append("] do not have the bootstrapping OM information." +
+          " Update their ozone-site.xml with new node details before " +
+          "proceeding.");
+      shutdown(errorMsgBuilder.toString());
+    }
+  }
+
+  /**
+   * Check whether current OM information exists in the remote OM's reloaded
+   * configs.
+   */
+  private boolean checkOMexistsInRemoteOMConfig(OMMetadata remoteOMMetadata) {
+    List<OMNodeDetails> omNodesInNewConf = remoteOMMetadata
+        .getOmNodesInNewConf();
+    for (OMNodeDetails omNodeInRemoteOM : omNodesInNewConf) {

Review comment:
       There are 2 extra checks which I wanted to add but forgot.
   1. New OM is not already present as part of the Ratis ring
   2. A nodeId is not being reused.
   Also, we can use this method when removing an OM to verify that the OMs are in sync.
   But I see your point. Let me test what the implications would be for the 2 scenarios mentioned above. If its harmless and results in the bootstrapping node to shutdown, we should be fine. Otherwise, I will add those checks.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org