You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ozone.apache.org by GitBox <gi...@apache.org> on 2020/12/13 03:19:51 UTC

[GitHub] [ozone] linyiqun commented on a change in pull request #1692: HDDS-4564. Prepare client should check every OM individually for the prepared check based on Txn ID.

linyiqun commented on a change in pull request #1692:
URL: https://github.com/apache/ozone/pull/1692#discussion_r541831023



##########
File path: hadoop-ozone/common/src/main/java/org/apache/hadoop/ozone/om/protocol/OzoneManagerProtocol.java
##########
@@ -602,4 +604,12 @@ default long prepareOzoneManager(
       throws IOException {
     return -1;
   }
+
+  default PrepareStatusResponse getOzoneManagerPrepareStatus(long txnId)

Review comment:
       Please also document this javadoc for this method.

##########
File path: hadoop-ozone/common/src/main/java/org/apache/hadoop/ozone/om/protocolPB/OzoneManagerProtocolClientSideTranslatorPB.java
##########
@@ -1574,6 +1574,17 @@ public long prepareOzoneManager(
     return prepareResponse.getTxnID();
   }
 
+  public PrepareStatusResponse getOzoneManagerPrepareStatus(long txnId)

Review comment:
       Can you add the @Override here that will let us know this is a protocol method?

##########
File path: hadoop-ozone/tools/src/main/java/org/apache/hadoop/ozone/admin/om/PrepareSubCommand.java
##########
@@ -67,13 +78,85 @@
   )
   private long txnApplyCheckIntervalSeconds;
 
+  @CommandLine.Option(
+      names = {"-pct", "--prepare-check-interval"},
+      description = "Time in SECONDS to wait between successive checks for OM" +
+          " preparation.",
+      defaultValue = "10",
+      hidden = true
+  )
+  private long prepareCheckInterval;
+
+  @CommandLine.Option(
+      names = {"-pt", "--prepare-timeout"},
+      description = "Max time in SECONDS to wait for all OMs to be prepared",
+      defaultValue = "300",
+      hidden = true
+  )
+  private long prepareTimeOut;
+
   @Override
   public Void call() throws Exception {
     OzoneManagerProtocol client = parent.createOmClient(omServiceId);
     long prepareTxnId = client.prepareOzoneManager(txnApplyWaitTimeSeconds,
         txnApplyCheckIntervalSeconds);
     System.out.println("Ozone Manager Prepare Request successfully returned " +
-        "with Txn Id " + prepareTxnId);
+        "with Transaction Id : [" + prepareTxnId + "].");
+
+    Map<String, Boolean> omPreparedStatusMap = new HashMap<>();
+    Set<String> omHosts = getOmHostsFromConfig(
+        parent.getParent().getOzoneConf(), omServiceId);
+    omHosts.forEach(h -> omPreparedStatusMap.put(h, false));
+    Duration pTimeout = Duration.of(prepareTimeOut, ChronoUnit.SECONDS);
+    Duration pInterval = Duration.of(prepareCheckInterval, ChronoUnit.SECONDS);
+
+    System.out.println();
+    System.out.println("Checking individual OM instances for prepare request " +
+        "completion...");
+    long endTime = System.currentTimeMillis() + pTimeout.toMillis();
+    int expectedNumPreparedOms = omPreparedStatusMap.size();
+    int currentNumPreparedOms = 0;
+    while (System.currentTimeMillis() < endTime &&
+        currentNumPreparedOms < expectedNumPreparedOms) {
+      for (Map.Entry<String, Boolean> e : omPreparedStatusMap.entrySet()) {
+        if (!e.getValue()) {
+          String omHost = e.getKey();
+          try (OzoneManagerProtocol singleOmClient =
+                    parent.createOmClient(omServiceId, omHost, false)) {
+            PrepareStatusResponse response =
+                singleOmClient.getOzoneManagerPrepareStatus(prepareTxnId);
+            PrepareStatus status = response.getStatus();
+            System.out.println("OM : [" + omHost + "], Prepare " +
+                "Status : [" + status.name() + "], Current Transaction Id : [" +
+                response.getCurrentTxnIndex() + "]");
+            if (status.equals(PREPARE_COMPLETED)) {
+              e.setValue(true);
+              currentNumPreparedOms++;
+            }
+          } catch (IOException ioEx) {
+            System.out.println("Exception while checking preparation " +
+                "completeness for [" + omHost +
+                "], Error : [" + ioEx.getMessage() + "]");
+          }
+        }
+      }
+      if (currentNumPreparedOms < expectedNumPreparedOms) {
+        System.out.println("Waiting for " + prepareCheckInterval +
+            " seconds before retrying...");
+        Thread.sleep(pInterval.toMillis());
+      }
+    }
+    if (currentNumPreparedOms < expectedNumPreparedOms) {
+      throw new Exception("OM Preparation failed since all OMs are not " +
+          "prepared yet.");

Review comment:
       As this is the command for admin users, can we print out this message instead of throwing an exception? That will be more friendly to users.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org