You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ozone.apache.org by GitBox <gi...@apache.org> on 2020/12/11 19:14:17 UTC

[GitHub] [ozone] avijayanhwx opened a new pull request #1692: HDDS-4564. Prepare client should check every OM individually for the prepared check based on Txn ID.

avijayanhwx opened a new pull request #1692:
URL: https://github.com/apache/ozone/pull/1692


   ## What changes were proposed in this pull request?
   
   - After getting a successful response for the OMPrepareRequest, the prepare client should use the Txn ID from the response to check every OM's preparation completeness.
   - This JIRA is partially dependent on HDDS-4569 which plans to introduce a marker file at the end of preparation. In a follow up JIRA, this will be refined to take up that logic, along with an integration test.
   - A basic acceptance test for prepare has been added. 
   - User facing message screenshot has been attached.
   
   ## What is the link to the Apache JIRA
   https://issues.apache.org/jira/browse/HDDS-4564
   
   ## How was this patch tested?
   Manually tested. 
   Added acceptance test.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] errose28 commented on a change in pull request #1692: HDDS-4564. Prepare client should check every OM individually for the prepared check based on Txn ID.

Posted by GitBox <gi...@apache.org>.
errose28 commented on a change in pull request #1692:
URL: https://github.com/apache/ozone/pull/1692#discussion_r542572110



##########
File path: hadoop-ozone/interface-client/src/main/proto/OmClientProtocol.proto
##########
@@ -1091,6 +1094,23 @@ message PrepareResponse {
     required uint64 txnID = 1;
 }
 
+message PrepareStatusRequest {
+    required uint64 txnID = 1;
+}
+
+message PrepareStatusResponse {
+    enum PrepareStatus {
+        // TODO
+        // HDDS-4569 may introduce new states here, like marker file found
+        // but with different txn id. We can add them as make sense.
+        PREPARE_NOT_STARTED = 1;
+        PREPARE_IN_PROGRESS = 2;
+        PREPARE_COMPLETED = 3;
+    }
+    required PrepareStatus status = 1;
+    optional uint64 currentTxnIndex = 2;

Review comment:
       Ok makes sense. Good for me to keep in mind while doing HDDS-4569.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] avijayanhwx commented on a change in pull request #1692: HDDS-4564. Prepare client should check every OM individually for the prepared check based on Txn ID.

Posted by GitBox <gi...@apache.org>.
avijayanhwx commented on a change in pull request #1692:
URL: https://github.com/apache/ozone/pull/1692#discussion_r542569283



##########
File path: hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/protocolPB/OzoneManagerRequestHandler.java
##########
@@ -623,6 +633,24 @@ private FinalizeUpgradeProgressResponse reportUpgradeProgress(
         .build();
   }
 
+  private PrepareStatusResponse getPrepareStatus(PrepareStatusRequest request)
+      throws IOException {
+    // TODO After HDDS-4569,
+    // When there is a global "prepared" state in OM, we can return
+    // PREPARE_NOT_STARTED instead of PREPARE_IN_PROGRESS appropriately.
+    PrepareStatus prepareStatus = null;
+    long txnID = request.getTxnID();
+    long ratisSnapshotIndex = impl.getRatisSnapshotIndex();
+    if (ratisSnapshotIndex != txnID) {
+      LOG.info("Last Txn Index = {}", ratisSnapshotIndex);
+      prepareStatus =  PREPARE_IN_PROGRESS;

Review comment:
       As mentioned in an earlier comment, this can undergo some minor changes after HDDS-4569. Currently, in my mind, an OM in prepared flag disabled state (state maintained in global OM level) does not know about a OmPrepareRequest (think slow follower). When the flag has been enabled but still is waiting for the flush, then it is likely in PREPARE_IN_PROGRESS mode. When the flag is enabled, and the flush is done (OMPR apply txn is fully complete), then we are expected to be PREPARE_COMPLETED state. Of course there are some edge cases that can be caused by subsequent OMPrepareRequests coming in the pipeline, but I don't expect them to be a big problem to the fundamental problem the prepare operation is trying to solve.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] avijayanhwx commented on a change in pull request #1692: HDDS-4564. Prepare client should check every OM individually for the prepared check based on Txn ID.

Posted by GitBox <gi...@apache.org>.
avijayanhwx commented on a change in pull request #1692:
URL: https://github.com/apache/ozone/pull/1692#discussion_r542564473



##########
File path: hadoop-ozone/interface-client/src/main/proto/OmClientProtocol.proto
##########
@@ -1091,6 +1094,23 @@ message PrepareResponse {
     required uint64 txnID = 1;
 }
 
+message PrepareStatusRequest {
+    required uint64 txnID = 1;
+}
+
+message PrepareStatusResponse {
+    enum PrepareStatus {
+        // TODO
+        // HDDS-4569 may introduce new states here, like marker file found
+        // but with different txn id. We can add them as make sense.
+        PREPARE_NOT_STARTED = 1;
+        PREPARE_IN_PROGRESS = 2;
+        PREPARE_COMPLETED = 3;
+    }
+    required PrepareStatus status = 1;
+    optional uint64 currentTxnIndex = 2;

Review comment:
       @errose28 Yes, I agree. Since I expect some changes in the status enum fields as well as the logic in OzoneManagerRequestHandler#getPrepareStatus (usage of marker file vs OM transaction info), I believe we can punt the documentation to a follow up patch after HDDS-4569. That is the reason I have not added an integration test as well.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] avijayanhwx commented on a change in pull request #1692: HDDS-4564. Prepare client should check every OM individually for the prepared check based on Txn ID.

Posted by GitBox <gi...@apache.org>.
avijayanhwx commented on a change in pull request #1692:
URL: https://github.com/apache/ozone/pull/1692#discussion_r542133014



##########
File path: hadoop-ozone/common/src/main/java/org/apache/hadoop/ozone/om/protocol/OzoneManagerProtocol.java
##########
@@ -602,4 +604,12 @@ default long prepareOzoneManager(
       throws IOException {
     return -1;
   }
+
+  default PrepareStatusResponse getOzoneManagerPrepareStatus(long txnId)

Review comment:
       Fixed.

##########
File path: hadoop-ozone/common/src/main/java/org/apache/hadoop/ozone/om/protocolPB/OzoneManagerProtocolClientSideTranslatorPB.java
##########
@@ -1574,6 +1574,17 @@ public long prepareOzoneManager(
     return prepareResponse.getTxnID();
   }
 
+  public PrepareStatusResponse getOzoneManagerPrepareStatus(long txnId)

Review comment:
       Fixed.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] linyiqun commented on a change in pull request #1692: HDDS-4564. Prepare client should check every OM individually for the prepared check based on Txn ID.

Posted by GitBox <gi...@apache.org>.
linyiqun commented on a change in pull request #1692:
URL: https://github.com/apache/ozone/pull/1692#discussion_r542240479



##########
File path: hadoop-ozone/tools/src/main/java/org/apache/hadoop/ozone/admin/om/PrepareSubCommand.java
##########
@@ -67,13 +78,85 @@
   )
   private long txnApplyCheckIntervalSeconds;
 
+  @CommandLine.Option(
+      names = {"-pct", "--prepare-check-interval"},
+      description = "Time in SECONDS to wait between successive checks for OM" +
+          " preparation.",
+      defaultValue = "10",
+      hidden = true
+  )
+  private long prepareCheckInterval;
+
+  @CommandLine.Option(
+      names = {"-pt", "--prepare-timeout"},
+      description = "Max time in SECONDS to wait for all OMs to be prepared",
+      defaultValue = "300",
+      hidden = true
+  )
+  private long prepareTimeOut;
+
   @Override
   public Void call() throws Exception {
     OzoneManagerProtocol client = parent.createOmClient(omServiceId);
     long prepareTxnId = client.prepareOzoneManager(txnApplyWaitTimeSeconds,
         txnApplyCheckIntervalSeconds);
     System.out.println("Ozone Manager Prepare Request successfully returned " +
-        "with Txn Id " + prepareTxnId);
+        "with Transaction Id : [" + prepareTxnId + "].");
+
+    Map<String, Boolean> omPreparedStatusMap = new HashMap<>();
+    Set<String> omHosts = getOmHostsFromConfig(
+        parent.getParent().getOzoneConf(), omServiceId);
+    omHosts.forEach(h -> omPreparedStatusMap.put(h, false));
+    Duration pTimeout = Duration.of(prepareTimeOut, ChronoUnit.SECONDS);
+    Duration pInterval = Duration.of(prepareCheckInterval, ChronoUnit.SECONDS);
+
+    System.out.println();
+    System.out.println("Checking individual OM instances for prepare request " +
+        "completion...");
+    long endTime = System.currentTimeMillis() + pTimeout.toMillis();
+    int expectedNumPreparedOms = omPreparedStatusMap.size();
+    int currentNumPreparedOms = 0;
+    while (System.currentTimeMillis() < endTime &&
+        currentNumPreparedOms < expectedNumPreparedOms) {
+      for (Map.Entry<String, Boolean> e : omPreparedStatusMap.entrySet()) {
+        if (!e.getValue()) {
+          String omHost = e.getKey();
+          try (OzoneManagerProtocol singleOmClient =
+                    parent.createOmClient(omServiceId, omHost, false)) {
+            PrepareStatusResponse response =
+                singleOmClient.getOzoneManagerPrepareStatus(prepareTxnId);
+            PrepareStatus status = response.getStatus();
+            System.out.println("OM : [" + omHost + "], Prepare " +
+                "Status : [" + status.name() + "], Current Transaction Id : [" +
+                response.getCurrentTxnIndex() + "]");
+            if (status.equals(PREPARE_COMPLETED)) {
+              e.setValue(true);
+              currentNumPreparedOms++;
+            }
+          } catch (IOException ioEx) {
+            System.out.println("Exception while checking preparation " +
+                "completeness for [" + omHost +
+                "], Error : [" + ioEx.getMessage() + "]");
+          }
+        }
+      }
+      if (currentNumPreparedOms < expectedNumPreparedOms) {
+        System.out.println("Waiting for " + prepareCheckInterval +
+            " seconds before retrying...");
+        Thread.sleep(pInterval.toMillis());
+      }
+    }
+    if (currentNumPreparedOms < expectedNumPreparedOms) {
+      throw new Exception("OM Preparation failed since all OMs are not " +
+          "prepared yet.");

Review comment:
       Okay, make sense.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] avijayanhwx commented on a change in pull request #1692: HDDS-4564. Prepare client should check every OM individually for the prepared check based on Txn ID.

Posted by GitBox <gi...@apache.org>.
avijayanhwx commented on a change in pull request #1692:
URL: https://github.com/apache/ozone/pull/1692#discussion_r542565441



##########
File path: hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/protocolPB/OzoneManagerRequestHandler.java
##########
@@ -623,6 +633,24 @@ private FinalizeUpgradeProgressResponse reportUpgradeProgress(
         .build();
   }
 
+  private PrepareStatusResponse getPrepareStatus(PrepareStatusRequest request)
+      throws IOException {
+    // TODO After HDDS-4569,
+    // When there is a global "prepared" state in OM, we can return
+    // PREPARE_NOT_STARTED instead of PREPARE_IN_PROGRESS appropriately.
+    PrepareStatus prepareStatus = null;
+    long txnID = request.getTxnID();
+    long ratisSnapshotIndex = impl.getRatisSnapshotIndex();
+    if (ratisSnapshotIndex != txnID) {
+      LOG.info("Last Txn Index = {}", ratisSnapshotIndex);

Review comment:
       Yes, I can add that log line.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] avijayanhwx merged pull request #1692: HDDS-4564. Prepare client should check every OM individually for the prepared check based on Txn ID.

Posted by GitBox <gi...@apache.org>.
avijayanhwx merged pull request #1692:
URL: https://github.com/apache/ozone/pull/1692


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] linyiqun commented on a change in pull request #1692: HDDS-4564. Prepare client should check every OM individually for the prepared check based on Txn ID.

Posted by GitBox <gi...@apache.org>.
linyiqun commented on a change in pull request #1692:
URL: https://github.com/apache/ozone/pull/1692#discussion_r541831023



##########
File path: hadoop-ozone/common/src/main/java/org/apache/hadoop/ozone/om/protocol/OzoneManagerProtocol.java
##########
@@ -602,4 +604,12 @@ default long prepareOzoneManager(
       throws IOException {
     return -1;
   }
+
+  default PrepareStatusResponse getOzoneManagerPrepareStatus(long txnId)

Review comment:
       Please also document this javadoc for this method.

##########
File path: hadoop-ozone/common/src/main/java/org/apache/hadoop/ozone/om/protocolPB/OzoneManagerProtocolClientSideTranslatorPB.java
##########
@@ -1574,6 +1574,17 @@ public long prepareOzoneManager(
     return prepareResponse.getTxnID();
   }
 
+  public PrepareStatusResponse getOzoneManagerPrepareStatus(long txnId)

Review comment:
       Can you add the @Override here that will let us know this is a protocol method?

##########
File path: hadoop-ozone/tools/src/main/java/org/apache/hadoop/ozone/admin/om/PrepareSubCommand.java
##########
@@ -67,13 +78,85 @@
   )
   private long txnApplyCheckIntervalSeconds;
 
+  @CommandLine.Option(
+      names = {"-pct", "--prepare-check-interval"},
+      description = "Time in SECONDS to wait between successive checks for OM" +
+          " preparation.",
+      defaultValue = "10",
+      hidden = true
+  )
+  private long prepareCheckInterval;
+
+  @CommandLine.Option(
+      names = {"-pt", "--prepare-timeout"},
+      description = "Max time in SECONDS to wait for all OMs to be prepared",
+      defaultValue = "300",
+      hidden = true
+  )
+  private long prepareTimeOut;
+
   @Override
   public Void call() throws Exception {
     OzoneManagerProtocol client = parent.createOmClient(omServiceId);
     long prepareTxnId = client.prepareOzoneManager(txnApplyWaitTimeSeconds,
         txnApplyCheckIntervalSeconds);
     System.out.println("Ozone Manager Prepare Request successfully returned " +
-        "with Txn Id " + prepareTxnId);
+        "with Transaction Id : [" + prepareTxnId + "].");
+
+    Map<String, Boolean> omPreparedStatusMap = new HashMap<>();
+    Set<String> omHosts = getOmHostsFromConfig(
+        parent.getParent().getOzoneConf(), omServiceId);
+    omHosts.forEach(h -> omPreparedStatusMap.put(h, false));
+    Duration pTimeout = Duration.of(prepareTimeOut, ChronoUnit.SECONDS);
+    Duration pInterval = Duration.of(prepareCheckInterval, ChronoUnit.SECONDS);
+
+    System.out.println();
+    System.out.println("Checking individual OM instances for prepare request " +
+        "completion...");
+    long endTime = System.currentTimeMillis() + pTimeout.toMillis();
+    int expectedNumPreparedOms = omPreparedStatusMap.size();
+    int currentNumPreparedOms = 0;
+    while (System.currentTimeMillis() < endTime &&
+        currentNumPreparedOms < expectedNumPreparedOms) {
+      for (Map.Entry<String, Boolean> e : omPreparedStatusMap.entrySet()) {
+        if (!e.getValue()) {
+          String omHost = e.getKey();
+          try (OzoneManagerProtocol singleOmClient =
+                    parent.createOmClient(omServiceId, omHost, false)) {
+            PrepareStatusResponse response =
+                singleOmClient.getOzoneManagerPrepareStatus(prepareTxnId);
+            PrepareStatus status = response.getStatus();
+            System.out.println("OM : [" + omHost + "], Prepare " +
+                "Status : [" + status.name() + "], Current Transaction Id : [" +
+                response.getCurrentTxnIndex() + "]");
+            if (status.equals(PREPARE_COMPLETED)) {
+              e.setValue(true);
+              currentNumPreparedOms++;
+            }
+          } catch (IOException ioEx) {
+            System.out.println("Exception while checking preparation " +
+                "completeness for [" + omHost +
+                "], Error : [" + ioEx.getMessage() + "]");
+          }
+        }
+      }
+      if (currentNumPreparedOms < expectedNumPreparedOms) {
+        System.out.println("Waiting for " + prepareCheckInterval +
+            " seconds before retrying...");
+        Thread.sleep(pInterval.toMillis());
+      }
+    }
+    if (currentNumPreparedOms < expectedNumPreparedOms) {
+      throw new Exception("OM Preparation failed since all OMs are not " +
+          "prepared yet.");

Review comment:
       As this is the command for admin users, can we print out this message instead of throwing an exception? That will be more friendly to users.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] avijayanhwx commented on a change in pull request #1692: HDDS-4564. Prepare client should check every OM individually for the prepared check based on Txn ID.

Posted by GitBox <gi...@apache.org>.
avijayanhwx commented on a change in pull request #1692:
URL: https://github.com/apache/ozone/pull/1692#discussion_r542133594



##########
File path: hadoop-ozone/tools/src/main/java/org/apache/hadoop/ozone/admin/om/PrepareSubCommand.java
##########
@@ -67,13 +78,85 @@
   )
   private long txnApplyCheckIntervalSeconds;
 
+  @CommandLine.Option(
+      names = {"-pct", "--prepare-check-interval"},
+      description = "Time in SECONDS to wait between successive checks for OM" +
+          " preparation.",
+      defaultValue = "10",
+      hidden = true
+  )
+  private long prepareCheckInterval;
+
+  @CommandLine.Option(
+      names = {"-pt", "--prepare-timeout"},
+      description = "Max time in SECONDS to wait for all OMs to be prepared",
+      defaultValue = "300",
+      hidden = true
+  )
+  private long prepareTimeOut;
+
   @Override
   public Void call() throws Exception {
     OzoneManagerProtocol client = parent.createOmClient(omServiceId);
     long prepareTxnId = client.prepareOzoneManager(txnApplyWaitTimeSeconds,
         txnApplyCheckIntervalSeconds);
     System.out.println("Ozone Manager Prepare Request successfully returned " +
-        "with Txn Id " + prepareTxnId);
+        "with Transaction Id : [" + prepareTxnId + "].");
+
+    Map<String, Boolean> omPreparedStatusMap = new HashMap<>();
+    Set<String> omHosts = getOmHostsFromConfig(
+        parent.getParent().getOzoneConf(), omServiceId);
+    omHosts.forEach(h -> omPreparedStatusMap.put(h, false));
+    Duration pTimeout = Duration.of(prepareTimeOut, ChronoUnit.SECONDS);
+    Duration pInterval = Duration.of(prepareCheckInterval, ChronoUnit.SECONDS);
+
+    System.out.println();
+    System.out.println("Checking individual OM instances for prepare request " +
+        "completion...");
+    long endTime = System.currentTimeMillis() + pTimeout.toMillis();
+    int expectedNumPreparedOms = omPreparedStatusMap.size();
+    int currentNumPreparedOms = 0;
+    while (System.currentTimeMillis() < endTime &&
+        currentNumPreparedOms < expectedNumPreparedOms) {
+      for (Map.Entry<String, Boolean> e : omPreparedStatusMap.entrySet()) {
+        if (!e.getValue()) {
+          String omHost = e.getKey();
+          try (OzoneManagerProtocol singleOmClient =
+                    parent.createOmClient(omServiceId, omHost, false)) {
+            PrepareStatusResponse response =
+                singleOmClient.getOzoneManagerPrepareStatus(prepareTxnId);
+            PrepareStatus status = response.getStatus();
+            System.out.println("OM : [" + omHost + "], Prepare " +
+                "Status : [" + status.name() + "], Current Transaction Id : [" +
+                response.getCurrentTxnIndex() + "]");
+            if (status.equals(PREPARE_COMPLETED)) {
+              e.setValue(true);
+              currentNumPreparedOms++;
+            }
+          } catch (IOException ioEx) {
+            System.out.println("Exception while checking preparation " +
+                "completeness for [" + omHost +
+                "], Error : [" + ioEx.getMessage() + "]");
+          }
+        }
+      }
+      if (currentNumPreparedOms < expectedNumPreparedOms) {
+        System.out.println("Waiting for " + prepareCheckInterval +
+            " seconds before retrying...");
+        Thread.sleep(pInterval.toMillis());
+      }
+    }
+    if (currentNumPreparedOms < expectedNumPreparedOms) {
+      throw new Exception("OM Preparation failed since all OMs are not " +
+          "prepared yet.");

Review comment:
       Thanks for the review @linyiqun. I believe throwing an exception is better since the return code for the command will be non-zero. This will help in higher level applications automating this call figure out the result of the command decisively.
   
   
   > Exception while checking preparation completeness for [om2], Error : [Invalid host name: local host is: (unknown); >destination host is: "om2":9862; java.net.UnknownHostException; For more details see:  >http://wiki.apache.org/hadoop/UnknownHost]
   >Waiting for 10 seconds before retrying...
   >OM Preparation failed since all OMs are not prepared yet.
   >bash-4.2$ echo $?
   >255




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] avijayanhwx commented on pull request #1692: HDDS-4564. Prepare client should check every OM individually for the prepared check based on Txn ID.

Posted by GitBox <gi...@apache.org>.
avijayanhwx commented on pull request #1692:
URL: https://github.com/apache/ozone/pull/1692#issuecomment-743436944


   cc @errose28 for review.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] avijayanhwx commented on pull request #1692: HDDS-4564. Prepare client should check every OM individually for the prepared check based on Txn ID.

Posted by GitBox <gi...@apache.org>.
avijayanhwx commented on pull request #1692:
URL: https://github.com/apache/ozone/pull/1692#issuecomment-746692442


   Thanks for the reviews @linyiqun, @errose28 & @swagle!


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] avijayanhwx commented on pull request #1692: HDDS-4564. Prepare client should check every OM individually for the prepared check based on Txn ID.

Posted by GitBox <gi...@apache.org>.
avijayanhwx commented on pull request #1692:
URL: https://github.com/apache/ozone/pull/1692#issuecomment-744819445


   Ran the test 20 times and verified that no failures are seen (https://github.com/avijayanhwx/hadoop-ozone/runs/1553559592)  Please ignore the overall test run. It has failed due to incomplete github workflow changes. The test itself successfully ran 20 times.
   [summary.txt](https://github.com/apache/ozone/files/5692222/summary.txt)
   
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] errose28 commented on a change in pull request #1692: HDDS-4564. Prepare client should check every OM individually for the prepared check based on Txn ID.

Posted by GitBox <gi...@apache.org>.
errose28 commented on a change in pull request #1692:
URL: https://github.com/apache/ozone/pull/1692#discussion_r542503880



##########
File path: hadoop-ozone/interface-client/src/main/proto/OmClientProtocol.proto
##########
@@ -1091,6 +1094,23 @@ message PrepareResponse {
     required uint64 txnID = 1;
 }
 
+message PrepareStatusRequest {
+    required uint64 txnID = 1;
+}
+
+message PrepareStatusResponse {
+    enum PrepareStatus {
+        // TODO
+        // HDDS-4569 may introduce new states here, like marker file found
+        // but with different txn id. We can add them as make sense.
+        PREPARE_NOT_STARTED = 1;
+        PREPARE_IN_PROGRESS = 2;
+        PREPARE_COMPLETED = 3;
+    }
+    required PrepareStatus status = 1;
+    optional uint64 currentTxnIndex = 2;

Review comment:
       I'm a bit confused on the currentTxnIndex field. If this is really the current transaction index of the system, that will always exist, so why would that be optional? If it is instead being used as the prepare index, it makes sense that it would be optional and omitted if the OM is not prepared, but in that case we should probably change the name to `preparedIndex` or something like that.
   
   Assuming this is the prepare index, we should probably document what is expected of this field in relation to the different prepareStatuses. This was my thought but let me know what you think:
   - PREPARE_NOT_STARTED -> prepare index empty
   - PREPARE_IN_PROGRESS -> prepare index present, but this is the index we are waiting to be prepared on (index of the prepare request)
   - PREPARE_COMPLETED -> OM is prepared at the prepare index.

##########
File path: hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/protocolPB/OzoneManagerRequestHandler.java
##########
@@ -623,6 +633,24 @@ private FinalizeUpgradeProgressResponse reportUpgradeProgress(
         .build();
   }
 
+  private PrepareStatusResponse getPrepareStatus(PrepareStatusRequest request)
+      throws IOException {
+    // TODO After HDDS-4569,
+    // When there is a global "prepared" state in OM, we can return
+    // PREPARE_NOT_STARTED instead of PREPARE_IN_PROGRESS appropriately.
+    PrepareStatus prepareStatus = null;
+    long txnID = request.getTxnID();
+    long ratisSnapshotIndex = impl.getRatisSnapshotIndex();
+    if (ratisSnapshotIndex != txnID) {
+      LOG.info("Last Txn Index = {}", ratisSnapshotIndex);
+      prepareStatus =  PREPARE_IN_PROGRESS;

Review comment:
       Just for clarification as I am working on HDDS-4569, is this the expected interpretations of status?
   flag on && (txn index == snapshot index) -> PREPARE_COMPLETED
   flag on && (txn index != snapshot index) -> PREPARE_IN_PROGRESS
   flag off  -> PREPARE_NOT_STARTED

##########
File path: hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/protocolPB/OzoneManagerRequestHandler.java
##########
@@ -623,6 +633,24 @@ private FinalizeUpgradeProgressResponse reportUpgradeProgress(
         .build();
   }
 
+  private PrepareStatusResponse getPrepareStatus(PrepareStatusRequest request)
+      throws IOException {
+    // TODO After HDDS-4569,
+    // When there is a global "prepared" state in OM, we can return
+    // PREPARE_NOT_STARTED instead of PREPARE_IN_PROGRESS appropriately.
+    PrepareStatus prepareStatus = null;
+    long txnID = request.getTxnID();
+    long ratisSnapshotIndex = impl.getRatisSnapshotIndex();
+    if (ratisSnapshotIndex != txnID) {
+      LOG.info("Last Txn Index = {}", ratisSnapshotIndex);

Review comment:
       I think that should be `!=` in the log message. Also can we add some more context to this message, like "Received prepare status request while last txn index \<index> != last snapshot index \<snapshot index>".




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org