You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ozone.apache.org by GitBox <gi...@apache.org> on 2021/04/26 11:17:25 UTC

[GitHub] [ozone] GlenGeng opened a new pull request #2184: HDDS-5148. Bump ratis version to 2.1.0-ff8aa66-SNAPSHOT

GlenGeng opened a new pull request #2184:
URL: https://github.com/apache/ozone/pull/2184


   ## What changes were proposed in this pull request?
   
   Bump ratis version to 2.1.0-ff8aa66-SNAPSHOT
   
   ## What is the link to the Apache JIRA
   
   https://issues.apache.org/jira/browse/HDDS-5148
   
   ## How was this patch tested?
   
   CI
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] bharatviswa504 commented on a change in pull request #2184: HDDS-5148. Bump ratis version to 2.1.0-ff8aa66-SNAPSHOT

Posted by GitBox <gi...@apache.org>.
bharatviswa504 commented on a change in pull request #2184:
URL: https://github.com/apache/ozone/pull/2184#discussion_r621118330



##########
File path: hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/ha/SCMStateMachine.java
##########
@@ -294,6 +294,14 @@ public void pause() {
     getLifeCycle().transition(LifeCycle.State.PAUSED);
   }
 
+  @Override
+  public void reinitialize() {
+    if (getLifeCycleState() == LifeCycle.State.PAUSED) {

Review comment:
       Question: How these tests are working before? (Not able to understand that? can you shed some info)




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] GlenGeng closed pull request #2184: HDDS-5148. Bump ratis version to 2.1.0-ff8aa66-SNAPSHOT

Posted by GitBox <gi...@apache.org>.
GlenGeng closed pull request #2184:
URL: https://github.com/apache/ozone/pull/2184


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] bharatviswa504 commented on pull request #2184: HDDS-5148. Bump ratis version to 2.1.0-ff8aa66-SNAPSHOT

Posted by GitBox <gi...@apache.org>.
bharatviswa504 commented on pull request #2184:
URL: https://github.com/apache/ozone/pull/2184#issuecomment-826954207


   @elek for SCM HA snapshot implementation to work we require RATIS-1326, RATIS-1356, and RATIS-1369.
   The commit related to this is [00f0c85](https://github.com/apache/ratis/commit/00f0c8586cbf70e292ece73768db93f3b7888885)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] bshashikant merged pull request #2184: HDDS-5148. Bump ratis version to 2.1.0-ff8aa66-SNAPSHOT

Posted by GitBox <gi...@apache.org>.
bshashikant merged pull request #2184:
URL: https://github.com/apache/ozone/pull/2184


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] GlenGeng commented on pull request #2184: HDDS-5148. Bump ratis version to 2.1.0-ff8aa66-SNAPSHOT

Posted by GitBox <gi...@apache.org>.
GlenGeng commented on pull request #2184:
URL: https://github.com/apache/ozone/pull/2184#issuecomment-827405218


   @bharatviswa504 After installSnapshot, the lifecycle of follower's SCMStateMachine is in PAUSED state. 
   
   ```
             stateMachine.followerEvent().notifyInstallSnapshotFromLeader(roleInfoProto, firstAvailableLogTermIndex)
                 .whenComplete((reply, exception) -> {
                   if (exception != null) {
                     LOG.warn("{}: Failed to notify StateMachine to InstallSnapshot. Exception: {}",
                         getMemberId(), exception.getMessage());
                     inProgressInstallSnapshotRequest.compareAndSet(firstAvailableLogTermIndex, null);
                     return;
                   }
   
                   if (reply != null) {
                     LOG.info("{}: StateMachine successfully installed snapshot index {}. Reloading the StateMachine.",
                         getMemberId(), reply.getIndex());
                     stateMachine.pause();
                     state.updateInstalledSnapshotIndex(reply);
                     state.reloadStateMachine(reply.getIndex());
                   }
                   inProgressInstallSnapshotRequest.compareAndSet(firstAvailableLogTermIndex, null);
                 });
           } 
   ```
   
   The reason is, after `notifyInstallSnapshotFromLeader ` is called, ratis call ` stateMachine.pause();` again. 
   I am checking it.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] adoroszlai commented on pull request #2184: HDDS-5148. Bump ratis version to 2.1.0-ff8aa66-SNAPSHOT

Posted by GitBox <gi...@apache.org>.
adoroszlai commented on pull request #2184:
URL: https://github.com/apache/ozone/pull/2184#issuecomment-827308695


   @GlenGeng I think pushing empty commit is better for triggering new CI run than closing/reopening.  Most importantly it keeps the PR's build history easily available.  We also gain more data points (regarding CI failures), because it triggers two builds: PR and push.  It is also slightly less noisy.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] bharatviswa504 commented on a change in pull request #2184: HDDS-5148. Bump ratis version to 2.1.0-ff8aa66-SNAPSHOT

Posted by GitBox <gi...@apache.org>.
bharatviswa504 commented on a change in pull request #2184:
URL: https://github.com/apache/ozone/pull/2184#discussion_r621117282



##########
File path: hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/ha/SCMStateMachine.java
##########
@@ -294,6 +294,14 @@ public void pause() {
     getLifeCycle().transition(LifeCycle.State.PAUSED);
   }
 
+  @Override
+  public void reinitialize() {
+    if (getLifeCycleState() == LifeCycle.State.PAUSED) {

Review comment:
       cc @hanishakoneru does OmStateMachine also need similar code?
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] bshashikant commented on pull request #2184: HDDS-5148. Bump ratis version to 2.1.0-ff8aa66-SNAPSHOT

Posted by GitBox <gi...@apache.org>.
bshashikant commented on pull request #2184:
URL: https://github.com/apache/ozone/pull/2184#issuecomment-828123012


   Thanks @GlenGeng for the work and others for the reviews.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] GlenGeng commented on pull request #2184: HDDS-5148. Bump ratis version to 2.1.0-ff8aa66-SNAPSHOT

Posted by GitBox <gi...@apache.org>.
GlenGeng commented on pull request #2184:
URL: https://github.com/apache/ozone/pull/2184#issuecomment-826799231


   cc @runzhiwang  Please take a look at this PR. Thanks !


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] GlenGeng closed pull request #2184: HDDS-5148. Bump ratis version to 2.1.0-ff8aa66-SNAPSHOT

Posted by GitBox <gi...@apache.org>.
GlenGeng closed pull request #2184:
URL: https://github.com/apache/ozone/pull/2184


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] elek commented on pull request #2184: HDDS-5148. Bump ratis version to 2.1.0-ff8aa66-SNAPSHOT

Posted by GitBox <gi...@apache.org>.
elek commented on pull request #2184:
URL: https://github.com/apache/ozone/pull/2184#issuecomment-826935387


   Let me make it clear: It's not a blocking comment as we already have a SNAPSHOT dependency, feel free to go ahead. But I am planning to suggest 2.0.1 Ratis release.
   
   Can you please share what is the Ratis commit which was required here? (Just to be sure it's included if we do a patch release..)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] GlenGeng edited a comment on pull request #2184: HDDS-5148. Bump ratis version to 2.1.0-ff8aa66-SNAPSHOT

Posted by GitBox <gi...@apache.org>.
GlenGeng edited a comment on pull request #2184:
URL: https://github.com/apache/ozone/pull/2184#issuecomment-827337814


   Thanks @elek for looking at this PR, and overall I agree with your suggestion. Thanks @bharatviswa504 for the explanation and share the commit.
   
   > I think pushing empty commit is better for triggering new CI run than closing/reopening
   
   Thanks @adoroszlai for the suggestion. I will follow it.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] bharatviswa504 commented on pull request #2184: HDDS-5148. Bump ratis version to 2.1.0-ff8aa66-SNAPSHOT

Posted by GitBox <gi...@apache.org>.
bharatviswa504 commented on pull request #2184:
URL: https://github.com/apache/ozone/pull/2184#issuecomment-827349018


   @GlenGeng Can you check if test failure is related I see one of the test related to InstallSnapshot is failing.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] GlenGeng edited a comment on pull request #2184: HDDS-5148. Bump ratis version to 2.1.0-ff8aa66-SNAPSHOT

Posted by GitBox <gi...@apache.org>.
GlenGeng edited a comment on pull request #2184:
URL: https://github.com/apache/ozone/pull/2184#issuecomment-827337814


   Thanks @elek for looking at this PR, and overall I agree with your suggestion. Thanks @bharatviswa504 for the explanation and the sharing of the commit and the related context.
   
   > I think pushing empty commit is better for triggering new CI run than closing/reopening
   
   Thanks @adoroszlai for the suggestion. I will follow it.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] GlenGeng commented on pull request #2184: HDDS-5148. Bump ratis version to 2.1.0-ff8aa66-SNAPSHOT

Posted by GitBox <gi...@apache.org>.
GlenGeng commented on pull request #2184:
URL: https://github.com/apache/ozone/pull/2184#issuecomment-827337814


   Thanks @elek for looking at this PR, and overall I agree with your suggestion. Thanks @bharatviswa504 for the explanation and share the commit.
   
   > I think pushing empty commit is better for triggering new CI run than closing/reopening
   Thanks @adoroszlai for the suggestion. I will follow it.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] bshashikant commented on pull request #2184: HDDS-5148. Bump ratis version to 2.1.0-ff8aa66-SNAPSHOT

Posted by GitBox <gi...@apache.org>.
bshashikant commented on pull request #2184:
URL: https://github.com/apache/ozone/pull/2184#issuecomment-827591240


   I think, in Ratis we need to fix the applyTransaction behaviour as well. If the stateMachine is in paused state, the applyIndex should not increase in the StateMachineUpdater. This is not happening right now.
   
   Then, as part of reloading the stateMachine as part of reinitializing the stateMachine, it can call unpause().


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] GlenGeng commented on a change in pull request #2184: HDDS-5148. Bump ratis version to 2.1.0-ff8aa66-SNAPSHOT

Posted by GitBox <gi...@apache.org>.
GlenGeng commented on a change in pull request #2184:
URL: https://github.com/apache/ozone/pull/2184#discussion_r621131003



##########
File path: hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/ha/SCMStateMachine.java
##########
@@ -294,6 +294,14 @@ public void pause() {
     getLifeCycle().transition(LifeCycle.State.PAUSED);
   }
 
+  @Override
+  public void reinitialize() {
+    if (getLifeCycleState() == LifeCycle.State.PAUSED) {

Review comment:
       After `notifyInstallSnapshotFromLeader` is called, ratis calls 
   ```
                   if (reply != null) {
                     LOG.info("{}: StateMachine successfully installed snapshot index {}. Reloading the StateMachine.",
                         getMemberId(), reply.getIndex());
                     stateMachine.pause();
                     state.updateInstalledSnapshotIndex(reply);
                     state.reloadStateMachine(reply.getIndex());
                   }
   ```
   `stateMachine.pause();` will make SM to be in PAUSED state, `state.reloadStateMachine(reply.getIndex())` will trigger `StateMachineUpdater#reload()` to be called, which will then call `stateMachine.reinitialize();`.
   
   This is the reason of the fix.
   
   As far as I known,  at the end of `TestSCMInstallSnapshotWithHA#testInstallSnapshot`, the followerSCM is also in PAUSED state, which is not checked before.
   
   And for `testInstallOldCheckpointFailure`, `notifyInstallSnapshotFromLeader` is not really called, since without the fix in RATIS-1369, downloading snapshot taken at index 0 is ignore by follower SCM.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] bshashikant commented on pull request #2184: HDDS-5148. Bump ratis version to 2.1.0-ff8aa66-SNAPSHOT

Posted by GitBox <gi...@apache.org>.
bshashikant commented on pull request #2184:
URL: https://github.com/apache/ozone/pull/2184#issuecomment-828121637


   Have filed https://issues.apache.org/jira/browse/RATIS-1370 for ratis issue.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org