You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@ratis.apache.org by GitBox <gi...@apache.org> on 2020/05/05 17:37:46 UTC

[GitHub] [incubator-ratis] hanishakoneru opened a new pull request #88: RATIS-873. Fix Install Snapshot Notification

hanishakoneru opened a new pull request #88:
URL: https://github.com/apache/incubator-ratis/pull/88


   This Jira aims to fix the following:
   
   - When Follower is in the process of installing snapshot and it gets an append entry, it replies with result INCONSISTENCY and the follower next index is updated to the snapshot index being installed. This should not happen as the snapshot installation is still in progress. Follower's next index on the leader should remain the same.
   
   - After InstallSnapshot is done, Leader should update the commitIndex of the Follower to the installed snapshot index.
   
   - After InstallSnapshot, when reloading StateMachine, any previously open LogSegment should be closed, if the last entry in the open log is already included in the snapshot.
   
   - When Follower is notified to install snapshot through StateMachine, the reply should indicate the same. It would help with debugging if the Install Snapshot success reply and Install Snapshot notified replies are distinct.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-ratis] hanishakoneru commented on a change in pull request #88: RATIS-873. Fix Install Snapshot Notification

Posted by GitBox <gi...@apache.org>.
hanishakoneru commented on a change in pull request #88:
URL: https://github.com/apache/incubator-ratis/pull/88#discussion_r421767590



##########
File path: ratis-server/src/main/java/org/apache/ratis/server/raftlog/segmented/SegmentedRaftLog.java
##########
@@ -492,6 +492,13 @@ public void syncWithSnapshot(long lastSnapshotIndex) {
     // if the last index in snapshot is larger than the index of the last
     // log entry, we should delete all the log entries and their cache to avoid
     // gaps between log segments.
+
+    // Close open log segment if entries are already included in snapshot
+    LogSegment openSegment = cache.getOpenSegment();
+    if (openSegment != null && openSegment.getEndIndex() <= lastSnapshotIndex) {
+      fileLogWorker.closeLogSegment(openSegment);
+    }
+    cache.clear();

Review comment:
       Done.

##########
File path: ratis-grpc/src/main/java/org/apache/ratis/grpc/server/GrpcLogAppender.java
##########
@@ -222,6 +222,10 @@ private void increaseNextIndex(AppendEntriesRequestProto request) {
     }
   }
 
+  private void increaseNextIndex(final long installedSnapshotIndex) {
+    getFollower().increaseNextIndex(installedSnapshotIndex + 1);

Review comment:
       Done.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-ratis] lokeshj1703 commented on a change in pull request #88: RATIS-873. Fix Install Snapshot Notification

Posted by GitBox <gi...@apache.org>.
lokeshj1703 commented on a change in pull request #88:
URL: https://github.com/apache/incubator-ratis/pull/88#discussion_r421367469



##########
File path: ratis-grpc/src/main/java/org/apache/ratis/grpc/server/GrpcLogAppender.java
##########
@@ -222,6 +222,10 @@ private void increaseNextIndex(AppendEntriesRequestProto request) {
     }
   }
 
+  private void increaseNextIndex(final long installedSnapshotIndex) {
+    getFollower().increaseNextIndex(installedSnapshotIndex + 1);

Review comment:
       FollowerInfo#increaseNextIndex will throw error if current next index is greater than installedSnapshotIndex + 1. I think we should do something like updateCommitIndex where we set nextIndex to max value.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-ratis] hanishakoneru commented on a change in pull request #88: RATIS-873. Fix Install Snapshot Notification

Posted by GitBox <gi...@apache.org>.
hanishakoneru commented on a change in pull request #88:
URL: https://github.com/apache/incubator-ratis/pull/88#discussion_r421761623



##########
File path: ratis-grpc/src/main/java/org/apache/ratis/grpc/server/GrpcLogAppender.java
##########
@@ -369,13 +381,19 @@ public void onNext(InstallSnapshotReplyProto reply) {
 
       switch (reply.getResult()) {
         case SUCCESS:

Review comment:
       Yes but I am not sure what all needs to be fixed in the regular InstallSnapshot. 
   For example, after install snapshot is done, before reloading StateMachine, we should do state.updateInstalledSnapshotIndex(reply).
   I think we should open another Jira to fix the InstallSnapshot.
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-ratis] lokeshj1703 commented on a change in pull request #88: RATIS-873. Fix Install Snapshot Notification

Posted by GitBox <gi...@apache.org>.
lokeshj1703 commented on a change in pull request #88:
URL: https://github.com/apache/incubator-ratis/pull/88#discussion_r421367469



##########
File path: ratis-grpc/src/main/java/org/apache/ratis/grpc/server/GrpcLogAppender.java
##########
@@ -222,6 +222,10 @@ private void increaseNextIndex(AppendEntriesRequestProto request) {
     }
   }
 
+  private void increaseNextIndex(final long installedSnapshotIndex) {
+    getFollower().increaseNextIndex(installedSnapshotIndex + 1);

Review comment:
       FollowerInfo#increaseNextIndex will throw error if current next index is greater than installedSnapshotIndex + 1. I think we should do something like updateCommitIndex here.

##########
File path: ratis-server/src/main/java/org/apache/ratis/server/raftlog/segmented/SegmentedRaftLog.java
##########
@@ -492,6 +492,13 @@ public void syncWithSnapshot(long lastSnapshotIndex) {
     // if the last index in snapshot is larger than the index of the last
     // log entry, we should delete all the log entries and their cache to avoid
     // gaps between log segments.
+
+    // Close open log segment if entries are already included in snapshot
+    LogSegment openSegment = cache.getOpenSegment();
+    if (openSegment != null && openSegment.getEndIndex() <= lastSnapshotIndex) {
+      fileLogWorker.closeLogSegment(openSegment);
+    }
+    cache.clear();

Review comment:
       We can potentially clear closed log segments with endIndex > snapshot index? Maybe we should move it inside the if statement?

##########
File path: ratis-grpc/src/main/java/org/apache/ratis/grpc/server/GrpcLogAppender.java
##########
@@ -369,13 +381,19 @@ public void onNext(InstallSnapshotReplyProto reply) {
 
       switch (reply.getResult()) {
         case SUCCESS:

Review comment:
       Do we need to update commitIndex and nextIndex here as well?

##########
File path: ratis-grpc/src/main/java/org/apache/ratis/grpc/server/GrpcLogAppender.java
##########
@@ -394,17 +412,24 @@ public void onNext(InstallSnapshotReplyProto reply) {
     @Override
     public void onError(Throwable t) {
       if (!isAppenderRunning()) {
-        LOG.info("{} is stopped", this);
+        LOG.info("{} is stopped", GrpcLogAppender.this);
         return;
       }
-      LOG.error("{}: Failed installSnapshot: {}", this, t);
+      GrpcUtil.warn(LOG, () -> this + ": Failed InstallSnapshot", t);
+      grpcServerMetrics.onRequestRetry(); // Update try counter
       resetClient(null);
       close();
     }
 
     @Override
     public void onCompleted() {
-      LOG.info("{}: follower responses installSnapshot COMPLETED", this);
+      if (!isNotificationOnly) {
+        LOG.info("{}: follower responded installSnapshot COMPLETED", this);
+      } else {
+        if (LOG.isDebugEnabled()) {
+          LOG.info("{}: follower responded installSnapshot COMPLETED", this);
+        }
+      }

Review comment:
       We can combine the if statements using below condition.
   `!isNotificationOnly || LOG.isDebugEnabled()`




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-ratis] lokeshj1703 commented on a change in pull request #88: RATIS-873. Fix Install Snapshot Notification

Posted by GitBox <gi...@apache.org>.
lokeshj1703 commented on a change in pull request #88:
URL: https://github.com/apache/incubator-ratis/pull/88#discussion_r420700281



##########
File path: ratis-grpc/src/main/java/org/apache/ratis/grpc/server/GrpcLogAppender.java
##########
@@ -369,13 +369,22 @@ public void onNext(InstallSnapshotReplyProto reply) {
 
       switch (reply.getResult()) {
         case SUCCESS:
+          LOG.info("{}: Completed InstallSnapshot. Reply: {}", this);
+          removePending(reply);
+          break;
+        case NOTIFIED:

Review comment:
       Based on my understanding I think IN_PROGRESS and NOTIFIED are similar. NOTIFIED means that follower received the request? If this is the case can we use IN_PROGRESS to reply once follower receives the request?

##########
File path: ratis-grpc/src/main/java/org/apache/ratis/grpc/server/GrpcLogAppender.java
##########
@@ -369,13 +369,22 @@ public void onNext(InstallSnapshotReplyProto reply) {
 
       switch (reply.getResult()) {
         case SUCCESS:
+          LOG.info("{}: Completed InstallSnapshot. Reply: {}", this);

Review comment:
       We need to add reply in the arguments.

##########
File path: ratis-grpc/src/main/java/org/apache/ratis/grpc/server/GrpcLogAppender.java
##########
@@ -369,13 +369,22 @@ public void onNext(InstallSnapshotReplyProto reply) {
 
       switch (reply.getResult()) {
         case SUCCESS:
+          LOG.info("{}: Completed InstallSnapshot. Reply: {}", this);
+          removePending(reply);
+          break;
+        case NOTIFIED:
+          LOG.info("{}: Notified to InstallSnapshot.", this);
+          removePending(reply);
+          break;
         case IN_PROGRESS:
+          LOG.info("{}: InstallSnapshot in progress.", this);
           removePending(reply);
           break;
         case ALREADY_INSTALLED:
           final long followerSnapshotIndex = reply.getSnapshotIndex();
-          LOG.info("{}: set follower snapshotIndex to {}.", this, followerSnapshotIndex);
+          LOG.info("{}: Already Installed Snapshot Index {}.", this, followerSnapshotIndex);
           getFollower().setSnapshotIndex(followerSnapshotIndex);
+          updateCommitIndex(followerSnapshotIndex);

Review comment:
       Should we also update nextIndex? 

##########
File path: ratis-grpc/src/main/java/org/apache/ratis/grpc/server/GrpcLogAppender.java
##########
@@ -394,17 +403,19 @@ public void onNext(InstallSnapshotReplyProto reply) {
     @Override
     public void onError(Throwable t) {
       if (!isAppenderRunning()) {
-        LOG.info("{} is stopped", this);
+        LOG.info("{} is stopped", GrpcLogAppender.this);
         return;
       }
-      LOG.error("{}: Failed installSnapshot: {}", this, t);
+      GrpcUtil.warn(LOG, () -> this + ": Failed InstallSnapshot", t);
+      grpcServerMetrics.onRequestRetry(); // Update try counter
       resetClient(null);
-      close();

Review comment:
       We can add a flag in close to mark it as closed without notifying append in this case.

##########
File path: ratis-grpc/src/main/java/org/apache/ratis/grpc/server/GrpcLogAppender.java
##########
@@ -394,17 +403,19 @@ public void onNext(InstallSnapshotReplyProto reply) {
     @Override
     public void onError(Throwable t) {
       if (!isAppenderRunning()) {
-        LOG.info("{} is stopped", this);
+        LOG.info("{} is stopped", GrpcLogAppender.this);
         return;
       }
-      LOG.error("{}: Failed installSnapshot: {}", this, t);
+      GrpcUtil.warn(LOG, () -> this + ": Failed InstallSnapshot", t);
+      grpcServerMetrics.onRequestRetry(); // Update try counter
       resetClient(null);
-      close();
     }
 
     @Override
     public void onCompleted() {
-      LOG.info("{}: follower responses installSnapshot COMPLETED", this);
+      if (!firstResponseReceived) {

Review comment:
       Why do we need this check?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-ratis] hanishakoneru commented on a change in pull request #88: RATIS-873. Fix Install Snapshot Notification

Posted by GitBox <gi...@apache.org>.
hanishakoneru commented on a change in pull request #88:
URL: https://github.com/apache/incubator-ratis/pull/88#discussion_r421126675



##########
File path: ratis-grpc/src/main/java/org/apache/ratis/grpc/server/GrpcLogAppender.java
##########
@@ -394,17 +403,19 @@ public void onNext(InstallSnapshotReplyProto reply) {
     @Override
     public void onError(Throwable t) {
       if (!isAppenderRunning()) {
-        LOG.info("{} is stopped", this);
+        LOG.info("{} is stopped", GrpcLogAppender.this);
         return;
       }
-      LOG.error("{}: Failed installSnapshot: {}", this, t);
+      GrpcUtil.warn(LOG, () -> this + ": Failed InstallSnapshot", t);
+      grpcServerMetrics.onRequestRetry(); // Update try counter
       resetClient(null);
-      close();
     }
 
     @Override
     public void onCompleted() {
-      LOG.info("{}: follower responses installSnapshot COMPLETED", this);
+      if (!firstResponseReceived) {

Review comment:
       I had added it as otherwise the Log is flooded with "InstallSnapshot COMPLETED" messages even when the install snapshot is in progress. I updated it to not print in case of install snapshot notification.

##########
File path: ratis-grpc/src/main/java/org/apache/ratis/grpc/server/GrpcLogAppender.java
##########
@@ -369,13 +369,22 @@ public void onNext(InstallSnapshotReplyProto reply) {
 
       switch (reply.getResult()) {
         case SUCCESS:
+          LOG.info("{}: Completed InstallSnapshot. Reply: {}", this);
+          removePending(reply);
+          break;
+        case NOTIFIED:
+          LOG.info("{}: Notified to InstallSnapshot.", this);
+          removePending(reply);
+          break;
         case IN_PROGRESS:
+          LOG.info("{}: InstallSnapshot in progress.", this);
           removePending(reply);
           break;
         case ALREADY_INSTALLED:
           final long followerSnapshotIndex = reply.getSnapshotIndex();
-          LOG.info("{}: set follower snapshotIndex to {}.", this, followerSnapshotIndex);
+          LOG.info("{}: Already Installed Snapshot Index {}.", this, followerSnapshotIndex);
           getFollower().setSnapshotIndex(followerSnapshotIndex);
+          updateCommitIndex(followerSnapshotIndex);

Review comment:
       Done.

##########
File path: ratis-grpc/src/main/java/org/apache/ratis/grpc/server/GrpcLogAppender.java
##########
@@ -369,13 +369,22 @@ public void onNext(InstallSnapshotReplyProto reply) {
 
       switch (reply.getResult()) {
         case SUCCESS:
+          LOG.info("{}: Completed InstallSnapshot. Reply: {}", this);

Review comment:
       Done.

##########
File path: ratis-grpc/src/main/java/org/apache/ratis/grpc/server/GrpcLogAppender.java
##########
@@ -369,13 +369,22 @@ public void onNext(InstallSnapshotReplyProto reply) {
 
       switch (reply.getResult()) {
         case SUCCESS:
+          LOG.info("{}: Completed InstallSnapshot. Reply: {}", this);
+          removePending(reply);
+          break;
+        case NOTIFIED:

Review comment:
       Agree. Done.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-ratis] hanishakoneru commented on pull request #88: RATIS-873. Fix Install Snapshot Notification

Posted by GitBox <gi...@apache.org>.
hanishakoneru commented on pull request #88:
URL: https://github.com/apache/incubator-ratis/pull/88#issuecomment-624922818


   Thank you @lokeshj1703 for the review. I addressed all your comments in commit [8ea4014](https://github.com/apache/incubator-ratis/pull/88/commits/8ea401453866d0d9e5a2b099fc6b8ecd6bdee33a).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-ratis] hanishakoneru commented on pull request #88: RATIS-873. Fix Install Snapshot Notification

Posted by GitBox <gi...@apache.org>.
hanishakoneru commented on pull request #88:
URL: https://github.com/apache/incubator-ratis/pull/88#issuecomment-624277076


   Failing unit tests pass locally.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-ratis] hanishakoneru commented on a change in pull request #88: RATIS-873. Fix Install Snapshot Notification

Posted by GitBox <gi...@apache.org>.
hanishakoneru commented on a change in pull request #88:
URL: https://github.com/apache/incubator-ratis/pull/88#discussion_r421765301



##########
File path: ratis-grpc/src/main/java/org/apache/ratis/grpc/server/GrpcLogAppender.java
##########
@@ -394,17 +412,24 @@ public void onNext(InstallSnapshotReplyProto reply) {
     @Override
     public void onError(Throwable t) {
       if (!isAppenderRunning()) {
-        LOG.info("{} is stopped", this);
+        LOG.info("{} is stopped", GrpcLogAppender.this);
         return;
       }
-      LOG.error("{}: Failed installSnapshot: {}", this, t);
+      GrpcUtil.warn(LOG, () -> this + ": Failed InstallSnapshot", t);
+      grpcServerMetrics.onRequestRetry(); // Update try counter
       resetClient(null);
       close();
     }
 
     @Override
     public void onCompleted() {
-      LOG.info("{}: follower responses installSnapshot COMPLETED", this);
+      if (!isNotificationOnly) {
+        LOG.info("{}: follower responded installSnapshot COMPLETED", this);
+      } else {
+        if (LOG.isDebugEnabled()) {
+          LOG.info("{}: follower responded installSnapshot COMPLETED", this);
+        }
+      }

Review comment:
       Done.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-ratis] hanishakoneru commented on a change in pull request #88: RATIS-873. Fix Install Snapshot Notification

Posted by GitBox <gi...@apache.org>.
hanishakoneru commented on a change in pull request #88:
URL: https://github.com/apache/incubator-ratis/pull/88#discussion_r421126087



##########
File path: ratis-grpc/src/main/java/org/apache/ratis/grpc/server/GrpcLogAppender.java
##########
@@ -394,17 +403,19 @@ public void onNext(InstallSnapshotReplyProto reply) {
     @Override
     public void onError(Throwable t) {
       if (!isAppenderRunning()) {
-        LOG.info("{} is stopped", this);
+        LOG.info("{} is stopped", GrpcLogAppender.this);
         return;
       }
-      LOG.error("{}: Failed installSnapshot: {}", this, t);
+      GrpcUtil.warn(LOG, () -> this + ": Failed InstallSnapshot", t);
+      grpcServerMetrics.onRequestRetry(); // Update try counter
       resetClient(null);
-      close();

Review comment:
       Done.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-ratis] hanishakoneru commented on a change in pull request #88: RATIS-873. Fix Install Snapshot Notification

Posted by GitBox <gi...@apache.org>.
hanishakoneru commented on a change in pull request #88:
URL: https://github.com/apache/incubator-ratis/pull/88#discussion_r421761623



##########
File path: ratis-grpc/src/main/java/org/apache/ratis/grpc/server/GrpcLogAppender.java
##########
@@ -369,13 +381,19 @@ public void onNext(InstallSnapshotReplyProto reply) {
 
       switch (reply.getResult()) {
         case SUCCESS:

Review comment:
       Yes but I am not sure what all needs to be fixed in the regular InstallSnapshot. I can update the snapshotIndex, commitIndex and nextIndex here for now.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-ratis] lokeshj1703 commented on a change in pull request #88: RATIS-873. Fix Install Snapshot Notification

Posted by GitBox <gi...@apache.org>.
lokeshj1703 commented on a change in pull request #88:
URL: https://github.com/apache/incubator-ratis/pull/88#discussion_r421942911



##########
File path: ratis-grpc/src/main/java/org/apache/ratis/grpc/server/GrpcLogAppender.java
##########
@@ -369,13 +381,19 @@ public void onNext(InstallSnapshotReplyProto reply) {
 
       switch (reply.getResult()) {
         case SUCCESS:

Review comment:
       Yes..Makes sense to do it in a separate jira.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-ratis] hanishakoneru commented on pull request #88: RATIS-873. Fix Install Snapshot Notification

Posted by GitBox <gi...@apache.org>.
hanishakoneru commented on pull request #88:
URL: https://github.com/apache/incubator-ratis/pull/88#issuecomment-625962831


   Thanks @lokeshj1703 for the reviews. Merged into master.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org