You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ratis.apache.org by GitBox <gi...@apache.org> on 2021/10/01 08:45:49 UTC

[GitHub] [ratis] hanishakoneru commented on a change in pull request #504: RATIS-1402. do not send extra rpc calls to follower when the follower is still installing a snapshot

hanishakoneru commented on a change in pull request #504:
URL: https://github.com/apache/ratis/pull/504#discussion_r720053673



##########
File path: ratis-server/src/main/java/org/apache/ratis/server/impl/RaftServerImpl.java
##########
@@ -1624,7 +1626,7 @@ private InstallSnapshotReplyProto notifyStateMachineToInstallSnapshot(
       if (isSnapshotNull.compareAndSet(true, false)) {
         LOG.info("{}: InstallSnapshot notification result: {}", getMemberId(),
             InstallSnapshotResult.SNAPSHOT_UNAVAILABLE);
-        inProgressInstallSnapshotRequest.compareAndSet(firstAvailableLogTermIndex, null);
+        inProgressInstallSnapshotRequest.compareAndSet(firstAvailableLogTermIndex.hashCode(), 0);

Review comment:
       The hasCode() would still be different between to iterations of this call as we create a firstAvailableLogTermIndex object every time in notifyStateMachineToInstallSnapshot(). 
   Instead, we could set inProgressInstallSnapshotRequest to firstAvailableLogIndex. This will be a long value which will not change for different iterations of the same request.
   

##########
File path: ratis-server/src/main/java/org/apache/ratis/server/impl/RaftServerImpl.java
##########
@@ -1604,12 +1604,14 @@ private InstallSnapshotReplyProto notifyStateMachineToInstallSnapshot(
                     LOG.debug("{}: StateMachine could not install snapshot as it is not available", this);
                   }
                 }
-              // wait until the snapshot is installed successfully or exceptionally
-              }).join();
-        } catch (Throwable t) {
-          inProgressInstallSnapshotRequest.compareAndSet(firstAvailableLogTermIndex, null);
-          throw t;
-        }
+              // wait for 10 seconds for statemachine to install snapshot
+              }).get(10, TimeUnit.SECONDS);

Review comment:
       10 seconds is too long to block the main thread and wait for this executor thread to finish. I leader could assume this follower is not responding and stop the follower log appended from its side (is there a timeout?). Even if there is no timeout, I dont think we should block the main thread for more than a second.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ratis.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org