You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@ratis.apache.org by GitBox <gi...@apache.org> on 2020/08/02 05:36:15 UTC

[GitHub] [incubator-ratis] runzhiwang opened a new pull request #161: RATIS-993. Preventing disruptions when a server rejoins the cluster

runzhiwang opened a new pull request #161:
URL: https://github.com/apache/incubator-ratis/pull/161


   ## What changes were proposed in this pull request?
   
   Implement Preventing disruptions when a server rejoins the cluster according to raft paper section 9.6
   
   ## What is the link to the Apache JIRA
   
   https://issues.apache.org/jira/browse/RATIS-993
   
   ## How was this patch tested?
   
   add ut: testPreVote
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-ratis] runzhiwang commented on pull request #161: RATIS-993. Support pre vote

Posted by GitBox <gi...@apache.org>.
runzhiwang commented on pull request #161:
URL: https://github.com/apache/incubator-ratis/pull/161#issuecomment-755136290


   @szetszwo I have updated this pre-vote patch. Could you help review it ?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-ratis] runzhiwang commented on a change in pull request #161: RATIS-993. Preventing disruptions when a server rejoins the cluster

Posted by GitBox <gi...@apache.org>.
runzhiwang commented on a change in pull request #161:
URL: https://github.com/apache/incubator-ratis/pull/161#discussion_r466182408



##########
File path: ratis-server/src/main/java/org/apache/ratis/server/impl/ServerState.java
##########
@@ -221,10 +221,10 @@ boolean hasLeader() {
   /**
    * Become a candidate and start leader election
    */
-  long initElection() {
+  long initElection(boolean preVote) {

Review comment:
       @GlenGeng Thanks for review, I define askForPreVotes method to process pre vote, in this method, it will not change voteFor.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-ratis] runzhiwang commented on pull request #161: RATIS-993. Preventing disruptions when a server rejoins the cluster

Posted by GitBox <gi...@apache.org>.
runzhiwang commented on pull request #161:
URL: https://github.com/apache/incubator-ratis/pull/161#issuecomment-668914938


   @szetszwo @bshashikant @lokeshj1703 @GlenGeng Could you help review this patch ? Thank you very much.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-ratis] runzhiwang commented on pull request #161: RATIS-993. Preventing disruptions when a server rejoins the cluster

Posted by GitBox <gi...@apache.org>.
runzhiwang commented on pull request #161:
URL: https://github.com/apache/incubator-ratis/pull/161#issuecomment-669657731


   > the follower is not in the conf.
   > I cannot see this requirement in the paper. Could you give me a pointer?
   
   @szetszwo You are right, it does not exist in the paper. I implement it to adapt current ratis implementation. Because in current ratis, when leader election, leader will send shutdown to follower if follower, such as follower1, is not in leader's conf at [shouldSendShutdown(candidateId, candidateLastEntry)](https://github.com/apache/incubator-ratis/blob/master/ratis-server/src/main/java/org/apache/ratis/server/impl/RaftServerImpl.java#L860).  So I have two choice to make follower1 shutdown: 1.pass the pre-vote in this case, so that follower1 can shutdown in real leader election; 2.pre-vote in charge of shutdown follower1, but this will make pre-vote complicated. So I choose the first option. What do you think?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-ratis] szetszwo commented on a change in pull request #161: RATIS-993. Support pre vote

Posted by GitBox <gi...@apache.org>.
szetszwo commented on a change in pull request #161:
URL: https://github.com/apache/incubator-ratis/pull/161#discussion_r552490729



##########
File path: ratis-server/src/main/java/org/apache/ratis/server/impl/LeaderElection.java
##########
@@ -64,10 +64,14 @@
 
   private ResultAndTerm logAndReturn(Result result,
       Map<RaftPeerId, RequestVoteReplyProto> responses,
-      List<Exception> exceptions, long newTerm) {
-    LOG.info(this + ": Election " + result + "; received " + responses.size() + " response(s) "
-        + responses.values().stream().map(ServerStringUtils::toRequestVoteReplyString).collect(Collectors.toList())
-        + " and " + exceptions.size() + " exception(s); " + server.getState());
+      List<Exception> exceptions, long newTerm, boolean preVote) {
+    LOG.info("{}: {} {} received {} response(s):{}",
+        this,
+        preVote ? "Pre vote " : "Election ",

Review comment:
       Let's add a hyphen, i.e. "Pre-vote".

##########
File path: ratis-server/src/main/java/org/apache/ratis/server/impl/RaftServerImpl.java
##########
@@ -1026,10 +1026,84 @@ private boolean shouldSendShutdown(RaftPeerId candidateId,
   public RequestVoteReplyProto requestVote(RequestVoteRequestProto r)
       throws IOException {
     final RaftRpcRequestProto request = r.getServerRequest();
-    return requestVote(RaftPeerId.valueOf(request.getRequestorId()),
-        ProtoUtils.toRaftGroupId(request.getRaftGroupId()),
-        r.getCandidateTerm(),
-        TermIndex.valueOf(r.getCandidateLastEntry()));
+    if (r.getPreVote()) {
+      return requestPreVote(RaftPeerId.valueOf(request.getRequestorId()),
+          ProtoUtils.toRaftGroupId(request.getRaftGroupId()),
+          r.getCandidateTerm(),
+          TermIndex.valueOf(r.getCandidateLastEntry()));
+    } else {
+      return requestVote(RaftPeerId.valueOf(request.getRequestorId()),
+          ProtoUtils.toRaftGroupId(request.getRaftGroupId()),
+          r.getCandidateTerm(),
+          TermIndex.valueOf(r.getCandidateLastEntry()));
+    }
+  }
+
+  private boolean isCurrentLeaderValid() {
+    if (getInfo().isLeader()) {
+      return role.getLeaderState().map(LeaderStateImpl::checkLeadership).orElse(false);
+    } else if (getInfo().isFollower()) {
+      return state.hasLeader()
+          && role.getFollowerState().map(FollowerState::isCurrentLeaderValid).orElse(false);
+    } else if (getInfo().isCandidate()) {
+      return false;
+    }
+
+    return false;
+  }
+
+  private RequestVoteReplyProto requestPreVote(
+      RaftPeerId candidateId, RaftGroupId candidateGroupId,
+      long candidateTerm, TermIndex candidateLastEntry) throws IOException {
+    CodeInjectionForTesting.execute(REQUEST_VOTE, getId(),
+        candidateId, candidateTerm, candidateLastEntry);
+    LOG.info("{}: receive preVote({}, {}, {}, {})",
+        getMemberId(), candidateId, candidateGroupId, candidateTerm, candidateLastEntry);
+    assertLifeCycleState(LifeCycle.States.RUNNING);
+    assertGroup(candidateId, candidateGroupId);
+
+    boolean preVoteGranted = false;
+    boolean shouldShutdown = false;
+    final RequestVoteReplyProto reply;
+    synchronized (this) {
+      boolean isCurrentLeaderValid = isCurrentLeaderValid();
+      RaftPeer candidate = getRaftConf().getPeer(candidateId);
+
+      if (candidate != null) {
+        int compare = ServerState.compareLog(state.getLastEntry(), candidateLastEntry);
+        int priority = getRaftConf().getPeer(getId()).getPriority();
+        LOG.info("{} priority:{} candidate:{} candidatePriority:{} compare:{}",
+            this, priority, candidate, candidate.getPriority(), compare);
+
+        // vote for candidate if:
+        // 1. current leader is not valid
+        // 2. log lags behind candidate or
+        //    log equals candidate's, and priority less or equal candidate's
+        if (!isCurrentLeaderValid &&

Review comment:
       The condition `!isCurrentLeaderValid` should be moved to the previous if-statement.

##########
File path: ratis-server/src/main/java/org/apache/ratis/server/impl/RaftServerImpl.java
##########
@@ -1026,10 +1026,84 @@ private boolean shouldSendShutdown(RaftPeerId candidateId,
   public RequestVoteReplyProto requestVote(RequestVoteRequestProto r)
       throws IOException {
     final RaftRpcRequestProto request = r.getServerRequest();
-    return requestVote(RaftPeerId.valueOf(request.getRequestorId()),
-        ProtoUtils.toRaftGroupId(request.getRaftGroupId()),
-        r.getCandidateTerm(),
-        TermIndex.valueOf(r.getCandidateLastEntry()));
+    if (r.getPreVote()) {
+      return requestPreVote(RaftPeerId.valueOf(request.getRequestorId()),
+          ProtoUtils.toRaftGroupId(request.getRaftGroupId()),
+          r.getCandidateTerm(),
+          TermIndex.valueOf(r.getCandidateLastEntry()));
+    } else {
+      return requestVote(RaftPeerId.valueOf(request.getRequestorId()),
+          ProtoUtils.toRaftGroupId(request.getRaftGroupId()),
+          r.getCandidateTerm(),
+          TermIndex.valueOf(r.getCandidateLastEntry()));
+    }
+  }
+
+  private boolean isCurrentLeaderValid() {
+    if (getInfo().isLeader()) {
+      return role.getLeaderState().map(LeaderStateImpl::checkLeadership).orElse(false);
+    } else if (getInfo().isFollower()) {
+      return state.hasLeader()
+          && role.getFollowerState().map(FollowerState::isCurrentLeaderValid).orElse(false);
+    } else if (getInfo().isCandidate()) {
+      return false;
+    }
+
+    return false;
+  }
+
+  private RequestVoteReplyProto requestPreVote(
+      RaftPeerId candidateId, RaftGroupId candidateGroupId,
+      long candidateTerm, TermIndex candidateLastEntry) throws IOException {
+    CodeInjectionForTesting.execute(REQUEST_VOTE, getId(),
+        candidateId, candidateTerm, candidateLastEntry);
+    LOG.info("{}: receive preVote({}, {}, {}, {})",
+        getMemberId(), candidateId, candidateGroupId, candidateTerm, candidateLastEntry);
+    assertLifeCycleState(LifeCycle.States.RUNNING);
+    assertGroup(candidateId, candidateGroupId);
+
+    boolean preVoteGranted = false;
+    boolean shouldShutdown = false;
+    final RequestVoteReplyProto reply;
+    synchronized (this) {
+      boolean isCurrentLeaderValid = isCurrentLeaderValid();
+      RaftPeer candidate = getRaftConf().getPeer(candidateId);
+
+      if (candidate != null) {

Review comment:
       It should check also isCurrentLeaderValid, i.e.
   ```
         if (!isCurrentLeaderValid && candidate != null) {
   ```
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-ratis] szetszwo merged pull request #161: RATIS-993. Support pre vote

Posted by GitBox <gi...@apache.org>.
szetszwo merged pull request #161:
URL: https://github.com/apache/incubator-ratis/pull/161


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-ratis] runzhiwang commented on a change in pull request #161: RATIS-993. Preventing disruptions when a server rejoins the cluster

Posted by GitBox <gi...@apache.org>.
runzhiwang commented on a change in pull request #161:
URL: https://github.com/apache/incubator-ratis/pull/161#discussion_r466182843



##########
File path: ratis-server/src/main/java/org/apache/ratis/server/impl/LeaderElection.java
##########
@@ -188,89 +197,104 @@ private boolean shouldRun(long electionTerm) {
     return shouldRun() && server.getState().getCurrentTerm() == electionTerm;
   }
 
+  private ResultAndTerm submitRequestAndWaitResult(
+      final ServerState state, final RaftConfiguration conf, final long electionTerm, boolean preVote)
+      throws InterruptedException {
+    final ResultAndTerm r;
+    final Collection<RaftPeer> others = conf.getOtherPeers(server.getId());
+    if (others.isEmpty()) {
+      r = new ResultAndTerm(Result.PASSED, electionTerm);
+    } else {
+      TermIndex lastEntry = state.getLastEntry();
+      final Executor voteExecutor = new Executor(this, others.size());
+      try {
+        final int submitted = submitRequests(electionTerm, lastEntry, others, voteExecutor, preVote);
+        r = waitForResults(electionTerm, submitted, conf, voteExecutor, preVote);
+      } finally {
+        voteExecutor.shutdown();
+      }
+    }
+
+    return r;
+  }
+
   /**
    * After a peer changes its role to candidate, it invokes this method to
    * send out requestVote rpc to all other peers.
    */
-  private void askForVotes() throws InterruptedException, IOException {
+  private boolean askForVotes(boolean preVote) throws InterruptedException, IOException {

Review comment:
       Thanks for review, I define askForPreVotes method to process pre vote, in this method, I change `while (shouldRun())` to `if (shouldRun())`




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-ratis] szetszwo commented on pull request #161: RATIS-993. Support pre vote

Posted by GitBox <gi...@apache.org>.
szetszwo commented on pull request #161:
URL: https://github.com/apache/incubator-ratis/pull/161#issuecomment-755332894


   > ...  I will refactor the code in next PR.
   
   @runzhiwang , thanks a lot.  Since you are working on RATIS-1273, let me do the refactoring.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-ratis] runzhiwang closed pull request #161: RATIS-993. Support pre vote

Posted by GitBox <gi...@apache.org>.
runzhiwang closed pull request #161:
URL: https://github.com/apache/incubator-ratis/pull/161


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-ratis] runzhiwang commented on pull request #161: RATIS-993. Preventing disruptions when a server rejoins the cluster

Posted by GitBox <gi...@apache.org>.
runzhiwang commented on pull request #161:
URL: https://github.com/apache/incubator-ratis/pull/161#issuecomment-669653007


   > 
   > ... the voters have not received heartbeats from a valid leader for at least a baseline election timeout ...
   > 
   > This requirement seems not yet implemented in the current change.
   
   @szetszwo  Thanks for review. You are right, I did not implement it in this PR because I want to make this PR simpler to review, and I will implement it after this PR. Even though without checkout timeout,  it will not introduce new bug, because in this case pre-vote passed and trigger a new leader election. I can also implement it in this PR, it's ok to me.  Which one do you prefer ?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-ratis] runzhiwang commented on pull request #161: RATIS-993. Preventing disruptions when a server rejoins the cluster

Posted by GitBox <gi...@apache.org>.
runzhiwang commented on pull request #161:
URL: https://github.com/apache/incubator-ratis/pull/161#issuecomment-670247608


   > When the follower is not in the conf, it should not allowed to join the group. Otherwise, any followers could join any groups.
   @szetszwo Thanks for explanation. Do you mean shutdown the follower not in the conf when pre-vote ? 
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-ratis] GlenGeng commented on a change in pull request #161: RATIS-993. Preventing disruptions when a server rejoins the cluster

Posted by GitBox <gi...@apache.org>.
GlenGeng commented on a change in pull request #161:
URL: https://github.com/apache/incubator-ratis/pull/161#discussion_r466130097



##########
File path: ratis-server/src/main/java/org/apache/ratis/server/impl/LeaderElection.java
##########
@@ -188,89 +197,104 @@ private boolean shouldRun(long electionTerm) {
     return shouldRun() && server.getState().getCurrentTerm() == electionTerm;
   }
 
+  private ResultAndTerm submitRequestAndWaitResult(
+      final ServerState state, final RaftConfiguration conf, final long electionTerm, boolean preVote)
+      throws InterruptedException {
+    final ResultAndTerm r;
+    final Collection<RaftPeer> others = conf.getOtherPeers(server.getId());
+    if (others.isEmpty()) {
+      r = new ResultAndTerm(Result.PASSED, electionTerm);
+    } else {
+      TermIndex lastEntry = state.getLastEntry();
+      final Executor voteExecutor = new Executor(this, others.size());
+      try {
+        final int submitted = submitRequests(electionTerm, lastEntry, others, voteExecutor, preVote);
+        r = waitForResults(electionTerm, submitted, conf, voteExecutor, preVote);
+      } finally {
+        voteExecutor.shutdown();
+      }
+    }
+
+    return r;
+  }
+
   /**
    * After a peer changes its role to candidate, it invokes this method to
    * send out requestVote rpc to all other peers.
    */
-  private void askForVotes() throws InterruptedException, IOException {
+  private boolean askForVotes(boolean preVote) throws InterruptedException, IOException {

Review comment:
       `While (shouldRu())` may be omitted for preVote case, since it just need one round rpc.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-ratis] runzhiwang closed pull request #161: RATIS-993. Preventing disruptions when a server rejoins the cluster

Posted by GitBox <gi...@apache.org>.
runzhiwang closed pull request #161:
URL: https://github.com/apache/incubator-ratis/pull/161


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-ratis] runzhiwang commented on pull request #161: RATIS-993. Support pre vote

Posted by GitBox <gi...@apache.org>.
runzhiwang commented on pull request #161:
URL: https://github.com/apache/incubator-ratis/pull/161#issuecomment-755185123


   will fix the failed ut.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-ratis] runzhiwang closed pull request #161: RATIS-993. Preventing disruptions when a server rejoins the cluster

Posted by GitBox <gi...@apache.org>.
runzhiwang closed pull request #161:
URL: https://github.com/apache/incubator-ratis/pull/161


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-ratis] runzhiwang commented on pull request #161: RATIS-993. Preventing disruptions when a server rejoins the cluster

Posted by GitBox <gi...@apache.org>.
runzhiwang commented on pull request #161:
URL: https://github.com/apache/incubator-ratis/pull/161#issuecomment-670318566


   @szetszwo Thanks for review. I have updated the patch. Could you help review it again ?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-ratis] runzhiwang closed pull request #161: RATIS-993. Preventing disruptions when a server rejoins the cluster

Posted by GitBox <gi...@apache.org>.
runzhiwang closed pull request #161:
URL: https://github.com/apache/incubator-ratis/pull/161


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-ratis] GlenGeng commented on a change in pull request #161: RATIS-993. Preventing disruptions when a server rejoins the cluster

Posted by GitBox <gi...@apache.org>.
GlenGeng commented on a change in pull request #161:
URL: https://github.com/apache/incubator-ratis/pull/161#discussion_r466128412



##########
File path: ratis-server/src/main/java/org/apache/ratis/server/impl/ServerState.java
##########
@@ -221,10 +221,10 @@ boolean hasLeader() {
   /**
    * Become a candidate and start leader election
    */
-  long initElection() {
+  long initElection(boolean preVote) {

Review comment:
       If `preVote` is true, you will not change `currentTerm`, meanwhile change `voteFor`, which may violate the raft algorithm that one can only vote once in a specific term.

##########
File path: ratis-server/src/main/java/org/apache/ratis/server/impl/LeaderElection.java
##########
@@ -188,89 +197,104 @@ private boolean shouldRun(long electionTerm) {
     return shouldRun() && server.getState().getCurrentTerm() == electionTerm;
   }
 
+  private ResultAndTerm submitRequestAndWaitResult(
+      final ServerState state, final RaftConfiguration conf, final long electionTerm, boolean preVote)
+      throws InterruptedException {
+    final ResultAndTerm r;
+    final Collection<RaftPeer> others = conf.getOtherPeers(server.getId());
+    if (others.isEmpty()) {
+      r = new ResultAndTerm(Result.PASSED, electionTerm);
+    } else {
+      TermIndex lastEntry = state.getLastEntry();
+      final Executor voteExecutor = new Executor(this, others.size());
+      try {
+        final int submitted = submitRequests(electionTerm, lastEntry, others, voteExecutor, preVote);
+        r = waitForResults(electionTerm, submitted, conf, voteExecutor, preVote);
+      } finally {
+        voteExecutor.shutdown();
+      }
+    }
+
+    return r;
+  }
+
   /**
    * After a peer changes its role to candidate, it invokes this method to
    * send out requestVote rpc to all other peers.
    */
-  private void askForVotes() throws InterruptedException, IOException {
+  private boolean askForVotes(boolean preVote) throws InterruptedException, IOException {

Review comment:
       `While (shouldRu())` may be omitted for preVote case, since it just need one round roc.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-ratis] szetszwo commented on pull request #161: RATIS-993. Preventing disruptions when a server rejoins the cluster

Posted by GitBox <gi...@apache.org>.
szetszwo commented on pull request #161:
URL: https://github.com/apache/incubator-ratis/pull/161#issuecomment-670306841


   Yes, we should shutdown the follower for that group.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-ratis] runzhiwang closed pull request #161: RATIS-993. Preventing disruptions when a server rejoins the cluster

Posted by GitBox <gi...@apache.org>.
runzhiwang closed pull request #161:
URL: https://github.com/apache/incubator-ratis/pull/161


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-ratis] szetszwo commented on pull request #161: RATIS-993. Preventing disruptions when a server rejoins the cluster

Posted by GitBox <gi...@apache.org>.
szetszwo commented on pull request #161:
URL: https://github.com/apache/incubator-ratis/pull/161#issuecomment-670148089


   @runzhiwang Let's just implement exactly raft paper section 9.6.
   
   When the follower is not in the conf, it should not allowed to join the group.  Otherwise, any followers could join any groups.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-ratis] runzhiwang commented on pull request #161: RATIS-993. Support pre vote

Posted by GitBox <gi...@apache.org>.
runzhiwang commented on pull request #161:
URL: https://github.com/apache/incubator-ratis/pull/161#issuecomment-755335868


   @szetszwo  Sure, thanks a lot.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-ratis] runzhiwang closed pull request #161: RATIS-993. Preventing disruptions when a server rejoins the cluster

Posted by GitBox <gi...@apache.org>.
runzhiwang closed pull request #161:
URL: https://github.com/apache/incubator-ratis/pull/161


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-ratis] szetszwo commented on pull request #161: RATIS-993. Preventing disruptions when a server rejoins the cluster

Posted by GitBox <gi...@apache.org>.
szetszwo commented on pull request #161:
URL: https://github.com/apache/incubator-ratis/pull/161#issuecomment-669326952


   > 2. the follower is not in the conf.
   
   I cannot see this requirement in the paper.  Could you give me a pointer?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-ratis] runzhiwang edited a comment on pull request #161: RATIS-993. Preventing disruptions when a server rejoins the cluster

Posted by GitBox <gi...@apache.org>.
runzhiwang edited a comment on pull request #161:
URL: https://github.com/apache/incubator-ratis/pull/161#issuecomment-670247608


   > When the follower is not in the conf, it should not allowed to join the group. Otherwise, any followers could join any groups.
   
   @szetszwo Thanks for explanation. Do you mean shutdown the follower not in the conf when pre-vote ? 
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-ratis] runzhiwang commented on pull request #161: RATIS-993. Support pre vote

Posted by GitBox <gi...@apache.org>.
runzhiwang commented on pull request #161:
URL: https://github.com/apache/incubator-ratis/pull/161#issuecomment-755273565


   > There are quite many code duplications between pre-vote and vote. We may refactor the code here or in a separated JIRA.
   
   @szetszwo Thanks for review. I will refactor the code in next PR. 
   
   Besides, when s1 transferLeaderShip to s2,  s2 should skip askForPreVote, and askForVote directly. Because when s2 askForPreVote, s1 and s3 will reject it because current leader is still valid. So I add a flag: LeaderElection#force which means we should skip askForPreVote and askForVote directly, this flag will be true when s2 receive StartLeaderElectionRequest.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-ratis] szetszwo commented on pull request #161: RATIS-993. Preventing disruptions when a server rejoins the cluster

Posted by GitBox <gi...@apache.org>.
szetszwo commented on pull request #161:
URL: https://github.com/apache/incubator-ratis/pull/161#issuecomment-669325790


   @runzhiwang thanks for working on this.
   
   > ... the voters have not received heartbeats from a valid leader for at least a baseline election timeout ...
   
   This requirement seems not yet implemented in the current change.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org