You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ratis.apache.org by GitBox <gi...@apache.org> on 2023/01/18 03:26:26 UTC

[GitHub] [ratis] kaijchen opened a new pull request, #808: RATIS-1769. Do not change priority in TransferCommand

kaijchen opened a new pull request, #808:
URL: https://github.com/apache/ratis/pull/808

   ## What changes were proposed in this pull request?
   
   This is a followup of [RATIS-1762](https://issues.apache.org/jira/browse/RATIS-1762). TransferCommand should not change priority of peers (or at least not by default).
   
   Sadly this will break backward compatibility. But version 3.0 hasn't been released, so it might be OK.
   
   ## What is the link to the Apache JIRA
   
   https://issues.apache.org/jira/browse/RATIS-1769
   
   ## How was this patch tested?
   
   Manually.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ratis.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [ratis] szetszwo commented on pull request #808: RATIS-1769. Add TransferLeaderCommand and deprecate TransferCommand

Posted by "szetszwo (via GitHub)" <gi...@apache.org>.
szetszwo commented on PR #808:
URL: https://github.com/apache/ratis/pull/808#issuecomment-1431016422

   @kaijchen , that's a good point.  We may change the existing `election transfer` command as you described:
   - If the target peer already has the highest priority, do not call `setConfiguration`.
   - If not, call `setConfiguration` to change only the target peer to the highest priority.
   
   Does it sound good?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ratis.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [ratis] kaijchen commented on pull request #808: RATIS-1769. Add TransferLeaderCommand and deprecate TransferCommand

Posted by "kaijchen (via GitHub)" <gi...@apache.org>.
kaijchen commented on PR #808:
URL: https://github.com/apache/ratis/pull/808#issuecomment-1429844030

   > * The new `election transferLeader` command, since it won't change priorities, actually is yielding the leadership.  How about we call it `election yield`?
   
   I'm OK to call it `election yield`, but since raft paper call it transfer leadership, users may get confused.
   
   > * The existing `election transfer` command is still useful since it can transfer the leader to a lower priority peer.  How about we don't deprecate it?
   
   The existing `election transfer` is equal to `peer setPriority` plus `election transferLeader`. User can still achieve the same functionality, and the priority will be set by user explicitly (so the user knows exactly what is going on).
   
   Based on these reasons, I chose to deprecate the old command. How do you think?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ratis.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [ratis] codings-dan commented on pull request #808: RATIS-1769. Do not change priority in TransferCommand

Posted by "codings-dan (via GitHub)" <gi...@apache.org>.
codings-dan commented on PR #808:
URL: https://github.com/apache/ratis/pull/808#issuecomment-1414770574

   @kaijchen Thanks for working on this. I prefer the second option, we can add a new commad to ensure compatibility.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ratis.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [ratis] kaijchen commented on pull request #808: RATIS-1769. Do not change priority in TransferCommand

Posted by GitBox <gi...@apache.org>.
kaijchen commented on PR #808:
URL: https://github.com/apache/ratis/pull/808#issuecomment-1386434153

   > Sadly this will break backward compatibility. But version 3.0 hasn't been released, so it might be OK.
   
   @szetszwo @codings-dan what do you think?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ratis.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [ratis] szetszwo commented on pull request #808: RATIS-1769. Add TransferLeaderCommand and deprecate TransferCommand

Posted by "szetszwo (via GitHub)" <gi...@apache.org>.
szetszwo commented on PR #808:
URL: https://github.com/apache/ratis/pull/808#issuecomment-1430294500

   @kaijchen , I got a new idea -- we may update the `election transfer` as following:
   - Since we already have the `getRaftGroup()` returned from the server, we can check priorities to decide whether to call `setConfiguration`.  It will call `setConfiguration` only if the transferee has smaller priority.
   - Then, we don't need a new command.
   
   We do you think?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ratis.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [ratis] kaijchen commented on pull request #808: RATIS-1769. Add TransferLeaderCommand and deprecate TransferCommand

Posted by "kaijchen (via GitHub)" <gi...@apache.org>.
kaijchen commented on PR #808:
URL: https://github.com/apache/ratis/pull/808#issuecomment-1432720999

   Manually test backward compatibility, Ratis shell version `3.0.0-SNAPSHOT`, Ratis server version `2.4.1`:
   ```console
   $ bin/ratis sh election transfer -peers 127.0.0.1:10024,127.0.0.1:10124,127.0.0.1:11124 -address 127.0.0.1:11124
   [main] INFO org.reflections.Reflections - Reflections took 164 ms to scan 1 urls, producing 5 keys and 18 values
   [main] WARN org.apache.ratis.metrics.MetricRegistries - Found multiple MetricRegistries implementations: class org.apache.ratis.metrics.impl.MetricRegistriesImpl, class org.apache.ratis.metrics.dropwizard3.Dm3MetricRegistriesImpl. Using first found implementation: org.apache.ratis.metrics.impl.MetricRegistriesImpl@1e67a849
   Transferring leadership to server with address <127.0.0.1:11124> 
   Changing priority of <127.0.0.1:11124> to 1
   : caught an error when executing transfer: n0@group-ABB3109A44C1 refused to transfer leadership to peer n2 as it does not has highest priority 9: peers:[n0|rpc:127.0.0.1:10024|admin:|client:|dataStream:|priority:1|startupRole:FOLLOWER, n1|rpc:127.0.0.1:10124|admin:|client:|dataStream:|priority:0|startupRole:FOLLOWER, n2|rpc:127.0.0.1:11124|admin:|client:|dataStream:|priority:1|startupRole:FOLLOWER]|listeners:[], old=null
   Transferring leadership to server with address <127.0.0.1:11124> 
   Changing priority of <127.0.0.1:11124> to 2
   : Transferring leadership initiated
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ratis.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [ratis] kaijchen commented on pull request #808: RATIS-1769. Add TransferLeaderCommand and deprecate TransferCommand

Posted by "kaijchen (via GitHub)" <gi...@apache.org>.
kaijchen commented on PR #808:
URL: https://github.com/apache/ratis/pull/808#issuecomment-1431023761

   > * If the target peer already has the highest priority, do not call `setConfiguration`.
   > * If not, call `setConfiguration` to change only the target peer to the highest priority.
   > 
   > Does it sound good?
   
   Yes. But how do we deal with old server (backward compatibility)?
   For example, for new servers, we don't need to change priority in the following situation.
   But for old servers, we have to increase target peer's priority to 2.
   
   | Peer | A | B | C | D | E |
   | --- | --- | --- | --- | --- | --- |
   | Priority | 1 | 1 | 1 | 1 | 1 |


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ratis.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [ratis] kaijchen commented on pull request #808: RATIS-1769. Avoid changing priorities in TransferCommand unless necessary

Posted by "kaijchen (via GitHub)" <gi...@apache.org>.
kaijchen commented on PR #808:
URL: https://github.com/apache/ratis/pull/808#issuecomment-1437781663

   Thanks @szetszwo for the review, updated.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ratis.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [ratis] kaijchen commented on pull request #808: RATIS-1769. Add TransferLeaderCommand and deprecate TransferCommand

Posted by "kaijchen (via GitHub)" <gi...@apache.org>.
kaijchen commented on PR #808:
URL: https://github.com/apache/ratis/pull/808#issuecomment-1432372112

   > @kaijchen , When a new client talking to an old server, the client in that case should get a `TransferLeadershipException` saying that the target server does not have the highest priority. Then, we can send the `setConfiguration` in that case.
   
   OK, sounds good.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ratis.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [ratis] szetszwo commented on pull request #808: RATIS-1769. Add TransferLeaderCommand and deprecate TransferCommand

Posted by "szetszwo (via GitHub)" <gi...@apache.org>.
szetszwo commented on PR #808:
URL: https://github.com/apache/ratis/pull/808#issuecomment-1429110804

   @kaijchen , I am thinking how to deal with the existing and the new commands.
   
   - The new `election transferLeader` command, since it won't change priorities, actually is yielding the leadership.  How about we call it `election yield`?
   
   - The existing `election transfer` command is still useful since it can transfer the leader to a lower priority peer.  How about we don't deprecate it?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ratis.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [ratis] kaijchen commented on pull request #808: RATIS-1769. Avoid changing priorities in TransferCommand unless necessary

Posted by "kaijchen (via GitHub)" <gi...@apache.org>.
kaijchen commented on PR #808:
URL: https://github.com/apache/ratis/pull/808#issuecomment-1432998040

   Note: sometimes transfer leadership succeeds, but the request timed out.
   This is due to the old leader hasn't received any messages from the new leader yet.
   Especially in default timeout, i.e. `TransferLeadershipRequest.timeout = 0`,
   the timeout will default to `server.randomElectionTimeout()`.
   
   This will lead to `TransferCommand` to return -1, and cause flaky test.
   So is it better to set a larger default timeout?
   
   To fix this problem, we should wait for the new leader to be elected in the shell.
   But `AbstractRatisCommand` seems didn't expect the leader to change.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ratis.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [ratis] szetszwo commented on a diff in pull request #808: RATIS-1769. Avoid changing priorities in TransferCommand unless necessary

Posted by "szetszwo (via GitHub)" <gi...@apache.org>.
szetszwo commented on code in PR #808:
URL: https://github.com/apache/ratis/pull/808#discussion_r1112301101


##########
ratis-shell/src/main/java/org/apache/ratis/shell/cli/sh/election/TransferCommand.java:
##########
@@ -55,54 +56,74 @@ public int run(CommandLine cl) throws IOException {
     super.run(cl);
 
     String strAddr = cl.getOptionValue(ADDRESS_OPTION_NAME);
+    // TODO: Default timeout should be set to 0, which means let server decide (based on election timeout).
+    //       However, occasionally the request could timeout too fast while the transfer is in progress.
+    //       i.e. request timeout doesn't mean transfer leadership has failed.
+    //       Currently, Ratis shell returns merely based on the result of the request.
+    //       So we set a larger default timeout here (3s).
+    final long timeoutDefault = 3_000L;
+    // Default timeout for legacy mode matches with the legacy command (version 2.4.x and older).
+    final long timeoutLegacy = 60_000L;
+    final Optional<Long> timeout = !cl.hasOption(TIMEOUT_OPTION_NAME) ? Optional.empty() :
+        Optional.of(Long.parseLong(cl.getOptionValue(TIMEOUT_OPTION_NAME)) * 1000L);

Review Comment:
   Use `TimeDuration` for supporting different units such as 100ms, 1min.
   ```java
       final TimeDuration timeoutDefault = TimeDuration.valueOf(3, TimeUnit.SECONDS);
       // Default timeout for legacy mode matches with the legacy command (version 2.4.x and older).
       final TimeDuration timeoutLegacy = TimeDuration.valueOf(60, TimeUnit.SECONDS);
       final Optional<TimeDuration> timeout = !cl.hasOption(TIMEOUT_OPTION_NAME) ? Optional.empty() :
           Optional.of(TimeDuration.valueOf(cl.getOptionValue(TIMEOUT_OPTION_NAME), TimeUnit.SECONDS));
   ```



##########
ratis-shell/src/main/java/org/apache/ratis/shell/cli/sh/election/TransferCommand.java:
##########
@@ -55,54 +56,74 @@ public int run(CommandLine cl) throws IOException {
     super.run(cl);
 
     String strAddr = cl.getOptionValue(ADDRESS_OPTION_NAME);
+    // TODO: Default timeout should be set to 0, which means let server decide (based on election timeout).
+    //       However, occasionally the request could timeout too fast while the transfer is in progress.
+    //       i.e. request timeout doesn't mean transfer leadership has failed.
+    //       Currently, Ratis shell returns merely based on the result of the request.
+    //       So we set a larger default timeout here (3s).
+    final long timeoutDefault = 3_000L;
+    // Default timeout for legacy mode matches with the legacy command (version 2.4.x and older).
+    final long timeoutLegacy = 60_000L;
+    final Optional<Long> timeout = !cl.hasOption(TIMEOUT_OPTION_NAME) ? Optional.empty() :
+        Optional.of(Long.parseLong(cl.getOptionValue(TIMEOUT_OPTION_NAME)) * 1000L);
 
-    RaftPeerId newLeaderId = null;
-    // update priorities to enable transfer
-    List<RaftPeer> peersWithNewPriorities = new ArrayList<>();
-    for (RaftPeer peer : getRaftGroup().getPeers()) {
-      peersWithNewPriorities.add(
-          RaftPeer.newBuilder(peer)
-              .setPriority(peer.getAddress().equals(strAddr) ? 2 : 1)
-              .build()
-      );
-      if (peer.getAddress().equals(strAddr)) {
-        newLeaderId = peer.getId();
-      }
-    }
-    if (newLeaderId == null) {
+    final int highestPriority = getRaftGroup().getPeers().stream()
+        .mapToInt(RaftPeer::getPriority).max().orElse(0);
+    RaftPeer newLeader = getRaftGroup().getPeers().stream()
+        .filter(peer -> peer.getAddress().equals(strAddr)).findAny().orElse(null);
+    if (newLeader == null) {

Review Comment:
   Print an error message.
   ```java
         printf("Peer with address %s not found.", strAddr);
   ```



##########
ratis-shell/src/main/java/org/apache/ratis/shell/cli/sh/election/TransferCommand.java:
##########
@@ -55,54 +56,74 @@ public int run(CommandLine cl) throws IOException {
     super.run(cl);
 
     String strAddr = cl.getOptionValue(ADDRESS_OPTION_NAME);
+    // TODO: Default timeout should be set to 0, which means let server decide (based on election timeout).
+    //       However, occasionally the request could timeout too fast while the transfer is in progress.
+    //       i.e. request timeout doesn't mean transfer leadership has failed.
+    //       Currently, Ratis shell returns merely based on the result of the request.
+    //       So we set a larger default timeout here (3s).
+    final long timeoutDefault = 3_000L;
+    // Default timeout for legacy mode matches with the legacy command (version 2.4.x and older).
+    final long timeoutLegacy = 60_000L;
+    final Optional<Long> timeout = !cl.hasOption(TIMEOUT_OPTION_NAME) ? Optional.empty() :
+        Optional.of(Long.parseLong(cl.getOptionValue(TIMEOUT_OPTION_NAME)) * 1000L);
 
-    RaftPeerId newLeaderId = null;
-    // update priorities to enable transfer
-    List<RaftPeer> peersWithNewPriorities = new ArrayList<>();
-    for (RaftPeer peer : getRaftGroup().getPeers()) {
-      peersWithNewPriorities.add(
-          RaftPeer.newBuilder(peer)
-              .setPriority(peer.getAddress().equals(strAddr) ? 2 : 1)
-              .build()
-      );
-      if (peer.getAddress().equals(strAddr)) {
-        newLeaderId = peer.getId();
-      }
-    }
-    if (newLeaderId == null) {
+    final int highestPriority = getRaftGroup().getPeers().stream()
+        .mapToInt(RaftPeer::getPriority).max().orElse(0);
+    RaftPeer newLeader = getRaftGroup().getPeers().stream()
+        .filter(peer -> peer.getAddress().equals(strAddr)).findAny().orElse(null);
+    if (newLeader == null) {
       return -2;
     }
     try (RaftClient client = RaftUtils.createClient(getRaftGroup())) {
-      String stringPeers = "[" + peersWithNewPriorities.stream().map(RaftPeer::toString)
-          .collect(Collectors.joining(", ")) + "]";
-      printf("Applying new peer state before transferring leadership: %n%s%n", stringPeers);
-      RaftClientReply setConfigurationReply =
-          client.admin().setConfiguration(peersWithNewPriorities);
-      processReply(setConfigurationReply,
-          () -> "failed to set priorities before initiating election");
       // transfer leadership
-      printf("Transferring leadership to server with address <%s> %n", strAddr);
-      try {
-        Thread.sleep(3_000);
-        RaftClientReply transferLeadershipReply =
-            client.admin().transferLeadership(newLeaderId, 60_000);
-        processReply(transferLeadershipReply, () -> "election failed");
-      } catch (Throwable t) {
-        printf("caught an error when executing transfer: %s%n", t.getMessage());
+      Throwable err = tryTransfer(client, newLeader, highestPriority, timeout.orElse(timeoutDefault));
+      if (err instanceof TransferLeadershipException
+          && err.getMessage().contains("it does not has highest priority")) {
+        // legacy mode, transfer leadership by setting priority.
+        err = tryTransfer(client, newLeader, highestPriority + 1, timeout.orElse(timeoutLegacy));
+      }
+      if (err != null) {
         return -1;
       }

Review Comment:
   After changing `tryTransfer` to return a boolean, we could make the corresponding change as below:
   ```java
       try (RaftClient client = RaftUtils.createClient(getRaftGroup())) {
         // transfer leadership
         if (!tryTransfer(client, newLeader, highestPriority, timeout.orElse(timeoutDefault))) {
           // legacy mode, transfer leadership by setting priority.
           tryTransfer(client, newLeader, highestPriority + 1, timeout.orElse(timeoutLegacy));
         }
       } catch(Throwable t) {
         printf("Failed to transfer peer %s with address %s: ",
             newLeader.getId(), newLeader.getAddress());
         t.printStackTrace(getPrintStream());
         return -1;
       }
       return 0;
   ```



##########
ratis-shell/src/main/java/org/apache/ratis/shell/cli/sh/election/TransferCommand.java:
##########
@@ -55,54 +56,74 @@ public int run(CommandLine cl) throws IOException {
     super.run(cl);
 
     String strAddr = cl.getOptionValue(ADDRESS_OPTION_NAME);
+    // TODO: Default timeout should be set to 0, which means let server decide (based on election timeout).
+    //       However, occasionally the request could timeout too fast while the transfer is in progress.
+    //       i.e. request timeout doesn't mean transfer leadership has failed.
+    //       Currently, Ratis shell returns merely based on the result of the request.
+    //       So we set a larger default timeout here (3s).
+    final long timeoutDefault = 3_000L;
+    // Default timeout for legacy mode matches with the legacy command (version 2.4.x and older).
+    final long timeoutLegacy = 60_000L;
+    final Optional<Long> timeout = !cl.hasOption(TIMEOUT_OPTION_NAME) ? Optional.empty() :
+        Optional.of(Long.parseLong(cl.getOptionValue(TIMEOUT_OPTION_NAME)) * 1000L);
 
-    RaftPeerId newLeaderId = null;
-    // update priorities to enable transfer
-    List<RaftPeer> peersWithNewPriorities = new ArrayList<>();
-    for (RaftPeer peer : getRaftGroup().getPeers()) {
-      peersWithNewPriorities.add(
-          RaftPeer.newBuilder(peer)
-              .setPriority(peer.getAddress().equals(strAddr) ? 2 : 1)
-              .build()
-      );
-      if (peer.getAddress().equals(strAddr)) {
-        newLeaderId = peer.getId();
-      }
-    }
-    if (newLeaderId == null) {
+    final int highestPriority = getRaftGroup().getPeers().stream()
+        .mapToInt(RaftPeer::getPriority).max().orElse(0);
+    RaftPeer newLeader = getRaftGroup().getPeers().stream()
+        .filter(peer -> peer.getAddress().equals(strAddr)).findAny().orElse(null);
+    if (newLeader == null) {
       return -2;
     }
     try (RaftClient client = RaftUtils.createClient(getRaftGroup())) {
-      String stringPeers = "[" + peersWithNewPriorities.stream().map(RaftPeer::toString)
-          .collect(Collectors.joining(", ")) + "]";
-      printf("Applying new peer state before transferring leadership: %n%s%n", stringPeers);
-      RaftClientReply setConfigurationReply =
-          client.admin().setConfiguration(peersWithNewPriorities);
-      processReply(setConfigurationReply,
-          () -> "failed to set priorities before initiating election");
       // transfer leadership
-      printf("Transferring leadership to server with address <%s> %n", strAddr);
-      try {
-        Thread.sleep(3_000);
-        RaftClientReply transferLeadershipReply =
-            client.admin().transferLeadership(newLeaderId, 60_000);
-        processReply(transferLeadershipReply, () -> "election failed");
-      } catch (Throwable t) {
-        printf("caught an error when executing transfer: %s%n", t.getMessage());
+      Throwable err = tryTransfer(client, newLeader, highestPriority, timeout.orElse(timeoutDefault));
+      if (err instanceof TransferLeadershipException
+          && err.getMessage().contains("it does not has highest priority")) {
+        // legacy mode, transfer leadership by setting priority.
+        err = tryTransfer(client, newLeader, highestPriority + 1, timeout.orElse(timeoutLegacy));
+      }
+      if (err != null) {
         return -1;
       }
-      println("Transferring leadership initiated");
     }
     return 0;
   }
 
+  private Throwable tryTransfer(RaftClient client, RaftPeer newLeader, int highestPriority, long timeout) {
+    printf("Transferring leadership to server with address <%s> %n", newLeader.getAddress());
+    try {
+      // lift the current leader to the highest priority,
+      if (newLeader.getPriority() < highestPriority) {
+        setPriority(client, newLeader.getAddress(), highestPriority);
+      }
+      RaftClientReply transferLeadershipReply =
+          client.admin().transferLeadership(newLeader.getId(), timeout);
+      processReply(transferLeadershipReply, () -> "election failed");
+    } catch (Throwable t) {
+      printf("caught an error when executing transfer: %s%n", t.getMessage());
+      return t;
+    }
+    println("Transferring leadership initiated");
+    return null;
+  }

Review Comment:
   Let's return a boolean to indicate if the transfer is success.
   ```java
     private boolean tryTransfer(RaftClient client, RaftPeer newLeader, int highestPriority,
         TimeDuration timeout) throws IOException {
       printf("Transferring leadership to server with address <%s> %n", newLeader.getAddress());
       try {
         // lift the current leader to the highest priority,
         if (newLeader.getPriority() < highestPriority) {
           setPriority(client, newLeader.getAddress(), highestPriority);
         }
         RaftClientReply transferLeadershipReply =
             client.admin().transferLeadership(newLeader.getId(), timeout.toLong(TimeUnit.MILLISECONDS));
         processReply(transferLeadershipReply, () -> "election failed");
       } catch (TransferLeadershipException tle) {
         if (tle.getMessage().contains("it does not has highest priority")) {
           return false;
         }
         throw tle;
       }
       println("Transferring leadership initiated");
       return true;
     }
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ratis.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [ratis] kaijchen commented on pull request #808: RATIS-1769. Add TransferLeaderCommand and deprecate TransferCommand

Posted by "kaijchen (via GitHub)" <gi...@apache.org>.
kaijchen commented on PR #808:
URL: https://github.com/apache/ratis/pull/808#issuecomment-1420323212

   @codings-dan @szetszwo PTAL, thanks.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ratis.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [ratis] kaijchen commented on pull request #808: RATIS-1769. Try to avoid changing priorities in TransferCommand

Posted by "kaijchen (via GitHub)" <gi...@apache.org>.
kaijchen commented on PR #808:
URL: https://github.com/apache/ratis/pull/808#issuecomment-1432761895

   Looks like the msg is incorrect: 
   
   ```
   : caught an error when executing transfer: n0@group-ABB3109A44C1 refused to transfer leadership to peer n2 as it does not has highest priority 9: 
   ```
   
   https://github.com/apache/ratis/blob/18eacaed31e4965a9c400d86409a88fea21fc18a/ratis-server/src/main/java/org/apache/ratis/server/impl/RaftServerImpl.java#L1117-L1120


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ratis.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [ratis] kaijchen commented on pull request #808: RATIS-1769. Avoid changing priorities in TransferCommand unless necessary

Posted by "kaijchen (via GitHub)" <gi...@apache.org>.
kaijchen commented on PR #808:
URL: https://github.com/apache/ratis/pull/808#issuecomment-1436456680

   > Note: sometimes transfer leadership succeeds, but the request timed out. This is due to the old leader hasn't received any messages from the new leader yet. Especially in default timeout, i.e. `TransferLeadershipRequest.timeout = 0`, the timeout will default to `server.randomElectionTimeout()`.
   > 
   > This will lead to `TransferCommand` to return -1, and cause flaky test. So is it better to set a larger default timeout?
   
   Set default timeout to `3s` in the new mode. The default timeout in legacy mode (fallback) is kept as `60s`.
   
   > To fix this problem, we should wait for the new leader to be elected in the shell. But `AbstractRatisCommand` seems didn't expect the leader to change.
   
   Added TODO in comments. It's better to keep changes small in this PR.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ratis.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [ratis] kaijchen commented on pull request #808: RATIS-1769. Avoid changing priorities in TransferCommand unless necessary

Posted by "kaijchen (via GitHub)" <gi...@apache.org>.
kaijchen commented on PR #808:
URL: https://github.com/apache/ratis/pull/808#issuecomment-1436458778

   @szetszwo please take another look, thanks.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ratis.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [ratis] kaijchen commented on pull request #808: RATIS-1769. Do not change priority in TransferCommand

Posted by "kaijchen (via GitHub)" <gi...@apache.org>.
kaijchen commented on PR #808:
URL: https://github.com/apache/ratis/pull/808#issuecomment-1414706690

   I have come up with 3 options to preserve backward compatibility.
   @szetszwo @codings-dan what do you think?
   
   1. Check server version first and send command accordingly.
   2. Add a new command and deprecate the old command.
   3. Just break the backward compatibility, since next version is 3.0.
   
   Personally I prever the 2nd way.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ratis.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [ratis] kaijchen commented on pull request #808: RATIS-1769. Add TransferLeaderCommand and deprecate TransferCommand

Posted by "kaijchen (via GitHub)" <gi...@apache.org>.
kaijchen commented on PR #808:
URL: https://github.com/apache/ratis/pull/808#issuecomment-1423852018

   > could you add a unit test for the new command? see `ElectionCommandIntegrationTest#testElectionTransferCommand`
   
   Thanks @codings-dan, I have added `ElectionCommandIntegrationTest#testElectionTransferLeaderCommand`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ratis.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [ratis] kaijchen commented on pull request #808: RATIS-1769. Add TransferLeaderCommand and deprecate TransferCommand

Posted by "kaijchen (via GitHub)" <gi...@apache.org>.
kaijchen commented on PR #808:
URL: https://github.com/apache/ratis/pull/808#issuecomment-1425286145

   > We should have test to transfer a leader who does not have the highest priority. Then, it should get some exception.
   
   Thanks @szetszwo for reviewing. Good idea.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ratis.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [ratis] szetszwo commented on pull request #808: RATIS-1769. Add TransferLeaderCommand and deprecate TransferCommand

Posted by "szetszwo (via GitHub)" <gi...@apache.org>.
szetszwo commented on PR #808:
URL: https://github.com/apache/ratis/pull/808#issuecomment-1431890415

   > ... how do we deal with old server (backward compatibility)?
   
   @kaijchen , When a new client talking to an old server, the client in that case should get a `TransferLeadershipException` saying that the target server does not have the highest priority.  Then, we can send the `setConfiguration` in that case.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ratis.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [ratis] szetszwo merged pull request #808: RATIS-1769. Avoid changing priorities in TransferCommand unless necessary

Posted by "szetszwo (via GitHub)" <gi...@apache.org>.
szetszwo merged PR #808:
URL: https://github.com/apache/ratis/pull/808


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ratis.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [ratis] kaijchen commented on pull request #808: RATIS-1769. Add TransferLeaderCommand and deprecate TransferCommand

Posted by "kaijchen (via GitHub)" <gi...@apache.org>.
kaijchen commented on PR #808:
URL: https://github.com/apache/ratis/pull/808#issuecomment-1430674627

   > * Then, we don't need a new command.
   
   The old server (prior to 2.4.1) always requires the transferee to have the highest priority for transfer leadership.
   We are keeping the old `transfer` command for backward compatibility. Is there a way to get the server version?
   
   > * Since we already have the `getRaftGroup()` returned from the server, we can check priorities to decide whether to call `setConfiguration`.  It will call `setConfiguration` only if the transferee has smaller priority.
   
   Personally I prefer to print a message like "transferee does not have highest priority, please call `setPriority` first".
   Or add a `-force` option, to set the transferee to highest priority.
   
   ## Example
   
   let's say we have a cluster like this,
   
   | Peer | A | B | C | D | E |
   | --- | --- | --- | --- | --- | --- |
   | Priority | 3 | 2 | 2 | 1 | 1 |
   
   ### New server
   
   If we want to transfer leader from A to E, we should only raise E's priority to highest.
   So if peer E is down afterwards, the old priority remains.
   
   | Peer | A | B | C | D | E |
   | --- | --- | --- | --- | --- | --- |
   | Priority | 3 | 2 | 2 | 1 | 3 |
   
   ### Old server
   
   Since the old server requires transferee to be the **only** peer with highest priority.
   The best we can do is:
   
   | Peer | A | B | C | D | E |
   | --- | --- | --- | --- | --- | --- |
   | Priority | 3 | 2 | 2 | 1 | 4 |
   
   However, the old command will reset every node's priority, and the old priority is erased.
   
   | Peer | A | B | C | D | E |
   | --- | --- | --- | --- | --- | --- |
   | Priority | 1 | 1 | 1 | 1 | 2 |
    


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ratis.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org