You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ratis.apache.org by "Kaijie Chen (Jira)" <ji...@apache.org> on 2023/01/19 03:42:00 UTC
[jira] [Comment Edited] (RATIS-1762) Support transfer leadership between nodes with same priority

    [ https://issues.apache.org/jira/browse/RATIS-1762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17678463#comment-17678463 ] 

Kaijie Chen edited comment on RATIS-1762 at 1/19/23 3:41 AM:
-------------------------------------------------------------

RATIS-1762 is the core logic. Every other change is optional, and depending on RATIS-1762. 
||Jira||Event||Before||After||Incompatibility||
|RATIS-1762|Leader received a TransferLeadershipRequest.|The prior leader will passively wait the yieldLeadership() to transfer the leadership.|The prior leader will actively monitor the transferee's matchIndex, and send TimeoutNow when it's up-to-date.|No.|
|RATIS-1769|Ratis shell sending a TransferLeadershipRequest.|The client will first reconfigure the priority of nodes, make the transferee's priority highest. Then send a TransferLeadershipRequest.|The client will just send a TransferLeadershipRequest.|Yes. New shell cannot work with old server.|
|RATIS-1770|Leader detected a higher priority peer.|The prior leader still accepts client requests during the transfer.|The prior leader will reject client requests during the transfer.
(By starting a TransferLeadership described in RATIS-1762.)|No. But it may block with a previous/new request.|
|RATIS-1771|Leader received a TransferLeadershipRequest when there is a previous request pending.|The new request will be rejected.|The previous request will be aborted.|No. But the behavior will be different.|
|RATIS-1772|N/A|Just refactor.|Should be same.|No. If we don't change proto.|


was (Author: ckj996):
||Jira||Event||Before||After||Incompatibility||
|RATIS-1762|Leader received a TransferLeadershipRequest.|The prior leader will passively wait the yieldLeadership() to transfer the leadership.|The prior leader will actively monitor the transferee's matchIndex, and send TimeoutNow when it's up-to-date.|No.|
|RATIS-1769|Ratis shell sending a TransferLeadershipRequest.|The client will first reconfigure the priority of nodes, make the transferee's priority highest. Then send a TransferLeadershipRequest.|The client will just send a TransferLeadershipRequest.|Yes. New shell cannot work with old server.|
|RATIS-1770|Leader detected a higher priority peer.|The prior leader still accepts client requests during the transfer.|The prior leader will reject client requests during the transfer.
(By starting a TransferLeadership described in RATIS-1762.)|No. But it may block with a previous/new request.|
|RATIS-1771|Leader received a TransferLeadershipRequest when there is a previous request pending.|The new request will be rejected.|The previous request will be aborted.|No. But the behavior will be different.|
|RATIS-1772|N/A|Just refactor.|Should be same.|No. If we don't change proto.|

> Support transfer leadership between nodes with same priority
> ------------------------------------------------------------
>
>                 Key: RATIS-1762
>                 URL: https://issues.apache.org/jira/browse/RATIS-1762
>             Project: Ratis
>          Issue Type: Sub-task
>            Reporter: Kaijie Chen
>            Priority: Major
>          Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> Current transfer leadership implementation in Ratis is depending on priority. The current leader will periodically check follower's priority, and yield leader to higher priority peer.
> In this Jira, I propose to implement the basic transfer leadership operation, which is described in section 3.10 "leadership transfer extension" of [Diego Ongaro's PhD dissertation|https://web.stanford.edu/~ouster/cgi-bin/papers/OngaroPhD.pdf]. In a future Jira, we can change the current "yieldLeaderToHigherPriorityPeer()" to use this operation.
> Steps of the transfer leadership operation:
>  # The prior leader stops accepting new client requests.
>  # The prior leader fully updates the target server’s log to match its own, using the normal log replication mechanism.
>  # The prior leader sends a TimeoutNow request to the target server. This request has the same effect as the target server’s election timer firing: the target server starts a new election (incrementing its term and becoming a candidate).
> Success condition:
>  * Once the target server receives the TimeoutNow request, it is highly likely to start an election before any other server and become leader in the next term. Its next message to the prior leader will include its new term number, causing the prior leader to step down. At this point, leadership transfer is complete.
> Failure condition:
>  * It is also possible for the target server to fail; in this case, the cluster must resume client operations. If leadership transfer does not complete after about an election timeout, the prior leader aborts the transfer and resumes accepting client requests. If the prior leader was mistaken and the target server is actually operational, then at worst this mistake will result in an extra election, after which client operations will be restored.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)