You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@kafka.apache.org by "Richard Yu (Jira)" <ji...@apache.org> on 2020/03/22 16:11:01 UTC
[jira] [Commented] (KAFKA-9733) Consider addition of leader quorum in replication model

    [ https://issues.apache.org/jira/browse/KAFKA-9733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17064328#comment-17064328 ] 

Richard Yu commented on KAFKA-9733:
-----------------------------------

[~bchen225242] Do you know how much use this has for Kafka? The design and implementation for this issue would no doubt be horrendously complex, so whats your take? 

> Consider addition of leader quorum in replication model
> -------------------------------------------------------
>
>                 Key: KAFKA-9733
>                 URL: https://issues.apache.org/jira/browse/KAFKA-9733
>             Project: Kafka
>          Issue Type: New Feature
>          Components: clients, core
>            Reporter: Richard Yu
>            Priority: Minor
>
> Kafka's current replication model (with its single leader and several followers) is somewhat similar to the current consensus algorithms being used in databases (RAFT) with the major difference being the existence of the ISR. Consequently, Kafka suffers from the same fault tolerance issues as does other distributed systems which rely on RAFT: the leader tends to be the chokepoint for failures i.e. if it goes down, it will have a brief stop-the-world effect. 
> In contrast, giving all replicas the power to write and read to other replicas is also difficult to accomplish (as emphasized by the complexity of the Egalitarian Paxos algorithm), since consistency is so hard to maintain in such an algorithm, plus very little gain compared to the overhead. 
> Therefore, I propose that we have an intermediate plan in between these two algorithms, and that is the leader replica quorum. In essence, there will be multiple leaders (which have the power for both read and writes), but the number of leaders will not be excessive (i.e. maybe three at max). How we achieve consistency is simple:
>  * Any leader has the power to propose a write update to other replicas. But before passing a write update to a follower, the other leaders must elect if such an operation is granted.
>  * In principle, a leader will propose a write update to the other leaders, and once the other leaders have integrated that write update into their version of the stored data, they will also give the green light. 
>  * If say, more than half the other leaders have agreed that the current change is good to go, then we can forward the change downstream to the other replicas.
>  The algorithm for maintaining consistency between multiple leaders will still have to be worked out in detail. However, there would be multiple gains from this design over the old model:
>  # The single leader failure bottleneck has been alleviated to a certain extent, since there are now multiple leader replicas.
>  # Write updates will potentially no longer be bottlenecked at one single leader (since there are multiple leaders available). On a related note, there has been a KIP that allows clients to read from non-leader replicas. (will add the KIP link soon).
> Some might note that the overhead from maintaining consistency among multiple leaders might offset these gains. That might be true, with a large number of leaders, but with a small number then (capped at 3 as mentioned above), the overhead will also be correspondingly small. (How latency will be affected is unknown until further testing, but more than likely, this option will probably be. configurable depending on user requirements).
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)