You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@kafka.apache.org by "Richard Yu (Jira)" <ji...@apache.org> on 2020/03/18 15:43:00 UTC
[jira] [Updated] (KAFKA-9733) Consider addition to Kafka's replication model

     [ https://issues.apache.org/jira/browse/KAFKA-9733?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Richard Yu updated KAFKA-9733:
------------------------------
    Description: 
Note: Description still not finished.

This feature I'm proposing might not offer too much of a performance boost, but I think it is still worth considering. In our current replication model, we have a single leader and several followers (with our ISR included). However, the current bottleneck would be that once the leader goes down, it will take a while to get the next leader online, which is a serious pain. (also leading to a considerable write/read delay)

In order to help alleviate this issue, we can consider multiple clusters independent of each other i.e. each of them are their own leader/follower group for the _same partition set_. The difference here is that these clusters can _communicate_ between one another. 

At first, this might seem redundant, but there is a reasoning to this:
 # Let's say we have two leader/follower groups (I must note that these two groups does _not_ have shared memory) for the same replicated partition.
 # One leader goes down, and that means for the respective followers, they would under normal circumstances be unable to receive new write updates.
 # However, in this situation, we can have those followers poll their write/read requests from the other group whose leader has _not gone down._ It doesn't necessarily have to be  the leader either, it can be other members from that group's ISR. 
 # The idea here is that if the members of these two groups detect that they are lagging behind another, they would be able to poll one another for updates.

So what is the difference here from just having multiple leaders in a single cluster?

The answer is that the leader is responsible for making sure that there is consistency within _its own cluster._ Not the other cluster it is in communication with.  

  was:
Note: Description still not finished.

This feature I'm proposing might not offer too much of a performance boost, but I think it is still worth considering. In our current replication model, we have a single leader and several followers (with our ISR included). However, the current bottleneck would be that once the leader goes down, it will take a while to get the next leader online, which is a serious pain. (also leading to a considerable write/read delay)

In order to help alleviate this issue, we can consider multiple clusters independent of each other i.e. each of them are their own leader/follower group for the _same partition set_. The difference here is that these clusters can _communicate_ between one another. 

At first, this might seem redundant, but there is a reasoning to this:
 # Let's say we have two leader/follower groups for the same replicated partition.
 # One leader goes down, and that means for the respective followers, they would under normal circumstances be unable to receive new write updates.
 # However, in this situation, we can have those followers poll their write/read requests from the other group whose leader has _not gone down._ It doesn't necessarily have to be  the leader either, it can be other members from that group's ISR. 
 # The idea here is that if the members of these two groups detect that they are lagging behind another, they would be able to poll one another for updates.

So what is the difference here from just having multiple leaders in a single cluster?

The answer is that the leader is responsible for making sure that there is consistency within _its own cluster._ Not the other cluster it is in communication with.  


> Consider addition to Kafka's replication model
> ----------------------------------------------
>
>                 Key: KAFKA-9733
>                 URL: https://issues.apache.org/jira/browse/KAFKA-9733
>             Project: Kafka
>          Issue Type: New Feature
>          Components: clients, core
>            Reporter: Richard Yu
>            Priority: Minor
>
> Note: Description still not finished.
> This feature I'm proposing might not offer too much of a performance boost, but I think it is still worth considering. In our current replication model, we have a single leader and several followers (with our ISR included). However, the current bottleneck would be that once the leader goes down, it will take a while to get the next leader online, which is a serious pain. (also leading to a considerable write/read delay)
> In order to help alleviate this issue, we can consider multiple clusters independent of each other i.e. each of them are their own leader/follower group for the _same partition set_. The difference here is that these clusters can _communicate_ between one another. 
> At first, this might seem redundant, but there is a reasoning to this:
>  # Let's say we have two leader/follower groups (I must note that these two groups does _not_ have shared memory) for the same replicated partition.
>  # One leader goes down, and that means for the respective followers, they would under normal circumstances be unable to receive new write updates.
>  # However, in this situation, we can have those followers poll their write/read requests from the other group whose leader has _not gone down._ It doesn't necessarily have to be  the leader either, it can be other members from that group's ISR. 
>  # The idea here is that if the members of these two groups detect that they are lagging behind another, they would be able to poll one another for updates.
> So what is the difference here from just having multiple leaders in a single cluster?
> The answer is that the leader is responsible for making sure that there is consistency within _its own cluster._ Not the other cluster it is in communication with.  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)