You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@kafka.apache.org by "Guozhang Wang (JIRA)" <ji...@apache.org> on 2015/03/26 23:33:53 UTC

[jira] [Commented] (KAFKA-2017) Persist Coordinator State for Coordinator Failover

    [ https://issues.apache.org/jira/browse/KAFKA-2017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14382835#comment-14382835 ] 

Guozhang Wang commented on KAFKA-2017:
--------------------------------------

Talked to Onur offline about possible approaches, and would like to propose we use ZK as our first attempt for its simplicity, but let's also be open for further improvements to move it back to Kafka storage like the offset management if we found this design has some un-expected shortcomings.

Also we can persist the group / consumer registry information, but keep the partition assignment result in coordinator's main memory given that the latter can be re-calculated from the former as long as the calculation is deterministic (which is true as for now).

The proposed ZK data structure from discussion with Onur:

{code}
/coordinator/consumers/[groupId]:
version: int
generationId: int
partitionStrategy: string

/coordinator/consumers/[groupId]/members/[consumerId]:
version: int
subscriptions: string (topic-names separated by comma)
sessionTimeout: int
{code}

The above structure as a disadvantage though such that coordinator migration will cause lots of ZK reads. An alternative approach would be:

{code}
/coordinator/consumers/[groupId]:
version: int
generationId: int
partitionStrategy: string
members: List[consumer] (json string)
    subscriptions: string
    sessionTimeoutMs: int
{code}

> Persist Coordinator State for Coordinator Failover
> --------------------------------------------------
>
>                 Key: KAFKA-2017
>                 URL: https://issues.apache.org/jira/browse/KAFKA-2017
>             Project: Kafka
>          Issue Type: Sub-task
>          Components: consumer
>    Affects Versions: 0.9.0
>            Reporter: Onur Karaman
>            Assignee: Onur Karaman
>
> When a coordinator fails, the group membership protocol tries to failover to a new coordinator without forcing all the consumers rejoin their groups. This is possible if the coordinator persists its state so that the state can be transferred during coordinator failover. This state consists of most of the information in GroupRegistry and ConsumerRegistry.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)