You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by "Chris Riccomini (JIRA)" <ji...@apache.org> on 2012/06/01 23:15:23 UTC

[jira] [Created] (KAFKA-360) Add ability to disable rebalancing in ZooKeeper consumer

Chris Riccomini created KAFKA-360:
-------------------------------------

             Summary: Add ability to disable rebalancing in ZooKeeper consumer
                 Key: KAFKA-360
                 URL: https://issues.apache.org/jira/browse/KAFKA-360
             Project: Kafka
          Issue Type: Bug
          Components: core
            Reporter: Chris Riccomini


There is a need for a ZooKeeper-based Kafka consumer that does not re-balance. This is needed because we may be handling partitioning outside of Kafka. For example, I may have a stateful process that is meant to consume only from Partition 7 of a given Kafka topic. When that process stops, I don't want another consumer to take over partition 7.

The benefits we get from using the ZooKeeper-based consumer (vs the Simple Consumer) without rebalancing is that offsets will still be handled by Kafka/ZK, as will failover when a partition's leader disappears/fails.

I think the way to do this is to add a consumer config parameter that disables a consumer group's rebalancing. That way, the first consumer in the group to connect (when the ZK node is created) can specify if rebalancing should be enabled for the consumer group. If rebalancing is disabled, the consumers should be forced to supply a list of partition IDs that they wish to read from. Perhaps this can be done during the createMessageStreams call?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (KAFKA-360) Add ability to disable rebalancing in ZooKeeper consumer

Posted by "Jun Rao (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/KAFKA-360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13288290#comment-13288290 ] 

Jun Rao commented on KAFKA-360:
-------------------------------

It seems that the main request is to control which partition a consumer consumes. I can see 3 approaches. (1) Patch ZookeeperConsumerConnector as proposed in the description of the jira. First, we need to disable rebalance. However, it seems that you want to control the partition assignment youself. So, we will need a way for a consumer to pass in what partitions it wants to consume. (2) If you want a consumer to stick to a partition, you can model each partition as a separate topic and let consumers consume each individual topic. (3) Changing SimpleConsumer so that it's not tied to a particular broker. Instead, it can take requests containing partitions hosted on any broker. Under the cover, it figures out the right brokers to connect to and handles partition leader changes. The consumer still needs to do the offset management itself. There are tradeoffs among those approaches and I am not sure which one is best. For approach (1), it seems that we won't be using much of the functionality of ZookeeperConsumerConnector except for offset management. For approach (2), it seems to be a bit ad hoc. For approach (3), it will make the SimpleConsumer implementation more complicated. We probably need to discuss this a bit more in the jira. 
                
> Add ability to disable rebalancing in ZooKeeper consumer
> --------------------------------------------------------
>
>                 Key: KAFKA-360
>                 URL: https://issues.apache.org/jira/browse/KAFKA-360
>             Project: Kafka
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 0.8
>            Reporter: Chris Riccomini
>
> There is a need for a ZooKeeper-based Kafka consumer that does not re-balance. This is needed because we may be handling partitioning outside of Kafka. For example, I may have a stateful process that is meant to consume only from Partition 7 of a given Kafka topic. When that process stops, I don't want another consumer to take over partition 7.
> The benefits we get from using the ZooKeeper-based consumer (vs the Simple Consumer) without rebalancing is that offsets will still be handled by Kafka/ZK, as will failover when a partition's leader disappears/fails.
> I think the way to do this is to add a consumer config parameter that disables a consumer group's rebalancing. That way, the first consumer in the group to connect (when the ZK node is created) can specify if rebalancing should be enabled for the consumer group. If rebalancing is disabled, the consumers should be forced to supply a list of partition IDs that they wish to read from. Perhaps this can be done during the createMessageStreams call?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

[jira] [Updated] (KAFKA-360) Add ability to disable rebalancing in ZooKeeper consumer

Posted by "Chris Riccomini (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/KAFKA-360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Chris Riccomini updated KAFKA-360:
----------------------------------

    Affects Version/s: 0.8
    
> Add ability to disable rebalancing in ZooKeeper consumer
> --------------------------------------------------------
>
>                 Key: KAFKA-360
>                 URL: https://issues.apache.org/jira/browse/KAFKA-360
>             Project: Kafka
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 0.8
>            Reporter: Chris Riccomini
>
> There is a need for a ZooKeeper-based Kafka consumer that does not re-balance. This is needed because we may be handling partitioning outside of Kafka. For example, I may have a stateful process that is meant to consume only from Partition 7 of a given Kafka topic. When that process stops, I don't want another consumer to take over partition 7.
> The benefits we get from using the ZooKeeper-based consumer (vs the Simple Consumer) without rebalancing is that offsets will still be handled by Kafka/ZK, as will failover when a partition's leader disappears/fails.
> I think the way to do this is to add a consumer config parameter that disables a consumer group's rebalancing. That way, the first consumer in the group to connect (when the ZK node is created) can specify if rebalancing should be enabled for the consumer group. If rebalancing is disabled, the consumers should be forced to supply a list of partition IDs that they wish to read from. Perhaps this can be done during the createMessageStreams call?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (KAFKA-360) Add ability to disable rebalancing in ZooKeeper consumer

Posted by "Neha Narkhede (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/KAFKA-360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Neha Narkhede updated KAFKA-360:
--------------------------------

    Issue Type: Sub-task  (was: Bug)
        Parent: KAFKA-364
    
> Add ability to disable rebalancing in ZooKeeper consumer
> --------------------------------------------------------
>
>                 Key: KAFKA-360
>                 URL: https://issues.apache.org/jira/browse/KAFKA-360
>             Project: Kafka
>          Issue Type: Sub-task
>          Components: core
>    Affects Versions: 0.8
>            Reporter: Chris Riccomini
>
> There is a need for a ZooKeeper-based Kafka consumer that does not re-balance. This is needed because we may be handling partitioning outside of Kafka. For example, I may have a stateful process that is meant to consume only from Partition 7 of a given Kafka topic. When that process stops, I don't want another consumer to take over partition 7.
> The benefits we get from using the ZooKeeper-based consumer (vs the Simple Consumer) without rebalancing is that offsets will still be handled by Kafka/ZK, as will failover when a partition's leader disappears/fails.
> I think the way to do this is to add a consumer config parameter that disables a consumer group's rebalancing. That way, the first consumer in the group to connect (when the ZK node is created) can specify if rebalancing should be enabled for the consumer group. If rebalancing is disabled, the consumers should be forced to supply a list of partition IDs that they wish to read from. Perhaps this can be done during the createMessageStreams call?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira