You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by "Jun Rao (Commented) (JIRA)" <ji...@apache.org> on 2012/02/29 19:01:58 UTC
[jira] [Commented] (KAFKA-239) Wire existing producer and consumer to use the new ZK data structure

    [ https://issues.apache.org/jira/browse/KAFKA-239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13219370#comment-13219370 ] 

Jun Rao commented on KAFKA-239:
-------------------------------

1. ZkUtils.getTopicPartitionsPath: There are a couple reasons why we chose to use /brokers/topics/partitions to store partition data, instead of /brokers/topics: (1) This distinguishes the ZK layout in 0.8 from 0.7. In 0.7, we already store broker ids directly under /brokers/topics. So there could be confusion if we put partition ids at the same level. (2) This also leaves room for future extension to add non-partition level per topic info in ZK.

2. ZkUtils: rename idoesBrokerHostPartition to something like isPartitionOnBroker

3. LogManager: logFlusherScheduler.startUp is called twice.

4. LogManager.getOrCreateLog: The LogManager shouldn't need to know anything about ZK. The check of whether a partition exists on a broker should be done at KafkaApi level. Ideally, we just want to check partition ownership from a local cache, which will be populated by ZK listeners (part of kafka-44). In this patch, we can either not check at all or directly check from ZK (with the intention to have it optimized in kafka-44).

5. KafkaServer: the log cleaner scheduler is scheduled in LogManager, should we start up the schedule there too. If there is good reason to do that, should we protect startUp from being called more than once?

6. KafkaScheduler: what's the benefit of having a startUp method? It seems creating the executor in the constructor is simpler.

7. Partition should be made logical. It only contains topic and partition id. There will be a new entity Replica which is associated with broker id.

8. ProducerPool: remove unused import and fix indentation in close()

9. AsyncProducerTest: There are duplicated code that sets up mock to form expected partition metadata. Can we have a separate method to share the code?

10. Producer: Is zkClient in the constructor just for testing? If so, add a comment to indicate that.

11. TestUtils: why do we need checkSetEqual? Doesn't scala do that by default?

12. ZookeeperConsumerConnectorTest.testLederSelectionForPartition: is it really testing leader election? It seems to be testing partition ownership.

13. ZkUtils.getLeaderForPartition: This is probably fine for this patch. However, we should think whether it is better to get leader info from ZK directly or use the getMetaData api.

                
> Wire existing producer and consumer to use the new ZK data structure
> --------------------------------------------------------------------
>
>                 Key: KAFKA-239
>                 URL: https://issues.apache.org/jira/browse/KAFKA-239
>             Project: Kafka
>          Issue Type: Sub-task
>          Components: core
>            Reporter: Jun Rao
>            Assignee: Neha Narkhede
>         Attachments: kafka-239-latest-revision-unit-tests-passing.patch, kafka-239-v2.patch
>
>
> We can assume the leader of a partition is always the first replica. Data will only be stored in the first replica. So, there is no fault-tolerance support yet. Just make the partition logical.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira