You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by "Jun Rao (Commented) (JIRA)" <ji...@apache.org> on 2012/04/18 00:48:17 UTC
[jira] [Commented] (KAFKA-332) Mirroring should use multiple producers; add producer retries to DefaultEventHandler

    [ https://issues.apache.org/jira/browse/KAFKA-332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13256026#comment-13256026 ] 

Jun Rao commented on KAFKA-332:
-------------------------------

Some comments:
1. DefaultEventHandler: 
1.1. It would be useful to see the retry # in trace log
1.2 We should capture all Throwable.

2. ProducerConfig: explain a bit more why num.retries is not appropriate for zk-based producer. Basically, during resend, we don't re-select brokers.

3. MirrorMaker: The usage of circularIterator is pretty fancy. Would it be simpler to just put all producers in an array and loop through it circularly? 
                
> Mirroring should use multiple producers; add producer retries to DefaultEventHandler
> ------------------------------------------------------------------------------------
>
>                 Key: KAFKA-332
>                 URL: https://issues.apache.org/jira/browse/KAFKA-332
>             Project: Kafka
>          Issue Type: Improvement
>          Components: core
>            Reporter: Joel Koshy
>            Assignee: Joel Koshy
>            Priority: Minor
>         Attachments: KAFKA-332-v1.patch
>
>
> I'm clubbing these two together as these are both important for mirroring.
> (1) Multiple producers:
> Shallow iteration (KAFKA-315) helps improve mirroring throughput when
> messages are compressed. With shallow iteration, the mirror-maker's consumer
> does not do deep iteration over compressed messages. However, when its
> embedded producer sends these messages to the target cluster's brokers, the
> receiving broker does deep iteration to validate the messages before
> appending to the log.
> In the current (pre- KAFKA-48) request handling mechanism, one producer
> effectively translates to one server-side thread for handling produce
> requests, so there is still a bottleneck due to decompression (due to
> message validation) on the target broker.
> One way to work around this is to use broker.list with multiple brokers
> specified per broker. E.g.,
> broker.list=0:localhost:9191,1:localhost:9191,2:localhost:9191,... which
> effectively emulates multiple server-side threads. It would be better to
> just add a num.producers option to the mirror-maker and instantiate that
> many producers.
> (2) Retries:
> If the mirror-maker uses broker.list and one of the brokers is bounced for
> any reason, messages can get lost. Message loss can be reduced/avoided if
> the brokers are behind a VIP and if retries are supported. This option will
> not work for the zk-based producer because the decision of which broker to
> send to has already been made, so retries would go to the same (potentially
> still down) broker. (With KAFKA-253 it would work for zk-based producers as
> well, but that is only in 0.8).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira